AWS CUR reports extractor

This extractor downloads AWS CUR reports exported to S3 in CSV format.

Table of contents:

[TOC]

AWS Setup

First, the CUR report exports need to be set up in the AWS account to be exported to S3 bucket in the selected granularity and CSV format. Follow this guide to set up the export.

Export Setup:

Setup S3 bucket
Set the report prefix
Select granularity
Select report versioning (overwrite recommended)
Choose GZIP compression type

Functionality notes

The extractor downloads AWS CUR reports from S3, processes them locally using DuckDB, and exports to CSV format.

Key features:

Local processing with DuckDB (no Snowflake workspace required)
Direct S3 loading via DuckDB's httpfs extension for uncompressed files
ZIP file support with automatic extraction and processing
Dynamic schema handling - automatically expands column set when schema changes
Incremental loading - downloads only new reports when configured

Configuration

AWS config

Your S3 bucket details and credentials as set up in the AWS console

New files only

If set to true, only newly generated report is downloaded each execution.

Minimum date since

Minimum date of the report. Lowest report date to download. When New files only option is checked, this applies only to the first run, reset the state to backfill. Date in YYYY-MM-DD format or a string i.e. 5 days ago, 1 month ago, yesterday, etc. If left empty, all records are downloaded.

Maximum date

Maximum date of the report. Max report date to download. When New files only option is checked, this applies only to the first run, reset the state to backfill. Date in YYYY-MM-DD format or a string i.e. 5 days ago, 1 month ago, yesterday, etc. If left empty, all records are downloaded.

Report prefix

The prefix as you set up in the AWS CUR config. In S3 bucket this is path to your report. E.g. my-report or some/long/prefix/my_report

In most cases this would be the prefix you've chosen. If unsure, refer to the S3 bucket containing the report and copy the path of the report folder.

Output

The output schema is described here

IMPORTANT NOTE The result column names are modified to match the KBC Storage column name requirements:

Categories are separated by __. e.g.bill/BillingPeriodEndDate is converted to bill__billingPeriodEndDate
Any characters that are not alphanumeric or _ underscores are replaced by underscore. E.g. resourceTags/user:owner is converted to resourceTags__user_owner
The KBC Storage is case insesitive so the above may lead to duplicate names. In such case the names are deduplicated by adding an index. e.g resourceTags/user:name and resourceTags/user:Name lead to resourceTags__user_Name and resourcetags__user_name_1 columns respectively

Note That the output schema changes often and may be also affected by the tags and custom columns you define.

Development

Requires Python 3.13 and UV package manager.

Local Development

# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and setup
git clone repo_path my-new-component
cd my-new-component
uv sync

# Run component
source .venv/bin/activate
PYTHONPATH=src python src/component.py

# Run tests
python -m unittest discover
ruff check .

Docker Development

docker-compose build
docker-compose run --rm dev

Integration

For information about deployment and integration with KBC, please refer to the deployment section of developers documentation

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.github/workflows		.github/workflows
component_config		component_config
docs		docs
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
change_log.md		change_log.md
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AWS CUR reports extractor

AWS Setup

Functionality notes

Configuration

AWS config

New files only

Minimum date since

Maximum date

Report prefix

Output

Development

Local Development

Docker Development

Integration

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AWS CUR reports extractor

AWS Setup

Functionality notes

Configuration

AWS config

New files only

Minimum date since

Maximum date

Report prefix

Output

Development

Local Development

Docker Development

Integration

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages