Skip to content

Fix for opt-in AWS S3 regions when no explicit region is specified#88930

Merged
alexey-milovidov merged 3 commits intoClickHouse:masterfrom
zvonand:fix-s3-opt-in-regions
Oct 24, 2025
Merged

Fix for opt-in AWS S3 regions when no explicit region is specified#88930
alexey-milovidov merged 3 commits intoClickHouse:masterfrom
zvonand:fix-s3-opt-in-regions

Conversation

@zvonand
Copy link
Contributor

@zvonand zvonand commented Oct 23, 2025

Currently, CH fails to read from buckets that are located in opt-in AWS regions if region is not specified in query explicitly:

SELECT DISTINCT _path                                                                                                                                                        
FROM s3('s3://aztest-mesouth/subfoldr/*', '<key-id>', '<secret>')                                                                
                                                                
Received exception from server (version 25.10.1):                                                                                                                            
Code: 499. DB::Exception: Received from localhost:9000. DB::Exception: Could not list objects in bucket 'aztest-mesouth' with prefix 'subfoldr/', S3 exception: `IllegalLocationConstraintException`, message: 'Unable to parse ExceptionName: IllegalLocationConstraintException Message: The me-south-1 location constraint is incompatible for the region specific endpoint this request was sent to.': The data format cannot be detected by the contents of the files. You can specify the format manually. (S3_ERROR)



SELECT DISTINCT _path                                                                                                                                                        
FROM s3('https://s3.me-south-1.amazonaws.com/aztest-mesouth/subfoldr/*', '<key-id>', '<secret>')                                 


   ┌─_path───────────────────────────────┐
1. │ aztest-mesouth/subfoldr/example.csv │
   └─────────────────────────────────────┘        
                                                                                                                           

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Allow using opt-in AWS regions for S3 automatically when the region is not specified in the endpoint. Reference: opt-in AWS regions.

@alexey-milovidov alexey-milovidov self-assigned this Oct 23, 2025
@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Oct 23, 2025
@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Oct 23, 2025

Workflow [PR], commit [96aa021]

Summary:

job_name test_name status info comment
Stateless tests (amd_ubsan, parallel) failure
00900_long_parquet_load FAIL cidb, flaky
Stateless tests (amd_debug, distributed plan, s3 storage, parallel) failure
02844_max_backup_bandwidth_s3 FAIL cidb, flaky
Integration tests (amd_asan, old analyzer, 4/6) failure
test_scheduler_cpu_preemptive/test.py::test_downscaling[cpu-slot-preemption-timeout-1ms] FAIL cidb
OOM in dmesg FAIL cidb
Performance Comparison (amd_release, master_head, 2/3) failure
Check Results failure

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Oct 23, 2025
@zvonand zvonand force-pushed the fix-s3-opt-in-regions branch from 19cd84c to 96aa021 Compare October 23, 2025 23:11
@Felixoid
Copy link
Member

@zvonand, please, rotate the accidental credentials leak. I removed the version with it, but I think you still need to revoke it.

@zvonand
Copy link
Contributor Author

zvonand commented Oct 24, 2025

@zvonand, please, rotate the accidental credentials leak.

Already did, seconds after doing it.
Still, thanks for noticing!

@zvonand
Copy link
Contributor Author

zvonand commented Oct 24, 2025

@alexey-milovidov alexey-milovidov merged commit f33d205 into ClickHouse:master Oct 24, 2025
116 of 123 checks passed
@zvonand zvonand deleted the fix-s3-opt-in-regions branch October 24, 2025 10:32
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Oct 24, 2025
@fm4v fm4v requested a review from antonio2368 October 27, 2025 12:34
@fm4v
Copy link
Member

fm4v commented Oct 27, 2025

@antonio2368 hi! Could you please review this PR as well?

}

/// special handling for opt-in regions
if (new_region_detected && is_illegal_constraint_exception && initial_endpoint.substr(11) == "amazonaws.com")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you help me understand why we cannot rely on the logic below?
It should be same as what you did here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can't use getURIFromError but we can only modify that code:

auto new_uri = is_illegal_constraint_exception ? initial_endpoint : getURIFromError(error);

Would something like this work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it actually would. shall I submit a small PR to make this look a bit better then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes if you don't mind.
The code is already a bit complex because it tries to handle such edge cases so I would like to insist on keeping it simple where possible.
Also thanks for making this fix!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants