Skip to content

feat: Support anonymous S3 access for public buckets#27758

Closed
0eu wants to merge 1 commit into
trinodb:masterfrom
0eu:0eu/anonymous-access
Closed

feat: Support anonymous S3 access for public buckets#27758
0eu wants to merge 1 commit into
trinodb:masterfrom
0eu:0eu/anonymous-access

Conversation

@0eu
Copy link
Copy Markdown

@0eu 0eu commented Dec 24, 2025

Description

This pull request introduces the ability to access public S3 storage anonymously.

Currently, when fs.native-s3.enabled is set to true, Trino utilizes an AWS credential provider chain that expects some form of credentials. If no credentials are found, the S3 client fails to initialize, preventing access to public datasets that do not require authentication.

This PR adds a new configuration property, s3.anonymous-access. When set to true, the S3FileSystem will use the AnonymousCredentialsProvider, allowing Trino to perform unsigned requests to public S3 buckets.

Additional context and related issues

Fixes: #27512

The implementation is inspired from the comment in the original issue:

  • Adds anonymousAccess to S3FileSystemConfig.
  • Updates S3FileSystemLoader to prioritize the AnonymousCredentialsProvider if the configuration is enabled.
  • Includes documentation updates to guide users on how to enable this for public buckets.
  • Adds unit tests in TestS3FileSystemConfig to ensure the property is correctly mapped from configuration files.

I've also done testing on my machine and it works:

trino:default> CREATE TABLE hive.default.noaa_test (
            ->       id VARCHAR,
            ->       date VARCHAR,
            ->       element VARCHAR,
            ->       data_value INTEGER
            ->   )
            ->   WITH (
            ->       format = 'PARQUET',
            ->       external_location = 's3://noaa-ghcn-pds/parquet/by_year/YEAR=2024/ELEMENT=TMAX/'
            ->   );
CREATE TABLE

trino:default> SHOW TABLES IN hive.default;
   Table   
-----------
 noaa_test 
(1 row)

Query 20251224_224038_00007_2eahn, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
0.07 [1 rows, 138B] [13 rows/s, 1.87KiB/s]

trino:default> SELECT * FROM hive.default.noaa_test LIMIT 5;
     id      |   date   | element | data_value 
-------------+----------+---------+------------
 USC00307383 | 20241222 | NULL    |        -11 
 AE000041196 | 20240101 | NULL    |        278 
 AEM00041218 | 20240101 | NULL    |        275 
 AGE00147716 | 20240101 | NULL    |        203 
 AEM00041194 | 20240101 | NULL    |        277 
(5 rows)

Query 20251224_224047_00008_2eahn, FINISHED, 1 node
Splits: 26 total, 20 done (76.92%)
1.64 [15 rows, 3.6MiB] [9 rows/s, 2.19MiB/s]

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Object Storage
* Add support for anonymous S3 access using the `s3.anonymous-access` configuration property. ({issue}`27512`)

@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Dec 24, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@0eu
Copy link
Copy Markdown
Author

0eu commented Dec 24, 2025

Awesome, I've submitted the signed CLA.

Comment thread docs/src/main/sphinx/object-storage/file-system-s3.md

public boolean isAnonymousAccess()
{
return anonymousAccess;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at:

I see the use of an "authentication type" field to distinguish between the many potential ways to configure the authentication for s3 access.
This can be done though EVENTUALLY in a follow-up PR.

Or if we decide to add it - then we'd be working with

s3.auth-type=ANONYMOUS

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion! I really like this direction to be honest. s3.auth-type would be more consistent with Azure/GCS and eventually, if there would be more types, they could be easily supported. I went with a boolean flag for simplicity, but see the value in the proposed idea.

However, before we implement this (now or in the follow-up PR), we need to clarify how this will interact with existing config parameters. For example, if the user doesn't specify an auth type, should we fail with an error or use the old logic (default credentials chain)?

I would be happy moving s3.auth-type to a separate PR. But let's create an issue or track it somewhere so that S3 isn't an exception when it comes to authentication configuration. In the meantime, I will see, how this can be implemented -- I'm relatively new to the codebase, but willing to learn more and contribute :D

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the user doesn't specify an auth type, should we fail with an error or use the old logic (default credentials chain)?

When introducing the new configuration, we should deprecate the existing one. This preserves backward compatibility while providing users with a warning, giving them a transition period to adjust if necessary. For reference, see #26681 for details.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sharing this PR, it's helpful 👍

@0eu 0eu force-pushed the 0eu/anonymous-access branch from 490a595 to 0afaf65 Compare December 25, 2025 22:19
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Dec 25, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@0eu 0eu force-pushed the 0eu/anonymous-access branch from 0afaf65 to 4696d3b Compare December 26, 2025 10:57
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Dec 26, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

}

@Config("s3.anonymous-access")
@ConfigDescription("Use anonymous credentials for accessing public S3 buckets")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add config validation. when anonymous access is set, other authentication option should be left unset, as they are ignored (eg access/secret key)

see here for example validation:

@Test
public void testInvalidPath()
{
assertFailsValidation(
new SqlEnvironmentConfig()
.setPath("too.many.parts"),
"sqlPathValid",
"sql.path must be a valid SQL path",
AssertTrue.class);
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point! I've added validation to ensure that certain options are mutually exclusive.

@0eu 0eu force-pushed the 0eu/anonymous-access branch from 4696d3b to e3f9a2c Compare January 5, 2026 22:05
@cla-bot
Copy link
Copy Markdown

cla-bot Bot commented Jan 5, 2026

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@0eu
Copy link
Copy Markdown
Author

0eu commented Jan 5, 2026

Processing may take a few days. The CLA needs to be on file before we merge your changes.

Hmm, I've submitted signed CLA on Dec 25, 2025. Is there a way to follow-up on this?

@github-actions
Copy link
Copy Markdown

This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack.

@github-actions github-actions Bot added the stale label Jan 27, 2026
@github-actions
Copy link
Copy Markdown

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

@github-actions github-actions Bot closed this Feb 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

access public S3 storage anonymously

4 participants