Skip to content

Add IcebergTableCredentials back to IcebergSplit#29263

Closed
JohnFitzpatrick44 wants to merge 5 commits into
trinodb:masterfrom
JohnFitzpatrick44:jcf/include-creds-in-split
Closed

Add IcebergTableCredentials back to IcebergSplit#29263
JohnFitzpatrick44 wants to merge 5 commits into
trinodb:masterfrom
JohnFitzpatrick44:jcf/include-creds-in-split

Conversation

@JohnFitzpatrick44
Copy link
Copy Markdown
Member

@JohnFitzpatrick44 JohnFitzpatrick44 commented Apr 28, 2026

Description

Previously, fileIoProperties were removed from IcebergSplit as part of the response to #28537.

Passing credentials through the split is not inherently risky, however. We currently do not surface any split info that isn't explicitly selected through APIs like ConnectorSplit#getSplitInfo (here) or SplitCompletedEvent (here).

Now that credentials are wrapped in a very clearly named IcebergTableCredentials object, it should be fine to pass this through the split. We may have been overzealous with what was removed. It may still be useful to be able to handle vended credentials from splits, rather than only at the table level.

I considered an alternative approach where we do dynamically update credentials using a similar and simplified version of how we handle dynamic filters. This proved to be quite complicated and invasive, however, and I would like to limit the amount of changes outside of the Iceberg connector.

This approach is much simpler, and shouldn't result in any credential leaks that we were initially responding to from the above issue.

Additional context and related issues

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot Bot added the cla-signed label Apr 28, 2026
@github-actions github-actions Bot added hudi Hudi connector iceberg Iceberg connector delta-lake Delta Lake connector hive Hive connector bigquery BigQuery connector mongodb MongoDB connector blackhole Blackhole connector elasticsearch Elasticsearch connector faker Faker connector memory Memory connector opensearch OpenSearch connector pinot Pinot connector redshift Redshift connector lakehouse labels Apr 28, 2026
@findepi findepi requested a review from ebyhr April 28, 2026 19:45
Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplit.java Outdated
@findinpath
Copy link
Copy Markdown
Contributor

[INFO] --- checkstyle:3.6.0:check (checkstyle) @ trino-spi ---
[INFO] There is 1 error reported by Checkstyle 13.4.1 with checkstyle/airbase-checks.xml ruleset.
Error:  src/main/java/io/trino/spi/connector/ConnectorPageSourceProvider.java:[19,8] (imports) UnusedImports: Unused import - java.util.Optional.
[INFO] ------------------------------------------------------------------------

@JohnFitzpatrick44 JohnFitzpatrick44 force-pushed the jcf/include-creds-in-split branch 2 times, most recently from 7de4fcd to c5d7c4e Compare April 28, 2026 21:16
@wendigo
Copy link
Copy Markdown
Contributor

wendigo commented Apr 29, 2026

How this is more secure that the current implementation? What problem does this PR solve?

@findinpath
Copy link
Copy Markdown
Contributor

How this is more secure that the current implementation? What problem does this PR solve?

This is less about security and more about use cases which require the fileIoProperties in the IcebergSplit.
The splits are deemed, at the time of this writing, secure, so having fileIoProperties packed within the IcebergSplit should not pose concerns.

One such use case which would benefit of having the fileIoProperties in the IcebergSplit is #28389
cc @sopel39

@wendigo
Copy link
Copy Markdown
Contributor

wendigo commented Apr 29, 2026

@findinpath security comes first, not a single feature use case should impose security hole. I prefer worse but secure API

@sopel39
Copy link
Copy Markdown
Member

sopel39 commented Apr 29, 2026

"Add IcebergTableCredentials back to IcebergSplit"

Does it mean we send potentially same table credential info over and over again with each split? Should it be called IcebergSplitCredentials?

@JohnFitzpatrick44
Copy link
Copy Markdown
Member Author

JohnFitzpatrick44 commented Apr 29, 2026

@wendigo

security comes first, not a single feature use case should impose security hole. I prefer worse but secure API

Fair point, but this approach is at least as secure as the current state. We no longer indiscriminately dump splits to query json (from PRs here and here), so the only remaining risk is toString, which isn't meant to have sensitive information anyway. cc @raunaqmorarka

As for motivation, being able to vend split-level credentials makes us significantly more flexible moving forward, and this approach moves us in a much better direction than trying to find a workaround to provide creds to tasks after scheduling.

@sopel39

Does it mean we send potentially same table credential info over and over again with each split? Should it be called IcebergSplitCredentials?

Yes, although this was the norm before #28537

Good suggestion for the name change, I'll consider better alternatives. Maybe a more general IcebergCredentials.

The interface is ConnectorTableCredentials, so best leave it as-is.

@JohnFitzpatrick44 JohnFitzpatrick44 force-pushed the jcf/include-creds-in-split branch 2 times, most recently from 4b88e10 to 61e7260 Compare April 29, 2026 21:41
@findinpath
Copy link
Copy Markdown
Contributor


Expecting code to raise a throwable.
	at io.trino.plugin.iceberg.catalog.rest.TestIcebergPolarisCatalogConnectorSmokeTest.testDropTableWithMissingDataFile(TestIcebergPolarisCatalogConnectorSmokeTest.java:197)

https://github.com/trinodb/trino/actions/runs/25135388739/job/73672490397?pr=29263

Related failure?

@JohnFitzpatrick44
Copy link
Copy Markdown
Member Author

@findinpath passes locally, I'll try rerunning tests

@JohnFitzpatrick44 JohnFitzpatrick44 force-pushed the jcf/include-creds-in-split branch 4 times, most recently from bed96ba to 61e7260 Compare April 30, 2026 19:50
@JohnFitzpatrick44 JohnFitzpatrick44 force-pushed the jcf/include-creds-in-split branch 2 times, most recently from b41536b to 45a237f Compare May 4, 2026 16:13
@JohnFitzpatrick44 JohnFitzpatrick44 force-pushed the jcf/include-creds-in-split branch 2 times, most recently from b726046 to bf49fd4 Compare May 4, 2026 20:27
Copy link
Copy Markdown
Contributor

@chenjian2664 chenjian2664 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding credentials back to IcebergSplit would be more of an enhancement for connectors that support split-level specific credentials. Am I understanding that correctly?

Map<Integer, String> partitionSpecsByIdJson,
Optional<Type> partitionColumnType)
Optional<Type> partitionColumnType,
IcebergTableCredentials tableCredentials)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the IcebergTableCredentials in FilesTableSplit?

}

@Override
public Void visitTableScan(TableScanNode node, Void context)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about connectors that require, or can operate with, table-level credentials, in that case we probably still need this method

@JohnFitzpatrick44
Copy link
Copy Markdown
Member Author

Closing to reconsider this approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bigquery BigQuery connector blackhole Blackhole connector cla-signed delta-lake Delta Lake connector elasticsearch Elasticsearch connector faker Faker connector hive Hive connector hudi Hudi connector iceberg Iceberg connector lakehouse memory Memory connector mongodb MongoDB connector opensearch OpenSearch connector pinot Pinot connector redshift Redshift connector

Development

Successfully merging this pull request may close these issues.

5 participants