Add support for Iceberg table encryption by kamijin-fanta · Pull Request #28354 · trinodb/trino

kamijin-fanta · 2026-02-18T12:55:54Z

Integrate Iceberg encryption handling into read and write paths and table operations across Iceberg catalogs.

Description

Add end-to-end support for Iceberg table encryption in the Trino Iceberg connector.
Support encrypted table creation via table properties: encryption_key_id and encryption_data_key_length.
Add catalog-level KMS implementation config: iceberg.encryption.kms-impl.
Wire EncryptionManager into write paths for data files, position delete files, and deletion vectors.
Persist and propagate file key metadata through commit tasks, splits, and table changes processing.
Add decryption-aware read paths for encrypted data files and delete files by introducing encrypted Trino input/output wrappers.
Update Iceberg table operations to use EncryptingFileIO for encrypted tables.

Additional context and related issues

encryption_data_key_length is validated to 16, 24, or 32 bytes and requires encryption_key_id.
Changing encryption_key_id or encryption_data_key_length after they are set is rejected.
KMS client behavior is explicit: catalog-level iceberg.encryption.kms-impl is reused, while table-level encryption.kms-impl is resolved per table.
Added and updated tests include encrypted table DML flows, ORC writer validation with encryption, table changes on encrypted tables, config mapping, and encryption manager factory behavior.
Iceberg References https://iceberg.apache.org/docs/nightly/encryption/
Related issue: [Iceberg] Add Support for Iceberg Native Encryption (Client-side Table Encryption) #28204

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

Integrate Iceberg encryption handling into read and write paths and table operations across Iceberg catalogs. Add test coverage for encrypted table behavior. Signed-off-by: kamijin_fanta <kamijin@live.jp>

ebyhr · 2026-02-19T01:35:35Z

+            throws Exception
+    {
+        return IcebergQueryRunner.builder()
+                .addIcebergProperty("iceberg.encryption.kms-impl", TestingKmsClient.class.getName())


Can we use LocalStack instead? We should also test with real KMS such as AWS Key Management Service.

I initially thought we couldn’t override the endpoint. However, after re-reading the code, I realized we can implement our own AwsClientFactory and point to it via the catalog property client.factory. That would let us call endpointOverride(...) in the client builder.

I’ll validate whether we can run the tests against LocalStack with this approach. Thanks!

ebyhr · 2026-02-19T01:37:46Z

+    @Config("iceberg.encryption.kms-impl")
+    @ConfigDescription("KMS implementation class for Iceberg table encryption")


We shouldn't expose Java class name to users. Please introduce a new enum and map to class name internally.

Also, don't forget updating iceberg.md.

According to the Iceberg documentation, both encryption.kms-type and encryption.kms-impl appear to be public configuration options. However, in Iceberg 1.10.1 and also in the latest main branch, encryption.kms-type doesn’t seem to be referenced/used in the code, and users effectively must set encryption.kms-impl all the time.

Also, I think we should keep some extensibility here because there are users who use KMS providers outside of major clouds like AWS/GCP (including myself). For consistency with other configuration surfaces (e.g., catalog configs) I’m currently leaning toward accepting the Java class name for now, and then later switching to an enum-based selection once Iceberg formalizes/supports kms-type (or an equivalent stable interface). Ideally we can evolve this to an internal enum → class mapping when Iceberg’s interface becomes clearer.

What do you think, @ebyhr? Another reasonable option is to wait to merge until Iceberg finalizes this interface, but I’m trying to balance that with keeping room for non-cloud KMS implementations.

Ref:
https://github.com/apache/iceberg/blob/a6c4e6aef06103db38be064534b7b0be09b3ebca/core/src/main/java/org/apache/iceberg/encryption/EncryptionUtil.java#L38-L80

encryption.kms-type doesn’t seem to be referenced/used in the code

Did you confirm apache/iceberg#15272?. We don't need to wait for 1.11.0 for using KMS type anyway. We can introduce iceberg.encryption.kms-type and internally set encryption.kms-impl.

I’m currently leaning toward accepting the Java class name for now, and then later switching to an enum-based selection once Iceberg formalizes/supports kms-type (or an equivalent stable interface).

The expected order is different. We should begin with the strictest option, such as an enum-based approach, and only allow arbitrary options if that proves too restrictive. Note that we generally avoid such generic options in this project.

Thanks. I read your Iceberg PR.

One question to confirm: are you saying it’s acceptable to implement this logic on the Trino side without waiting for the next Iceberg release? For example, we could expose only the iceberg.encryption.kms-type property in Trino, and then internally set Iceberg’s encryption.kms-impl to the appropriate implementation class based on that value.

Also, my understanding is that KMS APIs don’t have an industry-standard interface like the S3 API for object storage. In particular, users who enable encryption often build their own KMS around on-prem HSMs or similar systems. Given that context, I assume Iceberg exposes kms-impl to allow users to plug in their own implementation. That said, I understand the point that Trino generally shouldn’t expect users to configure a Java class name directly, or, put differently, to distribute arbitrary jars/classes via configuration.

In that case, is there a good way to leave room for users outside cloud environments to customize KMS behavior without forking Trino? And is supporting that kind of extensibility something you’d consider in scope?

ebyhr · 2026-02-19T01:41:57Z

        requireNonNull(schemaTableName, "schemaTableName is null");
        requireNonNull(tableSchemaJson, "tableSchemaJson is null");
        columns = ImmutableList.copyOf(requireNonNull(columns, "columns is null"));
+        storageProperties = ImmutableMap.copyOf(requireNonNull(storageProperties, "storageProperties is null"));


We don't use requireNonNull with ImmutableMap.copyOf.

ebyhr · 2026-02-19T06:53:16Z

+            if (catalogKmsClient != null) {
+                return catalogKmsClient;
+            }
+            catalogKmsClient = EncryptionUtil.createKmsClient(properties);


I don't think calling EncryptionUtil.createKmsClient with table properties will work against a real KMS. As I understand it, it requires credentials provided via catalog properties.

This illustrates why we don't allow such generic implementations. It becomes very easy to ship a broken implementation.

I’ve removed iceberg.encryption.kms-impl for now and added iceberg.encryption.kms-type and iceberg.encryption.kms-properties.

For the KMS client configuration, table properties are not used. Instead, iceberg.encryption.kms-properties is used.

That said, my understanding is that Trino generally has very few places where we allow passing through an arbitrary set of options like this. So I’m considering listing the required properties (for example, the AWS region) and defining explicit, typed fields for them instead.

ebyhr · 2026-02-19T09:54:42Z

+        try {
+            return fileIo.properties();
+        }
+        catch (UnsupportedOperationException e) {


We will be able to remove this catch once apache/iceberg#15289 is released.

ebyhr · 2026-02-19T09:57:43Z

+import static org.assertj.core.api.Assertions.assertThat;
+import static org.assertj.core.api.Assertions.assertThatThrownBy;
+
+public class TestIcebergEncryptionManagerFactory


Please follow https://trino.io/docs/current/develop/tests.html

ebyhr · 2026-02-19T09:58:55Z

+    @Inject
+    public IcebergEncryptionManagerFactory(IcebergConfig config)
+    {
+        requireNonNull(config, "config is null");


We don't use requireNonNull for config classes. #13940

Signed-off-by: kamijin_fanta <kamijin@live.jp>

sopel39 · 2026-02-20T13:13:03Z

Thanks for the PR. I've been also working on PME in Iceberg, but haven't published the draft yet. Let me go though this PR, to see if it converges in right direction

sopel39 · 2026-02-20T13:22:01Z

    private final long fileSize;
    private final long fileRecordCount;
    private final IcebergFileFormat fileFormat;
+    private final Optional<byte[]> encryptionKeyMetadata;


Split should contain fileKey AND aad.

encryptionKeyMetadata shouldn't be forwarded to workers are workers will not have access to Iceberg metadata (where keys are stored)

Example

@VisibleForTesting record ParquetFileDecryptionData(byte[] fileEncryptionKey, byte[] fileAadPrefix) { ParquetFileDecryptionData { requireNonNull(fileEncryptionKey, "fileEncryptionKey is null"); requireNonNull(fileAadPrefix, "fileAadPrefix is null"); } } private static Optional<ParquetFileDecryptionData> parquetFileDecryptionData( EncryptedInputFile encryptedInputFile, EncryptionManager encryptionManager) { InputFile inputFile; try { inputFile = encryptionManager.decrypt(encryptedInputFile); } catch (RuntimeException e) { return Optional.empty(); } if (!(inputFile instanceof NativeEncryptionInputFile nativeEncryptionInputFile)) { return Optional.empty(); } NativeEncryptionKeyMetadata nativeKeyMetadata; try { nativeKeyMetadata = nativeEncryptionInputFile.keyMetadata(); } catch (RuntimeException e) { return Optional.empty(); } ByteBuffer encryptionKey = nativeKeyMetadata.encryptionKey(); ByteBuffer aadPrefix = nativeKeyMetadata.aadPrefix(); if (encryptionKey == null || aadPrefix == null) { return Optional.empty(); } return Optional.of(new ParquetFileDecryptionData( ByteBuffers.toByteArray(encryptionKey), ByteBuffers.toByteArray(aadPrefix))); }

sopel39 · 2026-02-20T13:23:41Z

+        return fileSystem.newInputFile(Location.of(path), fileSize);
+    }
+
+    private static TrinoInputFile decryptInputFileIfNeeded(


We already have framework for PME decryption (#24517) that works with Trino native Parquet reader that should be used instead.

Parquet has different encryption modes with footer/column keys, see https://parquet.apache.org/docs/file-format/data-pages/encryption/. I don't think org.apache.iceberg.encryption.StandardEncryptionManager.StandardDecryptedInputFile even works with reading encrypted parquet files.

IIUC encryptionManager.get().decrypt is for decrypting metadata files.

Thanks for the detailed feedback and for pointing to #24517. I aligned this PR with the PME approach from #24517, following the same pattern used with Trino’s native Parquet reader.

sopel39 · 2026-02-20T18:56:39Z

PTAL at #28389. Especially split handling, read-path test and FixedKeyDecryptionKeyRetriever. I'm fine either PR landing, just using proper PME code with testing

sopel39 · 2026-02-20T21:24:19Z

+                        "encryption_key_id = 'test-key', " +
+                        "encryption_data_key_length = 16)")) {
+            String tableName = table.getName();
+            assertUpdate("INSERT INTO " + tableName + " VALUES (1, 'a'), (2, 'b'), (3, 'c')", 3);


encrypted table is created by Trino too, right? I think it's the reason why

InputFile decryptedInputFile = encryptionManager.get().decrypt(EncryptedFiles.encryptedInput(encryptedInputFile, metadataWithLength));

works for Parquet files. However, this should really be tested with code like https://github.com/trinodb/trino/pull/28389/changes#diff-75c880cd52216050277554400af0c70c6a983d40426c347aae427123aa28542bR140 (e.g. use apache parquet writer like

try (DataWriter<Record> writer = Parquet.writeData(encryptedOutputFile) .forTable(table) .withSpec(table.spec()) .withPartition(null) .withKeyMetadata(encryptedOutputFile.keyMetadata()) .createWriterFunc(GenericParquetWriter::create) .build()) {

Add ParquetFileDecryptionData (file key + AAD prefix) to Iceberg split metadata, delete files, and table_changes splits. Resolve decryption data on coordinator in split sources and pass it to workers. Wire Parquet decryption properties into IcebergPageSourceProvider. Keep compatibility by falling back to legacy key-metadata-based decryption when key metadata is present. Signed-off-by: kamijin_fanta <kamijin@live.jp>

ebyhr · 2026-03-02T13:03:43Z

+    @Config("iceberg.encryption.kms-properties")
+    @ConfigDescription("Catalog-level KMS client properties in key=value format")
+    public IcebergConfig setEncryptionKmsProperties(List<String> encryptionKmsProperties)
+    {
+        this.encryptionKmsProperties = Optional.ofNullable(encryptionKmsProperties)
+                .map(ImmutableList::copyOf)
+                .orElseGet(ImmutableList::of);
+        return this;
+    }


In this project, we avoid using arbitrary config properties as much as possible. With that model, it's easy to miss required properties and to permit invalid combinations of settings. It also makes it harder to correctly handle values that contain a =. Also, the current code has the risk of leaking credentials since @ConfigSecuritySensitive annotation is missing.

You can refer to IcebergRestCatalogModule for an example of how we handle multiple implementations.

ggershinsky · 2026-03-04T07:45:07Z

    {
        try {
-            OrcDataSink orcDataSink = OutputStreamOrcDataSink.create(fileSystem.newOutputFile(outputPath));
+            EncryptedOutput encryptedOutput = createOutputFile(fileSystem, outputPath, encryptionManager);


Iceberg encryption is not supported yet in tables with ORC data files. In the future, the native ORC encryption will need to be leveraged.

ggershinsky · 2026-03-04T07:47:17Z

+    }
+
+    @Test
+    void testEncryptedOrcWriterValidation()


Iceberg throws an exception when writing an encrypted ORC table,
https://github.com/apache/iceberg/blob/main/orc/src/main/java/org/apache/iceberg/orc/ORC.java#L116

sopel39 · 2026-03-25T15:23:04Z

@kamijin-fanta are you going to work on this PR?

ebyhr · 2026-03-28T04:23:58Z

#28905 takes over this PR as we talked offline.

Add support for Iceberg table encryption

3a06752

Integrate Iceberg encryption handling into read and write paths and table operations across Iceberg catalogs. Add test coverage for encrypted table behavior. Signed-off-by: kamijin_fanta <kamijin@live.jp>

cla-bot Bot added the cla-signed label Feb 18, 2026

github-actions Bot added the iceberg Iceberg connector label Feb 18, 2026

ebyhr reviewed Feb 19, 2026

View reviewed changes

raunaqmorarka requested a review from sopel39 February 19, 2026 05:39

ebyhr reviewed Feb 19, 2026

View reviewed changes

kamijin-fanta added 2 commits February 20, 2026 12:16

Add LocalStack KMS integration test for Iceberg table encryption

63b5d23

Signed-off-by: kamijin_fanta <kamijin@live.jp>

Iceberg: replace kms-impl with kms-type and kms-properties

4f607b1

Signed-off-by: kamijin_fanta <kamijin@live.jp>

sopel39 reviewed Feb 20, 2026

View reviewed changes

mrendi29 mentioned this pull request Feb 24, 2026

Add Vault Transit KMS client apache/iceberg#14451

Closed

ebyhr reviewed Mar 4, 2026

View reviewed changes

ggershinsky reviewed Mar 4, 2026

View reviewed changes

ebyhr mentioned this pull request Mar 28, 2026

Add support for table encryption in Iceberg #28905

Closed

ebyhr closed this Mar 28, 2026

		@Config("iceberg.encryption.kms-impl")
		@ConfigDescription("KMS implementation class for Iceberg table encryption")

Conversation

kamijin-fanta commented Feb 18, 2026

Description

Additional context and related issues

Release notes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebyhr Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebyhr Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sopel39 commented Feb 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sopel39 commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sopel39 commented Mar 25, 2026

Uh oh!

ebyhr commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

ebyhr Feb 19, 2026 •

edited

Loading

ebyhr Feb 19, 2026 •

edited

Loading

sopel39 commented Feb 20, 2026 •

edited

Loading