Skip to content

Add native subclasses for InputFile and OutputFile to simplify.#10

Merged
ggershinsky merged 1 commit into
ggershinsky:deliver-key-metadata2from
rdblue:deliver-key-metadata2
Jan 4, 2024
Merged

Add native subclasses for InputFile and OutputFile to simplify.#10
ggershinsky merged 1 commit into
ggershinsky:deliver-key-metadata2from
rdblue:deliver-key-metadata2

Conversation

@rdblue
Copy link
Copy Markdown

@rdblue rdblue commented Jan 3, 2024

This is an update to apache#9359 that simplifies the read path and addresses review items.

This introduces NativeEncryptionInputFile to solve problems in the read path. Specifically the encryption manager needed to know whether to decrypt or return the underlying input file, but that depends on the file format and whether that format is capable of native encryption. In addition, the read path also needed to know whether to configure native encryption, which was a separate choice. And I think having a separate encryption choice would also have broken existing encryption managers that handle Parquet files with encrypting streams.

Instead, this uses a new InputFile type to signal that native encryption should be used if possible and to carry the encryption key. Then the format helpers (Avro, Parquet, and ORC) are responsible for either using the encrypting stream (Avro) or applying native encryption (Parquet and ORC).

I considered other approaches here, but this was the cleanest because most others required separate handling for encryption keys and didn't cover all paths -- notably missing the Arrow readers. By using the same InputFile everywhere, this requires fewer changes to Spark and Flink readers and should work everywhere that the encryption manager is used.

import java.nio.ByteBuffer;

/** {@link EncryptionKeyMetadata} for use with format-native encryption. */
public interface NativeEncryptionKeyMetadata extends EncryptionKeyMetadata {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was needed to avoid making StandardKeyMetadata public.

@ggershinsky ggershinsky merged commit 619c234 into ggershinsky:deliver-key-metadata2 Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants