Skip to content

Add Parquet decryption support for Hive tables#24517

Merged
sopel39 merged 6 commits into
trinodb:masterfrom
sopel39:ks/pme
Oct 1, 2025
Merged

Add Parquet decryption support for Hive tables#24517
sopel39 merged 6 commits into
trinodb:masterfrom
sopel39:ks/pme

Conversation

@sopel39
Copy link
Copy Markdown
Member

@sopel39 sopel39 commented Dec 18, 2024

Description

Adds support to read Hive tables with encrypted Parquet files.

Additional context and related issues

Parquet added support for encryption https://parquet.apache.org/docs/file-format/data-pages/encryption/. Spark also added support to read and write tables with parquet encrypted files. In this PR we are adding support to read Hive tables with encrypted Parquet files with Trino.

Fixes #9383

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Hive
* Add support for reading encrypted parquet files. ({issue}`24517`, {issue}`9383`)

@cla-bot cla-bot Bot added the cla-signed label Dec 18, 2024
@github-actions github-actions Bot added hudi Hudi connector iceberg Iceberg connector delta-lake Delta Lake connector hive Hive connector labels Dec 18, 2024
@electrum
Copy link
Copy Markdown
Member

This is loading classes dynamically based on class names in config. We should use the standard Trino pattern of having explicitly enumerated providers, each with their own strongly typed config classes.

@sopel39
Copy link
Copy Markdown
Member Author

sopel39 commented Dec 18, 2024

This is loading classes dynamically based on class names in config. We should use the standard Trino pattern of having explicitly enumerated providers, each with their own strongly typed config classes.

Yes. I just rebased for now and resolved conflicts

@github-actions
Copy link
Copy Markdown

This pull request has gone a while without any activity. Tagging for triage help: @mosabua

@github-actions github-actions Bot added the stale label Feb 10, 2025
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2025

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

@github-actions github-actions Bot closed this Mar 3, 2025
@sopel39 sopel39 reopened this Apr 2, 2025
@github-actions github-actions Bot added redshift Redshift connector and removed stale labels Apr 2, 2025
@sopel39 sopel39 force-pushed the ks/pme branch 3 times, most recently from 990f6f2 to 07025cc Compare April 17, 2025 15:29
@sopel39 sopel39 changed the title [WIP] Add Parquet decryption support for Hive tables Add Parquet decryption support for Hive tables Apr 18, 2025
@sopel39 sopel39 force-pushed the ks/pme branch 2 times, most recently from 825272d to 1a265fc Compare April 22, 2025 11:38
@sopel39 sopel39 marked this pull request as ready for review April 22, 2025 14:33
@sopel39 sopel39 requested a review from raunaqmorarka April 22, 2025 14:33
Copy link
Copy Markdown
Member Author

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed comments

Comment thread lib/trino-parquet/src/main/java/io/trino/parquet/metadata/BlockMetadata.java Outdated
Comment thread lib/trino-parquet/src/main/java/io/trino/parquet/predicate/PredicateUtils.java Outdated
Comment thread lib/trino-parquet/src/main/java/io/trino/parquet/predicate/PredicateUtils.java Outdated
@sopel39
Copy link
Copy Markdown
Member Author

sopel39 commented Sep 22, 2025

addressed comments

Copy link
Copy Markdown
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

Comment thread lib/trino-parquet/src/main/java/io/trino/parquet/crypto/ParquetCipher.java Outdated
@yunrougong
Copy link
Copy Markdown

Hi, is there a timeline for releasing this feature?

@wendigo
Copy link
Copy Markdown
Contributor

wendigo commented Sep 22, 2025

@yunrougong given that this PR is far from being merged, there is no ETA

@sopel39
Copy link
Copy Markdown
Member Author

sopel39 commented Sep 23, 2025

Hi, is there a timeline for releasing this feature?

The code is already used in production. Please ping me if you are interested in PME

@sopel39
Copy link
Copy Markdown
Member Author

sopel39 commented Sep 30, 2025

@raunaqmorarka applied comments

Copy link
Copy Markdown
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

Comment thread plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/util/ParquetUtil.java Outdated
Comment thread docs/src/main/sphinx/connector/hive.md Outdated
Comment thread plugin/trino-hive/src/test/java/io/trino/plugin/hive/HiveQueryRunner.java Outdated
Copy link
Copy Markdown
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work

@sopel39 sopel39 merged commit 51087fb into trinodb:master Oct 1, 2025
137 of 138 checks passed
@sopel39 sopel39 deleted the ks/pme branch October 1, 2025 08:21
@github-actions github-actions Bot added this to the 478 milestone Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed delta-lake Delta Lake connector docs hive Hive connector hudi Hudi connector iceberg Iceberg connector lakehouse redshift Redshift connector stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed.

Development

Successfully merging this pull request may close these issues.

Trino Parquet Column Encryption

7 participants