Add added_snapshot_id column to Iceberg $files table#28911
Conversation
There was a problem hiding this comment.
Pull request overview
Adds partition-level delete-file metrics to Trino’s Iceberg $partitions system table to make it easy to identify partitions accumulating position/equality deletes without scanning $files (Fixes #28910).
Changes:
- Extend
IcebergStatisticsto track position/equality delete file and record counts. - Update
PartitionsTableto populate the new metrics by iteratingFileScanTask.deletes()(with deduplication). - Update and add tests to validate the new
$partitionscolumns and shifted column indexes.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergStatistics.java | Adds delete-file counters + builder ingestion method. |
| plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/system/PartitionsTable.java | Adds new $partitions columns and aggregates delete metrics from scan tasks. |
| plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergV2.java | Adds coverage for new delete metric columns (position + equality + OPTIMIZE reset). |
| plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergSystemTables.java | Updates $partitions schema assertions and field indexes. |
| plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java | Updates column list expectation for $partitions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
63810f9 to
aa9a341
Compare
ebyhr
left a comment
There was a problem hiding this comment.
Iceberg: Add delete file metrics to $partitions metadata table
We don't use "Iceberg: " prefix in a commit title. Please follow https://trino.io/development/process#pull-request-and-commit-guidelines
Changes: ... in a commit body isn't so helpful. You can remove it.
5f3a572 to
f02f75a
Compare
|
@ebyhr i have addressed review comments. could you please review again. |
|
@kaveti is the failure related to your changes? https://github.com/trinodb/trino/actions/runs/23729005130/job/69119458673?pr=28911 |
f02f75a to
46cbb07
Compare
Add testFilesTableDeleteFileDeduplication to BaseIcebergSystemTables that verifies the $files table shows each delete file exactly once, with no duplicate entries from FileScanTask expansion. Follow-up to trinodb#28911 as requested by findinpath.
Deduplicate files by path in FilesTablePageSource since the same delete file can appear multiple times. Add a HashSet to track seen file paths and skip duplicates. Add testFilesTableDeleteFileDeduplication to BaseIcebergSystemTables that verifies the $files table shows each delete file exactly once. Follow-up to trinodb#28911 as requested by findinpath.
Add testFilesTableDeleteFileDeduplication to BaseIcebergSystemTables that verifies the $files table shows each delete file exactly once, with no duplicate entries (v2 position + equality deletes). Add testFilesTableDeletionVectors that verifies v3 deletion vector behavior: multiple DV entries share the same Puffin file_path in the $files table. Currently there are no content_offset/content_size_in_bytes columns to distinguish individual DVs within the shared Puffin file. Follow-up to trinodb#28911 as requested by findinpath.
547d08e to
199d45a
Compare
39d4fb8 to
acd4f2d
Compare
acd4f2d to
1b2094c
Compare
|
@chenjian2664 i have addressed your comments. thank you |
1b2094c to
3bee45e
Compare
3bee45e to
93827a1
Compare
Please follow https://trino.io/development/process#release-note-guidelines |
Summary
added_snapshot_idto the Iceberg$filessystem table.added_snapshot_idfrom live manifest entry snapshot metadata.$filesschema additions introduced after this branch diverged.Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:
Fixes #28910