Skip to content

Upgrade to Hive 4.0.1#299

Merged
tdcmeehan merged 1 commit into
prestodb:masterfrom
imjalpreet:hive4-upgrade
Apr 22, 2026
Merged

Upgrade to Hive 4.0.1#299
tdcmeehan merged 1 commit into
prestodb:masterfrom
imjalpreet:hive4-upgrade

Conversation

@imjalpreet
Copy link
Copy Markdown
Member

@imjalpreet imjalpreet commented Apr 17, 2026

Required for prestodb/presto#24571

The changes have already been tested with CI in the above-linked PR.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 17, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Upgrades the project to Hive 4.0.1 and Thrift 0.16.0, and updates the HiveThriftClient to use the newer Hive metastore API and a safer TSocket initialization pattern compatible with the upgraded libraries.

Sequence diagram for updated setStatistics Hive metastore interaction

sequenceDiagram
    actor TestRunner
    participant HiveThriftClient
    participant ThriftHiveMetastoreClient

    TestRunner->>HiveThriftClient: setStatistics(tableName, tableStatistics)
    HiveThriftClient->>HiveThriftClient: getSchema(tableName)
    HiveThriftClient->>GetTableRequest: new GetTableRequest(schema, schemalessName)
    HiveThriftClient->>ThriftHiveMetastoreClient: get_table_req(getTableRequest)
    ThriftHiveMetastoreClient-->>HiveThriftClient: GetTableResult(table)
    HiveThriftClient->>HiveThriftClient: setRowsCount(tableName, tableStatistics, table)
    HiveThriftClient->>HiveThriftClient: setColumnStatistics(tableName, tableStatistics, table, predicate)
    HiveThriftClient-->>TestRunner: return
Loading

Sequence diagram for updated HiveThriftClient constructor with TSocket initialization

sequenceDiagram
    actor TestRunner
    participant HiveThriftClient
    participant TSocket

    TestRunner->>HiveThriftClient: new HiveThriftClient(thriftHost, thriftPort)
    activate HiveThriftClient
    HiveThriftClient->>TSocket: new TSocket(thriftHost, thriftPort)
    HiveThriftClient->>TSocket: open()
    TSocket-->>HiveThriftClient: success
    HiveThriftClient-->>TestRunner: instance created
    deactivate HiveThriftClient
Loading

Class diagram for updated HiveThriftClient with new Hive metastore API usage

classDiagram
    class HiveThriftClient {
        - TTransport transport
        - ThriftHiveMetastoreClient client
        + HiveThriftClient(thriftHost, thriftPort)
        + setStatistics(tableName, tableStatistics) void
    }

    class TableName {
        + getSchemalessNameInDatabase() String
    }

    class TableStatistics {
    }

    class GetTableRequest {
        + GetTableRequest(schemaName, tableName)
    }

    class Table {
    }

    HiveThriftClient --> ThriftHiveMetastoreClient : uses
    HiveThriftClient --> TableName : parameter
    HiveThriftClient --> TableStatistics : parameter
    HiveThriftClient --> GetTableRequest : constructs
    HiveThriftClient --> Table : retrieves
Loading

File-Level Changes

Change Details Files
Update HiveThriftClient to be compatible with Hive 4 metastore API and newer Thrift behavior.
  • Move TSocket construction into the try block before opening the transport to better handle TTransportException under newer Thrift versions.
  • Replace usage of client.get_table(schema, tableName) with construction of a GetTableRequest and retrieval via client.get_table_req(getTableRequest).getTable() to match the Hive 4 metastore API.
tempto-core/src/main/java/io/prestodb/tempto/internal/fulfillment/table/hive/HiveThriftClient.java
Upgrade build dependencies for Hive and Thrift.
  • Bump Hive library version from 3.0.0-2 to 4.0.1-1.
  • Bump Thrift library version from 0.9.3 to 0.16.0.
build.gradle

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • By moving transport = new TSocket(thriftHost, thriftPort); inside the try block, transport can now remain null if new TSocket throws, which may change downstream behavior (e.g., in close() or other usages); consider either keeping the assignment outside the try or adding explicit null handling where transport is used.
  • With the switch from get_table to get_table_req, it may be helpful to add a small helper method to construct and invoke the GetTableRequest so the call site remains simple and consistent if additional request parameters or options need to be set in future Hive upgrades.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- By moving `transport = new TSocket(thriftHost, thriftPort);` inside the try block, `transport` can now remain null if `new TSocket` throws, which may change downstream behavior (e.g., in `close()` or other usages); consider either keeping the assignment outside the try or adding explicit null handling where `transport` is used.
- With the switch from `get_table` to `get_table_req`, it may be helpful to add a small helper method to construct and invoke the `GetTableRequest` so the call site remains simple and consistent if additional request parameters or options need to be set in future Hive upgrades.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@nmahadevuni
Copy link
Copy Markdown
Member

@imjalpreet Looks good. Did you verify by running any tempto tests?

@imjalpreet
Copy link
Copy Markdown
Member Author

@nmahadevuni, yes, as I mentioned in the PR description, all the changes have been validated in the linked presto PR.

@tdcmeehan tdcmeehan merged commit 34ab163 into prestodb:master Apr 22, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants