Skip to content

Upgrade Hive client to 2.3.9 and HIVE platform JDK toolchain to 17#177

Draft
YogeshKothari26 wants to merge 2 commits into
linkedin:masterfrom
YogeshKothari26:yokothar/transport-hive-2.3.9-jdk17
Draft

Upgrade Hive client to 2.3.9 and HIVE platform JDK toolchain to 17#177
YogeshKothari26 wants to merge 2 commits into
linkedin:masterfrom
YogeshKothari26:yokothar/transport-hive-2.3.9-jdk17

Conversation

@YogeshKothari26
Copy link
Copy Markdown

@YogeshKothari26 YogeshKothari26 commented May 11, 2026

Summary

Bumps the plugin's pinned Hive client 1.2.2 → 2.3.9 and the HIVE platform's JDK toolchain 8 → 17 in Defaults.java. Bytecode for the HIVE platform stays at Java 8 (options.release.set(8)) so the produced consumer UDF jars remain runnable on Java 8 runtimes.

Concrete changes:

  • Defaults.java: HIVE platform JavaLanguageVersion.of(8) → of(17) + add org.pentaho exclusion to the consumer's hive-exec compileOnly configuration.
  • TransportPlugin.java: pin bytecode release = 8 for non-Trino Java platforms; add --add-opens JVM args to the test launcher when the platform's JLV is >= 17 and the platform is non-Trino.
  • transportable-udfs-plugin/build.gradle: hive-version 1.2.2 → 2.3.9 in the generated version-info.properties.
  • transportable-udfs-hive/build.gradle: hive-exec 1.2.2 → 2.3.9 (compileOnly + testImplementation), add org.pentaho exclusion, add --add-opens to the subproject's own test task when the build JVM is JDK 17.
  • transportable-udfs-test-hive/build.gradle: hive-exec / hive-service 1.2.2 → 2.3.9, add org.pentaho exclusion.
  • HiveTester.java: two Hive 2.x compat fixes — replace the removed FunctionInfo(boolean, String, GenericUDF) ctor with FunctionInfo(FunctionType.PERSISTENT, ...); disable METASTORE_SCHEMA_VERIFICATION and enable datanucleus.schema.autoCreateAll on the embedded HiveConf (Hive 2.3.x's embedded Derby strictly verifies schema version on startup).

Spark (spark_2.11, spark_2.12) and Trino subprojects are not changed.

Motivation

  • Hive 1.2.2 transitively pulls org.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde, which is not resolvable on Maven Central; downstream builds that don't add an explicit exclusion fail to resolve. Hive 2.3.9 is excludable cleanly.
  • Hive 1.2.2's embedded HiveServer2 + DataNucleus + Derby reflection paths fail under JDK 17. Hive 2.3.9 is JDK 17-friendly with the standard --add-opens flags applied here.
  • Together: downstream UDF projects can move their build JVM to Java 17 without per-project workaround patches.

Testing

  • ./gradlew build on JDK 8 build JVM — BUILD SUCCESSFUL, 215 actionable tasks.
  • ./gradlew -p transportable-udfs-examples clean build on JDK 8 build JVM — 51 / 51 example tests pass (17 hiveTest on JDK 17 launcher, 17 trinoTest, 17 generic).
  • ./gradlew :transportable-udfs-hive:test on JDK 17 build JVM — PASS with the --add-opens block in the subproject's test task.
  • Bytecode of produced Hive UDF wrappers confirmed as class file v52 (Java 8 compatible) via javap -v. Trino wrapper output unchanged at v61 (Java 17).
  • JAR-diff vs master: transportable-udfs-hive.jar and transportable-udfs-trino.jar are byte-identical (same SHA-256 + class set).

…ain (T2)

Consolidated branch for testing — DO NOT MERGE. Will be split into separate
PRs once integration testing scope is agreed with the team.

T1 — Hive bump (purely compile-time, runtime jar byte-identical):
- transportable-udfs-hive/build.gradle: hive-exec 1.2.2 -> 2.3.9 (compileOnly +
  testImplementation), add `exclude group: 'org.pentaho'` to drop the
  unresolvable pentaho-aggdesigner-algorithm transitive.
- transportable-udfs-test/transportable-udfs-test-hive/build.gradle: same
  hive-exec + hive-service bumps and pentaho exclusions.
- transportable-udfs-plugin/build.gradle: version-info.properties hive-version
  1.2.2 -> 2.3.9, so consumers of the plugin compile against Hive 2.3.9.
- Defaults.java: add `.exclude("org.pentaho")` to the consumer-facing Hive
  compileOnly DependencyConfiguration.
- HiveTester.java: two API/runtime fixes —
  * `new FunctionInfo(false, name, wrapper)` -> `FunctionInfo.FunctionType
    .PERSISTENT`. Hive 2.x dropped the boolean-isNative ctor.
  * Disable `METASTORE_SCHEMA_VERIFICATION` and enable
    `datanucleus.schema.autoCreateAll` on HiveConf. Hive 2.3.x's embedded
    Derby strictly verifies schema version on startup and fails with
    `MetaException: Version information not found in metastore` otherwise.

T2 — Hive platform JDK 17 toolchain (consumer unblock):
- Defaults.java: HIVE platform JavaLanguageVersion.of(8) -> of(17). Lets
  consumers build their compileHiveJava task with the JDK 17 toolchain.
- TransportPlugin.java: pin bytecode target to Java 8 (`options.release.set(8)`)
  for non-Trino platforms, so consumer UDF jars remain runnable on Java 8 Hive
  grid runtimes despite the JDK 17 toolchain. Trino is excluded because Trino
  406+ requires Java 17 bytecode (sealed classes, records).
- TransportPlugin.java: when test launcher is JDK 17 for non-Trino platforms,
  add `--add-opens` for java.base/{lang,lang.invoke,io,net,nio,util,
  util.concurrent,sun.nio.ch,sun.security.action} so embedded HiveServer2 +
  DataNucleus + Derby reflection works on JDK 17.
- build.gradle: same `--add-opens` set conditionally on the subprojects' own
  test tasks when the build JVM itself is JDK 17.

Verified locally:
- ./gradlew build (JDK 8): SUCCESSFUL, 215 tasks
- ./gradlew -p transportable-udfs-examples clean build (JDK 8): 51/51 tests
  pass (17 hiveTest + 17 trinoTest + 17 generic test)
- bytecode of produced UDF jars: Hive platform = class v52 (Java 8), Trino
  platform = class v61 (Java 17), via javap -v
- jar-diff vs master: transportable-udfs-hive.jar same class set + same
  byte-size 70239B (zip-timestamp drift only); transportable-udfs-trino.jar
  same class set + same byte-size 47664B; test-hive.jar +263B (the 2 explicit
  HiveTester fixes); plugin.jar +447B (T1 + T2 explicit changes only)
- transportable-udfs-hive:test on JDK 17 build JVM: passes with the
  --add-opens flags

Known limitation (out of scope, fixed by drop-spark_2.11 follow-up):
- transportable-udfs-spark_2.11:compileScala fails on JDK 17 build JVM with
  `MissingRequirementError: object java.lang.Object in compiler mirror not
  found`. Pre-existing Scala 2.11 incompatibility with JDK 17, not introduced
  by this branch. CI uses JDK 8 build JVM so unaffected.
…nsportable-udfs-hive

The previous commit added --add-opens to subprojects { test {} } in the top-level
build.gradle, which applied the JVM args to every subproject's test task even
though only transportable-udfs-hive:test loads Hive 2.3.9 reflection. Move the
block into transportable-udfs-hive/build.gradle so only that subproject gets the
flags, keeping scope minimal and aligned with the 11-MP unblock goal.

Verified:
- ./gradlew clean build (JDK 8): SUCCESSFUL, 215 tasks
- ./gradlew -p transportable-udfs-examples clean build -s (JDK 8): 51/51 tests
  pass (hiveTest 17, trinoTest 17, generic test 17)
- ./gradlew :transportable-udfs-hive:test on JDK 17 build JVM: pass with the
  local --add-opens block

Untouched: api / utils / codegen / compile-utils / type-system /
annotation-processor / trino / trino-plugin / spark_2.11 / spark_2.12 / all
other test-* subprojects. Only Hive paths are modified.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant