Upgrade Hive client to 2.3.9 and HIVE platform JDK toolchain to 17#177
Draft
YogeshKothari26 wants to merge 2 commits into
Draft
Upgrade Hive client to 2.3.9 and HIVE platform JDK toolchain to 17#177YogeshKothari26 wants to merge 2 commits into
YogeshKothari26 wants to merge 2 commits into
Conversation
…ain (T2)
Consolidated branch for testing — DO NOT MERGE. Will be split into separate
PRs once integration testing scope is agreed with the team.
T1 — Hive bump (purely compile-time, runtime jar byte-identical):
- transportable-udfs-hive/build.gradle: hive-exec 1.2.2 -> 2.3.9 (compileOnly +
testImplementation), add `exclude group: 'org.pentaho'` to drop the
unresolvable pentaho-aggdesigner-algorithm transitive.
- transportable-udfs-test/transportable-udfs-test-hive/build.gradle: same
hive-exec + hive-service bumps and pentaho exclusions.
- transportable-udfs-plugin/build.gradle: version-info.properties hive-version
1.2.2 -> 2.3.9, so consumers of the plugin compile against Hive 2.3.9.
- Defaults.java: add `.exclude("org.pentaho")` to the consumer-facing Hive
compileOnly DependencyConfiguration.
- HiveTester.java: two API/runtime fixes —
* `new FunctionInfo(false, name, wrapper)` -> `FunctionInfo.FunctionType
.PERSISTENT`. Hive 2.x dropped the boolean-isNative ctor.
* Disable `METASTORE_SCHEMA_VERIFICATION` and enable
`datanucleus.schema.autoCreateAll` on HiveConf. Hive 2.3.x's embedded
Derby strictly verifies schema version on startup and fails with
`MetaException: Version information not found in metastore` otherwise.
T2 — Hive platform JDK 17 toolchain (consumer unblock):
- Defaults.java: HIVE platform JavaLanguageVersion.of(8) -> of(17). Lets
consumers build their compileHiveJava task with the JDK 17 toolchain.
- TransportPlugin.java: pin bytecode target to Java 8 (`options.release.set(8)`)
for non-Trino platforms, so consumer UDF jars remain runnable on Java 8 Hive
grid runtimes despite the JDK 17 toolchain. Trino is excluded because Trino
406+ requires Java 17 bytecode (sealed classes, records).
- TransportPlugin.java: when test launcher is JDK 17 for non-Trino platforms,
add `--add-opens` for java.base/{lang,lang.invoke,io,net,nio,util,
util.concurrent,sun.nio.ch,sun.security.action} so embedded HiveServer2 +
DataNucleus + Derby reflection works on JDK 17.
- build.gradle: same `--add-opens` set conditionally on the subprojects' own
test tasks when the build JVM itself is JDK 17.
Verified locally:
- ./gradlew build (JDK 8): SUCCESSFUL, 215 tasks
- ./gradlew -p transportable-udfs-examples clean build (JDK 8): 51/51 tests
pass (17 hiveTest + 17 trinoTest + 17 generic test)
- bytecode of produced UDF jars: Hive platform = class v52 (Java 8), Trino
platform = class v61 (Java 17), via javap -v
- jar-diff vs master: transportable-udfs-hive.jar same class set + same
byte-size 70239B (zip-timestamp drift only); transportable-udfs-trino.jar
same class set + same byte-size 47664B; test-hive.jar +263B (the 2 explicit
HiveTester fixes); plugin.jar +447B (T1 + T2 explicit changes only)
- transportable-udfs-hive:test on JDK 17 build JVM: passes with the
--add-opens flags
Known limitation (out of scope, fixed by drop-spark_2.11 follow-up):
- transportable-udfs-spark_2.11:compileScala fails on JDK 17 build JVM with
`MissingRequirementError: object java.lang.Object in compiler mirror not
found`. Pre-existing Scala 2.11 incompatibility with JDK 17, not introduced
by this branch. CI uses JDK 8 build JVM so unaffected.
…nsportable-udfs-hive
The previous commit added --add-opens to subprojects { test {} } in the top-level
build.gradle, which applied the JVM args to every subproject's test task even
though only transportable-udfs-hive:test loads Hive 2.3.9 reflection. Move the
block into transportable-udfs-hive/build.gradle so only that subproject gets the
flags, keeping scope minimal and aligned with the 11-MP unblock goal.
Verified:
- ./gradlew clean build (JDK 8): SUCCESSFUL, 215 tasks
- ./gradlew -p transportable-udfs-examples clean build -s (JDK 8): 51/51 tests
pass (hiveTest 17, trinoTest 17, generic test 17)
- ./gradlew :transportable-udfs-hive:test on JDK 17 build JVM: pass with the
local --add-opens block
Untouched: api / utils / codegen / compile-utils / type-system /
annotation-processor / trino / trino-plugin / spark_2.11 / spark_2.12 / all
other test-* subprojects. Only Hive paths are modified.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bumps the plugin's pinned Hive client
1.2.2 → 2.3.9and the HIVE platform's JDK toolchain8 → 17inDefaults.java. Bytecode for the HIVE platform stays at Java 8 (options.release.set(8)) so the produced consumer UDF jars remain runnable on Java 8 runtimes.Concrete changes:
Defaults.java: HIVE platformJavaLanguageVersion.of(8) → of(17)+ addorg.pentahoexclusion to the consumer'shive-execcompileOnlyconfiguration.TransportPlugin.java: pin bytecoderelease = 8for non-Trino Java platforms; add--add-opensJVM args to the test launcher when the platform's JLV is>= 17and the platform is non-Trino.transportable-udfs-plugin/build.gradle:hive-version1.2.2 → 2.3.9in the generatedversion-info.properties.transportable-udfs-hive/build.gradle:hive-exec 1.2.2 → 2.3.9(compileOnly+testImplementation), addorg.pentahoexclusion, add--add-opensto the subproject's own test task when the build JVM is JDK 17.transportable-udfs-test-hive/build.gradle:hive-exec/hive-service1.2.2 → 2.3.9, addorg.pentahoexclusion.HiveTester.java: two Hive 2.x compat fixes — replace the removedFunctionInfo(boolean, String, GenericUDF)ctor withFunctionInfo(FunctionType.PERSISTENT, ...); disableMETASTORE_SCHEMA_VERIFICATIONand enabledatanucleus.schema.autoCreateAllon the embeddedHiveConf(Hive 2.3.x's embedded Derby strictly verifies schema version on startup).Spark (
spark_2.11,spark_2.12) and Trino subprojects are not changed.Motivation
1.2.2transitively pullsorg.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde, which is not resolvable on Maven Central; downstream builds that don't add an explicit exclusion fail to resolve. Hive2.3.9is excludable cleanly.1.2.2's embedded HiveServer2 + DataNucleus + Derby reflection paths fail under JDK 17. Hive2.3.9is JDK 17-friendly with the standard--add-opensflags applied here.Testing
./gradlew buildon JDK 8 build JVM — BUILD SUCCESSFUL, 215 actionable tasks../gradlew -p transportable-udfs-examples clean buildon JDK 8 build JVM — 51 / 51 example tests pass (17hiveTeston JDK 17 launcher, 17trinoTest, 17 generic)../gradlew :transportable-udfs-hive:teston JDK 17 build JVM — PASS with the--add-opensblock in the subproject's test task.v52(Java 8 compatible) viajavap -v. Trino wrapper output unchanged atv61(Java 17).master:transportable-udfs-hive.jarandtransportable-udfs-trino.jarare byte-identical (same SHA-256 + class set).