Skip to content

[#1805] feat(spark): Support Spark 4 in client-spark/extension#2749

Open
LuciferYang wants to merge 1 commit into
apache:masterfrom
LuciferYang:spark4-extension-explore
Open

[#1805] feat(spark): Support Spark 4 in client-spark/extension#2749
LuciferYang wants to merge 1 commit into
apache:masterfrom
LuciferYang:spark4-extension-explore

Conversation

@LuciferYang
Copy link
Copy Markdown
Contributor

@LuciferYang LuciferYang commented May 12, 2026

What changes were proposed in this pull request?

Follow-up to #2748. That PR left client-spark/extension out of the spark4 profile because Spark 4 uses jakarta.servlet while Spark 3 uses javax.servlet, and WebUIPage.render(request) must override its parent signature exactly — so one .scala file can't serve both.

This PR adds two sibling source roots under client-spark/extension:

  • src/main/scala-javax/org/apache/spark/ui/ServletCompat.scalatype HttpServletRequest = javax.servlet.http.HttpServletRequest
  • src/main/scala-jakarta/org/apache/spark/ui/ServletCompat.scalatype HttpServletRequest = jakarta.servlet.http.HttpServletRequest

ShufflePage.scala imports ServletCompat.HttpServletRequest and overrides render with that alias. build-helper-maven-plugin picks one of the two roots via ${extension.servlet.source.dir}; a spark4 profile in this module's pom (merged with the root pom's spark4 profile by shared id) flips the property to the jakarta variant. ServletCompat is private[ui], so callers stay in org.apache.spark.ui.

Alongside:

  • Add client-spark/extension to the root pom's spark4 profile <modules>.
  • Add rss-client-spark-ui dependency in client-spark/spark4/pom.xml (mirrors client-spark/spark3/pom.xml).
  • Bump build-helper-maven-plugin from 1.10 (2014) to 3.6.0.

Why are the changes needed?

Without this, Spark 4 users lose the Uniffle Shuffle tab, the History Server plugin, and the listener-driven status store. #2748 flagged this as a known limitation.

How was this patch tested?

mvn -pl client-spark/extension clean compile passes under every profile that includes the module: spark3, spark3.0, spark3.2, spark3.2.0, spark3.3, spark3.4, spark3.5, spark4.

Under -Pspark4, dependency:tree resolves spark-core_2.13:4.0.2, scala-library:2.13.16, and jakarta.servlet-api:5.0.0. Runtime validation on a live Spark 4 cluster is follow-up.

@LuciferYang LuciferYang marked this pull request as draft May 13, 2026 00:00
Spark 4 switched the UI servlet API from javax.servlet (Jetty 9) to
jakarta.servlet (Jetty 11+). WebUIPage.render must override its parent
signature exactly, so the same .scala file cannot serve both.

Route extension through a ServletCompat type alias backed by two
version-specific source roots (scala-javax / scala-jakarta). The
build-helper-maven-plugin picks one via ${extension.servlet.source.dir},
switched by a spark4 profile in this module's pom that merges with the
root pom's spark4 profile.

Other changes:
- Add client-spark/extension to the root pom's spark4 profile modules.
- Add rss-client-spark-ui dependency to client-spark/spark4/pom.xml,
  mirroring client-spark/spark3/pom.xml.
- Bump build-helper-maven-plugin 1.10 -> 3.6.0.
- Contributor notes document new-variant and new-spark-version steps,
  IDE re-import on profile switch, and ~/.m2 hygiene.

Verified: extension compiles cleanly under spark3 / spark3.0 / spark3.2 /
spark3.2.0 / spark3.3 / spark3.4 / spark3.5 / spark4 profiles.
@LuciferYang LuciferYang force-pushed the spark4-extension-explore branch from aedf48a to 46a7143 Compare May 13, 2026 00:07
@LuciferYang LuciferYang marked this pull request as ready for review May 13, 2026 00:08
@github-actions
Copy link
Copy Markdown

Test Results

 3 401 files  ±0   3 401 suites  ±0   7h 18m 43s ⏱️ +37s
 1 263 tests ±0   1 252 ✅ ±0  11 💤 ±0  0 ❌ ±0 
16 930 runs  ±0  16 904 ✅ ±0  26 💤 ±0  0 ❌ ±0 

Results for commit 46a7143. ± Comparison against base commit c18774d.

@LuciferYang
Copy link
Copy Markdown
Contributor Author

cc @roryqi

@roryqi roryqi requested a review from zuston May 13, 2026 02:41
@roryqi
Copy link
Copy Markdown
Contributor

roryqi commented May 13, 2026

@zuston Could u take a look?

@zuston
Copy link
Copy Markdown
Member

zuston commented May 13, 2026

Great work! Have you tested the uniffle shuffle tab in spark4 history server or runtime? @LuciferYang

@LuciferYang
Copy link
Copy Markdown
Contributor Author

LuciferYang commented May 13, 2026

Ran a local runtime smoke test — the Uniffle tab loads cleanly under Spark 4 with real shuffle traffic going through Uniffle.

Setup

  • Uniffle coordinator + shuffle server built from this PR (commit 46a71436) with mvn -Pspark4,hadoop-dependencies-included install -DskipTests
  • Local Spark 4.0.2 (scala-2.13, JDK 17) as driver
  • spark-shell --jars rss-client-spark4-shaded-*.jar,rss-client-spark-ui-*.jar --conf spark.plugins=org.apache.spark.UnifflePlugin --conf spark.shuffle.manager=org.apache.spark.shuffle.RssShuffleManager --conf spark.rss.coordinator.quorum=127.0.0.1:19999 --conf spark.rss.storage.type=MEMORY_LOCALFILE --conf spark.shuffle.sort.io.plugin.class=org.apache.spark.shuffle.RssShuffleDataIo --conf spark.serializer=org.apache.spark.serializer.KryoSerializer
  • Workload: sc.parallelize(1 to 50000, 4).map(i => (i % 20, i.toLong)).reduceByKey(_ + _).collect()

Results

GET http://localhost:4040/uniffle/ → HTTP 200, 15305 bytes.

The tab appears in the navbar:

<li class="nav-item active">
  <a class="nav-link" href="/uniffle/">Uniffle</a>
</li>

Summary block (stripped of HTML tags):

Total shuffle bytes:       343.7 KiB
CompressionRatio:          683.6 KiB / 343.7 KiB = 1.0
Shuffle Duration (W+R):    0.3 s  (0.3 s + 43 ms) / 0.8 s = 0.39
Client Observed Speed:     1.26 / 8.18 MB/s
Uniffle Speed:             3.01 / 21.99 MB/s
Reassign Status:           partitionSplit=false, blockSentFailure=false, stageRetry=false

Every collapsible section renders without error: Uniffle Build Information, Uniffle Properties, Shuffle Throughput Statistics, Hybrid Storage Read Statistics, Shuffle Server, Assignment, Shuffle Write Times, Shuffle Read Times, Shuffle Failures.

image

History Server

Not exercised in this run. The plugin class is registered via META-INF/services/org.apache.spark.status.AppHistoryServerPlugin → org.apache.spark.UniffleHistoryServerPlugin, which the Spark 4 History Server picks up through ServiceLoader exactly like Spark 3. Both paths attach the same ShuffleTab + UniffleStatusStore objects, so if the runtime UI renders, the History Server rendering should follow the same code path. Happy to follow up with an explicit HS run if you want one on the record.

@LuciferYang
Copy link
Copy Markdown
Contributor Author

@roryqi @zuston can we merge this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants