Skip to content

ci: scope Spark SQL trigger paths to per-version shims and diff#4415

Merged
mbutrovich merged 1 commit into
apache:mainfrom
andygrove:ci/tighten-spark-sql-triggers
May 26, 2026
Merged

ci: scope Spark SQL trigger paths to per-version shims and diff#4415
mbutrovich merged 1 commit into
apache:mainfrom
andygrove:ci/tighten-spark-sql-triggers

Conversation

@andygrove
Copy link
Copy Markdown
Member

Summary

Tighten the path allow-list on each spark_sql_test_<v>.yml so a change confined to one Spark version no longer fans out to the others.

Each workflow currently triggers on spark/src/main/** and dev/diffs/**. Today that means:

  • editing spark/src/main/spark-4.1/ runs the 3.5 and 4.0 workflows for no reason
  • editing dev/diffs/3.4.3.diff runs the 4.0 workflow for no reason
  • editing dev/diffs/iceberg/*.diff runs all four Spark SQL workflows for no reason
  • if we ever add a spark/src/main/spark-4.3/ shim it will run the 3.x workflows for no reason

The change is purely to the paths: filters; job logic is untouched.

Per-version mapping (from pom.xml)

Profile Shim dirs that apply Diff
spark-3.4 spark-3.4, spark-3.x dev/diffs/3.4.3.diff
spark-3.5 spark-3.5, spark-3.x dev/diffs/3.5.8.diff
spark-4.0 spark-4.0, spark-4.x dev/diffs/4.0.2.diff
spark-4.1 spark-4.1, spark-4.x dev/diffs/4.1.1.diff

Each workflow keeps spark/src/main/** (so unrelated Java/Scala/resources still trigger) but adds !-exclusions for the shim dirs that don't apply, and replaces dev/diffs/** with the single applicable diff file.

Both push: and pull_request: filters are updated where present (3.4 and 4.1 only have push: paths since their PR trigger is labeled).

Test plan

  • actionlint passes on all four workflow files
  • After merge, edit only spark/src/main/spark-4.1/... in a follow-up PR and confirm only the 4.1 workflow is queued
  • After merge, edit only dev/diffs/3.5.8.diff and confirm only the 3.5 workflow is queued
  • After merge, edit only dev/diffs/iceberg/1.10.0.diff and confirm none of the Spark SQL workflows are queued

Each spark_sql_test_<v>.yml currently triggers on `spark/src/main/**`
and `dev/diffs/**`, so a change confined to one Spark version's shim
or diff still fans out and runs the other versions' jobs.

Tighten the path allow-list per version:
- exclude unrelated `spark/src/main/spark-3.4/`, `spark-3.5/`, `spark-3.x/`,
  `spark-4.0/`, `spark-4.1/`, `spark-4.2/`, `spark-4.x/` directories so a
  3.4-only shim edit never fires the 4.x workflows, a 4.1-only shim edit
  never fires the 3.x workflows, and a future `spark-4.3/` shim won't
  trigger any 3.x workflow either
- replace `dev/diffs/**` with the single `dev/diffs/<full-version>.diff`
  the workflow actually applies, which also stops `dev/diffs/iceberg/`
  edits from triggering the Spark SQL test workflows
Copy link
Copy Markdown
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks right to me, thanks @andygrove!

@mbutrovich mbutrovich merged commit d7c5e38 into apache:main May 26, 2026
32 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants