Skip to content

Gate H100.8 CI workflows behind ciflow/h100.8 label#3016

Merged
SherlockNoMad merged 4 commits intomainfrom
h100.8
Apr 20, 2026
Merged

Gate H100.8 CI workflows behind ciflow/h100.8 label#3016
SherlockNoMad merged 4 commits intomainfrom
h100.8

Conversation

@SherlockNoMad
Copy link
Copy Markdown
Contributor

@SherlockNoMad SherlockNoMad commented Apr 17, 2026

Summary

H100.8 runners are scarce and the job queue is very long, blocking PRs. This PR gates H100 CI workflows behind a ciflow/h100.8 label so they only run on PRs when a maintainer explicitly requests it.

Changes to 2 workflows:

  • integration_test_8gpu_h100.yaml — PR trigger narrowed from [opened, synchronize, reopened, ready_for_review] to [labeled] only, with a job-level if checking for the ciflow/h100.8 label
  • integration_test_8gpu_graph_trainer_h100.yaml — same trigger change, replacing the draft-PR check with the ciflow/h100.8 label check

Trigger behavior after this PR:

Event h100.yaml graph_trainer_h100.yaml
Push to main (merge) Runs (with paths-ignore: experiments/) Runs (with paths: graph_trainer/)
Cron schedule Runs (every 6h) Runs (every 12h)
PR labeled ciflow/h100.8 Runs if PR touches core (non-experiments) files Runs if PR touches graph_trainer/ files
PR opened / pushed / reopened Skipped Skipped

Note: The ciflow/h100.8 label needs to be created in the repo settings.

Test plan

  • Verify that opening a new PR does NOT trigger the two H100 workflows
image
  • Verify that adding the ciflow/h100.8 label to a PR triggers the appropriate H100 workflow(s)
  • Verify that push-to-main and cron schedules still trigger normally

H100.8 runners are scarce and the job queue is long. Instead of
running H100 tests on every PR event, require maintainers to
explicitly add the `ciflow/h100.8` label. Push-to-main and cron
schedules are unaffected.

- integration_test_8gpu_h100.yaml: PR trigger changed to `labeled`
  only, with paths-ignore for experiments/ and job-level label check
- integration_test_8gpu_graph_trainer_h100.yaml: PR trigger changed
  to `labeled` only, with paths filter for graph_trainer/ and
  job-level label check
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 17, 2026
@SherlockNoMad SherlockNoMad changed the title Gate H100 CI workflows behind ciflow/h100.8 label Gate H100.8 CI workflows behind ciflow/h100.8 label Apr 17, 2026
@SherlockNoMad SherlockNoMad added the ciflow/h100.8 Trigger H100.8 CI label Apr 17, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 17, 2026

Warning: Unknown label ciflow/h100.8.
Currently recognized labels are

  • ciflow/8gpu

Please add the new label to .github/pytorch-probot.yml

@@ -1,3 +1,4 @@
ciflow_push_tags:
- ciflow/8gpu
- ciflow/h100.8
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what this do....

Add `synchronize` back to pull_request types so that new commits
pushed to a PR with the ciflow/h100.8 label re-trigger the H100 CI.
The job-level label check still prevents runs on unlabeled PRs.
@fegin
Copy link
Copy Markdown
Contributor

fegin commented Apr 17, 2026

How much do we actually save if we still have when PR merged into main ?

Tag-based triggering ignores path filters, so revert to
pull_request types [labeled, synchronize] with job-level label
check. Also remove stale ciflow/8gpu/* tag trigger from h100.yaml.
Copy link
Copy Markdown
Contributor

@fegin fegin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvement. Can we verify if this PR is not related to the failing CI? I don't think it is but just want confirm. Thanks!

Comment on lines +10 to +11
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
types: [labeled]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to add on: push: tags: ciflow/h100.8/* to a dispatch tigger rather than the complex condition down below

Copy link
Copy Markdown
Contributor Author

@SherlockNoMad SherlockNoMad Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Tag-based triggering and path filters are incompatible — GitHub ignores
  paths/paths-ignore for tag pushes. To get both label gating and path filtering,
  we need to go back to the pull_request approach.

I don't know how true is this, but what I have now seems to work, so I am shipping this.

@SherlockNoMad SherlockNoMad merged commit 66b204b into main Apr 20, 2026
16 of 19 checks passed
@SherlockNoMad
Copy link
Copy Markdown
Contributor Author

Thanks for the improvement. Can we verify if this PR is not related to the failing CI? I don't think it is but just want confirm. Thanks!

verified, not related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/h100.8 Trigger H100.8 CI ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants