[GraphTrainer][AutoDev] Remove compile_with_inductor annotation from qwen3 FlexAttention by SherlockNoMad · Pull Request #3019 · pytorch/torchtitan

SherlockNoMad · 2026-04-17T22:28:26Z

Summary

Remove the compile_with_inductor annotation on FlexAttention.forward in the qwen3 graph_trainer parallelize module to align with llama3 and deepseek_v3 variants, which do not have this annotation.

Why

The qwen3 annotate_qwen3 function tagged FlexAttention.forward with {"compile_with_inductor": "flex_attention"} metadata, but the llama3 and deepseek_v3 graph_trainer variants do not annotate FlexAttention this way. Since FlexAttention is a shared component (torchtitan/models/common/attention.py), annotating its forward method in one model variant but not others causes a global mutation that persists across all models in the same process, which could cause subtle behavioral divergence depending on model initialization order.

This PR removes the annotation from qwen3 so all three graph_trainer model variants are consistent.

Test plan

Verified module imports successfully (from torchtitan.experiments.graph_trainer.qwen3.parallelize import annotate_qwen3, parallelize_qwen3)
Pre-commit linting passes (flake8, ufmt, codespell, pydoclint all pass)
Self-reviewed diff: only the FlexAttention annotation, its import, and its docstring entry were removed; EP and AC annotations are untouched
GPU integration tests (requires H100 cluster)

…xAttention The qwen3 graph_trainer parallelize.py annotated FlexAttention.forward with compile_with_inductor metadata, but the llama3 and deepseek_v3 variants do not have this annotation. This divergence could cause subtle issues when FlexAttention is shared across models. Remove the annotation from qwen3 to align all graph_trainer model variants.

pytorch-bot Bot added the ciflow/8gpu label Apr 17, 2026

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 17, 2026

SherlockNoMad marked this pull request as ready for review April 20, 2026 16:42

SherlockNoMad requested review from aditvenk, tianyu-l, xmfan and yiming0416 as code owners April 20, 2026 16:42

yiming0416 approved these changes Apr 20, 2026

View reviewed changes

SherlockNoMad merged commit dd0cbe6 into main Apr 20, 2026
14 of 17 checks passed

SherlockNoMad mentioned this pull request Apr 24, 2026

[graph_trainer] Nightly scout tracking issue #2856

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GraphTrainer][AutoDev] Remove compile_with_inductor annotation from qwen3 FlexAttention#3019

[GraphTrainer][AutoDev] Remove compile_with_inductor annotation from qwen3 FlexAttention#3019
SherlockNoMad merged 1 commit intomainfrom
graph_trainer/align-flexattention-annotations

SherlockNoMad commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SherlockNoMad commented Apr 17, 2026

Summary

Why

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants