Skip to content

[ROCm] revert VAE restrictions on RDNA4 and CDNA4#13411

Open
Apophis3158 wants to merge 1 commit intoComfy-Org:masterfrom
Apophis3158:master/rocm
Open

[ROCm] revert VAE restrictions on RDNA4 and CDNA4#13411
Apophis3158 wants to merge 1 commit intoComfy-Org:masterfrom
Apophis3158:master/rocm

Conversation

@Apophis3158
Copy link
Copy Markdown

Better support for AMD latest GPU arches: RDNA4 (gfx1200, gfx1201) and CDNA4 (gfx950), determined based on fp8 support.

Reference: __hip_fp8_e4m3 and __hip_fp8_e5m2 supports in HIP C++ type implementation support table at AMD data types and precision support

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

The PR makes AMD-specific behavior conditional on SUPPORT_FP8_OPS. In comfy/model_management.py, pytorch_attention_enabled_vae() now returns False for AMD only when SUPPORT_FP8_OPS is false; if SUPPORT_FP8_OPS is true it defers to pytorch_attention_enabled(). In comfy/sd.py, VAE.__init__ sets VAE_KL_MEM_RATIO to 2.73 only when is_amd() is true and SUPPORT_FP8_OPS is false; otherwise it uses the non-AMD value. No public signatures changed.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description check ✅ Passed The description clearly explains the architectural improvements and references AMD's official documentation on FP8 support, directly relating to the code changes in model_management.py and sd.py.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title references reverting VAE restrictions on RDNA4 and CDNA4 AMD architectures, which aligns with the PR's core objective of improving support for these architectures by conditionally enabling PyTorch VAE attention and adjusting memory ratios based on FP8 support.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Apophis3158
Copy link
Copy Markdown
Author

  • Updated inline comments.

These two improvements has already been tested by ROCm users on Windows and Linux long time ago:

and more.

Most feedback is from gfx120x, so it's better to use SUPPORT_FP8_OPS for the restriction.

if torch_version_numeric >= (2, 7) and rocm_version >= (6, 4):
if any((a in arch) for a in ["gfx1200", "gfx1201", "gfx950"]): # TODO: more arches, "gfx942" gives error on pytorch nightly 2.10 1013 rocm7.0
SUPPORT_FP8_OPS = True

@Apophis3158 Apophis3158 changed the title [ROCm] better AMD CDNA4 and RDNA4 support [ROCm] better AMD CDNA4 and RDNA4 support for VAE Apr 17, 2026
@Apophis3158 Apophis3158 changed the title [ROCm] better AMD CDNA4 and RDNA4 support for VAE [ROCm] revert VAE restrictions on RDNA4 and CDNA4 Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant