Skip to content

[Qualcomm] Support native_layer_norm and affine-free LayerNorm in QNN backend#18990

Open
KevinUW114514 wants to merge 7 commits intopytorch:mainfrom
KevinUW114514:fix/qnn-layer-norm-none-check
Open

[Qualcomm] Support native_layer_norm and affine-free LayerNorm in QNN backend#18990
KevinUW114514 wants to merge 7 commits intopytorch:mainfrom
KevinUW114514:fix/qnn-layer-norm-none-check

Conversation

@KevinUW114514
Copy link
Copy Markdown
Contributor

@KevinUW114514 KevinUW114514 commented Apr 19, 2026

[Qualcomm] Support native_layer_norm and affine-free LayerNorm in QNN backend

Summary

Adds QNN backend support for aten.native_layer_norm.default (which is the decomposed form of torch.nn.LayerNorm) and handles models where weight/bias are not provided (elementwise_affine=False).

Problem

When exporting models with torch.native_layer_norm or torch.nn.LayerNorm(affine=False) to the QNN backend, the following issues occur:

  1. Missing native_layer_norm visitor: The original LayerNormVisitor only targets aten.layer_norm.default, but PyTorch decomposes torch.nn.LayerNorm to aten.native_layer_norm.default during export.

  2. None weight/bias: When elementwise_affine=False, the weight and bias arguments are None. QNN x86_64 runtime cannot handle None tensor inputs, causing AttributeError when calling get_parameter().

Solution

1. Update visitor target (op_layer_norm.py)

Change the visitor target from aten.layer_norm.default to aten.native_layer_norm.default:

# Before
target = ["aten.layer_norm.default"]

# After
target = ["aten.native_layer_norm.default"]

This is correct because during ExecuTorch export, aten.layer_norm.default is decomposed to aten.native_layer_norm.default before the QNN lowering stage.

2. Handle None weight/bias (op_layer_norm.py)

When weight/bias are None, create synthetic tensors:

  • Missing weight → torch.ones(normalized_shapes) (identity transform)
  • Missing bias → torch.zeros(normalized_shapes) (no offset)

Create synthetic fx.Node objects to register these as QNN static tensors:

weight_tensor = torch.ones(normalized_shapes, dtype=torch.float32)
weight_node = torch.fx.Node(
    node.graph,
    node.name + "_runtime_weight",
    "call_function",
    exir_ops.edge.aten.tensor.default,
    (),
    {},
)
# Preserve quant_attrs with zero_point=0 for QNN compatibility

3. Use same annotator for both ops (htp_rules.py)

The quantizer annotator registers both aten.layer_norm.default and aten.native_layer_norm.default to the same LayerNorm class, since both ops have identical argument schemas:

@register_annotator(
    [torch.ops.aten.layer_norm.default, torch.ops.aten.native_layer_norm.default],
    QnnConstants.OpLayerNorm.op_name,
)

4. Add None check to get_parameter() (utils.py)

Guard against None nodes to prevent AttributeError:

if node is None:
    return None

Files Changed

File Changes
builders/op_layer_norm.py Add native_layer_norm support + handle None weight/bias
builders/utils.py Add None guard in get_parameter()
quantizer/annotators/htp_rules.py Register annotator for both ops
tests/models.py Add NativeLayerNorm test model
tests/test_qnn_delegate.py Add floating-point and quantized tests

Test Plan

Run QNN delegate tests for layer_norm:

python backends/qualcomm/tests/test_qnn_delegate.py \
    -k "test_qnn_backend_layer_norm or test_qnn_backend_native_layer_norm" \
    --soc_model SM8650 \
    --build_folder build-x86/ \
    --executorch_root . \
    --enable_x86_64

Expected: 4 tests pass (2 floating-point, 2 quantized).

Release Notes

  • Release notes: qualcomm

Related Issues

This resolves the issue where FLUX2 transformer export fails with:

  • [QNN Delegate Op Builder]: LayerNorm weight is None, skipping
  • AttributeError: 'NoneType' object has no attribute 'name'

Fixes #18989

  • Labels: bug, module:qnn

@abhinaykukkadapu

Copilot AI review requested due to automatic review settings April 19, 2026 07:29
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 19, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18990

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ You can merge normally! (2 Unrelated Failures)

As of commit 1a5a256 with merge base 8bda9ac (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla
Copy link
Copy Markdown

meta-cla Bot commented Apr 19, 2026

Hi @KevinUW114514!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@KevinUW114514
Copy link
Copy Markdown
Contributor Author

@pytorchbot label "release notes: none"

@pytorch-bot pytorch-bot Bot added the release notes: none Do not include this in the release notes label Apr 19, 2026
@meta-cla
Copy link
Copy Markdown

meta-cla Bot commented Apr 19, 2026

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 19, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a crash in the Qualcomm QNN PT2E quantizer by making _mark_nodes_as_annotated robust to None entries in node lists (e.g., when aten.layer_norm has optional affine args like weight=None).

Changes:

  • Skip None entries in _mark_nodes_as_annotated to avoid AttributeError when accessing node.meta.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

from .qconfig import QuantizationConfig


def _mark_nodes_as_annotated(nodes: List[Node]):
Comment thread backends/qualcomm/quantizer/rules.py Outdated
Comment on lines +32 to +33
if node is None:
continue
@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

abhinaykukkadapu commented Apr 20, 2026

Hi @KevinUW114514 thank you for your contribution. I think the root cause is that we need to guard weight and bias creation in rules files for htp and lpai similar to #18219, let me know if you are willing to change it. Adding the guard might silently propagate bad configs like these in the pipeline and i think we should fail loudly. CC: @shewu-quic

@KevinUW114514
Copy link
Copy Markdown
Contributor Author

Hi @abhinaykukkadapu , thanks for the follow-up! Actually I also realized this root issue as I encountered the error in my downstream tasks. I am currently working on fixing this. I can edit the issue and PR to re-state the issue and submit a complete fix for it. Let me know if any concern. Thank you!

@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

Hi @abhinaykukkadapu , thanks for the follow-up! Actually I also realized this root issue as I encountered the error in my downstream tasks. I am currently working on fixing this. I can edit the issue and PR to re-state the issue and submit a complete fix for it. Let me know if any concern. Thank you!

Thanks, that would be awesome, will look forward to your changes.

@KevinUW114514 KevinUW114514 changed the title Fix AttributeError in _mark_nodes_as_annotated when node is None [QNN] Fix AttributeError in _mark_nodes_as_annotated when node is None Apr 20, 2026
Fixes AttributeError when aten.native_layer_norm has optional weight=None.
Both weight and bias are guarded to handle the None case gracefully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nil-is-all nil-is-all added the module: qnn Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/ label Apr 20, 2026
… backend

- add QNN layer norm support for aten.native_layer_norm.default
- handle missing weight/bias by creating identity weight and zero bias
- always provide bias tensor for QNN LayerNorm op
- add floating-point and quantized tests for native_layer_norm
- print generated pte filename after export
@KevinUW114514 KevinUW114514 changed the title [QNN] Fix AttributeError in _mark_nodes_as_annotated when node is None [Qualcomm] Support native_layer_norm and affine-free LayerNorm in QNN backend Apr 22, 2026
@KevinUW114514
Copy link
Copy Markdown
Contributor Author

Hi @abhinaykukkadapu and @shewu-quic , thanks for your help before! Could you please take a look at the PR to see whether I am doing the right fix on this? I appreciate your time 😃

Comment thread backends/qualcomm/quantizer/rules.py Outdated

def _mark_nodes_as_annotated(nodes: List[Node]):
for node in nodes:
if node is None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to get rid of this, CC: @shewu-quic

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the node should not be None in this function.

@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

@KevinUW114514 LGTM, will wait for a stamp from @shewu-quic too

Copilot AI review requested due to automatic review settings April 24, 2026 01:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Qualcomm QNN backend support for aten.native_layer_norm.default (the decomposed form of torch.nn.LayerNorm) and improves robustness when optional weight/bias inputs are None (e.g., elementwise_affine=False).

Changes:

  • Update the QNN op builder to target aten.native_layer_norm.default and synthesize identity weight / zero bias when missing.
  • Make quantizer annotation/marking logic resilient to optional None nodes and register the HTP annotator for both layer_norm and native_layer_norm.
  • Add new test model + delegate tests intended to cover native layer norm (float + quantized).

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
backends/qualcomm/builders/op_layer_norm.py Switch visitor target to native_layer_norm and add synthetic weight/bias handling for optional inputs.
backends/qualcomm/builders/utils.py Allow get_parameter() to safely handle None inputs.
backends/qualcomm/quantizer/rules.py Skip None entries when marking nodes as annotated.
backends/qualcomm/quantizer/annotators/htp_rules.py Register LayerNorm annotator for both layer_norm and native_layer_norm; avoid annotating missing optional args.
backends/qualcomm/quantizer/annotators/lpai_rules.py Make LayerNorm annotator tolerant of missing optional args (but still only registered for layer_norm).
backends/qualcomm/tests/models.py Add NativeLayerNorm test module.
backends/qualcomm/tests/test_qnn_delegate.py Add float + quantized tests for NativeLayerNorm.
backends/qualcomm/export_utils.py Print a success message after writing the generated .pte.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backends/qualcomm/tests/models.py Outdated
Comment on lines +1395 to +1408
self.weight = torch.nn.Parameter(torch.ones(768))
self.bias = torch.nn.Parameter(torch.zeros(768))
self.normalized_shape = [768]
self.eps = 1e-6

def forward(self, x):
if self.affine:
return torch.native_layer_norm(
x, self.normalized_shape, self.weight, self.bias, self.eps
)[0]
else:
return torch.native_layer_norm(
x, self.normalized_shape, self.weight, self.bias, self.eps
)[0]
for i, module in enumerate(modules):
with self.subTest(i=i):
self.lower_module_and_test_output(module, sample_input)



def get_parameter(
node: torch.fx.Node, edge_program: torch.export.ExportedProgram
Comment on lines 475 to +479
@staticmethod
def annotate(node: Node, quantization_config: QuantizationConfig) -> None:
act_node = node.args[0]
weight_node = node.args[2]
bias_node = None
if len(node.args) > 2:
bias_node = node.args[3]
weight_node = node.args[2] if len(node.args) > 2 else None
bias_node = node.args[3] if len(node.args) > 3 else None
Copy link
Copy Markdown
Collaborator

@shewu-quic shewu-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your effort.

Comment thread backends/qualcomm/tests/models.py Outdated
self.eps = 1e-6

def forward(self, x):
if self.affine:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two branches seem to be the same. Would it be possible to extend the current LayerNorm with torch.nn.LayerNorm(elementwise_affine=False) as a test case?


bias_node = self.get_node(node.args[3])
if bias_node is not None:
# Fake node: even when original bias is absent, QNN still needs it
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the bias is optional for QNN and can be kept as in the original design.
https://docs.qualcomm.com/doc/80-63442-10/topic/MasterOpDef.html#layernorm

Comment thread backends/qualcomm/builders/utils.py Outdated
def get_parameter(
node: torch.fx.Node, edge_program: torch.export.ExportedProgram
) -> torch.Tensor:
) -> Optional[torch.Tensor]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function shouldn't return None. Perhaps we should ensure that the node is not None before this function is called.

Comment thread backends/qualcomm/quantizer/rules.py Outdated

def _mark_nodes_as_annotated(nodes: List[Node]):
for node in nodes:
if node is None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the node should not be None in this function.

@KevinUW114514
Copy link
Copy Markdown
Contributor Author

KevinUW114514 commented Apr 27, 2026

Hi @shewu-quic , really thanks for the detailed review! I updated the PR according to your comments.

Changes made:

  • Removed the None guard from _mark_nodes_as_annotated().

    • Reason: I agree that this helper should only receive real FX Nodes. Optional LayerNorm args should be filtered at the annotator/caller level instead of being silently ignored here.
  • Removed the node is None handling from get_parameter().

    • Reason: get_parameter() should only be called when the caller has already confirmed the node is a parameter/buffer/lifted constant. Passing None into this helper is now treated as an invalid call site.
  • Updated AnnotateQuantAttrs to check is_parameter(...) before calling get_parameter(...).

    • Reason: after tightening get_parameter(), the pass should not rely on get_parameter() returning None for non-parameter dq inputs. This preserves the original behavior for normal QDQ nodes while making the parameter check explicit.
  • Updated LayerNormVisitor to keep QNN bias optional.

    • Reason: as you pointed out, QNN LayerNorm supports optional bias. The builder now starts with [input, weight] and only appends bias when the original graph has a real bias node. It no longer creates a fake zero-bias tensor.
  • Kept synthetic all-ones weight only for the missing-weight case.

    • Reason: for elementwise_affine=False, PyTorch provides weight=None. QNN LayerNorm still needs a scale input, so using an all-ones tensor preserves the identity-scale semantics.
  • Reworked the tests to extend the existing LayerNorm test model instead of using NativeLayerNorm.

    • Reason: the previous NativeLayerNorm(affine=False) did not actually pass None for weight/bias, so it was not testing the intended path. The tests now use torch.nn.LayerNorm(..., elementwise_affine=False), which exports toaten.native_layer_norm.default with missing affine parameters.
  • Updated both floating-point and quantized LayerNorm delegate tests to cover:

    • regular LayerNorm
    • bias=False
    • elementwise_affine=False

I also ran the targeted x86_64 QNN tests for the updated LayerNorm cases, and they pass. Please let me know if anything needs to be changed. Thanks again!

Copy link
Copy Markdown
Collaborator

@shewu-quic shewu-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Please resolve the linting issue.

Copilot AI review requested due to automatic review settings April 28, 2026 18:51
@KevinUW114514
Copy link
Copy Markdown
Contributor Author

Hi @abhinaykukkadapu , the linter issue has been fixed. Could you please take a look at the update? Also, thank you to shewu-quic for the detailed suggestions—they were genuinely very helpful!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Qualcomm QNN backend support for aten.native_layer_norm.default (the decomposed export form of nn.LayerNorm) and improves handling of optional weight/bias during quantization and lowering.

Changes:

  • Update LayerNorm lowering to target aten.native_layer_norm.default and synthesize an identity weight when elementwise_affine=False.
  • Extend HTP quantizer annotator registration to include both aten.layer_norm.default and aten.native_layer_norm.default; harden annotators against missing (None) weight/bias.
  • Expand LayerNorm delegate tests to include elementwise_affine=False.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
backends/qualcomm/builders/op_layer_norm.py Switch visitor target to aten.native_layer_norm.default; synthesize weight when absent.
backends/qualcomm/builders/utils.py Change get_parameter() behavior to assert parameter presence and cast dtype.
backends/qualcomm/_passes/annotate_quant_attrs.py Avoid calling get_parameter() unless the dq input is actually a parameter.
backends/qualcomm/quantizer/annotators/htp_rules.py Register LayerNorm annotator for both layer_norm and native_layer_norm; skip annotating missing weight/bias.
backends/qualcomm/quantizer/annotators/lpai_rules.py Make LayerNorm annotation resilient to optional/missing weight and bias.
backends/qualcomm/tests/models.py Update LayerNorm test module to allow elementwise_affine=False.
backends/qualcomm/tests/test_qnn_delegate.py Add elementwise_affine=False coverage to existing LayerNorm delegate tests.
backends/qualcomm/export_utils.py Print a success message after writing the .pte file.
Comments suppressed due to low confidence (1)

backends/qualcomm/tests/models.py:1389

  • PR description mentions adding a NativeLayerNorm test model and corresponding tests, but this change set only extends the existing LayerNorm test module. If the intent is to validate direct torch.native_layer_norm export (not just nn.LayerNorm decomposition), add the explicit NativeLayerNorm model here (or adjust the PR description accordingly).
    def forward(self, x):
        return self.instance_norm(x)


class IsInf(torch.nn.Module):
    def __init__(self):
        super().__init__()


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +40 to +45
assert (
param is not None
), f"Expect {node.name} to be parameter, buffer, or lifted tensor constant"
# update node.meta["val"] to qualified QNN datatype (e.g. i64 to i32)
assert isinstance(param, torch.Tensor), "Expect parameter to be tensor"
param = param.type(node.meta["val"].dtype)
Comment on lines 475 to +479
@staticmethod
def annotate(node: Node, quantization_config: QuantizationConfig) -> None:
act_node = node.args[0]
weight_node = node.args[2]
bias_node = None
if len(node.args) > 2:
bias_node = node.args[3]
weight_node = node.args[2] if len(node.args) > 2 else None
bias_node = node.args[3] if len(node.args) > 3 else None
Comment on lines 1381 to 1385

def test_qnn_backend_up_sampling_nearest_2d_with_size(self):
module = UpsampleNearest2D(sizes=(144, 208)) # noqa: F405
sample_input = (torch.randn(1, 16, 72, 104),)
self.lower_module_and_test_output(module, sample_input)
@KevinUW114514
Copy link
Copy Markdown
Contributor Author

Hi @abhinaykukkadapu , just following up on this PR when you get a chance. The linter issue has been fixed, and I’d really appreciate another look at the latest update. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: qnn Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/ release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[QNN] AttributeError in _mark_nodes_as_annotated when a layer_norm node has optional weight as None

5 participants