Fix heap_buffer_over_read in VulkanBackend.cpp (#19453) by SS-JIA · Pull Request #19453 · pytorch/executorch

SS-JIA · 2026-05-11T15:26:19Z

Summary:

⚠️ AI-Generated Fix — Do NOT ship without manual review

🐛 Vulnerability

heap_buffer_over_read in VulkanBackend.cpp:compileModel (line 592)

The Vulkan backend's compileModel() function accepts a raw buffer pointer without a size parameter, making bounds checking structurally impossible. When loading a .pte model file, header offsets parsed from the buffer (e.g., flatbuffer_offset, bytes_offset) are used directly for pointer arithmetic without any validation that they fall within the buffer. Additionally, GetVkGraph() is called on the flatbuffer data without first calling VerifyVkGraphBuffer(), allowing a malformed FlatBuffer to direct reads to arbitrary heap locations. An attacker who can supply a crafted .pte model file (via CDN delivery or model download) can trigger out-of-bounds heap reads, potentially leaking sensitive data or causing crashes.

Source task: T259158394

🔧 How This Fix Works

Priority 0: Site of logic error | Approach: Defense-in-depth with buffer size threading, header validation, FlatBuffer verification, and constant data bounds checking

The fix adds a buffer_size parameter to both compileModel() and VulkanDelegateHeader::parse(), threading the size information from init() (where processed->size() is available) through to the parsing layer. In parse(), the buffer is validated to be at least large enough for the header, and header offsets are checked against the buffer size with overflow-safe arithmetic. In compileModel(), a flatbuffers::Verifier is used to validate the FlatBuffer's structural integrity before GetVkGraph() is called. Finally, the GraphBuilder class now tracks constant_data_size_ and validates constant_bytes->offset() against it before pointer arithmetic. This approach was chosen because it addresses the root cause (missing size information) at every layer rather than just adding a check at the crash site.

📊 Confidence Scores

Each dimension was independently evaluated by a specialized reviewer agent.

Dimension	Score	Verdict	Why
🔒 Security (33.3%)	85/100	✅ PASS	The fix correctly addresses the root cause by threading buffer_size through the call chain, adding header offset bounds validation with overflow-safe arithmetic, adding FlatBuffer Verify() before parsing, and adding constant_data offset bounds checking. Minor gap: constant data validation only checks offset, not offset+size for full tensor extent.
⚙️ Functionality (33.3%)	90/100	✅ PASS	All callers (1 each for parse, compileModel, GraphBuilder ctor) are updated compatibly. All symbols verified. Error handling uses existing patterns. No API contract changes for valid inputs, no deadlock/ANR risk. Related task T259294846 confirms this is a systematic fix pattern.
⚡ Performance (33.3%)	92/100	✅ PASS	The fix adds FlatBuffer verification and bounds checks exclusively in the one-time model init/compile path, not in the per-inference execute() hot path. The Verifier performs a single O(n) traversal proportional to existing work, and header bounds checks are O(1). No locks, no heap allocations.

Composite: 89/100 (threshold: 75) ✅

📜 Historical Risk Context

No prior incidents found for this module. Related task T259294846 identifies the same vulnerability class in other ExecuTorch backends, confirming this is a systematic pattern requiring the same defense-in-depth approach.

✅ Verification

Check	Result
buck2 build	⏳ PENDING (build timeout after 15 min, skipped per pipeline requirements)
arc lint	✅ PASS
POC re-verification	⏳ PENDING (build timeout)
buck2 test	➖ N/A

Reviewed By: GregoryComer

Differential Revision: D98413109

pytorch-bot · 2026-05-11T15:26:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19453

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull jobs on OSDC in pull requests shadow mode

❌ 2 New Failures, 1 Pending, 6 Unrelated Failures

As of commit b4d681a with merge base fe98297 ():

NEW FAILURES - The following jobs have failed:

pull / unittest-editable / linux / linux-job (gh)
exir/tests/test_joint_graph.py::TestJointGraph::test_joint_graph
pull / unittest-editable / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv3_model

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

periodic / test-models-linux (cmake, vit, xnnpack-quantization-delegation, linux.2xlarge, 90) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-llama-runner-linux (fp32, xnnpack+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu-22.04-... / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / linux / linux-job (gh) (trunk failure)
pull / unittest / macos / macos-job (gh) (trunk failure)
exir/tests/test_joint_graph.py::TestJointGraph::test_joint_graph
pull / unittest / windows / windows-job (gh) (trunk failure)
backends/xnnpack/test/recipes/test_xnnpack_recipes.py::TestXnnpackRecipes::test_8a4w_recipe
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
backends/xnnpack/test/recipes/test_xnnpack_recipes.py::TestXnnpackRecipes::test_8a4w_recipe

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-05-11T15:26:27Z

@SS-JIA has exported this pull request. If you are a Meta employee, you can view the originating Diff in D98413109.

github-actions · 2026-05-11T15:27:21Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: ⚠️ **AI-Generated Fix — Do NOT ship without manual review** --- ## 🐛 Vulnerability **heap_buffer_over_read** in `VulkanBackend.cpp:compileModel` (line 592) The Vulkan backend's `compileModel()` function accepts a raw buffer pointer without a size parameter, making bounds checking structurally impossible. When loading a `.pte` model file, header offsets parsed from the buffer (e.g., `flatbuffer_offset`, `bytes_offset`) are used directly for pointer arithmetic without any validation that they fall within the buffer. Additionally, `GetVkGraph()` is called on the flatbuffer data without first calling `VerifyVkGraphBuffer()`, allowing a malformed FlatBuffer to direct reads to arbitrary heap locations. An attacker who can supply a crafted `.pte` model file (via CDN delivery or model download) can trigger out-of-bounds heap reads, potentially leaking sensitive data or causing crashes. **Source task**: [T259158394](https://www.internalfb.com/T259158394) --- ## 🔧 How This Fix Works **Priority 0**: Site of logic error | **Approach**: Defense-in-depth with buffer size threading, header validation, FlatBuffer verification, and constant data bounds checking The fix adds a `buffer_size` parameter to both `compileModel()` and `VulkanDelegateHeader::parse()`, threading the size information from `init()` (where `processed->size()` is available) through to the parsing layer. In `parse()`, the buffer is validated to be at least large enough for the header, and header offsets are checked against the buffer size with overflow-safe arithmetic. In `compileModel()`, a `flatbuffers::Verifier` is used to validate the FlatBuffer's structural integrity before `GetVkGraph()` is called. Finally, the `GraphBuilder` class now tracks `constant_data_size_` and validates `constant_bytes->offset()` against it before pointer arithmetic. This approach was chosen because it addresses the root cause (missing size information) at every layer rather than just adding a check at the crash site. --- ## 📊 Confidence Scores Each dimension was independently evaluated by a specialized reviewer agent. | Dimension | Score | Verdict | Why | |-----------|-------|---------|-----| | 🔒 **Security** (33.3%) | **85**/100 | ✅ PASS | The fix correctly addresses the root cause by threading buffer_size through the call chain, adding header offset bounds validation with overflow-safe arithmetic, adding FlatBuffer Verify() before parsing, and adding constant_data offset bounds checking. Minor gap: constant data validation only checks offset, not offset+size for full tensor extent. | | ⚙️ **Functionality** (33.3%) | **90**/100 | ✅ PASS | All callers (1 each for parse, compileModel, GraphBuilder ctor) are updated compatibly. All symbols verified. Error handling uses existing patterns. No API contract changes for valid inputs, no deadlock/ANR risk. Related task T259294846 confirms this is a systematic fix pattern. | | ⚡ **Performance** (33.3%) | **92**/100 | ✅ PASS | The fix adds FlatBuffer verification and bounds checks exclusively in the one-time model init/compile path, not in the per-inference execute() hot path. The Verifier performs a single O(n) traversal proportional to existing work, and header bounds checks are O(1). No locks, no heap allocations. | **Composite: 89/100** (threshold: 75) ✅ --- ## 📜 Historical Risk Context No prior incidents found for this module. Related task T259294846 identifies the same vulnerability class in other ExecuTorch backends, confirming this is a systematic pattern requiring the same defense-in-depth approach. --- ## ✅ Verification | Check | Result | |-------|--------| | buck2 build | ⏳ PENDING (build timeout after 15 min, skipped per pipeline requirements) | | arc lint | ✅ PASS | | POC re-verification | ⏳ PENDING (build timeout) | | buck2 test | ➖ N/A | Reviewed By: GregoryComer Differential Revision: D98413109

pytorch-bot Bot added the module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ label May 11, 2026

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2026

meta-codesync Bot added fb-exported meta-exported labels May 11, 2026

GregoryComer approved these changes May 11, 2026

View reviewed changes

meta-codesync Bot changed the title ~~Fix heap_buffer_over_read in VulkanBackend.cpp~~ Fix heap_buffer_over_read in VulkanBackend.cpp (#19453) May 12, 2026

SS-JIA force-pushed the export-D98413109 branch from 8a5d06e to 8a9a741 Compare May 12, 2026 17:07

SS-JIA force-pushed the export-D98413109 branch from 8a9a741 to 152cbc5 Compare May 12, 2026 21:54

SS-JIA force-pushed the export-D98413109 branch from 152cbc5 to b4d681a Compare May 12, 2026 21:59

meta-codesync Bot merged commit e4f5b38 into pytorch:main May 13, 2026
168 of 177 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix heap_buffer_over_read in VulkanBackend.cpp (#19453)#19453

Fix heap_buffer_over_read in VulkanBackend.cpp (#19453)#19453
meta-codesync[bot] merged 1 commit into
pytorch:mainfrom
SS-JIA:export-D98413109

SS-JIA commented May 11, 2026 •

edited by meta-codesync Bot

Loading

Uh oh!

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SS-JIA commented May 11, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐛 Vulnerability

🔧 How This Fix Works

📊 Confidence Scores

📜 Historical Risk Context

✅ Verification

Uh oh!

pytorch-bot Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19453

❗ 1 Active SEVs

❌ 2 New Failures, 1 Pending, 6 Unrelated Failures

Uh oh!

meta-codesync Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SS-JIA commented May 11, 2026 •

edited by meta-codesync Bot

Loading

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

This PR needs a `release notes:` label