Fix heap_buffer_over_read in VulkanBackend.cpp (#19453)#19453
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19453
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 2 New Failures, 1 Pending, 6 Unrelated FailuresAs of commit b4d681a with merge base fe98297 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Summary:⚠️ **AI-Generated Fix — Do NOT ship without manual review** --- ## 🐛 Vulnerability **heap_buffer_over_read** in `VulkanBackend.cpp:compileModel` (line 592) The Vulkan backend's `compileModel()` function accepts a raw buffer pointer without a size parameter, making bounds checking structurally impossible. When loading a `.pte` model file, header offsets parsed from the buffer (e.g., `flatbuffer_offset`, `bytes_offset`) are used directly for pointer arithmetic without any validation that they fall within the buffer. Additionally, `GetVkGraph()` is called on the flatbuffer data without first calling `VerifyVkGraphBuffer()`, allowing a malformed FlatBuffer to direct reads to arbitrary heap locations. An attacker who can supply a crafted `.pte` model file (via CDN delivery or model download) can trigger out-of-bounds heap reads, potentially leaking sensitive data or causing crashes. **Source task**: [T259158394](https://www.internalfb.com/T259158394) --- ## 🔧 How This Fix Works **Priority 0**: Site of logic error | **Approach**: Defense-in-depth with buffer size threading, header validation, FlatBuffer verification, and constant data bounds checking The fix adds a `buffer_size` parameter to both `compileModel()` and `VulkanDelegateHeader::parse()`, threading the size information from `init()` (where `processed->size()` is available) through to the parsing layer. In `parse()`, the buffer is validated to be at least large enough for the header, and header offsets are checked against the buffer size with overflow-safe arithmetic. In `compileModel()`, a `flatbuffers::Verifier` is used to validate the FlatBuffer's structural integrity before `GetVkGraph()` is called. Finally, the `GraphBuilder` class now tracks `constant_data_size_` and validates `constant_bytes->offset()` against it before pointer arithmetic. This approach was chosen because it addresses the root cause (missing size information) at every layer rather than just adding a check at the crash site. --- ## 📊 Confidence Scores Each dimension was independently evaluated by a specialized reviewer agent. | Dimension | Score | Verdict | Why | |-----------|-------|---------|-----| | 🔒 **Security** (33.3%) | **85**/100 | ✅ PASS | The fix correctly addresses the root cause by threading buffer_size through the call chain, adding header offset bounds validation with overflow-safe arithmetic, adding FlatBuffer Verify() before parsing, and adding constant_data offset bounds checking. Minor gap: constant data validation only checks offset, not offset+size for full tensor extent. | | ⚙️ **Functionality** (33.3%) | **90**/100 | ✅ PASS | All callers (1 each for parse, compileModel, GraphBuilder ctor) are updated compatibly. All symbols verified. Error handling uses existing patterns. No API contract changes for valid inputs, no deadlock/ANR risk. Related task T259294846 confirms this is a systematic fix pattern. | | ⚡ **Performance** (33.3%) | **92**/100 | ✅ PASS | The fix adds FlatBuffer verification and bounds checks exclusively in the one-time model init/compile path, not in the per-inference execute() hot path. The Verifier performs a single O(n) traversal proportional to existing work, and header bounds checks are O(1). No locks, no heap allocations. | **Composite: 89/100** (threshold: 75) ✅ --- ## 📜 Historical Risk Context No prior incidents found for this module. Related task T259294846 identifies the same vulnerability class in other ExecuTorch backends, confirming this is a systematic pattern requiring the same defense-in-depth approach. --- ## ✅ Verification | Check | Result | |-------|--------| | buck2 build | ⏳ PENDING (build timeout after 15 min, skipped per pipeline requirements) | | arc lint | ✅ PASS | | POC re-verification | ⏳ PENDING (build timeout) | | buck2 test | ➖ N/A | Reviewed By: GregoryComer Differential Revision: D98413109
Summary:⚠️ **AI-Generated Fix — Do NOT ship without manual review** --- ## 🐛 Vulnerability **heap_buffer_over_read** in `VulkanBackend.cpp:compileModel` (line 592) The Vulkan backend's `compileModel()` function accepts a raw buffer pointer without a size parameter, making bounds checking structurally impossible. When loading a `.pte` model file, header offsets parsed from the buffer (e.g., `flatbuffer_offset`, `bytes_offset`) are used directly for pointer arithmetic without any validation that they fall within the buffer. Additionally, `GetVkGraph()` is called on the flatbuffer data without first calling `VerifyVkGraphBuffer()`, allowing a malformed FlatBuffer to direct reads to arbitrary heap locations. An attacker who can supply a crafted `.pte` model file (via CDN delivery or model download) can trigger out-of-bounds heap reads, potentially leaking sensitive data or causing crashes. **Source task**: [T259158394](https://www.internalfb.com/T259158394) --- ## 🔧 How This Fix Works **Priority 0**: Site of logic error | **Approach**: Defense-in-depth with buffer size threading, header validation, FlatBuffer verification, and constant data bounds checking The fix adds a `buffer_size` parameter to both `compileModel()` and `VulkanDelegateHeader::parse()`, threading the size information from `init()` (where `processed->size()` is available) through to the parsing layer. In `parse()`, the buffer is validated to be at least large enough for the header, and header offsets are checked against the buffer size with overflow-safe arithmetic. In `compileModel()`, a `flatbuffers::Verifier` is used to validate the FlatBuffer's structural integrity before `GetVkGraph()` is called. Finally, the `GraphBuilder` class now tracks `constant_data_size_` and validates `constant_bytes->offset()` against it before pointer arithmetic. This approach was chosen because it addresses the root cause (missing size information) at every layer rather than just adding a check at the crash site. --- ## 📊 Confidence Scores Each dimension was independently evaluated by a specialized reviewer agent. | Dimension | Score | Verdict | Why | |-----------|-------|---------|-----| | 🔒 **Security** (33.3%) | **85**/100 | ✅ PASS | The fix correctly addresses the root cause by threading buffer_size through the call chain, adding header offset bounds validation with overflow-safe arithmetic, adding FlatBuffer Verify() before parsing, and adding constant_data offset bounds checking. Minor gap: constant data validation only checks offset, not offset+size for full tensor extent. | | ⚙️ **Functionality** (33.3%) | **90**/100 | ✅ PASS | All callers (1 each for parse, compileModel, GraphBuilder ctor) are updated compatibly. All symbols verified. Error handling uses existing patterns. No API contract changes for valid inputs, no deadlock/ANR risk. Related task T259294846 confirms this is a systematic fix pattern. | | ⚡ **Performance** (33.3%) | **92**/100 | ✅ PASS | The fix adds FlatBuffer verification and bounds checks exclusively in the one-time model init/compile path, not in the per-inference execute() hot path. The Verifier performs a single O(n) traversal proportional to existing work, and header bounds checks are O(1). No locks, no heap allocations. | **Composite: 89/100** (threshold: 75) ✅ --- ## 📜 Historical Risk Context No prior incidents found for this module. Related task T259294846 identifies the same vulnerability class in other ExecuTorch backends, confirming this is a systematic pattern requiring the same defense-in-depth approach. --- ## ✅ Verification | Check | Result | |-------|--------| | buck2 build | ⏳ PENDING (build timeout after 15 min, skipped per pipeline requirements) | | arc lint | ✅ PASS | | POC re-verification | ⏳ PENDING (build timeout) | | buck2 test | ➖ N/A | Reviewed By: GregoryComer Differential Revision: D98413109
Summary:⚠️ **AI-Generated Fix — Do NOT ship without manual review** --- ## 🐛 Vulnerability **heap_buffer_over_read** in `VulkanBackend.cpp:compileModel` (line 592) The Vulkan backend's `compileModel()` function accepts a raw buffer pointer without a size parameter, making bounds checking structurally impossible. When loading a `.pte` model file, header offsets parsed from the buffer (e.g., `flatbuffer_offset`, `bytes_offset`) are used directly for pointer arithmetic without any validation that they fall within the buffer. Additionally, `GetVkGraph()` is called on the flatbuffer data without first calling `VerifyVkGraphBuffer()`, allowing a malformed FlatBuffer to direct reads to arbitrary heap locations. An attacker who can supply a crafted `.pte` model file (via CDN delivery or model download) can trigger out-of-bounds heap reads, potentially leaking sensitive data or causing crashes. **Source task**: [T259158394](https://www.internalfb.com/T259158394) --- ## 🔧 How This Fix Works **Priority 0**: Site of logic error | **Approach**: Defense-in-depth with buffer size threading, header validation, FlatBuffer verification, and constant data bounds checking The fix adds a `buffer_size` parameter to both `compileModel()` and `VulkanDelegateHeader::parse()`, threading the size information from `init()` (where `processed->size()` is available) through to the parsing layer. In `parse()`, the buffer is validated to be at least large enough for the header, and header offsets are checked against the buffer size with overflow-safe arithmetic. In `compileModel()`, a `flatbuffers::Verifier` is used to validate the FlatBuffer's structural integrity before `GetVkGraph()` is called. Finally, the `GraphBuilder` class now tracks `constant_data_size_` and validates `constant_bytes->offset()` against it before pointer arithmetic. This approach was chosen because it addresses the root cause (missing size information) at every layer rather than just adding a check at the crash site. --- ## 📊 Confidence Scores Each dimension was independently evaluated by a specialized reviewer agent. | Dimension | Score | Verdict | Why | |-----------|-------|---------|-----| | 🔒 **Security** (33.3%) | **85**/100 | ✅ PASS | The fix correctly addresses the root cause by threading buffer_size through the call chain, adding header offset bounds validation with overflow-safe arithmetic, adding FlatBuffer Verify() before parsing, and adding constant_data offset bounds checking. Minor gap: constant data validation only checks offset, not offset+size for full tensor extent. | | ⚙️ **Functionality** (33.3%) | **90**/100 | ✅ PASS | All callers (1 each for parse, compileModel, GraphBuilder ctor) are updated compatibly. All symbols verified. Error handling uses existing patterns. No API contract changes for valid inputs, no deadlock/ANR risk. Related task T259294846 confirms this is a systematic fix pattern. | | ⚡ **Performance** (33.3%) | **92**/100 | ✅ PASS | The fix adds FlatBuffer verification and bounds checks exclusively in the one-time model init/compile path, not in the per-inference execute() hot path. The Verifier performs a single O(n) traversal proportional to existing work, and header bounds checks are O(1). No locks, no heap allocations. | **Composite: 89/100** (threshold: 75) ✅ --- ## 📜 Historical Risk Context No prior incidents found for this module. Related task T259294846 identifies the same vulnerability class in other ExecuTorch backends, confirming this is a systematic pattern requiring the same defense-in-depth approach. --- ## ✅ Verification | Check | Result | |-------|--------| | buck2 build | ⏳ PENDING (build timeout after 15 min, skipped per pipeline requirements) | | arc lint | ✅ PASS | | POC re-verification | ⏳ PENDING (build timeout) | | buck2 test | ➖ N/A | Reviewed By: GregoryComer Differential Revision: D98413109
Summary:
🐛 Vulnerability
heap_buffer_over_read in
VulkanBackend.cpp:compileModel(line 592)The Vulkan backend's
compileModel()function accepts a raw buffer pointer without a size parameter, making bounds checking structurally impossible. When loading a.ptemodel file, header offsets parsed from the buffer (e.g.,flatbuffer_offset,bytes_offset) are used directly for pointer arithmetic without any validation that they fall within the buffer. Additionally,GetVkGraph()is called on the flatbuffer data without first callingVerifyVkGraphBuffer(), allowing a malformed FlatBuffer to direct reads to arbitrary heap locations. An attacker who can supply a crafted.ptemodel file (via CDN delivery or model download) can trigger out-of-bounds heap reads, potentially leaking sensitive data or causing crashes.Source task: T259158394
🔧 How This Fix Works
Priority 0: Site of logic error | Approach: Defense-in-depth with buffer size threading, header validation, FlatBuffer verification, and constant data bounds checking
The fix adds a
buffer_sizeparameter to bothcompileModel()andVulkanDelegateHeader::parse(), threading the size information frominit()(whereprocessed->size()is available) through to the parsing layer. Inparse(), the buffer is validated to be at least large enough for the header, and header offsets are checked against the buffer size with overflow-safe arithmetic. IncompileModel(), aflatbuffers::Verifieris used to validate the FlatBuffer's structural integrity beforeGetVkGraph()is called. Finally, theGraphBuilderclass now tracksconstant_data_size_and validatesconstant_bytes->offset()against it before pointer arithmetic. This approach was chosen because it addresses the root cause (missing size information) at every layer rather than just adding a check at the crash site.📊 Confidence Scores
Each dimension was independently evaluated by a specialized reviewer agent.
Composite: 89/100 (threshold: 75) ✅
📜 Historical Risk Context
No prior incidents found for this module. Related task T259294846 identifies the same vulnerability class in other ExecuTorch backends, confirming this is a systematic pattern requiring the same defense-in-depth approach.
✅ Verification
Reviewed By: GregoryComer
Differential Revision: D98413109