Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18962
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Cancelled Job, 3 Unrelated FailuresAs of commit 52b7f06 with merge base 75fe8e9 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
adfc911 to
306d676
Compare
|
@claude review |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes a denial-of-service type confusion during EValue destruction by ensuring TensorList/ListOptionalTensor element destructors run on the already-unwrapped storage instead of re-dereferencing potentially-mutated wrapped pointers.
Changes:
- Added
BoxedEvalueList::destroy_elements()to destroyunwrapped_vals_without touchingwrapped_vals_. - Updated
EValuedestructor paths for TensorList and ListOptionalTensor to calldestroy_elements(). - Minor formatting adjustments to
EVALUE_DEFINE_TOmacro specializations.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| void destroy_elements() const { | ||
| for (typename executorch::aten::ArrayRef<T>::size_type i = 0; | ||
| i < wrapped_vals_.size(); | ||
| i++) { | ||
| unwrapped_vals_[i].~T(); | ||
| } | ||
| } |
There was a problem hiding this comment.
The loop bounds use wrapped_vals_.size() while indexing into unwrapped_vals_. If these ever diverge (e.g., partial unwrap, error paths, or refactoring), this becomes an out-of-bounds destructor call. Prefer iterating over the actual unwrapped element count (e.g., unwrapped_size_/unwrapped_vals_size_) or add an explicit invariant check that wrapped_vals_.size() matches the unwrapped storage length before iterating.
There was a problem hiding this comment.
I think this is fine - only wrapped_vals_ size is available, as unwrapped_vals_ is a raw pointer expected to be the same size.
see
// Source of truth for the list
executorch::aten::ArrayRef<EValue*> wrapped_vals_;
// Same size as wrapped_vals
mutable T* unwrapped_vals_;
| void destroy_elements() const { | ||
| for (typename executorch::aten::ArrayRef<T>::size_type i = 0; | ||
| i < wrapped_vals_.size(); | ||
| i++) { | ||
| unwrapped_vals_[i].~T(); | ||
| } | ||
| } |
There was a problem hiding this comment.
destroy_elements() is const but it explicitly destroys the underlying objects, which makes subsequent use of the list (including calling get()) undefined behavior. Consider making this method non-const and/or renaming to something that clearly signals one-way invalidation (e.g., destroy_unwrapped_elements_for_destruction()), and document that the list must not be accessed after calling it.
|
Claude finished @lucylq's task in 2m 29s —— View job Review of PR #18962 — Destroy unwrapped_vals instead of wrapped_vals
SummaryThis PR addresses TOB-EXECUTORCH-31, a denial-of-service vulnerability caused by type confusion during The fix is sound: by adding Correctness AnalysisThe fix is correct. Here's the reasoning:
Issues and Suggestions1. Loop bound mismatch (minor concern) — for (typename executorch::aten::ArrayRef<T>::size_type i = 0;
i < wrapped_vals_.size(); // <-- uses wrapped_vals_ size
i++) {
unwrapped_vals_[i].~T(); // <-- indexes into unwrapped_vals_
}
2. The method is marked That said, 3. Comment refers to Copilot flagged this too. The comment says "does not access the values_ table" but * This is safe to call during EValue destruction because it does not
* dereference wrapped_vals_, which may point to EValues mutated by
* MoveCall instructions.This is a nit — the current comment conveys the right idea. 4. Double-destruction safety — If 5. Whitespace-only changes in The diff includes two whitespace changes ( VerdictThe fix is well-targeted and correct. It eliminates the TOCTOU window during destruction without changing the runtime behavior for non-malicious programs. The main areas for potential improvement are the comment wording (nit) and the latent assumption about uninitialized Looks good to merge with the optional comment wording improvement. | |
GregoryComer
left a comment
There was a problem hiding this comment.
I don't have full context on the wrapping logic, but approving to unblock. The two copilot comments seem reasonable, but I'll defer to your judgement.
MoveCall instructions can overwrite values_ entries after parseTensorList validated their types, creating a time-of-check-time-of-use window. The existing to<T>() calls have ET_CHECK_MSG type guards that abort on mismatch, but the error messages don't indicate the TOCTOU cause. Add explicit type validation with clear diagnostic messages in: - BoxedEvalueList<optional<Tensor>>::get() (evalue.cpp) - Improved null check messages in the generic template (evalue.h) The to<T>() type checks in EValue provide defense in depth for the generic template (including BoxedEvalueList<Tensor> and <int64_t>). Note: fully preventing the abort (DoS) would require changing BoxedEvalueList<T>::get() to return Result<ArrayRef<T>>, which is a larger API change tracked separately. Addresses TOB-EXECUTORCH-31. This PR was authored with the assistance of Claude.
responded to copilot's comments and made destroy non-const. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } else if ( | ||
| isTensorList() && | ||
| payload.copyable_union.as_tensor_list_ptr != nullptr) { | ||
| // for (auto& tensor : toTensorList()) { | ||
| for (auto& tensor : payload.copyable_union.as_tensor_list_ptr->get()) { | ||
| tensor.~Tensor(); | ||
| } | ||
| payload.copyable_union.as_tensor_list_ptr->destroy_elements(); | ||
| } else if ( |
There was a problem hiding this comment.
Consider adding a regression test for the reported MoveCall/type-confusion scenario: construct an EValue TensorList, call toTensorList()/get() to populate unwrapped_vals_, then mutate one of the underlying wrapped EValues to a non-tensor (e.g., Int) and ensure destroying the list EValue does not crash. This would lock in the security fix and prevent reintroducing the destructor-time dereference path.
…resolution (#19163) ### Summary The Pico2 firmware link fails with an undefined reference to BoxedEvalueList<std::optional<Tensor>>::get() from RegisterCodegenUnboxedKernelsEverything.cpp.obj inside libportable_ops_lib.a. ld walks the link line once: by the time --whole-archive pulls the codegen TU's reference into the link, libexecutorch_core.a (which contains the evalue.cpp specialization) has already been processed. Pico2 was green until #18962 (TOB-EXECUTORCH-31 fix), which removed EValue::~EValue's inline call to BoxedEvalueList::get(). Before that commit, libexecutorch_core.a had its own use of the symbol that forced evalue.cpp.o to be pulled early; the link-order bug was always latent and the destructor change unmasked it. Move libexecutorch_core.a to after the kernel archives in all four USE_CMSIS_NN x USE_SELECTIVE_BUILD branches so the codegen TUs can still resolve their references after --whole-archive. Authored with Claude. ### Test plan CI
Destroy unwrapped_vals instead of dereferencing wrapped_vals. When a kernel uses TensorLists, it calls EValue::toTensorList(). This dereferences wrapped_vals into unwrapped_vals to get the tensor list. During execution, a (crafted) MoveCall potentially moves an Int into the TensorList. This means wrapped_vals now points to an Int, whereas unwrapped_vals still holds a Tensor. Instead of calling destructor on the wrapped_vals (ref to tensor), call the destructor on the unwrapped_vals which contain the real tensor. Vulnerability: During method destruction, the BoxedEvalueList dereferences its stored pointer and attempts to convert the swapped value to a Tensor, causing a type confusion that terminates the process. This results in a denial of service. Addresses TOB-EXECUTORCH-31. This PR was authored with the assistance of Claude. Co-authored-by: Github Executorch <github_executorch@arm.com>
Destroy unwrapped_vals instead of dereferencing wrapped_vals.
When a kernel uses TensorLists, it calls EValue::toTensorList(). This dereferences wrapped_vals into unwrapped_vals to get the tensor list.
During execution, a (crafted) MoveCall potentially moves an Int into the TensorList. This means wrapped_vals now points to an Int, whereas unwrapped_vals still holds a Tensor.
Instead of calling destructor on the wrapped_vals (ref to tensor), call the destructor on the unwrapped_vals which contain the real tensor.
Vulnerability: During method destruction, the BoxedEvalueList dereferences its stored pointer and attempts to convert the swapped value to a Tensor, causing a type confusion that terminates the process. This results in a denial of service.
Addresses TOB-EXECUTORCH-31.
This PR was authored with the assistance of Claude.