Add Buck support for right-sizing operator registry#19513
Conversation
Summary: Adds Buck support for auto-right-sizing the ExecuTorch operator registry, matching the CMake feature landed in pytorch#19118. A new `operator_registry_selective()` macro mirrors `prim_ops_registry_selective()`: a genrule stages `operator_registry.cpp`, `operator_registry.h`, and a generated `selected_max_kernel_num.h` in one artifact tree, and a per-binary cxx_library compiles them together. `operator_registry.cpp`'s existing `__has_include` ladder picks up `EXECUTORCH_SELECTED_MAX_KERNEL_NUM` from the generated header. `executorch_generated_lib` gains an opt-in `auto_size_kernel_registry` parameter (default False). The operator registry is intentionally global, so the per-binary variant defines the same external-linkage symbols as the shared `//executorch/runtime/kernel:operator_registry`. Turning the flag on requires the consumer to ensure no transitive dep on the shared target. User `-c executorch.max_kernel_num=N` still wins via the existing preprocessor flags. When `include_all_prim_ops=False`, the binary already builds its own selective prim ops variant; the counter receives the aggregated `selected_prim_ops.h`(via the new `--selected-prim-ops-header` flag on `gen_max_kernel_num.py`) so prim ops compiled out under `ET_PRIM_OPS_SELECTIVE_BUILD` aren't counted. Differential Revision: D104440331
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19513
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 10 New Failures, 6 Unrelated FailuresAs of commit 5505272 with merge base 8020fe0 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@rascani has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104440331. |
This PR needs a
|
Summary:
Adds Buck support for auto-right-sizing the ExecuTorch operator registry, matching the CMake feature landed in #19118.
A new
operator_registry_selective()macro mirrorsprim_ops_registry_selective(): a genrule stagesoperator_registry.cpp,operator_registry.h, and a generatedselected_max_kernel_num.hin one artifact tree, and a per-binary cxx_library compiles them together.operator_registry.cpp's existing__has_includeladder picks upEXECUTORCH_SELECTED_MAX_KERNEL_NUMfrom the generated header.executorch_generated_libgains an opt-inauto_size_kernel_registryparameter (default False). The operator registry is intentionally global, so the per-binary variant defines the same external-linkage symbols as the shared//executorch/runtime/kernel:operator_registry. Turning the flag on requires the consumer to ensure no transitive dep on the shared target. User-c executorch.max_kernel_num=Nstill wins via the existing preprocessor flags.When
include_all_prim_ops=False, the binary already builds its own selective prim ops variant; the counter receives the aggregatedselected_prim_ops.h(via the new--selected-prim-ops-headerflag ongen_max_kernel_num.py) so prim ops compiled out underET_PRIM_OPS_SELECTIVE_BUILDaren't counted.Fixes #19512
Differential Revision: D104440331