Conversation
b0330dc to
156fd04
Compare
Greptile SummaryThis PR adds static metadata inference (ndim, dtype, layout) to
Confidence Score: 4/5The PR is mostly safe but has one P1 defect in expand_dims schema inference that should be fixed before merging. All prior review concerns (prev_c_idx, dead output_dtype_fn, label typos, join axis sign) are tracked in existing threads. One new P1 issue was found: negative axes in the expand_dims OutputLayout lambda trigger an assertion / undefined behaviour during Pipeline::Build() before the operator's own validation can fire. The remaining finding (dead variable input_idx) is P2. Score 4 reflects the single outstanding P1. dali/operators/generic/expand_dims.cc (OutputLayout lambda, negative axis handling) Important Files Changed
Sequence DiagramsequenceDiagram
participant Py as Python (pipeline build)
participant PL as Pipeline::Build()
participant NM as node_meta::ComputeDataNodeMetadata
participant OS as OpSchema::Calculate*
participant EX as Executor::Build (operator ctor)
participant OP as OperatorBase::Setup/Run
Py->>PL: Build(output_descs)
PL->>NM: ComputeDataNodeMetadata(graph)
loop DFS over OpNodes
NM->>NM: propagate producer OutputDesc → consumer InputDesc
NM->>OS: InferOutputMetadata() → CalculateOutputDType/NDim/Layout
OS-->>NM: optional<dtype/ndim/layout> stored in OpSpec::outputs_
end
NM-->>PL: metadata populated
PL->>EX: executor_->Build(graph)
EX->>EX: instantiate operators (DALI_ENFORCE axis validations run here)
EX-->>PL: built
PL-->>Py: pipeline ready
Py->>OP: Setup(output_desc, ws)
OP->>OP: ValidateInputMetadata(ws, spec)
OP->>OP: SetupImpl()
Py->>OP: Run(ws)
OP->>OP: RunImpl()
OP->>OP: ValidateOutputMetadata(ws, spec)
Reviews (4): Last reviewed commit: "Make default metadata policy opt-in. Mak..." | Re-trigger Greptile |
644da43 to
e23b196
Compare
jantonguirao
left a comment
There was a problem hiding this comment.
LGTM, except for the bugs raised by greptile and some minor comments
| if (window_size == 0) | ||
| return {}; |
There was a problem hiding this comment.
Without it a debug build fired an assertion. This is not related to this change but prevented running all the tests in a debug build.
|
CI MESSAGE: [47951113]: BUILD STARTED |
|
CI MESSAGE: [47951113]: BUILD FAILED |
|
CI MESSAGE: [47993982]: BUILD STARTED |
|
|
||
| def __init__(self, name, device="cpu", source=None): | ||
| def __init__( | ||
| self, name, device="cpu", source=None, index=0, ndim=None, dtype=None, layout=None |
There was a problem hiding this comment.
Nitpick: missing type hints
There was a problem hiding this comment.
We pass some wild stuff there at times, so I wouldn't risk putting type hints here.
|
CI MESSAGE: [47993982]: BUILD FAILED |
|
CI MESSAGE: [48010759]: BUILD STARTED |
|
CI MESSAGE: [48023970]: BUILD STARTED |
5f38835 to
20b9194
Compare
|
CI MESSAGE: [48024121]: BUILD STARTED |
|
CI MESSAGE: [48010759]: BUILD FAILED |
|
CI MESSAGE: [48024121]: BUILD FAILED |
|
CI MESSAGE: [48125071]: BUILD STARTED |
b4b3b73 to
63b8346
Compare
|
CI MESSAGE: [48133713]: BUILD STARTED |
|
@greptileai Please look at expand_dims.cc:71 - out-of-range axes are handled here. |
|
You're right - I see the check at line 85 in DALI_ENFORCE(d < ndim,
make_string("Data has not enough dimensions to add new axes at specified indices."));This correctly validates that axes aren't out of range. When an axis index is too large, Tip: You can give me more context with .greptile/rules so I get this right next time. |
jantonguirao
left a comment
There was a problem hiding this comment.
Only minor comments left
| dtype = self._operator._op_spec.OutputDesc(result_index)[3] | ||
| if dtype is not None: | ||
| return dtype | ||
| self.run(self._eval_context) |
There was a problem hiding this comment.
[Minor] DRY — the ndim, dtype, and layout methods now share the same pattern:
if self._results is None:
if init_spec := getattr(self._operator, "_init_spec", None):
init_spec(self._inputs, self._args)
val = self._operator._op_spec.OutputDesc(result_index)[N]
if val is not None:
return val
self.run(self._eval_context)Consider extracting a helper like _get_static_meta(self, result_index, field_index) to avoid repeating the init_spec / OutputDesc / fallback logic three times.
|
CI MESSAGE: [48411019]: BUILD FAILED |
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
…ALI. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
…y input layout. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
3a571e7 to
d876708
Compare
|
CI MESSAGE: [48484956]: BUILD STARTED |
| layout = self._operator._op_spec.OutputDesc(result_index)[4] | ||
| if layout is not None: | ||
| layout = str(layout) | ||
| return None if layout == "" else layout |
There was a problem hiding this comment.
[Minor] The layout method's fast path returns None for empty layout (return None if layout == "" else layout), but the fallback path returns self._results[result_index].layout() which returns an empty string for no-layout tensors. This means the two code paths return different values for the same semantic state.
Also, the type hint says -> str but the fast path can return None.
| auto input_layout = input_desc.layout.value_or(""); | ||
|
|
||
| if (input_layout.empty()) { | ||
| // If the layout was empty, we need the number of dimesnions, as "" is legal for any ndim. |
There was a problem hiding this comment.
[Nit] Typo: dimesnions → dimensions
Co-authored-by: Rostan Tabet rtabet@nvidia.com
Category:
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Description:
This change adds static metadata inference (ndim, layout, dtype) to OpSchema. Most operators can infer it from OpSpec.
OpSpec now carries the statically inferred metadata.
Actual inputs and outputs, as seen in the workspace, are now automatically validated against OpSpec in OperatorBase.
There's a default policy for handling metadata - it's opt-in, but can be enabled for all schemas declared with
DALI_SCHEMAifDALI_SCHEMA_DEFAULT_METADATA_POLICYis defined and set to nonzero. This is true for DALI project, so all internal operators implement the default policy and need to either opt-out or override it if they don't conform.Additional information:
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A