You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Precompute T1 offset for quantized conv2d NHWC in TIE kernel
Summary:
Move the zero-point correction term `t1[oc] = -input_zero_point * sum(weight[oc])` from runtime (malloc + compute_t1_..._DWH + free per inference) to compile time via a new PrecomputeForQuantizedConvPass, mirroring the existing linear pass. The precomputed offset is threaded through a new optional "offset" parameter on cadence::quantized_conv2d_nhwc.per_tensor (defaults to None for backwards compatibility). The now-dead compute_t1_..._DWH functions are removed.
The TIE kernels assume the existence of the offset parameter similar to quantized_linear case.
Differential Revision: D100690813
0 commit comments