Validate input channels match kernel in depthwise/separable conv#22687
Validate input channels match kernel in depthwise/separable conv#22687hertschuh merged 6 commits intokeras-team:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a _check_input_channels_match_kernel helper function to validate input channels for depthwise and separable convolutions. The review feedback suggests enhancing this helper to support symbolic tensors with None dimensions, which would allow for its reuse in DepthwiseConv.compute_output_spec and eliminate redundant validation logic.
| def _check_input_channels_match_kernel(inputs, kernel, data_format): | ||
| input_channels = ( | ||
| inputs.shape[-1] if data_format == "channels_last" else inputs.shape[1] | ||
| ) | ||
| kernel_input_channels = kernel.shape[-2] | ||
| if input_channels != kernel_input_channels: | ||
| raise ValueError( | ||
| "The number of input channels must match the kernel's " | ||
| f"input channels. Received: input channels=" | ||
| f"{input_channels}, kernel input channels=" | ||
| f"{kernel_input_channels}, data_format='{data_format}'." | ||
| ) |
There was a problem hiding this comment.
To improve reusability and reduce code duplication, you can modify this helper function to also handle symbolic tensors, where shape dimensions can be None. This would allow you to use it in DepthwiseConv.compute_output_spec as well.
By adding is not None checks, this function becomes safe for both eager and symbolic execution paths.
def _check_input_channels_match_kernel(inputs, kernel, data_format):
input_channels = (
inputs.shape[-1] if data_format == "channels_last" else inputs.shape[1]
)
kernel_input_channels = kernel.shape[-2]
if (
input_channels is not None
and kernel_input_channels is not None
and input_channels != kernel_input_channels
):
raise ValueError(
"The number of input channels must match the kernel's "
f"input channels. Received: input channels="
f"{input_channels}, kernel input channels="
f"{kernel_input_channels}, data_format='{data_format}'."
)| input_channels = ( | ||
| inputs.shape[-1] | ||
| if self.data_format == "channels_last" | ||
| else inputs.shape[1] | ||
| ) | ||
| kernel_input_channels = kernel.shape[-2] | ||
| if ( | ||
| input_channels is not None | ||
| and kernel_input_channels is not None | ||
| and input_channels != kernel_input_channels | ||
| ): | ||
| raise ValueError( | ||
| "The number of input channels must match the kernel's " | ||
| f"input channels. Received: input channels=" | ||
| f"{input_channels}, kernel input channels=" | ||
| f"{kernel_input_channels}, data_format=" | ||
| f"'{self.data_format}'." | ||
| ) |
There was a problem hiding this comment.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #22687 +/- ##
==========================================
+ Coverage 82.95% 83.03% +0.08%
==========================================
Files 596 596
Lines 69252 69858 +606
Branches 10814 10884 +70
==========================================
+ Hits 57451 58010 +559
- Misses 8973 8999 +26
- Partials 2828 2849 +21
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
hertschuh
left a comment
There was a problem hiding this comment.
Thanks for looking into this!
| return DepthwiseConv( | ||
| strides, padding, data_format, dilation_rate | ||
| ).symbolic_call(inputs, kernel) | ||
| _check_input_channels_match_kernel(inputs, kernel, data_format) |
There was a problem hiding this comment.
The reason why we don't do any validation involving the inputs in the backend-independent code is that there are scenarios where this won't work:
- inputs can be a nested Python array and won't have a shape
- inputs can be a tracer / placeholder tensor and have dynamic dimensions
I would check which backend don't handle this correctly, then add this check for the backends that need it after the convert_to_tensor(inputs).
…per review feedback
|
Thanks for the review! I've pushed a refactor addressing the feedback:
Verified all four backends now raise the same clear |
| if ( | ||
| input_channels is None | ||
| or kernel_input_channels is None | ||
| or input_channels == kernel_input_channels | ||
| ): | ||
| return | ||
| raise ValueError( |
There was a problem hiding this comment.
Write it like this:
if (
isinstance(input_channels, int)
and isinstance(kernel_input_channels, int)
and input_channels != kernel_input_channels
):
raise ValueError(...)The reason is that dynamic dimensions can come in different forms, not just None, during tracing.
| inputs = convert_to_tensor(inputs) | ||
| kernel = convert_to_tensor(kernel) | ||
| check_depthwise_conv_input_channels(inputs, kernel, data_format) | ||
| num_spatial_dims = len(inputs.shape) - 2 | ||
| if num_spatial_dims > 2: | ||
| raise ValueError( | ||
| "`inputs` rank must be 3 (1D conv) or 4 (2D conv). Received: " | ||
| f"{inputs.ndim}." | ||
| ) |
There was a problem hiding this comment.
Can you move the check_depthwise_conv_input_channels after the check on num_spatial_dims.
The reason is that if the number of dimensions is incorrect, the math in check_depthwise_conv_input_channels will be incorrect and meaningless and may raise a confusing error.
| inputs = convert_to_tensor(inputs) | ||
| kernel = convert_to_tensor(kernel) | ||
| check_depthwise_conv_input_channels(inputs, kernel, data_format) |
There was a problem hiding this comment.
Any reason why you didn't do separable_conv?
| inputs = convert_to_tensor(inputs) | ||
| depthwise_kernel = convert_to_tensor(depthwise_kernel) | ||
| pointwise_kernel = convert_to_tensor(pointwise_kernel) | ||
| check_depthwise_conv_input_channels(inputs, depthwise_kernel, data_format) | ||
| num_spatial_dims = len(inputs.shape) - 2 | ||
| if num_spatial_dims > 2: | ||
| raise ValueError( | ||
| "`num_spatial_dims` must be 1 or 2. Received: " | ||
| f"num_spatial_dims={num_spatial_dims}." | ||
| ) |
There was a problem hiding this comment.
Can you move the check_depthwise_conv_input_channels after the check on num_spatial_dims.
The reason is that if the number of dimensions is incorrect, the math in check_depthwise_conv_input_channels will be incorrect and meaningless and may raise a confusing error.
| inputs = convert_to_tensor(inputs) | ||
| kernel = convert_to_tensor(kernel) | ||
| check_depthwise_conv_input_channels(inputs, kernel, data_format) |
There was a problem hiding this comment.
Any reason why you didn't do separable_conv?
| data_format = backend.standardize_data_format(data_format) | ||
| inputs = convert_to_tensor(inputs) | ||
| kernel = convert_to_tensor(kernel) | ||
| check_depthwise_conv_input_channels(inputs, kernel, data_format) |
There was a problem hiding this comment.
Any reason why you didn't do separable_conv?
|
Thanks for the review! Pushed
Re-verified the #22516 repro raises the clear error on all four backends for both |
Oh, I did not realize that. |
hertschuh
left a comment
There was a problem hiding this comment.
Thanks for your contribution!
Description
Fixes #22516.
keras.ops.separable_convandkeras.ops.depthwise_convpreviously gave confusing error messages when the input's channel dimension didn't match the kernel's expected input channels (e.g., passing channels-last shaped data withdata_format="channels_first"):input depth must be evenly divisible by filter depth: 5 vs 3Kernel shape must have the same length as inputNeither message clearly identifies the root cause: a mismatch between the input's channel count and the kernel's expected input channels.
This PR adds explicit channel validation in both the symbolic path (
compute_output_spec) and the eager path (before dispatching to the backend), producing a clear error:Changes:
_check_input_channels_match_kernelhelper inkeras/src/ops/nn.pyDepthwiseConv.compute_output_spec(symbolic path)depthwise_conv()andseparable_conv()(eager path)Contributor Agreement