Skip to content

nodes_morphology: implement with max-pooling.#13389

Open
maromri wants to merge 3 commits intoComfy-Org:masterfrom
maromri:feature/implement-morphology-with-max-pool
Open

nodes_morphology: implement with max-pooling.#13389
maromri wants to merge 3 commits intoComfy-Org:masterfrom
maromri:feature/implement-morphology-with-max-pool

Conversation

@maromri
Copy link
Copy Markdown
Contributor

@maromri maromri commented Apr 13, 2026

Why?

  • The ImageMorphology node uses kornia's morphology operations which default to the unfold engine.
  • Given a kernel size K, this engine calls tensor.unfold(2, K, 1).unfold(3, K, 1), materializing the entire KxK neighborhood for every pixel! (=all possible KxK patches at once)
  • Thus, memory scales as $O(H * W * K^2)$, which becomes catastrophic with larger kernel sizes when running on GPU (e.g. with the --gpu-only flag), as measured on H100 80GB, 1024x1024 RGB image, dilate operation:
Kernel Size Peak VRAM
3 ~0.9 GB
11 ~2.5 GB
31 ~14.5 GB
51 ~35.4 GB
71 ~65.6 GB
99 OOM (>80 GB)

What?

Replace kornia morphology calls with F.max_pool2d-based implementations:

F.max_pool2d(x, kernel_size, stride=1, padding=kernel_size//2)

And its negation for erosion.

  • This is a valid replacement because the node only ever uses flat square structuring elements (all-ones kernel), which is mathematically equivalent to a max/min pool.
  • All composite operations (open, close, gradient, top_hat, bottom_hat) are composed from these two primitives.
  • F.max_pool2d uses a native sliding-window kernel with $O(1)$ extra memory regardless of kernel size. After this change, VRAM usage is ~1.1 GB constant across all kernel sizes tested (3 through 99) on 1024x1024 images.

Running example

The following results compare the previous kornia-based implementation with the proposed F.max_pool2d-based implementation on a 1024x1024 test image:

morphology_test_mask

Operation kornia F.max_pool2d
Erode, kernel=55 erode_kornia erode_maxpool
Dilate, kernel=35 dilate_kornia dilate_maxpool
Open, kernel=45 open_kornia open_maxpool
Close, kernel=45 close_kornia close_maxpool
Gradient, kernel=55 gradient_kornia gradient_maxpool
Bottom hat, kernel=55 bottom_hat_kornia bottom_hat_maxpool
Top hat, kernel=55 top_hat_kornia top_hat_maxpool

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 13, 2026

📝 Walkthrough

Walkthrough

The PR refactors morphology operators in the nodes_morphology.py file. It replaces Kornia-based implementations of erosion, dilation, opening, closing, gradient, top_hat, and bottom_hat operations with custom implementations using torch.nn.functional.max_pool2d-based max and min pooling. Two helper functions (_max_pool_dilate and _max_pool_erode) are added to handle these operations. The execute() method is updated to use these new pooling-based approaches instead of explicit kernel tensor creation.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and concisely describes the main change: replacing morphology operations with max-pooling implementations.
Description check ✅ Passed The pull request description is directly related to the changeset, providing clear motivation (VRAM usage with Kornia), detailed explanation of the solution (using F.max_pool2d), and visual comparisons demonstrating equivalence.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy_extras/nodes_morphology.py`:
- Around line 10-17: The pooling helpers must explicitly pad to keep output
shapes identical for both odd and even kernel_size and avoid implicit
zero-padding; change _max_pool_dilate and _max_pool_erode to explicitly F.pad
the input with left/top = kernel_size//2 and right/bottom = kernel_size - 1 -
(kernel_size//2), then call F.max_pool2d with padding=0 and stride=1 so shape is
preserved for even kernels; for dilation pad with a very low constant (e.g.,
tensor.new_tensor(-float("inf")) or a sufficiently small sentinel) so borders
don't spuriously raise maxima, and for erosion keep using the negation trick but
pad the negated tensor with -inf (so after negation the padding behaves like
+inf for min) before pooling and then negate the result back—update functions
_max_pool_dilate and _max_pool_erode accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 598d6909-d7dd-4ca7-8587-d880ea5325af

📥 Commits

Reviewing files that changed from the base of the PR and between acd7185 and 1ab72b1.

📒 Files selected for processing (1)
  • comfy_extras/nodes_morphology.py

Comment on lines +10 to +17
def _max_pool_dilate(tensor, kernel_size):
pad = kernel_size // 2
return F.max_pool2d(tensor, kernel_size, stride=1, padding=pad)


def _max_pool_erode(tensor, kernel_size):
pad = kernel_size // 2
return -F.max_pool2d(-tensor, kernel_size, stride=1, padding=pad)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify two claims:
# 1) even kernel with max_pool2d padding=k//2 changes output shape
# 2) zero-padded erosion differs from geodesic-style edge handling

python - <<'PY'
import torch
import torch.nn.functional as F

x = torch.ones(1, 1, 5, 5)

print("Shape check:")
for k in (3, 4):
    y = F.max_pool2d(x, k, stride=1, padding=k//2)
    print(f"k={k} -> {tuple(y.shape)}")

k = 3
erode_zero = -F.max_pool2d(-x, k, stride=1, padding=k//2)

p = k // 2
x_inf = F.pad(x, (p, p, p, p), mode="constant", value=float("inf"))
erode_inf = -F.max_pool2d(-x_inf, k, stride=1, padding=0)

print("\nTop row erosion (zero pad):", erode_zero[0,0,0].tolist())
print("Top row erosion (+inf pad):", erode_inf[0,0,0].tolist())
PY

Repository: Comfy-Org/ComfyUI

Length of output: 178


🏁 Script executed:

cat -n comfy_extras/nodes_morphology.py | head -100

Repository: Comfy-Org/ComfyUI

Length of output: 4498


🏁 Script executed:

git log --oneline -n 20 -- comfy_extras/nodes_morphology.py

Repository: Comfy-Org/ComfyUI

Length of output: 114


🏁 Script executed:

git show 1ab72b1 --stat

Repository: Comfy-Org/ComfyUI

Length of output: 49344


🏁 Script executed:

git show 1ab72b1 -- comfy_extras/nodes_morphology.py

Repository: Comfy-Org/ComfyUI

Length of output: 4960


🏁 Script executed:

find . -type f -name "*.py" -path "*/tests*" -exec grep -l "morphology\|nodes_morphology" {} \;

Repository: Comfy-Org/ComfyUI

Length of output: 43


Handle even kernels and border padding explicitly in pooling helpers.

max_pool2d(..., padding=kernel_size // 2, stride=1) is not shape-preserving for even kernels—the output spatial dimensions increase by 1. Since the schema accepts even kernel_size values (line 34: step=1 from 3 to 999), operations like top_hat and bottom_hat that subtract pooled outputs from the original image will fail at runtime with shape mismatch.

Additionally, the negation trick in _max_pool_erode applies implicit zero-padding, which differs from proper geodesic erosion semantics where borders should be treated as infinity.

💡 Proposed fix
 def _max_pool_dilate(tensor, kernel_size):
-    pad = kernel_size // 2
-    return F.max_pool2d(tensor, kernel_size, stride=1, padding=pad)
+    pad_l = (kernel_size - 1) // 2
+    pad_r = kernel_size // 2
+    padded = F.pad(
+        tensor,
+        (pad_l, pad_r, pad_l, pad_r),
+        mode="constant",
+        value=float("-inf"),
+    )
+    return F.max_pool2d(padded, kernel_size, stride=1, padding=0)
 
 
 def _max_pool_erode(tensor, kernel_size):
-    pad = kernel_size // 2
-    return -F.max_pool2d(-tensor, kernel_size, stride=1, padding=pad)
+    pad_l = (kernel_size - 1) // 2
+    pad_r = kernel_size // 2
+    padded = F.pad(
+        tensor,
+        (pad_l, pad_r, pad_l, pad_r),
+        mode="constant",
+        value=float("inf"),
+    )
+    return -F.max_pool2d(-padded, kernel_size, stride=1, padding=0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy_extras/nodes_morphology.py` around lines 10 - 17, The pooling helpers
must explicitly pad to keep output shapes identical for both odd and even
kernel_size and avoid implicit zero-padding; change _max_pool_dilate and
_max_pool_erode to explicitly F.pad the input with left/top = kernel_size//2 and
right/bottom = kernel_size - 1 - (kernel_size//2), then call F.max_pool2d with
padding=0 and stride=1 so shape is preserved for even kernels; for dilation pad
with a very low constant (e.g., tensor.new_tensor(-float("inf")) or a
sufficiently small sentinel) so borders don't spuriously raise maxima, and for
erosion keep using the negation trick but pad the negated tensor with -inf (so
after negation the padding behaves like +inf for min) before pooling and then
negate the result back—update functions _max_pool_dilate and _max_pool_erode
accordingly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Regarding even kernels: this is true (even though even kernels don't really make sense for morphology, as there is no center pixel, so step=2 would make more sense here IMO).
  2. Regarding padding: I ran your script and both erosion rows are identical:
k=3 -> (1, 1, 5, 5)
k=4 -> (1, 1, 6, 6)
Top row erosion (zero pad): [1.0, 1.0, 1.0, 1.0, 1.0]
Top row erosion (+inf pad): [1.0, 1.0, 1.0, 1.0, 1.0]

This is because F.max_pool2d pads with -inf, not zero, so the border semantics already match kornia's geodesic padding.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maromri, thank you for the correction and for verifying empirically.

Point 2 — border padding: You're absolutely right. F.max_pool2d internally initializes its sliding window accumulator to -inf, not 0. This means the negation trick (-F.max_pool2d(-x, ...)) automatically gives +inf-equivalent border semantics for the erosion case, which is exactly what Kornia's geodesic erosion does. My concern there was unfounded — I apologize for the noise.

Point 1 — even kernels: The shape-mismatch risk is real, and I agree step=2 is the cleaner fix for this context. Morphological operations with even-sized kernels lack a well-defined center pixel anyway, so restricting the input to odd values via step=2 is both semantically correct and prevents any downstream shape issues. The current step=1 with min=3 still allows even values like 4, 6, etc.

(ᵔ◡ᵔ)∫


✏️ Learnings added
Learnt from: maromri
Repo: Comfy-Org/ComfyUI PR: 13389
File: comfy_extras/nodes_morphology.py:10-17
Timestamp: 2026-04-13T18:03:44.857Z
Learning: In `comfy_extras/nodes_morphology.py`, `F.max_pool2d` in PyTorch pads internally with `-inf` (not zero) for max pooling. This means the erosion negation trick `-F.max_pool2d(-tensor, kernel_size, stride=1, padding=pad)` already gives correct `+inf`-equivalent border semantics at the image boundaries, matching Kornia's geodesic erosion padding. Do not flag this as a border-padding correctness issue.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: rattus128
Repo: Comfy-Org/ComfyUI PR: 13062
File: comfy/ldm/lightricks/vae/causal_video_autoencoder.py:782-785
Timestamp: 2026-03-19T09:09:37.347Z
Learning: In `comfy/ldm/lightricks/vae/causal_video_autoencoder.py` (PR `#13062`), `SpaceToDepthDownsample.forward` does NOT need to buffer odd temporal tails internally. The invariant is guaranteed at two upstream points: (1) `VideoVAE.encode` truncates input to `1 + 8k` frames, so the remainder after the first frame is always a multiple of 8; (2) `Encoder.forward_orig` constrains `chunk_t` to {2, 4, 8m}, all divisors of 8, ensuring `torch.split` produces no partial final chunk. Every chunk reaching the downsample layers therefore has temporal size divisible by stride[0]=2. Do not flag the absence of per-layer remainder buffering as a bug.

Learnt from: pythongosssss
Repo: Comfy-Org/ComfyUI PR: 12555
File: comfy_extras/nodes_glsl.py:719-724
Timestamp: 2026-02-21T14:01:41.482Z
Learning: In PyOpenGL, bare Python scalars can be accepted for 1-element array parameters by NumberHandler. This means you can pass an int/float directly to OpenGL texture deletion (e.g., glDeleteTextures(tex)) without wrapping in a list. Verify function-specific expectations and ensure types match what the OpenGL call expects; use explicit lists only when the API requires an array.

Learnt from: jtydhr88
Repo: Comfy-Org/ComfyUI PR: 12757
File: comfy_extras/nodes_custom_sampler.py:1069-1089
Timestamp: 2026-03-04T14:05:31.426Z
Learning: In the ComfyUI sampling pipeline, treat percent_to_sigma(0.0) as a sentinel value (999999999.9) that means starting from pure noise. This is consistent with BasicScheduler via calculate_sigmas. The SamplingPercentToSigma node’s return_actual_sigma flag differentiates this sentinel from sigma_max. Reviewers should not flag CurveToSigmas or similar nodes that rely on percent_to_sigma as bugs; downstream samplers are expected to handle the sentinel correctly. When reviewing related sampling-related code, assume this sentinel semantics unless there is explicit handling for a real sigma_max.

Learnt from: kijai
Repo: Comfy-Org/ComfyUI PR: 13258
File: comfy_extras/nodes_frame_interpolation.py:151-189
Timestamp: 2026-04-04T13:29:15.653Z
Learning: In this ComfyUI codebase, node `execute()` inference is already run under a global `torch.inference_mode()` context established in the execution engine (e.g., `execution.py` around line ~732). During review, avoid recommending changes that wrap node inference loops in `torch.inference_mode()`—it is already applied, so such suggestions are likely redundant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant