feat(nodes): add expand_mask_with_fade to better handle canvas compositing needs

Previously we used erode/dilate and a Gaussian blur to expand and fade the edges of Canvas masks. The implementation a number of problems: - Erode/dilate kernel sizes were not calculated correctly, and extra iterations were run to compensate. The result is the blur size, which should have been pixels, was very inaccurate and unreliable. - What we want is to add a "soft bleed" - like a drop shadow with no offset - starting from the edge of the mask, extending out by however many pixels. But Gaussian blur does not do this. The blurred area starts _inside_ the mask and extends outside it. So it kinda blurs inwards and outwards. We compensated for this by expanding the mask. - Using a Gaussian blur can cause banding artifacts. Gaussian blur doesn't have a "size" or "radius" parameter in the sense that you think it should. It's a convolution matrix and there are _no non-zero values in the result_. This means that, far away from the mask, once compositing completes, we have some values that are very close to zero but not quite zero. These values are quantized by HTML Canvas, resulting in banding artifacts where you'd expect the blur to have faded to 0% alpha. At least, that is my understanding of why the banding artifacts occur. The new node uses a better strategy to expand the mask and add the fade out effect: - Calculate the distance from each white pixel to the nearest black pixel. - Normalize this distance by dividing by the fade size in px, then clip the values to 0 - 1. The result represents the distance of each white pixel to its nearest black pixel as a percentage of the fade size. At this point, it is a linear distribution. - Create a polynomial to describe the fade's intensity so that we can have a smooth transition from the masked region (black) to unmasked (white). There are some magic numbers here, deterined experimentally. - Evaluate the polynomial over the normalized distances, so we now have a matrix representing the fade intensity for every pixel - Convert this matrix back to uint8 and apply it to the mask This works soooo much better than the previous method. Not only does it fix the banding issues, but when we enable "output only generated regions", we get a much smaller image. Will add images to the PR to clarify.
2026-04-23 03:00:31 -04:00 · 2025-03-20 11:22:18 +10:00
parent 534f993023
commit 6cfeb71bed
1 changed files with 67 additions and 0 deletions
--- a/invokeai/app/invocations/image.py
+++ b/invokeai/app/invocations/image.py
@@ -1089,6 +1089,73 @@ class CanvasV2MaskAndCropInvocation(BaseInvocation, WithMetadata, WithBoard):
        return ImageOutput.build(image_dto)


+@invocation(
+    "expand_mask_with_fade", title="Expand Mask with Fade", tags=["image", "mask"], category="image", version="1.0.0"
+)
+class ExpandMaskWithFadeInvocation(BaseInvocation, WithMetadata, WithBoard):
+    """Expands a mask with a fade effect. The mask uses black to indicate areas to keep from the generated image and white for areas to discard.
+    The mask is thresholded to create a binary mask, and then a distance transform is applied to create a fade effect.
+    The fade size is specified in pixels, and the mask is expanded by that amount. The result is a mask with a smooth transition from black to white.
+    """
+
+    mask: ImageField = InputField(description="The mask to expand")
+    threshold: int = InputField(default=0, ge=0, le=255, description="The threshold for the binary mask (0-255)")
+    fade_size_px: int = InputField(default=32, ge=0, description="The size of the fade in pixels")
+
+    def invoke(self, context: InvocationContext) -> ImageOutput:
+        pil_mask = context.images.get_pil(self.mask.image_name, mode="L")
+
+        np_mask = numpy.array(pil_mask)
+
+        # Threshold the mask to create a binary mask - 0 for black, 255 for white
+        # If we don't threshold we can get some weird artifacts
+        np_mask = numpy.where(np_mask > self.threshold, 255, 0).astype(numpy.uint8)
+
+        # Create a mask for the black region (1 where black, 0 otherwise)
+        black_mask = (np_mask == 0).astype(numpy.uint8)
+
+        # Invert the black region
+        bg_mask = 1 - black_mask
+
+        # Create a distance transform of the inverted mask
+        dist = cv2.distanceTransform(bg_mask, cv2.DIST_L2, 5)
+
+        # Normalize distances so that pixels <fade_size_px become a linear gradient (0 to 1)
+        d_norm = numpy.clip(dist / self.fade_size_px, 0, 1)
+
+        # Control points: x values (normalized distance) and corresponding fade pct y values.
+
+        # There are some magic numbers here that are used to create a smooth transition:
+        # - The first point is at 0% of fade size from edge of mask (meaning the edge of the mask), and is 0% fade (black)
+        # - The second point is 1px from the edge of the mask and also has 0% fade, effectively expanding the mask
+        #   by 1px. This fixes an issue where artifacts can occur at the edge of the mask
+        # - The third point is at 20% of the fade size from the edge of the mask and has 20% fade
+        # - The fourth point is at 80% of the fade size from the edge of the mask and has 90% fade
+        # - The last point is at 100% of the fade size from the edge of the mask and has 100% fade (white)
+
+        # x values: 0 = mask edge, 1 = fade_size_px from edge
+        x_control = numpy.array([0.0, 1.0 / self.fade_size_px, 0.2, 0.8, 1.0])
+        # y values: 0 = black, 1 = white
+        y_control = numpy.array([0.0, 0.0, 0.2, 0.9, 1.0])
+
+        # Fit a cubic polynomial that smoothly passes through the control points
+        coeffs = numpy.polyfit(x_control, y_control, 3)
+        poly = numpy.poly1d(coeffs)
+
+        # Evaluate and clip the smooth mapping
+        feather = numpy.clip(poly(d_norm), 0, 1)
+
+        # Build final image.
+        np_result = numpy.where(black_mask == 1, 0, (feather * 255).astype(numpy.uint8))
+
+        # Convert back to PIL, grayscale
+        pil_result = Image.fromarray(np_result.astype(numpy.uint8), mode="L")
+
+        image_dto = context.images.save(image=pil_result, image_category=ImageCategory.MASK)
+
+        return ImageOutput.build(image_dto)
+
+
@invocation(
    "apply_mask_to_image",
    title="Apply Mask to Image",