Add SeedVR2 support (CORE-6) by pollockjj · Pull Request #14424 · Comfy-Org/ComfyUI

pollockjj · 2026-06-12T00:56:49Z

Summary

Add native SeedVR2 model detection, latent format, dtype policy, and VAE support.
Add SeedVR2 conditioning, progressive sampler, post-processing, and workflow-node support.
Add SeedVR2 attention, VAE tiling, temporal chunking, and regression coverage.

Resubmission of #14110, reverted in #14359 pending cleanup.

Changes since #14110

Removed 826 lines of the 7,383 the reverted PR added:
- VAE cleanup:
  - training and gradient-checkpointing paths
  - diffusers-reference machinery with no inference path: distribution sampling, weight inflation, conv-transpose variants, unused resnet options
  - constructor-dead generality: write-only attention options, quant convs, unreachable resampling and pooling branches, and the slicing, padding, and pre-norm plumbing
  - stock tiled VAE behavior restored: VAEDecodeTiled/VAEEncodeTiled public controls back to master semantics
- the legacy ByteDance sampling-schedule path in the nodes (timestep shift transform and its constants), replaced by Comfy's native samplers
- dead DiT and rotary-embedding options no released checkpoint exercises (xpos, custom frequencies, shared qkv/mlp, temporal windowing, block-type plumbing), plus dead initialization and padding fallbacks
- the static-audit tests that covered the removed options; the suite now carries contract coverage only (model detection, dtype policy, tiling dispatch, node boundaries, sampler chunking)
Relocated about 180 lines to minimize SeedVR2's footprint in shared ComfyUI code:
- the variable-length attention path moved out of comfy/ldm/modules/attention.py into comfy/ldm/seedvr/attention.py
- SeedVR2 tiling and temporal handling moved out of comfy/sd.py into the VAE wrapper behind generic hooks; no model-specific branches remain in core
- net effect: outside comfy/ldm/seedvr/ and comfy_extras/nodes_seedvr.py the diff is registration and detection glue, and nodes.py carries one registration line
Validated output against the previously released Add SeedVR2 support (CORE-6) #14110 baseline: 36.9–47.6 dB PSNR across 3B fp8, 7B fp16, 7B sharp fp8, and 7B sharp fp16, on image, single-chunk video, and multi-chunk video; outputs at this head are byte-identical across two independent runs each.

Examples

4x image upscale, 7B fp16. The input below is a 1024x1024 original taken down to 256x256 with the same second-order Real-ESRGAN degradation pipeline the SeedVR2 paper evaluates against: two rounds of random blur, rescaling, gaussian/poisson noise, and JPEG, plus H.264/MPEG-4 compression, at a fixed seed. The exact script is in the gist for degrading your own test inputs the same way.

7B fp16 model, 4x upscale

Output file, full ComfyUI workflow embedded

7B sharp fp8 model, 4x upscale

Output file, full ComfyUI workflow embedded

In each input cell the red box holds the actual input pixels at native size; the rest of the cell is that same image zoomed 4x nearest to output scale, which is what the model starts from.

4x video upscale. Two clips, 256x256 and 97 frames each, degraded from 1024x1024 originals through the same pipeline, video-compression stages included. Same red-box convention on the input panels; panels are animated WebP display transcodes, full-fidelity videos linked under each set.

3B fp8 model, 4x upscale, 3-chunk temporal blend (clip 1)

Full videos: reference | degraded input | SeedVR2 output, full ComfyUI workflow embedded

7B sharp fp16 model, 4x upscale, auto chunking (clip 2)

Full videos: reference | degraded input | SeedVR2 output, full ComfyUI workflow embedded

Degradation provenance

The test image/video inputs are not simple bicubic downscales. Each input was produced from its reference with a published second-order degradation pipeline (Real-ESRGAN / RealBasicVSR) consisting of unsharp masking, two stages of random blur (iso, aniso, generalized, plateau, sinc kernels), random rescaling (bilinear, area, bicubic), gaussian or poisson noise, and JPEG compression, with video codec compression (h264, mpeg4) supplying the temporal degradation for video sources. This follows the SeedVR2 paper's evaluation protocol (https://arxiv.org/abs/2506.05301). The exact script is included in the gist (degrade_uav.py) so reviewers can degrade any image or video the same way for independent testing. The sources shown are roughly 20% of the overall images and videos used for evaluation purposes, with another 50% taken directly from SeedVR2's original evaluation sources.

Workflows

Example workflows are available here:
https://gist.github.com/pollockjj/8d40c875e9eacb9b560709faef4ea31f

Both workflows are generated against this PR's node schemas; the image workflow was executed end-to-end at this PR's head.

Models

https://huggingface.co/Comfy-Org/SeedVR2

Repackaged model files with conditioning embedded so only one model file is needed.

Prior art

Reference implementation: https://github.com/ByteDance-Seed/SeedVR (Apache-2.0). Original weights: https://huggingface.co/ByteDance-Seed/SeedVR2-3B and https://huggingface.co/ByteDance-Seed/SeedVR2-7B (Apache-2.0).
The rotary embedding derives from lucidrains/rotary-embedding-torch, reduced to the paths SeedVR2 exercises.

Tracking

Linear: https://linear.app/comfyorg/issue/CORE-6/support-seedvr2

Validation

SeedVR2 unit suite: 45 passed. Full tests-unit/comfy_test + tests-unit/comfy_extras_test: 153 passed.
Local SeedVR2 image and video workflows completed successfully on 3B fp16, 7B fp16, and 7B sharp fp8.
Image and video workflows at this PR's head are pixel-identical across two independent runs (two image, two video, four model variants).

Copilot

Pull request overview

This pull request integrates SeedVR2 as a first-class model family in ComfyUI, including model detection/configuration, native latent/conditioning handling, SeedVR2-specific VAE wrapper + tiling, and extra workflow nodes (preprocess/conditioning/progressive sampling/post-processing), with accompanying unit/regression coverage.

Changes:

Adds SeedVR2 model detection, config, latent format, dtype policy, and model base wiring.
Introduces SeedVR2 native VAE implementation/wrapper (including tiling + temporal/memory-state behavior) and SeedVR2 variable-length attention backend.
Adds SeedVR2 extra nodes (preprocess/conditioning/progressive sampler/post-processing) and a broad unit/regression test suite.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`comfy/model_detection.py`	Adds SeedVR2 UNet config detection heuristics (3B/7B layouts).
`comfy/supported_models.py`	Registers `SeedVR2` supported model + dtype policy hook.
`comfy/supported_models_base.py`	Extends `set_inference_dtype` API to optionally accept a device.
`comfy/model_base.py`	Adds `SeedVR2` `BaseModel` wrapper and passes SeedVR2 conditioning through `extra_conds`.
`comfy/latent_formats.py`	Introduces `SeedVR2` latent format (16-channel latent).
`comfy/sd.py`	Adds SeedVR2 VAE detection/wiring and “owned tiling” hooks for VAEs that handle their own tiling.
`nodes.py`	Registers `nodes_seedvr.py` in built-in extra nodes loading.
`comfy/ldm/seedvr/constants.py`	SeedVR2/ByteDance/published-standards constants used across the integration.
`comfy/ldm/seedvr/model.py`	Implements SeedVR2 NaDiT model and related utilities (windowing, RoPE, attention plumbing).
`comfy/ldm/seedvr/attention.py`	Adds SeedVR2 split-loop var-attention backend.
`comfy/ldm/seedvr/vae.py`	Implements SeedVR2 VAE + wrapper, causal memory-state handling, and tiled encode/decode.
`comfy/ldm/seedvr/color_fix.py`	Adds LAB/wavelet/AdaIN post-processing color transfer utilities.
`comfy_extras/nodes_seedvr.py`	Adds SeedVR2 preprocess, conditioning, progressive sampler, and post-processing nodes.
`tests-unit/comfy_test/model_detection_test.py`	Adds SeedVR2 model detection coverage.
`tests-unit/comfy_test/test_seedvr2_model.py`	Consolidated regression tests for model/rope/latent format and VAE graph boundaries.
`tests-unit/comfy_test/test_seedvr2_dtype.py`	Tests SeedVR2 dtype policy + conditioning behavior.
`tests-unit/comfy_test/test_seedvr2_internals.py`	Tests GroupNorm limit gating and SeedVR2 optimized var-attention contract.
`tests-unit/comfy_test/test_seedvr2_vae_decode.py`	Tests SeedVR2 VAE wrapper decode shape/rank contracts.
`tests-unit/comfy_test/test_seedvr2_vae_tiled.py`	Tests SeedVR2 VAE tiling/dispatch behavior (encode/decode fallback routing).
`tests-unit/comfy_test/test_seedvr_progressive_sampler.py`	Tests progressive sampler schema + chunking/validation behavior.
`tests-unit/comfy_test/seedvr_vae_forward_test.py`	Regression test for SeedVR VAE forward contract (no diffusers-style attributes).
`tests-unit/comfy_extras_test/test_seedvr2_nodes.py`	Verifies node schema/execute signature alignment for SeedVR2 nodes.
`tests-unit/comfy_extras_test/test_seedvr2_conditioning.py`	Tests SeedVR2 conditioning node output determinism + fail-loud behavior.
`tests-unit/comfy_extras_test/test_seedvr2_post_processing.py`	Tests post-processing schema and error messaging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            if encode:
+                out = model.encode(t_chunk)[0]
+            else:
+                out = model.decode_(t_chunk)


coderabbitai · 2026-06-12T01:08:10Z

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4893c8f1-80c3-4f76-ada4-c53f2a4b9d94

📥 Commits

Reviewing files that changed from the base of the PR and between cfb9c31 and ad04a61.

📒 Files selected for processing (5)

comfy/model_base.py
comfy/model_detection.py
comfy/sd.py
comfy/supported_models.py
nodes.py

🚧 Files skipped from review as they are similar to previous changes (4)

nodes.py
comfy/model_detection.py
comfy/supported_models.py
comfy/sd.py

📝 Walkthrough

Walkthrough

This PR introduces SeedVR2, a complete video diffusion model architecture for ComfyUI. It adds a multimodal transformer (NaDiT) supporting video and text conditioning with windowed attention and rotary embeddings, a 3D causal VAE for video encoding/decoding with temporal memory management and overlap-aware tiling, color correction utilities using wavelet decomposition and CIELAB histogram matching, checkpoint detection for multiple model variants (7B separate/shared, 3B), Comfy workflow nodes for preprocessing, conditioning resolution, postprocessing with color correction modes, and progressive temporal chunk sampling with VRAM-aware auto-sizing, plus comprehensive regression tests validating the model pipeline and individual components.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 8.64% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main change: adding SeedVR2 support, with the linear ticket reference providing proper context.
Description check	✅ Passed	The description comprehensively documents the changes, validation results, examples, and references, directly relating to the SeedVR2 support implementation.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@comfy_extras/nodes_seedvr.py`:
- Around line 284-286: The code currently crops odd-sized tensors by computing
h2/w2 and slicing output (variables h2, w2, output) which discards the last
row/column; instead preserve the original height/width (use h =
output.shape[-3], w = output.shape[-2]) and stop slicing away pixels. If
downstream code requires even dimensions, change the behavior to pad to the next
even size (compute pad_h = 0 if h%2==0 else 1, pad_w = 0 if w%2==0 else 1) and
apply symmetric zero/transparent padding to output (and the alpha plane) rather
than trimming; track the original h/w so you can remove padding only if and when
needed, ensuring alignment logic and alpha restore operate on the original
resized canvas rather than a cropped one.

In `@comfy/ldm/modules/diffusionmodules/model.py`:
- Around line 35-37: Guard the new downscale_freq_shift before it's used to
avoid division by zero/negative scale: check that downscale_freq_shift is an
integer >= 0 and strictly less than half_dim (where half_dim = embedding_dim //
2) so that (half_dim - downscale_freq_shift) > 0; if the check fails, raise a
clear ValueError explaining that downscale_freq_shift must be in [0,
half_dim-1]. Apply this validation immediately before the existing emb
computation that uses emb = math.log(10000) / (half_dim - downscale_freq_shift)
so the function (timestep embedding helper) fails fast instead of producing
invalid tensors.

In `@comfy/ldm/seedvr/color_fix.py`:
- Around line 36-45: The function wavelet_decomposition can return an unbound
low_freq when levels==0; to fix, initialize low_freq before the loop (e.g., set
low_freq = image or low_freq = image.clone()) so low_freq is always defined;
keep high_freq as torch.zeros_like(image) and then run the existing loop that
uses wavelet_blur, high_freq.add_(...), and image = low_freq—this ensures the
final return (high_freq, low_freq) is safe even for levels=0.

In `@comfy/ldm/seedvr/model.py`:
- Around line 1274-1275: The window_method construction uses num_layers // 2 *
["720pwin_by_size_bysize","720pswin_by_size_bysize"] which yields one element
short for odd num_layers and causes indexing errors; update window_method so it
produces exactly num_layers entries (e.g., repeat the two-item pattern and
append the first item when num_layers is odd) and replace the same pattern
wherever window_method is built in this file (the other occurrence noted in the
review). Ensure the rest of the constructor variables (txt_dim, vid_dim) remain
unchanged.

In `@comfy/ldm/seedvr/vae.py`:
- Around line 255-264: The process-global _NORM_LIMIT and setter
set_norm_limit() cause cross-request interference; change the design so memory
limits are instance-scoped: remove reliance on the global _NORM_LIMIT in
GroupNorm-related logic and add a per-wrapper/per-model attribute (e.g., an
instance field on VideoAutoencoderKLWrapper and any GroupNorm-wrapping classes)
that set_memory_limit() updates, then flow that instance-specific limit into the
chunking/forward logic instead of calling set_norm_limit(); ensure
GroupNorm/chunking functions accept the limit as an argument or read it from the
instance so concurrent VAEs don’t mutate a shared global.

In `@comfy/sd.py`:
- Around line 1235-1237: The generic post-format hook here (formatter =
getattr(self.first_stage_model, "comfy_format_encoded", None); samples =
formatter(samples)) must not turn SeedVR2 single-frame 4-D latents into 5-D
tensors; update the hook call or its result handling so that if
comfy_format_encoded (or any formatter) returns a 5-D tensor with a temporal
axis of length 1 (ndim == 5 and shape[2] == 1) and the model/instance expects
latent_dimensions == 2 (SeedVR2), you squeeze the temporal axis back to a 4-D
layout (B, C, H, W). Reference comfy_format_encoded /
VideoAutoencoderKLWrapper.comfy_format_encoded and VAE.encode()/encode_tiled()
to implement this check-and-squeeze immediately after samples =
formatter(samples) (or alternatively restrict formatters to value-only
transforms), so downstream code continues to receive 2-D/4-D SeedVR2 latents.

In `@tests-unit/comfy_test/seedvr_vae_forward_test.py`:
- Around line 23-28: The test module mutates the global CLI flag by setting
cli_args.cpu based on torch.cuda.is_available(), causing cross-module state
leakage; update the module to save the original value of cli_args.cpu before
changing it and restore it after the test module runs (use a module-level
teardown or an autouse pytest fixture) so other tests aren't
affected—specifically wrap the change to cli_args.cpu (the symbol referenced in
the import) with save/restore logic or perform the toggle inside a fixture that
yields and resets the original value.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1bf2b866-576d-4b0b-b629-992a88010ac3

📥 Commits

Reviewing files that changed from the base of the PR and between 10d466b and 019f24f.

📒 Files selected for processing (25)

comfy/latent_formats.py
comfy/ldm/modules/diffusionmodules/model.py
comfy/ldm/seedvr/attention.py
comfy/ldm/seedvr/color_fix.py
comfy/ldm/seedvr/constants.py
comfy/ldm/seedvr/model.py
comfy/ldm/seedvr/vae.py
comfy/model_base.py
comfy/model_detection.py
comfy/sd.py
comfy/supported_models.py
comfy/supported_models_base.py
comfy_extras/nodes_seedvr.py
nodes.py
tests-unit/comfy_extras_test/test_seedvr2_conditioning.py
tests-unit/comfy_extras_test/test_seedvr2_nodes.py
tests-unit/comfy_extras_test/test_seedvr2_post_processing.py
tests-unit/comfy_test/model_detection_test.py
tests-unit/comfy_test/seedvr_vae_forward_test.py
tests-unit/comfy_test/test_seedvr2_dtype.py
tests-unit/comfy_test/test_seedvr2_internals.py
tests-unit/comfy_test/test_seedvr2_model.py
tests-unit/comfy_test/test_seedvr2_vae_decode.py
tests-unit/comfy_test/test_seedvr2_vae_tiled.py
tests-unit/comfy_test/test_seedvr_progressive_sampler.py

coderabbitai · 2026-06-12T01:08:13Z

+        h2 = output.shape[-3] - (output.shape[-3] % 2)
+        w2 = output.shape[-2] - (output.shape[-2] % 2)
+        output = output[:, :, :h2, :w2, :]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't trim odd-sized outputs here.

Lines 284-286 unconditionally drop the last row/column from odd-sized results, so post-processing no longer preserves the original resized dimensions. That breaks the node’s stated alignment behavior and also shifts/restores alpha onto a smaller canvas for odd-width or odd-height inputs.

Proposed fix

- h2 = output.shape[-3] - (output.shape[-3] % 2) - w2 = output.shape[-2] - (output.shape[-2] % 2) - output = output[:, :, :h2, :w2, :] if decoded_was_4d: output = output.reshape(-1, output.shape[-3], output.shape[-2], output.shape[-1]) return io.NodeOutput(output)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

h2 = output.shape[-3] - (output.shape[-3] % 2)

w2 = output.shape[-2] - (output.shape[-2] % 2)

output = output[:, :, :h2, :w2, :]

if decoded_was_4d:

output = output.reshape(-1, output.shape[-3], output.shape[-2], output.shape[-1])

return io.NodeOutput(output)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@comfy_extras/nodes_seedvr.py` around lines 284 - 286, The code currently crops odd-sized tensors by computing h2/w2 and slicing output (variables h2, w2, output) which discards the last row/column; instead preserve the original height/width (use h = output.shape[-3], w = output.shape[-2]) and stop slicing away pixels. If downstream code requires even dimensions, change the behavior to pad to the next even size (compute pad_h = 0 if h%2==0 else 1, pad_w = 0 if w%2==0 else 1) and apply symmetric zero/transparent padding to output (and the alpha plane) rather than trimming; track the original h/w so you can remove padding only if and when needed, ensuring alignment logic and alpha restore operate on the original resized canvas rather than a cropped one.

coderabbitai · 2026-06-12T01:08:13Z

    half_dim = embedding_dim // 2
-    emb = math.log(10000) / (half_dim - 1)
+    emb = math.log(10000) / (half_dim - downscale_freq_shift)
    emb = torch.exp(torch.arange(half_dim, dtype=torch.float32) * -emb)


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard the new frequency-shift parameter.

downscale_freq_shift now feeds the divisor directly. Values >= embedding_dim // 2 make the frequency scale undefined or inverted, so this helper can return broken timestep embeddings instead of failing fast.

Suggested guard

def get_timestep_embedding(timesteps, embedding_dim, flip_sin_to_cos=False, downscale_freq_shift=1): assert len(timesteps.shape) == 1 half_dim = embedding_dim // 2 + if half_dim > 0 and downscale_freq_shift >= half_dim: + raise ValueError("downscale_freq_shift must be smaller than embedding_dim // 2") emb = math.log(10000) / (half_dim - downscale_freq_shift)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

half_dim = embedding_dim // 2

emb = math.log(10000) / (half_dim - 1)

emb = math.log(10000) / (half_dim - downscale_freq_shift)

emb = torch.exp(torch.arange(half_dim, dtype=torch.float32) * -emb)

def get_timestep_embedding(timesteps, embedding_dim, flip_sin_to_cos=False, downscale_freq_shift=1):

assert len(timesteps.shape) == 1

half_dim = embedding_dim // 2

if half_dim > 0 and downscale_freq_shift >= half_dim:

raise ValueError("downscale_freq_shift must be smaller than embedding_dim // 2")

emb = math.log(10000) / (half_dim - downscale_freq_shift)

emb = torch.exp(torch.arange(half_dim, dtype=torch.float32) * -emb)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@comfy/ldm/modules/diffusionmodules/model.py` around lines 35 - 37, Guard the new downscale_freq_shift before it's used to avoid division by zero/negative scale: check that downscale_freq_shift is an integer >= 0 and strictly less than half_dim (where half_dim = embedding_dim // 2) so that (half_dim - downscale_freq_shift) > 0; if the check fails, raise a clear ValueError explaining that downscale_freq_shift must be in [0, half_dim-1]. Apply this validation immediately before the existing emb computation that uses emb = math.log(10000) / (half_dim - downscale_freq_shift) so the function (timestep embedding helper) fails fast instead of producing invalid tensors.

coderabbitai · 2026-06-12T01:08:14Z

+def wavelet_decomposition(image: Tensor, levels: int = WAVELET_DECOMP_LEVELS):
+    high_freq = torch.zeros_like(image)
+
+    for i in range(levels):
+        radius = 2 ** i
+        low_freq = wavelet_blur(image, radius)
+        high_freq.add_(image).sub_(low_freq)
+        image = low_freq
+
+    return high_freq, low_freq


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

low_freq undefined when levels=0.

If levels=0 is passed, the loop body never executes and low_freq remains unbound, causing UnboundLocalError at the return statement. While the default is 5 and this is an internal function, a defensive initialization ensures robustness.

Proposed fix

def wavelet_decomposition(image: Tensor, levels: int = WAVELET_DECOMP_LEVELS): high_freq = torch.zeros_like(image) + low_freq = image # fallback when levels=0 for i in range(levels): radius = 2 ** i low_freq = wavelet_blur(image, radius) high_freq.add_(image).sub_(low_freq) image = low_freq return high_freq, low_freq

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@comfy/ldm/seedvr/color_fix.py` around lines 36 - 45, The function wavelet_decomposition can return an unbound low_freq when levels==0; to fix, initialize low_freq before the loop (e.g., set low_freq = image or low_freq = image.clone()) so low_freq is always defined; keep high_freq as torch.zeros_like(image) and then run the existing loop that uses wavelet_blur, high_freq.add_(...), and image = low_freq—this ensures the final return (high_freq, low_freq) is safe even for levels=0.

coderabbitai · 2026-06-12T01:08:14Z

+from comfy.cli_args import args as cli_args
+
+if not torch.cuda.is_available():
+    cli_args.cpu = True
+
+from comfy.ldm.seedvr.vae import VideoAutoencoderKL  # noqa: E402


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid leaking args.cpu across test modules.

tests-unit/comfy_test/seedvr_vae_forward_test.py, tests-unit/comfy_test/test_seedvr2_internals.py, tests-unit/comfy_test/test_seedvr2_model.py, tests-unit/comfy_test/test_seedvr2_vae_decode.py, and tests-unit/comfy_test/test_seedvr2_vae_tiled.py all set the same process-global CLI flag during import and never restore it. The shared root cause is test-state leakage: once one module forces CPU mode, later modules in the same pytest worker can run under the wrong backend configuration. Save the previous value and restore it after each module (for example with module teardown or an autouse fixture).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests-unit/comfy_test/seedvr_vae_forward_test.py` around lines 23 - 28, The test module mutates the global CLI flag by setting cli_args.cpu based on torch.cuda.is_available(), causing cross-module state leakage; update the module to save the original value of cli_args.cpu before changing it and restore it after the test module runs (use a module-level teardown or an autouse pytest fixture) so other tests aren't affected—specifically wrap the change to cli_args.cpu (the symbol referenced in the import) with save/restore logic or perform the toggle inside a fixture that yields and resets the original value.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@comfy/sd.py`:
- Around line 1283-1297: The code is incorrectly truncating temporal frames by
slicing pixel_samples to maximum after converting through upscale/downscale;
instead compute any latent-aligned length needed (e.g. latent_max =
self.upscale_ratio[0](self.downscale_ratio[0](pixel_samples.shape[2]))) but do
NOT drop frames — pass the full pixel_samples to encode_tiled_3d and either pad
pixel_samples to latent_max or let encode_tiled_3d handle padding/alignment;
update the block around maximum/pixel_samples/encode_tiled_3d (symbols: maximum,
pixel_samples, encode_tiled_3d, encode_tiled) so temporal length is preserved
and tiling math uses padding/alignment rather than slicing.

In `@tests-unit/comfy_test/model_detection_test.py`:
- Around line 174-184: Add an assertion in
test_seedvr2_3b_shared_mm_detection_config to lock the 3B-only vid_out_norm
flag: after calling detect_unet_config(sd, "") and validating unet_config
fields, assert that unet_config["vid_out_norm"] is True so the 3B-specific
runtime knob (vid_out_norm) is covered; locate the test function
test_seedvr2_3b_shared_mm_detection_config and the detect_unet_config result to
add this check.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1d80d0a2-2f48-4281-a04a-3b6e7289df8a

📥 Commits

Reviewing files that changed from the base of the PR and between 019f24f and cfb9c31.

📒 Files selected for processing (25)

comfy/latent_formats.py
comfy/ldm/modules/diffusionmodules/model.py
comfy/ldm/seedvr/attention.py
comfy/ldm/seedvr/color_fix.py
comfy/ldm/seedvr/constants.py
comfy/ldm/seedvr/model.py
comfy/ldm/seedvr/vae.py
comfy/model_base.py
comfy/model_detection.py
comfy/sd.py
comfy/supported_models.py
comfy/supported_models_base.py
comfy_extras/nodes_seedvr.py
nodes.py
tests-unit/comfy_extras_test/test_seedvr2_conditioning.py
tests-unit/comfy_extras_test/test_seedvr2_nodes.py
tests-unit/comfy_extras_test/test_seedvr2_post_processing.py
tests-unit/comfy_test/model_detection_test.py
tests-unit/comfy_test/seedvr_vae_forward_test.py
tests-unit/comfy_test/test_seedvr2_dtype.py
tests-unit/comfy_test/test_seedvr2_internals.py
tests-unit/comfy_test/test_seedvr2_model.py
tests-unit/comfy_test/test_seedvr2_vae_decode.py
tests-unit/comfy_test/test_seedvr2_vae_tiled.py
tests-unit/comfy_test/test_seedvr_progressive_sampler.py

✅ Files skipped from review due to trivial changes (1)

comfy/latent_formats.py

🚧 Files skipped from review as they are similar to previous changes (22)

comfy/supported_models_base.py
nodes.py
comfy/ldm/modules/diffusionmodules/model.py
comfy/model_base.py
comfy/supported_models.py
tests-unit/comfy_test/test_seedvr2_internals.py
comfy/model_detection.py
tests-unit/comfy_test/test_seedvr2_model.py
comfy/ldm/seedvr/attention.py
tests-unit/comfy_extras_test/test_seedvr2_nodes.py
tests-unit/comfy_extras_test/test_seedvr2_conditioning.py
comfy/ldm/seedvr/color_fix.py
comfy/ldm/seedvr/constants.py
tests-unit/comfy_test/test_seedvr2_dtype.py
tests-unit/comfy_test/test_seedvr2_vae_decode.py
tests-unit/comfy_test/test_seedvr2_vae_tiled.py
tests-unit/comfy_extras_test/test_seedvr2_post_processing.py
comfy/ldm/seedvr/vae.py
tests-unit/comfy_test/seedvr_vae_forward_test.py
comfy_extras/nodes_seedvr.py
comfy/ldm/seedvr/model.py
tests-unit/comfy_test/test_seedvr_progressive_sampler.py

coderabbitai · 2026-06-12T16:29:31Z

+    def test_seedvr2_3b_shared_mm_detection_config(self):
+        sd = _make_seedvr2_3b_shared_mm_sd()
+        unet_config = detect_unet_config(sd, "")
+
+        assert unet_config is not None
+        assert unet_config["image_model"] == "seedvr2"
+        assert unet_config["vid_dim"] == 2560
+        assert unet_config["heads"] == 20
+        assert unet_config["num_layers"] == 32
+        assert unet_config["mlp_type"] == "swiglu"
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Lock down the 3B-only vid_out_norm flag here.

The 3B SeedVR2 branch in comfy/model_detection.py also sets vid_out_norm = True, but this test never asserts it. That leaves the one 3B-specific runtime knob free to regress while the detection test still passes.

Suggested assertion

def test_seedvr2_3b_shared_mm_detection_config(self): sd = _make_seedvr2_3b_shared_mm_sd() unet_config = detect_unet_config(sd, "") assert unet_config is not None assert unet_config["image_model"] == "seedvr2" assert unet_config["vid_dim"] == 2560 assert unet_config["heads"] == 20 assert unet_config["num_layers"] == 32 assert unet_config["mlp_type"] == "swiglu" + assert unet_config["vid_out_norm"] is True

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests-unit/comfy_test/model_detection_test.py` around lines 174 - 184, Add an assertion in test_seedvr2_3b_shared_mm_detection_config to lock the 3B-only vid_out_norm flag: after calling detect_unet_config(sd, "") and validating unet_config fields, assert that unet_config["vid_out_norm"] is True so the 3B-specific runtime knob (vid_out_norm) is covered; locate the test function test_seedvr2_3b_shared_mm_detection_config and the detect_unet_config result to add this check.

10902940129041 · 2026-06-12T16:39:10Z

Unable to load SeedVR2 3B in 16-bit, fp8 only (log)

[ERROR] !!! Exception during processing !!! expected mat1 and mat2 to have the same dtype, but got: float != c10::Half
[ERROR] Traceback (most recent call last):
File "/ComfyUI/execution.py", line 536, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/execution.py", line 336, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/execution.py", line 310, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "/ComfyUI/execution.py", line 298, in process_inputs
result = f(**inputs)
^^^^^^^^^^^
File "/ComfyUI/nodes.py", line 1586, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/nodes.py", line 1550, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/sample.py", line 74, in sample
samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 1444, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 1334, in sample
return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 1316, in sample
output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/patcher_extension.py", line 113, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 1254, in outer_sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 1229, in inner_sample
samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/patcher_extension.py", line 113, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 999, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/k_diffusion/sampling.py", line 205, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 639, in call
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 1202, in call
return self.outer_predict_noise(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 1209, in outer_predict_noise
).execute(x, timestep, model_options, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/patcher_extension.py", line 113, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 1212, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 619, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 210, in calc_cond_batch
return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 218, in _calc_cond_batch_outer
return executor.execute(model, conds, x_in, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/patcher_extension.py", line 113, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/samplers.py", line 334, in calc_cond_batch
output = model.apply_model(input_x, timestep, **c).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/model_base.py", line 189, in apply_model
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/patcher_extension.py", line 113, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/model_base.py", line 233, in _apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/ldm/seedvr/model.py", line 1462, in forward
vid, txt, vid_shape, txt_shape = block(
^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/ldm/seedvr/model.py", line 975, in forward
vid_attn, txt_attn = self.attn(vid_attn, txt_attn, vid_shape, txt_shape, cache)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/ldm/seedvr/model.py", line 735, in forward
vid_qkv, txt_qkv = self.proj_qkv(vid, txt)
^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/ldm/seedvr/model.py", line 620, in forward
vid = vid_module(vid, *get_args("vid", args), **get_kwargs("vid", kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ComfyUI/comfy/ops.py", line 503, in forward
return super().forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.12/dist-packages/torch/nn/modules/linear.py", line 134, in forward
return F.linear(input, self.weight, self.bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

tl:dr: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

pollockjj · 2026-06-17T17:57:48Z

@10902940129041 thanks for the report.

Unable to load SeedVR2 3B in 16-bit, fp8 only (log)

tl:dr: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

I tried to reproduce this on current PR head using the SeedVR2 3B fp16 checkpoint with the PR image workflow, but I could not reproduce the dtype error. I also tested the progressive sampler path, and that completed successfully as well.

Your traceback shows native KSampler with Euler reaching a direct fp16 Linear, where the input is still float32. In my local run, that tensor is cast before SeedVR2 forward, so there is likely a difference in the workflow or runtime path that I am not covering yet.

Could you share:

the ComfyUI commit SHA shown by your install;
the workflow JSON that reproduces the crash;
the startup log header with the Python/Torch/CUDA/GPU lines.

That should be enough for me to compare your run against the reproduction here and narrow down the dtype mismatch.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@comfy/model_base.py`:
- Around line 1760-1769: The code does not guard against empty reference_latents
lists before attempting to concatenate. When reference_latents is an empty list
(which is not None), the ordered list becomes empty, the loop does not execute,
and torch.cat(stacked, dim=2) will throw an error. Add a check to verify that
reference_latents has at least one element after the initial None check, and
only proceed with processing the latents and building the stacked concatenation
if the list is non-empty. This guard should be placed immediately after the
reference_latents is not None check and before the ordered list is constructed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4893c8f1-80c3-4f76-ada4-c53f2a4b9d94

📥 Commits

Reviewing files that changed from the base of the PR and between cfb9c31 and ad04a61.

📒 Files selected for processing (5)

comfy/model_base.py
comfy/model_detection.py
comfy/sd.py
comfy/supported_models.py
nodes.py

🚧 Files skipped from review as they are similar to previous changes (4)

nodes.py
comfy/model_detection.py
comfy/supported_models.py
comfy/sd.py

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@comfy/model_base.py`:
- Around line 1760-1769: The code does not guard against empty reference_latents
lists before attempting to concatenate. When reference_latents is an empty list
(which is not None), the ordered list becomes empty, the loop does not execute,
and torch.cat(stacked, dim=2) will throw an error. Add a check to verify that
reference_latents has at least one element after the initial None check, and
only proceed with processing the latents and building the stacked concatenation
if the list is non-empty. This guard should be placed immediately after the
reference_latents is not None check and before the ordered list is constructed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4893c8f1-80c3-4f76-ada4-c53f2a4b9d94

📥 Commits

Reviewing files that changed from the base of the PR and between cfb9c31 and ad04a61.

📒 Files selected for processing (5)

comfy/model_base.py
comfy/model_detection.py
comfy/sd.py
comfy/supported_models.py
nodes.py

🚧 Files skipped from review as they are similar to previous changes (4)

nodes.py
comfy/model_detection.py
comfy/supported_models.py
comfy/sd.py

🛑 Comments failed to post (1)

comfy/model_base.py (1)

1760-1769: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard empty reference_latents before concatenation.

If reference_latents is an empty list, stacked stays empty and torch.cat(stacked, dim=2) throws at runtime. Please add a non-empty check before building reference_latent.

Suggested fix

         reference_latents = kwargs.get("reference_latents", None)
-        if reference_latents is not None:
+        if reference_latents is not None and len(reference_latents) > 0:
             # SCAIL-2 multi-reference: reference_latents[0] is the primary ref, [1:] are additional
             # references. Stack as [additional..., primary] so the primary stays adjacent to the video.
             ordered = list(reference_latents[1:]) + list(reference_latents[:1])
             stacked = []
             for lat in ordered:
                 lat = self.process_latent_in(lat)
                 stacked.append(torch.cat([lat, torch.ones_like(lat[:, :4])], dim=1))
             out['reference_latent'] = comfy.conds.CONDRegular(torch.cat(stacked, dim=2))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

        reference_latents = kwargs.get("reference_latents", None)
        if reference_latents is not None and len(reference_latents) > 0:
            # SCAIL-2 multi-reference: reference_latents[0] is the primary ref, [1:] are additional
            # references. Stack as [additional..., primary] so the primary stays adjacent to the video.
            ordered = list(reference_latents[1:]) + list(reference_latents[:1])
            stacked = []
            for lat in ordered:
                lat = self.process_latent_in(lat)
                stacked.append(torch.cat([lat, torch.ones_like(lat[:, :4])], dim=1))
            out['reference_latent'] = comfy.conds.CONDRegular(torch.cat(stacked, dim=2))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@comfy/model_base.py` around lines 1760 - 1769, The code does not guard
against empty reference_latents lists before attempting to concatenate. When
reference_latents is an empty list (which is not None), the ordered list becomes
empty, the loop does not execute, and torch.cat(stacked, dim=2) will throw an
error. Add a check to verify that reference_latents has at least one element
after the initial None check, and only proceed with processing the latents and
building the stacked concatenation if the list is non-empty. This guard should
be placed immediately after the reference_latents is not None check and before
the ordered list is constructed.

Source: Coding guidelines

pollockjj requested a review from comfyanonymous as a code owner June 12, 2026 00:56

Copilot AI review requested due to automatic review settings June 12, 2026 00:56

pollockjj requested review from Kosinkadink, alexisrolland, guill, kijai and rattus128 as code owners June 12, 2026 00:56

Copilot started reviewing on behalf of pollockjj June 12, 2026 00:57 View session

Copilot AI reviewed Jun 12, 2026

View reviewed changes

Comment thread comfy/ldm/seedvr/vae.py

Comment on lines +136 to +139

if encode:

out = model.encode(t_chunk)[0]

else:

out = model.decode_(t_chunk)

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

alexisrolland assigned comfyanonymous Jun 12, 2026

pollockjj added 7 commits June 12, 2026 11:19

Add SeedVR2 model support

cd18c44

Add SeedVR2 VAE support

a7ea0c2

Add SeedVR2 workflow nodes

d54ce3d

Add SeedVR2 core coverage

0fdbc5d

Add SeedVR2 VAE coverage

bed0cd2

Add SeedVR2 node coverage

7050bdc

Add SeedVR2 sampler coverage

cfb9c31

pollockjj force-pushed the seedvr2-native-support-v5 branch from 019f24f to cfb9c31 Compare June 12, 2026 16:20

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

Merge branch 'master' into seedvr2-native-support-v5

ad04a61

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

Conversation

pollockjj commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes since #14110

Examples

7B fp16 model, 4x upscale

7B sharp fp8 model, 4x upscale

3B fp8 model, 4x upscale, 3-chunk temporal blend (clip 1)

7B sharp fp16 model, 4x upscale, auto chunking (clip 2)

Degradation provenance

Workflows

Models

Prior art

Tracking

Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

10902940129041 commented Jun 12, 2026

Uh oh!

pollockjj commented Jun 17, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pollockjj commented Jun 12, 2026 •

edited

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading