SeedVR2 Upscaler — Image & Video

Support for single image, batch images, and MP4 video upscaling.

This application utilizes the ComfyUI-SeedVR2_VideoUpscaler backend logic for inference.

⚠️ No GPU Detected (CPU Mode)

Neither CUDA (NVIDIA) nor MPS (macOS) was detected. Processing will be extremely slow.

  • Recommendation: Clone this repository to a local machine with a GPU for full functionality.
  • If running online (CPU): Please process Images Only.
  • Model Selection: Use GGUF 3B models or 7B (Q4_K_M) quantization. Heavier models will likely fail.

Configure a custom GitHub repository to test different versions or forks.

Preset mode

Automatically adjusts compilation, tiling, and offload settings based on your hardware capabilities.

256 4096
Output Format (Default: webp)

Format for saved images. For video input, the CLI produces MP4 (or PNG sequence), and this app converts the final result if needed.

1 5
Video Backend

Video encoder backend. 'ffmpeg' requires ffmpeg in system PATH but supports advanced features like 10-bit.

Use x265 10-bit encoding (reduces banding). Requires ffmpeg backend.

When checked, the DiT model dropdown will show GGUF models from cmeka/SeedVR2-GGUF. Efficient for lower VRAM.

DiT model (Format: RepoID/Filename)

DiT transformer model. 7B models have higher quality but require more memory than 3B models.

Color correction

Method to match colors. 'lab' (perceptual, recommended), 'wavelet' (frequency-based), 'adain' (statistical), etc.

0 1
0 1

Reduces image size before upscaling. Helps remove JPEG artifacts or noise as noted in community tips.

0.1 0.9
1 65

Pad final batch to match batch_size. Prevents temporal artifacts caused by small final batches. Adds extra compute.

Replaces the standard blockswap logic with the improved version from Nunchaku. Useful for faster offloading.

Offload DiT I/O layers for extra VRAM savings. Requires Offload Device.

DiT Offload device

Device to move DiT to when idle. 'cpu' frees VRAM between phases.

VAE Offload device

Device to move VAE to when idle. 'cpu' frees VRAM between phases.

Tensor Offload device

Where to store intermediate tensors. 'cpu' is recommended to save VRAM.

Keep DiT model in memory between generations. Useful for batch/directory mode or streaming.

Keep VAE model in memory between generations. Useful for batch/directory mode or streaming.

Process VAE encoding in tiles to reduce VRAM usage (good for large inputs).

Process VAE decoding in tiles to reduce VRAM usage.

Tile Debug Visualization

Visualizes the tiling process for debugging purposes.

20-40% speedup. Requires PyTorch 2.0+ and Triton. May increase memory usage.

15-25% speedup for VAE encoding/decoding.

Backend

'inductor' (full optimization) or 'cudagraphs' (lightweight).

Mode

Optimization level: 'default' (fast compile), 'max-autotune' (best speed, slow compile), etc.

Attention Mode

Attention backend. 'sdpa' (default), 'flash_attn' (faster), or 'sageattn' (Blackwell).

Compile entire model as single graph. Faster but less flexible.

Handle varying input shapes without recompilation.

Show verbose output in CLI logs.

Ready 0.0%