ComfyUI Update: Stable Video Diffusion on 8GB vram with 25 frames and more.
Here’s what’s new recently in ComfyUI.
Here’s the link to the previous update in case you missed it.
Stable Video Diffusion
ComfyUI now supports the new Stable Video Diffusion image to video model. With ComfyUI you can generate 1024x576 videos of 25 frames long on a GTX 1080 with 8GB vram. I can confirm that it also works on my AMD 6800XT with ROCm on Linux.
For workflows and explanations how to use these models see: the video examples page.
If you want the workflow I used to generate the video above you can save it and drag it on ComfyUI.
LCM
LCM models are models that can be sampled in very few steps. Recently loras have been released to convert regular SDXL and SD1.x models to LCM.
For how to use them in ComfyUI see: LCM examples page.
Kohya Deep Shrink
The _for_testing->PatchModelAddDownscale node adds a downscale to the unet that can be scheduled so that it only happens during the first timesteps of the model. This lets you generate consistent images at higher resolutions without having to do a second pass.
To use it add the node to your workflow and double the resolution you are generating at.
Support for ZSNR V Prediction Models
The new ModelSamplingDiscrete node lets you set a model as v_prediction zsnr to sample them properly.
The new RescaleCFG node implements the rescale cfg algorithm in the zsnr paper and should also be used with these models to sample them properly.
To use a ZSNR v_pred model use the regular checkpoint loader node to load it then chain the ModelSamplingDiscrete with v_pred and zsnr selected. You can then add the RescaleCFG node.
Other
The Load VAE node now supports TAESD. TAESD is a fast and small VAE implementation that is used for the high quality previews. If you have taesd_encoder and taesd_decoder or taesdxl_encoder and taesdxl_decoder in models/vae_approx the options “taesd” and “taesdxl” will show up on the Load VAE node. The reason I implemented it is because I thought it might go well with LCM for those that want maximum speed.
SaveAnimatedWEBP: A node to save batches of images as an animated webp. These webp files contain metadata and can be loaded in ComfyUI to get the workflow.
DOM element clipping: Nodes can now render on top of multi line text boxes. If your UI got slower you can disable it in the UI settings.
The UI can now load workflows that are in api format.
New color schemes in the UI: settings -> color palette
RepeatImageBatch: for repeating a batch of images.
ImageCrop: node to crop images.
latent->advanced->LatentInterpolate: uses nlerp to interpolate between two different latents.
heunpp2 sampler: A sampler that comes from this repo. The paper mentions how using different samplers in a single generation can improve quality. Using different samplers for different steps has been natively supported in ComfyUI for a long time now by chaining advanced sampler nodes together. You can see a complex example of chaining different advanced samplers together on the Noise Latent Composition Example page page.
COMBO primitive nodes now have a filter so you can filter what they will increment or randomize to.
The new FlipSigmas node can be used to flip the sigmas passed to the custom sampler node. It can be used to “unsample” an image.
Cool ComfyUI related projects
A very cool and novel UI that uses ComfyUI for its backend: https://github.com/rvion/CushyStudio
The Krita addon that everyone is talking about right now: https://github.com/Acly/krita-ai-diffusion
If stable video feels a bit lacking here’s the best implementation of AnimateDiff: https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved