Self Forcing trains autoregressive video diffusion models by
simulating the inference process during training
, performing autoregressive rollout with KV caching. It resolves the train-test distribution mismatch and enables
real-time, streaming video generation on a single RTX 4090
while matching the quality of state-of-the-art diffusion models.
Requirements
We tested this repo on the following setup:
Nvidia GPU with at least 24 GB memory (RTX 4090, A100, and H100 are tested).
Linux operating system.
64 GB RAM.
Other hardware setup could also work but hasn't been tested.
Installation
Create a conda environment and install dependencies:
Our model works better with long, detailed prompts
since it's trained with such prompts. We will integrate prompt extension into the codebase (similar to
Wan2.1
) in the future. For now, it is recommended to use third-party LLMs (such as GPT-4o) to extend your prompt before providing to the model.
You may want to adjust FPS so it plays smoothly on your device.
The speed can be improved by enabling
torch.compile
,
TAEHV-VAE
, or using FP8 Linear layers, although the latter two options may sacrifice quality. It is recommended to use
torch.compile
if possible and enable TAEHV-VAE if further speedup is needed.
CLI Inference
Example inference script using the chunk-wise autoregressive checkpoint trained with DMD:
Note: Our training algorithm (except for the GAN version) is data-free (
no video data is needed
). For now, we directly provide the ODE initialization checkpoint and will add more instructions on how to perform ODE initialization in the future (which is identical to the process described in the
CausVid
repo).
Our training run uses 600 iterations and completes in under 2 hours using 64 H100 GPUs. By implementing gradient accumulation, it should be possible to reproduce the results in less than 16 hours using 8 H100 GPUs.
If you find this codebase useful for your research, please kindly cite our paper:
@article{huang2025selfforcing,
title={Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion},
author={Huang, Xun and Li, Zhengqi and He, Guande and Zhou, Mingyuan and Shechtman, Eli},
journal={arXiv preprint arXiv:2506.08009},
year={2025}
}
Runs of gdhe17 Self-Forcing on huggingface.co
3.4K
Total runs
0
24-hour runs
35
3-day runs
306
7-day runs
199
30-day runs
More Information About Self-Forcing huggingface.co Model
Self-Forcing huggingface.co is an AI model on huggingface.co that provides Self-Forcing's model effect (), which can be used instantly with this gdhe17 Self-Forcing model. huggingface.co supports a free trial of the Self-Forcing model, and also provides paid use of the Self-Forcing. Support call Self-Forcing model through api, including Node.js, Python, http.
Self-Forcing huggingface.co is an online trial and call api platform, which integrates Self-Forcing's modeling effects, including api services, and provides a free online trial of Self-Forcing, you can try Self-Forcing online for free by clicking the link below.
gdhe17 Self-Forcing online free url in huggingface.co:
Self-Forcing is an open source model from GitHub that offers a free installation service, and any user can find Self-Forcing on GitHub to install. At the same time, huggingface.co provides the effect of Self-Forcing install, users can directly use Self-Forcing installed effect in huggingface.co for debugging and trial. It also supports api for free installation.