RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video
RayDer is a self-supervised novel view synthesis model that unifies camera estimation and view synthesis in a single transformer. Unlike prior self-supervised NVS approaches, which are bottlenecked by scarce static-scene data, RayDer is trained on general, dynamic real-world video — and its performance scales predictably with data, model size, and compute, following power-law relationships (R² > 0.99) analogous to those observed in LLMs.
Self-supervised novel view synthesis methods are fundamentally
data-limited
: they require static-scene training data, which is scarce. RayDer removes this bottleneck by enabling stable training on general, dynamic real-world video. By consolidating three separate networks into one unified transformer, introducing dynamic state prediction with dropout, and improving pose learning through autoregressive training, RayDer's performance scales predictably with data, model size, and compute.
Existing approaches rely on scarce data sources: supervised NVS requires posed multi-view images, while prior self-supervised methods require unposed videos of static scenes. RayDer instead trains from generic unposed videos that may contain dynamic objects, enabling learning from the dominant form of visual data and unlocking improved scaling with dataset size.
A single transformer unifies camera estimation and novel view synthesis, replacing the three separate networks used by prior self-supervised NVS pipelines.
Usage
To integrate RayDer into your own codebase, copy
rayder/model.py
from the
GitHub repository
and instantiate the model as:
import torch
from rayder.model import RayDer_L
model = RayDer_L()
model.load_state_dict(torch.load("rayder_l_576.pt", weights_only=True))
model.requires_grad_(False)
model.eval()
The
RayDer
class exposes two high-level inference methods:
predict_cameras(x)
: estimate camera parameters from a set of input views (trained for 8 views, but the models extrapolate quite well).
predict_views(x_in, cam_in, cam_target)
: synthesize novel views at target camera poses (trained for 1–7 input views, arbitrarily many output views).
Images are channels-last
(b, t, h, w, 3)
with pixel values in [-1, 1]. Camera extrinsics use the camera-to-world (c2w) convention, and the focal length
f
is normalized by the shorter image side (
f = f_pixels / min(h-1, w-1)
).
See the
GitHub repository
for
generate_video.py
(smooth view-interpolation videos from a set of input images) and
app.py
(Gradio demo).
Models
We currently release the following model variants:
Variant
Width
Depth
Params
Resolution
File
RayDer-L
1024
24
~743M
256²
rayder_l.pt
RayDer-L-576²
1024
24
~743M
576²
rayder_l_576.pt
Additional model variants and licensing available upon request.
License
This model is released under a license for personal and scientific non-commercial research purposes — see
LICENSE.md
for the full terms. For any commercial use or exploitation, please contact
[email protected]
.
Citation
If you find our model or code useful, please cite our paper:
@misc{prestel2026rayderscalableselfsupervisednovel,
title={RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video},
author={Ulrich Prestel and Stefan Andreas Baumann and Nick Stracke and Björn Ommer},
year={2026},
}
Runs of CompVis rayder on huggingface.co
0
Total runs
0
24-hour runs
0
3-day runs
0
7-day runs
0
30-day runs
More Information About rayder huggingface.co Model
rayder huggingface.co is an AI model on huggingface.co that provides rayder's model effect (), which can be used instantly with this CompVis rayder model. huggingface.co supports a free trial of the rayder model, and also provides paid use of the rayder. Support call rayder model through api, including Node.js, Python, http.
rayder huggingface.co is an online trial and call api platform, which integrates rayder's modeling effects, including api services, and provides a free online trial of rayder, you can try rayder online for free by clicking the link below.
rayder is an open source model from GitHub that offers a free installation service, and any user can find rayder on GitHub to install. At the same time, huggingface.co provides the effect of rayder install, users can directly use rayder installed effect in huggingface.co for debugging and trial. It also supports api for free installation.