Architecture
: Timer-S1 is a decoder-only Mixture-of-Experts (MoE) Transformer. For time series forecasting (a sequential problem where each step depends on previous ones), we propose
TimeSTP
, enabling multi-step prediction with
serial computations
.
Performance
: Timer-S1 achieves state-of-the-art results on
GIFT-Eval
. The model excels particularly at
medium-term
and
long-term
forecasting tasks.
Post Training
: Timer-S1 undergoes post-training, including continued pre-training (
CPT
) and long-context extension (
LCE
), which improves short-term and long-context performance.
Quickstart
pip install torch accelerate transformers~=4.57.1
import torch
from transformers import AutoModelForCausalLM
# load pretrain model# supports different lookback/forecast lengths
model = AutoModelForCausalLM.from_pretrained(
'bytedance-research/Timer-S1',
trust_remote_code=True,
device_map="auto"
)
# use local model# model = AutoModelForCausalLM.from_pretrained(# 'path_to_timer_s1',# trust_remote_code=True,# device_map="auto"# )# prepare input
batch_size, lookback_length = 1, 11520
seqs = torch.randn(batch_size, lookback_length).to(model.device)
# Note that Timer-S1 generates predictions at fixed quantile levels
forecast_length = 720
output = model.generate(seqs, max_new_tokens=forecast_length, revin=True)
# produce quantile forecasts in [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]print(output.shape) # batch_size x quantile_num(9) x forecast_length# produce the median forecast of the first sampleprint(output[0][4])
Out of GPU memory?
Try the following options:
# Option 1: reduce batch size or context length
batch_size, lookback_length = 1, 2880# Option 2: disable KV cache at runtime (or edit it in config.json for a permanent change)
model.config.use_cache = False
Specification
Architecture
: decoder-only Transformer
Context Length
: up to 11,520
ReNorm
: default=True
KV Cache
: default=True
Patch Length
: 16
Total Parameters
: 8.3B
Activated Parameters
: 0.75B
Number of Layers
: 40
License Agreement
This model is licensed under the Apache-2.0 License.
Citation
If you find Timer-S1 helpful for your research, please cite our paper:
@article{liu2026timer,
title={Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling},
author={Liu, Yong and Su, Xingjian and Wang, Shiyu and Zhang, Haoran and Liu, Haixuan and Wang, Yuxuan and Ye, Zhou and Xiang, Yang and Wang, Jianmin and Long, Mingsheng},
journal={arXiv preprint arXiv:2603.04791},
year={2026}
}
Runs of bytedance-research Timer-S1 on huggingface.co
1.8K
Total runs
25
24-hour runs
141
3-day runs
719
7-day runs
1.8K
30-day runs
More Information About Timer-S1 huggingface.co Model
Timer-S1 huggingface.co is an AI model on huggingface.co that provides Timer-S1's model effect (), which can be used instantly with this bytedance-research Timer-S1 model. huggingface.co supports a free trial of the Timer-S1 model, and also provides paid use of the Timer-S1. Support call Timer-S1 model through api, including Node.js, Python, http.
Timer-S1 huggingface.co is an online trial and call api platform, which integrates Timer-S1's modeling effects, including api services, and provides a free online trial of Timer-S1, you can try Timer-S1 online for free by clicking the link below.
bytedance-research Timer-S1 online free url in huggingface.co:
Timer-S1 is an open source model from GitHub that offers a free installation service, and any user can find Timer-S1 on GitHub to install. At the same time, huggingface.co provides the effect of Timer-S1 install, users can directly use Timer-S1 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.