DeepCoder-14B-Preview huggingface.co api & agentica-org DeepCoder-14B-Preview github AI Model

Introduction of DeepCoder-14B-Preview

Model Details of DeepCoder-14B-Preview

DeepCoder-14B-Preview

🚀 Democratizing Reinforcement Learning for LLMs (RLLM) 🌟

DeepCoder Overview

DeepCoder-14B-Preview is a code reasoning LLM fine-tuned from DeepSeek-R1-Distilled-Qwen-14B using distributed reinforcement learning (RL) to scale up to long context lengths. The model achieves 60.6% Pass@1 accuracy on LiveCodeBench v5 (8/1/24-2/1/25), representing a 8% improvement over the base model (53%) and achieving similar performance to OpenAI's o3-mini with just 14B parameters.

Data

Our training dataset consists of approximately 24K unique problem-tests pairs compiled from:

Taco-Verified
PrimeIntellect SYNTHETIC-1
LiveCodeBench v5 (5/1/23-7/31/24)

Training Recipe

Our training recipe relies on an improved version of GRPO (GRPO+) and iterative context lengthening, introduced in DeepScaleR.

GRPO+

We enhance the original GRPO algorithm with insights from DAPO to enable more stable training:

Offline Difficulty Filtering: DAPO employs online dynamic sampling, discarding both entirely correct and entirely incorrect samples on the fly. While this helps maintain a more stable effective batch size, it introduces significant runtime overhead due to rejection sampling. Instead, we perform offline difficulty filtering on a subset of coding problems to ensure the training dataset remains within a suitable difficulty range.
No Entropy Loss: We observed that including an entropy loss term often led to instability, with entropy growing exponentially and ultimately collapsing training. To mitigate this, we eliminate the entropy loss entirely.
No KL Loss: Eliminating KL loss prevents the LLM from staying within trust region of the original SFT model. This removal also obviates the need to compute log probabilities for the reference policy, thereby accelerating training.
Overlong Filtering (from DAPO): To preserve long-context reasoning, we mask the loss for truncated sequences. This technique enables DeepCoder to generalize to 64K-context inference despite being trained with a 32K context.
Clip High (from DAPO): By increasing the upper bound in GRPO/PPO’s surrogate loss, we encourage more exploration and more stable entropy.

Iterative Context Lengthening

Our original Deepscaler-1.5B-Preview scaled long context training from 8K→16K→24K, achieving 33→38→43% on AIME respectively. Similarly, Deepcoder-14B-Preview is trained on 16K→32K, achieving 54→58% on LiveCodeBench (v5). DeepCoder-14B-Preview successfully generalizes to longer contexts when evaluated at 64K context, reaching 60.6%.

DeepCoder generalizes better to long contexts than the base distilled model, due to DAPO's overlong filtering. However, it's longer responses are often truncated when the max length is capped at 16K, which can lower its scores.

Model	16K	32K	64K
DeepCoder-14B-Preview	45.6	57.9	60.6
DeepSeek-R1-Distill-Qwen-14B	50.2	53.0	53.0

A more detailed description of the training recipe can be found in our blog post .

Evaluation

We evaluate Deepcoder-14B-Preview on various coding benchmarks, including LiveCodeBench (LCBv5), Codeforces, and HumanEval+.

Model	LCB (v5)(8/1/24-2/1/25)	Codeforces Rating	Codeforces Percentile	HumanEval+
DeepCoder-14B-Preview (ours)	60.6	1936	95.3	92.6
DeepSeek-R1-Distill-Qwen-14B	53.0	1791	92.7	92.0
O1-2024-12-17 (Low)	59.5	1991	96.1	90.8
O3-Mini-2025-1-31 (Low)	60.9	1918	94.9	92.6
O1-Preview	42.7	1658	88.5	89
Deepseek-R1	62.8	1948	95.4	92.6
Llama-4-Behemoth	49.4	-	-	-

Serving DeepCoder

Our model can be served using popular high-performance inference systems:

vLLM
Hugging Face Text Generation Inference (TGI)
SGLang
TensorRT-LLM

All these systems support the OpenAI Chat Completions API format.

License

This project is released under the MIT License, reflecting our commitment to open and accessible AI development. We believe in democratizing AI technology by making our work freely available for anyone to use, modify, and build upon. This permissive license ensures that researchers, developers, and enthusiasts worldwide can leverage and extend our work without restrictions, fostering innovation and collaboration in the AI community.

Acknowledgement

Our training experiments are powered by our heavily modified fork of Verl , an open-source post-training library.
Our model is trained on top of DeepSeek-R1-Distill-Qwen-14B .
Our work is done as part of Berkeley Sky Computing Lab and Berkeley AI Research .

Citation

@misc{deepcoder2025,
  title={DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level},
  author={Michael Luo, Sijun Tan, Roy Huang, Xiaoxiang Shi, Rachel Xin, Colin Cai, Ameen Patel, Alpay Ariyak, Qingyang Wu, Ce Zhang, Li Erran Li, Raluca Ada Popa, Ion Stoica, Tianjun Zhang},
  howpublished={\url{https://pretty-radio-b75.notion.site/DeepCoder-A-Fully-Open-Source-14B-Coder-at-O3-mini-Level-1cf81902c14680b3bee5eb349a512a51}},
  note={Notion Blog},
  year={2025}
}

Runs of agentica-org DeepCoder-14B-Preview on huggingface.co

495

Total runs

24-hour runs

3-day runs

7-day runs

-15

30-day runs

More Information About DeepCoder-14B-Preview huggingface.co Model

More DeepCoder-14B-Preview license Visit here:

https://choosealicense.com/licenses/mit

DeepCoder-14B-Preview huggingface.co

DeepCoder-14B-Preview huggingface.co is an AI model on huggingface.co that provides DeepCoder-14B-Preview's model effect (), which can be used instantly with this agentica-org DeepCoder-14B-Preview model. huggingface.co supports a free trial of the DeepCoder-14B-Preview model, and also provides paid use of the DeepCoder-14B-Preview. Support call DeepCoder-14B-Preview model through api, including Node.js, Python, http.

DeepCoder-14B-Preview huggingface.co Url

https://huggingface.co/agentica-org/DeepCoder-14B-Preview

agentica-org DeepCoder-14B-Preview online free

DeepCoder-14B-Preview huggingface.co is an online trial and call api platform, which integrates DeepCoder-14B-Preview's modeling effects, including api services, and provides a free online trial of DeepCoder-14B-Preview, you can try DeepCoder-14B-Preview online for free by clicking the link below.

agentica-org DeepCoder-14B-Preview online free url in huggingface.co:

https://huggingface.co/agentica-org/DeepCoder-14B-Preview

DeepCoder-14B-Preview install

DeepCoder-14B-Preview is an open source model from GitHub that offers a free installation service, and any user can find DeepCoder-14B-Preview on GitHub to install. At the same time, huggingface.co provides the effect of DeepCoder-14B-Preview install, users can directly use DeepCoder-14B-Preview installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

DeepCoder-14B-Preview install url in huggingface.co:

https://huggingface.co/agentica-org/DeepCoder-14B-Preview

huggingface.co

agentica-org/DeepScaleR-1.5B-Preview

Total runs: 8.2K

Run Growth: -8.6K

Growth Rate: -103.97%

Updated:April 10 2025

huggingface.co

agentica-org/DeepSWE-Preview

Total runs: 648

Run Growth: 535

Growth Rate: 82.56%

Updated:July 03 2025

huggingface.co

agentica-org/DeepCoder-1.5B-Preview

Total runs: 157

Run Growth: -47

Growth Rate: -29.94%

Updated:April 10 2025

huggingface.co

agentica-org/DeepSWE-Verifier

Total runs: 21

Run Growth: 0

Growth Rate: 0.00%

Updated:July 02 2025

agentica-org / DeepCoder-14B-Preview

Introduction of DeepCoder-14B-Preview

Model Details of DeepCoder-14B-Preview

DeepCoder Overview

Data

Training Recipe

GRPO+

Iterative Context Lengthening

Evaluation

Serving DeepCoder

License

Acknowledgement

Citation

Runs of agentica-org DeepCoder-14B-Preview on huggingface.co

More Information About DeepCoder-14B-Preview huggingface.co Model

More DeepCoder-14B-Preview license Visit here:

DeepCoder-14B-Preview huggingface.co

DeepCoder-14B-Preview huggingface.co Url

agentica-org DeepCoder-14B-Preview online free

agentica-org DeepCoder-14B-Preview online free url in huggingface.co:

DeepCoder-14B-Preview install

DeepCoder-14B-Preview install url in huggingface.co:

Url of DeepCoder-14B-Preview

DeepCoder-14B-Preview huggingface.co Url

Provider of DeepCoder-14B-Preview huggingface.co

Other API from agentica-org