from transformers import pipeline
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="igorktech/grpo", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
@article{shao2024deepseekmath,
title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
year = 2024,
eprint = {arXiv:2402.03300},
}
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
Runs of igorktech grpo on huggingface.co
0
Total runs
0
24-hour runs
0
3-day runs
0
7-day runs
0
30-day runs
More Information About grpo huggingface.co Model
grpo huggingface.co
grpo huggingface.co is an AI model on huggingface.co that provides grpo's model effect (), which can be used instantly with this igorktech grpo model. huggingface.co supports a free trial of the grpo model, and also provides paid use of the grpo. Support call grpo model through api, including Node.js, Python, http.
grpo huggingface.co is an online trial and call api platform, which integrates grpo's modeling effects, including api services, and provides a free online trial of grpo, you can try grpo online for free by clicking the link below.
grpo is an open source model from GitHub that offers a free installation service, and any user can find grpo on GitHub to install. At the same time, huggingface.co provides the effect of grpo install, users can directly use grpo installed effect in huggingface.co for debugging and trial. It also supports api for free installation.