r1

Star

Here are 41 public repositories matching this topic...

modelscope / awesome-deep-reasoning

Star

Collect every awesome work about r1!

collection rl reasoning r1 o1 qwen deepseek grpo

Updated Feb 27, 2025
Python

RyanLiu112 / compute-optimal-tts

Star

Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".

r1 o1 large-language-model test-time-scaling

Updated Feb 19, 2025
Python

SmallDoges / small-doge

Star

Doge Family of Small Language Model

python nlp natural-language-processing reinforcement-learning deep-learning pytorch transformer chinese attention-mechanism r1 attention-is-all-you-need mechine-learning foundation-models small-language-models dynamic-mask-attention cross-domain-mixture-of-experts deepseek-r1

Updated Feb 27, 2025
Python

CJReinforce / PURE

Star

SOTA RL fine-tuning solution for advanced math reasoning of LLM

reinforcement-learning mathematics rl reasoning r1 o1 llm reinforcement-finetuning

Updated Feb 23, 2025
Python

DMontgomery40 / deepseek-mcp-server

Star

Model Context Protocol server for DeepSeek's advanced language models

mcp r1 deepseek-chat deepseek-api model-context-protocol deepseek-v3 deepseek-r1

Updated Feb 13, 2025
JavaScript

LazaUK / AIFoundry-DeepSeek-SDK

Star

Notebooks to demo the use of Azure AI Python SDK / LangChain with DeepSeek R1 reasoning model in Azure AI Foundry.

python sdk ai azure openai foundry r1 langchain deepseek

Updated Feb 6, 2025
Jupyter Notebook

glide-the / InterpretationoDreams

Star

使用langchain进行任务规划，构建子任务的会话场景资源，通过MCTS任务执行器，来让每个子任务通过在上下文中资源，通过自身反思探索来获取自身对问题的最优答案；这种方式依赖模型的对齐偏好，我们在每种偏好上设计了一个工程框架，来完成自我对不同答案的奖励进行采样策略

task-planning r1 cot mcts-agents deep-research

Updated Feb 24, 2025
Jupyter Notebook

lachlancresswell / AutoR1

Star

Auto-generate fallback and meter display from existing group info in d&b audiotechnik's R1 and ArrayCalc software.

r1 dbaudio dbaudiotechnik arraycalc

Updated Mar 7, 2024
Python

The-Swarm-Corporation / AgentGym

Star

A framework making it effortless to convert any llm model into a reasoning agent like o1 or DeepSeek's r1

ai rl agents alibaba r1 o1 llms qwen deepseek

Updated Feb 10, 2025
Python

IoTDevice / phicomm-r1-controler

Star

斐讯R1音箱控制程序

phicomm r1 yinxiang feixun

Updated Feb 28, 2021
Go

sdiehl / tiny-r1

Star

Recreating the minimal training methods of DeepSeek-R1 for small langauge models.

reasoning r1 grpo grpotrainer

Updated Feb 10, 2025
Python

ericsson-iap / python-sample-app

Star

Python Sample App for SMO Systems like Ericsson Intelligent Automation Platform. We aim to be ORAN aligned. Use this to kickstart your own app!

python smo 3gpp r1 eic ric ran oran o-ran rapp eiap non-rt-ric

Updated Oct 25, 2024
Python

nschlaepfer / ChainForge-R1-SuperCoT

Star

A multi-stage pipeline that enhances Qwen2.5 language models with DeepSeek Reasoner's chain-of-thought capabilities. Implements the DeepSeek-R1 methodology through cold-start SFT, reasoning-oriented RL, rejection sampling, and optional model distillation.

training ai reasoning r1 qwen deepseek deepseek-r1 cold-start-sft

Updated Jan 24, 2025
Python

turningpoint-ai / VisualThinker-R1-Zero

Star

Explore the Multimodal “Aha Moment” on 2B Model

reinforcement-learning reasoning r1 post-training multimodal deepseek deepseek-r1 grpo deepseek-r1-zero r1-zero multimodal-journey multimodal-r1

Updated Feb 27, 2025
Python

OnerootProject / r1

Star

R1 Protocol

protocol exchange dex r1

Updated Mar 7, 2019
JavaScript

lechmazur / goods

Star

LLM public goods game

evaluation economics r1 o1 llm o3-mini

Updated Feb 22, 2025

RyanLiu112 / Awesome-Process-Reward-Models

Star

A comprehensive collection of process reward models.

r1 o1 large-language-model process-reward-model

Updated Feb 24, 2025

tyler-romero / microR1

Star

Simple reposotory for training small reasoning models

reasoning r1 deepseek grpo

Updated Feb 6, 2025
Python

ericsson-iap / go-sample-app

Star

Go Sample App for SMO Systems like Ericsson Intelligent Automation Platform. We aim to be ORAN aligned. Use this to kickstart your own app!

golang smo 3gpp r1 eic ric ran oran o-ran rapp eiap non-rt-ric

Updated Oct 8, 2024
Go

Xuchen-Li / OvO-R1

Star

Exploring the influence of using end-to-end reinforcement learning and various reward functions on the reasoning capabilities of different 1.5B base models.

r1 openr1

Updated Feb 16, 2025
Python

Improve this page

Add a description, image, and links to the r1 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the r1 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

r1

Here are 41 public repositories matching this topic...

modelscope / awesome-deep-reasoning

RyanLiu112 / compute-optimal-tts

SmallDoges / small-doge

CJReinforce / PURE

DMontgomery40 / deepseek-mcp-server

LazaUK / AIFoundry-DeepSeek-SDK

glide-the / InterpretationoDreams

lachlancresswell / AutoR1

The-Swarm-Corporation / AgentGym

IoTDevice / phicomm-r1-controler

sdiehl / tiny-r1

ericsson-iap / python-sample-app

nschlaepfer / ChainForge-R1-SuperCoT

turningpoint-ai / VisualThinker-R1-Zero

OnerootProject / r1

lechmazur / goods

RyanLiu112 / Awesome-Process-Reward-Models

tyler-romero / microR1

ericsson-iap / go-sample-app

Xuchen-Li / OvO-R1

Improve this page

Add this topic to your repo