DeepSeek-R1
ReleasedFeaturedOpenAI o1 level reasoning through pure reinforcement learning
Released on 2025.01.20
Overview
DeepSeek-R1 is a revolutionary reasoning model that matches OpenAI o1 performance through pure reinforcement learning, without supervised fine-tuning on chain-of-thought data. It represents a breakthrough in AI reasoning capabilities.
Key Features
- Matches OpenAI o1 on reasoning benchmarks
- Pure RL without SFT on CoT data
- Open-source with MIT license
- Distilled versions available (1.5B to 70B)
- Emergent reasoning behaviors
Specifications
- Parameters
- 671B (based on V3)
- Architecture
- MoE + RL Reasoning
- Context Length
- 128K tokens
- Benchmark
- AIME 2024: 79.8%, MATH-500: 97.3%
- License
- MIT License