DeepSeek-R1

ReleasedFeatured

OpenAI o1 level reasoning through pure reinforcement learning

Released on 2025.01.20

Overview

DeepSeek-R1 is a revolutionary reasoning model that matches OpenAI o1 performance through pure reinforcement learning, without supervised fine-tuning on chain-of-thought data. It represents a breakthrough in AI reasoning capabilities.

Key Features

  • Matches OpenAI o1 on reasoning benchmarks
  • Pure RL without SFT on CoT data
  • Open-source with MIT license
  • Distilled versions available (1.5B to 70B)
  • Emergent reasoning behaviors

Specifications

Parameters
671B (based on V3)
Architecture
MoE + RL Reasoning
Context Length
128K tokens
Benchmark
AIME 2024: 79.8%, MATH-500: 97.3%
License
MIT License

Resources