Chinese Open-Source AI DeepSeek R1 Matches OpenAI's o1 at 98% Lower Cost
The DeepSeek R1 model now delivers results that are equivalent to or better than OpenAI's systems and remains both free and open-source.
Chinese Open-Source AI DeepSeek R1 Matches OpenAI's o1 at 98% Lower Cost
Chinese AI researchers have achieved what many thought was light years away: Chinese AI experts have developed a free open-source AI model which stands on par or surpasses OpenAI's leading reasoning systems. What makes this even more remarkable was how they did it: The researchers achieved this breakthrough by developing an AI which learned autonomously through experimental trial and error just like human learning processes.
The research paper describes DeepSeek-R1-Zero as a model that exhibits impressive reasoning abilities through large-scale reinforcement learning without any supervised fine-tuning process.
The reinforcement learning method teaches models to identify good decisions through rewards and bad decisions through punishments but without revealing which decisions were correct. Through multiple decisions the model discovers how to pursue the path strengthened by those outcomes.
During supervised fine-tuning humans provide the model with examples of desired outputs to set standards of good and bad results. Following this stage, the model advances to Reinforcement Learning where it generates various outputs which humans then evaluate to select the top performers. The model repeatedly goes through this process until it learns to produce satisfactory results consistently.
The DeepSeek R1 model represents a new direction in AI development due to its minimal human involvement during the training process. DeepSeek R1 relies mainly on mechanical reinforcement learning which allows it to learn by experimenting with different actions and receiving feedback to understand successful outcomes.
The research paper states that DeepSeek-R1-Zero develops powerful and interesting reasoning behaviors through reinforcement learning. The model acquired advanced abilities such as self-verification and reflection even though these capabilities were not directly programmed into it.
The performance numbers are impressive. DeepSeek R1 scored 79.8% on the AIME 2024 mathematics benchmark which topped OpenAI's o1 reasoning model. The model reached "expert level" performance on standardized coding tests with a Codeforces Elo rating of 2,029 while surpassing 96.3% of human competitors.