Developers caught DeepSeek R1 having an ‘aha moment’ on its own during training

Chinese startup DeepSeek took the world by storm this month, and especially in the past few days, with its ChatGPT rivals. The latest is called DeepSeek R1, with DeepSeek published research showing the reasoning model can match ChatGPT o1, OpenAI’s only public reasoning AI model.

There’s a big difference between the two. The Chinese developer created R1 without access to the same computing power that US companies have. While OpenAI can afford to buy any high-end chips NVIDIA makes, DeepSeek has limited access to the latest GPUs, and these units likely have to be smuggled into the country.

The DeepSeek R1 announcement directly impacted the market, with AI stock dipping in early Monday trading on news that China is already overcoming the ban on AI chips with new ideas for training AI.

The DeepSeek R1 developers relied mostly on Reinforcement Learning (RL) to improve the AI’s reasoning abilities. This training method uses a reward system to provide feedback to the AI, which made DeepSeek R1 cheaper to train than ChatGPT o1.

RL allows the AI to adapt while tackling prompts and problems and use feedback to improve itself. To prove this point, the researchers published a fragment from the AI’s chain-of-thought (CoT), or the step-by-step reasoning process a model like o1 and R1 goes through.

While solving a math problem, the ChatGPT rival had an “aha moment,” labeling it as such. This was, in turn, an “aha moment” for the researchers.

Continue reading…

The post Developers caught DeepSeek R1 having an ‘aha moment’ on its own during training appeared first on BGR.

Today’s Top Deals

Today’s deals: $50 off Meta Quest 3S, $250 Dyson Digital Slim vacuum, $50 Anker ANC headphones, more
Today’s deals: $399 iPad mini, $8 MagSafe chargers, 30% off TurboTax, $188 ergonomic office chair, more
Today’s deals: $294 Nintendo Switch OLED, $6 color LED smart bulbs, 26% off Energizer batteries, more
Today’s deals: $329 Apple Watch Series 10, $219 Bose soundbar, 40% off eufy video smart lock, more

Developers caught DeepSeek R1 having an ‘aha moment’ on its own during training originally appeared on BGR.com on Mon, 27 Jan 2025 at 13:57:00 EDT. Please see our terms for use of feeds.

Breaking

Developers caught DeepSeek R1 having an ‘aha moment’ on its own during training

Leave a Reply Cancel reply

Vikram Sharma

3 Apple TV+ new releases coming in February you don’t want to miss

Everyone’s talking about a new asteroid heading towards Earth, but is it a threat?

8 new documentaries revealed by Netflix, including one about OceanGate

The first major DeepSeek hack might have already happened

3 Apple TV+ new releases coming in February you don’t want to miss

Everyone’s talking about a new asteroid heading towards Earth, but is it a threat?

8 new documentaries revealed by Netflix, including one about OceanGate

The first major DeepSeek hack might have already happened

3 Apple TV+ new releases coming in February you don’t want to miss

Everyone’s talking about a new asteroid heading towards Earth, but is it a threat?

8 new documentaries revealed by Netflix, including one about OceanGate

The first major DeepSeek hack might have already happened

You Missed

3 Apple TV+ new releases coming in February you don’t want to miss

Everyone’s talking about a new asteroid heading towards Earth, but is it a threat?

8 new documentaries revealed by Netflix, including one about OceanGate

The first major DeepSeek hack might have already happened

Developers caught DeepSeek R1 having an ‘aha moment’ on its own during training

Related Posts

Leave a Reply Cancel reply

You Missed