📫 AI in the News: Reasoning Models Make Waves—But What Are They Really?

Florent Daudens

AI & Journalism @ Hugging Face

Published Nov 29, 2024

+ Follow

Hi! Here's your Friday, November 29, 2024 edition of AI in the News. Today, let's dive into "reasoning models".

Let’s start with a recap of the latest developments:

Alibaba launched Qwen with Questions (QwQ), a 32-billion-parameter open-source reasoning model, this week (VentureBeat)

It outperforms OpenAI’s o1-preview on the AIME and MATH benchmarks, but is less effective on LiveCodeBench coding tasks.
The model emphasizes reflection and self-questioning, leading to improved problem-solving capabilities, as noted in their blog post: “When given time to ponder, question, and reflect, the model’s understanding of mathematics and programming blossoms.”
QwQ is available under an Apache 2.0 license for commercial use, and its open nature allows users to understand its reasoning process, despite the lack of detailed training documentation.

Last week, DeepSeek made headlines by releasing R1 (TechCrunch)

DeepSeek is also a reasoning AI model that competes with OpenAI’s o1, capable of self-fact-checking and extended processing time for complex queries.
The model performs comparably to o1 on the AIME and MATH benchmarks but struggles with certain logic problems and can be easily jailbroken.
DeepSeek-R1 is backed by High-Flyer Capital Management, which aims for “superintelligent” AI.”

In September, OpenAI released o1, its first model with ‘reasoning’ abilities (The Verge)

o1 is designed to solve complex problems and write code more effectively than previous models, though it is slower and more expensive than GPT-4.
The model uses a new training methodology involving reinforcement learning and a “chain of thought” approach, resulting in improved accuracy and reduced hallucinations, although the issue persists.”

🤔 Which leaves us with this question: What’s a reasoning model?

There’s a very interesting article in The Conversation that demystifies it: AI that mimics human problem solving is a big advance – but comes with new risks and problems

Recommended by LinkedIn

The Art & Science of Mastering GPT-o1: How to Use…

Jousef Murad 6 months ago

The two paradigms of Artificial Intelligence: OpenAI's…

Megi Kavtaradze 5 months ago

xAI Unveils Grok 3, Fine-Tuned LLMs Dominate…

Ayush Gupta 2 months ago

These models showcase advanced reasoning capabilities through “chain-of-thought reasoning,” which breaks down complex tasks into simpler steps. They can be useful for tasks in coding, math, or science, for example.
A good example is the “strawberry test,” which consists of counting how many “r”s are in the word. Various “ordinary” models fail this test, but reasoning models don’t (although it’s important to note that some non-reasoning models can perfectly find the answer).
Most models are built in two stages: pre-training on large general datasets and fine-tuning with curated, expert-annotated data for specific tasks. "There are indications that some of the o1 AI models were trained on extensive examples of chain-of-thought reasoning that have been annotated by experts."

"This raises questions about the extent to which self-improvement, rather than expert-guided training, contributes to its capabilities."

These models have different approaches regarding the transparency of their chain of thoughts. For example, “the reflection that o1 performs upon its reasoning is not available to be examined, depriving users of insights into the system’s functioning,” whereas DeepSeek displays it.

If you want to explore some of these models:

You can try out QwQ here.
DeepSeek R1 is available in preview and is supposed to be released open-source.
o1 is available for paying customers of OpenAI.

Also, there’s a collection of other models with the same approach, thanks to Adina Yakefu .

Briefly Noted

Canada’s major news organizations band together to sue ChatGPT creator OpenAI - The Toronto Star

They claim that OpenAI's AI models are using their content without permission, violating copyright laws.
The suit seeks up to $20,000 in statutory damages per article used by OpenAI, which could put the total value of the suit in the range of billions of dollars.
The organizations argue that this practice undermines their business and threatens the future of journalism, stating, "We deserve to be compensated for our work."

As Cohere and Writer mine the ‘Live AI’ arena, Pathway joins the pack with a $10M round

📫 AI in the News: Reasoning Models Make Waves—But What Are They Really?

Florent Daudens

AI & Journalism @ Hugging Face

🤔 Which leaves us with this question: What’s a reasoning model?

Recommended by LinkedIn

Briefly Noted

AI in the News

3,798 followers

More articles by Florent Daudens

Insights from the community

Others also viewed

GPT-5 from OpenAI is on its way: Here’s what we’ve learned.

Unlocking the Power of Reasoning: A Deep Dive into OpenAI o1-preview

DeepSeek R1 vs. O1 Pro Mode by OpenAI: The Battle for the Pinnacle of AI-Driven Data Retrieval

The Rise of Reasoning in Generative AI

Let's talk about DeepSeek.

Fine-Tuning vs. Embedding: Unleashing the Power of LLMs

My Recent Dive into LLMs

DeepSeek: Redefining the Future of AI Innovation

Explore topics

🤔 Which leaves us with this question: What’s a reasoning model?

Recommended by LinkedIn

Briefly Noted

AI in the News

3,798 followers

More articles by Florent Daudens

DeepSeek, War AI, Meta's Moderation and the Paradox of Tolerance

AI Agents: The Good, The Risks, and The Way Forward — with Margaret Mitchell

What Happens When You Send Your AI Model to School for 2.5 Years?

AI in the News: What Biden’s Proposed Plan on AI Chips Is All About

AI in the News: 3 Ways AI Will Actually Transform Journalism by 2025

📫 AI in the News: CES Special Edition

AI in the News: Call Your Agents

📫 AI in the News: Models Can Strategically Lie

📫 AI in the News: Who Controls AI's Training Data?

📫 AI in the News: Trump, Tech Investments, and the Race for AI Supremacy

Insights from the community

Others also viewed

GPT-5 from OpenAI is on its way: Here’s what we’ve learned.

Unlocking the Power of Reasoning: A Deep Dive into OpenAI o1-preview

DeepSeek R1 vs. O1 Pro Mode by OpenAI: The Battle for the Pinnacle of AI-Driven Data Retrieval

The Rise of Reasoning in Generative AI

Let's talk about DeepSeek.

Fine-Tuning vs. Embedding: Unleashing the Power of LLMs

My Recent Dive into LLMs

DeepSeek: Redefining the Future of AI Innovation

Explore topics