AI That Thinks Before It Speaks - OpenAI "o1" models

What is OpenAI 'o1'? How Does It Work? When To Use o1-preview & o1-mini?

Richard Warepam

Sep 18, 2024

You might have heard that Open AI has just released two new models in ChatGPT. But this time, things are different.

People keep saying that OpenAI has stretched its limits with these new models.

Why is that? “Reasoning/deep thinking” is the ultimate reason.

The new models are as follows:

ChatGPT o1-Preview
ChatGPT o1-Mini

These new models differ from previous iterations in the GPT series. It focuses heavily on AI's approach to problem-solving.

But, before I go into detail about how these new models work, I'd want to highlight some of the ground-breaking results they produced.

Because this model focuses on problem-solving, it does exceptionally well at tasks that involve logic and deep thinking. Consider coding, mathematics, and, to some extent, PhD-level science.

#Ad: I would love it if you check out my eBooks later to support me:
🔗 Top 50+ ChatGPT Personas for Custom Instructions
🔗 Personal INTERVIEW Ready “SQL” CheatSheet
🔗 Personal INTERVIEW Ready “Statistics” Cornell Notes
Get free data science & AI eBooks: https://codewarepam.gumroad.com/

Take a look at the graphs.

Source: Open AI (https://openai.com/index/learning-to-reason-with-llms/)

In terms of coding, when these "o1" models were tested in a Codeforces competition with the same constraints as human contestants (us), their ELO ratings reached the 93rd percentile, a significant improvement over earlier GPT series such as "4o" (who were at the 11th percentile).

When this model was evaluated on multiple mathematics assessments, it produced encouraging results. Particularly in the AIME exam (an exam meant to challenge bright maths students), it averaged about 74% and reached up to 93% (13.9/15), putting it among the top 500 students nationally. Do you want to know how GPT 4o performed? - averaged roughly 12%. As a result, it represents a significant step forward.

Finally, it made significant advances in both PhD-level physics and formal logic. When tested with PhD-level individuals, this model outperformed and showed potential in complex tasks as well.

After all of the above successes, you might be wondering, “How does it work differently from the other GPT series models?”

The Deep Thinking Working

This model, as I mentioned earlier, focuses primarily on the way AI approaches problem-solving. But How?

This approach, however, is what distinguishes the “o1” model from all other models.

It simply does not generate query responses as all the others do; instead, it approaches the given problem logically and methodically—just like humans do—and then outputs the results.

According to OpenAI, this entire thought process, step-by-step analysis, and output generation is known as “Deep Thinking.”

The steps that it takes from a high-level perspective are as follows:

Analyzing the problem (question)
Taking into account the constraints
Exploring potential solutions
Furthermore, it reworks and retracts as needed.

You already know what comes out of such a cognitive process, right?

Fewer hallucinations and greater precision.

However, it comes at a cost: the system requires a lot of computational power and funds to operate.

What you were waiting for - When to use which?

It's time now that you understand how this model operates.

You do recall that in the beginning, I mentioned there were two models: “o1-preview” and “o1-mini.”

Which one, then, performs better?

After using both, I felt that the “o1-preview,” the big boy, was a little bit of a bum. Why am I saying this? - It takes quite some time to answer, perhaps due to excessive thinking and deductive reasoning.

On the other hand, “o1-mini” works quite quickly and precisely. With just one click, it provides precise and fast answers.

When to use ChatGPT o1-preview:

I am aware that I stated how slow it is. However, that has a purpose, correct?

It does so as a result of the extensive process of thinking it goes through before generating an output. Hence, it would be appropriate to use this model when in-depth reasoning is required for complex tasks.

If I had to put out the exact scenarios, here are some of them:

It would be ideal for complex tasks that require breaking down into simpler steps to achieve the desired result. Let us term this “advanced problem-solving.”
Another example could be during extensive research on any topic. As this kind of research requires multiple viewpoints, I believe this model will work quite well.
Finally, if you intend to feed a model a large amount of information in your prompt to obtain a structured output, this model will handle it effectively, despite its slowness.

Use ChatGPT o1-preview to solve complex tasks where speed is not a priority.

When to use ChatGPT o1-mini:

ChatGPT o1-mini is a new model, which is quick, smart, and efficient. I recommend using this when you need a fast, accurate answer.

The first case occurs when you need a quick and accurate answer since you have limited time.
Second, if you want straightforward responses without too much detail, this model is ideal.
Finally, make a fact-based decision; this model will assist you with its accurate and timely responses.

If you carefully look at the cases, you will agree with me that this model is best suited for:

Content creation
Brainstorming and quick feedback.

Use ChatGPT o1-mini to solve factually correct and when speed is a priority.

Wrapping Up

In this article, I have provided all the information you need regarding the new model “ChatGPT o1.”

Let us conclude this post by considering the human assessment of this model.

According to the study above:

This model is not particularly suitable for jobs such as copywriting, personal writing, and text editing.
On the other hand, this model is better suited for complex tasks such as programming, analyzing data, and mathematical calculations than all previous models.

If you found this article useful, consider ❤️ liking it, and you are also welcome to support me by tipping your desired amount here.

Connect: LinkedIn | Gumroad Shop | Medium | GitHub

Subscribe: Substack Newsletter | Appreciation Tip: Support

Your Data Guide

Discussion about this post