Skip to content

How AI -Chatbots such as Chatgpt and Deepseek drove such as

    In September, OpenAi revealed a new version of Chatgpt that was designed to reason by tasks with mathematics, science and computer programming. Unlike earlier versions of the chatbot, this new technology could spend time on 'thinking' due to complex problems before he settles for an answer.

    The company soon said that its new reasoning technology had performed better than the leading systems of the industry on a series of tests that follow the progress of artificial intelligence.

    Now other companies, such as Google, Anthropic and China's Deepseek, offer similar technologies.

    But can AI actually reason like a person? What does it mean for a computer to think? Do these systems approach really real intelligence?

    Here is a guide.

    Reasoning simply means that the chatbot spends some extra time working on a problem.

    “Reasoning is when the system does extra work after the question has been asked,” said Klein, a professor of computer science at the University of California, Berkeley and Chief Technology Officer of Scaled Cognition, an AI-start-up.

    It can break a problem in individual steps or try to resolve it by falling and error.

    The original chatgpt immediately answered questions. The new reasoning systems can go through a problem for a few seconds – or even minutes – before they answered.

    In some cases, a reasoning system will refine its approach to a question and repeatedly try to improve the method it has chosen. Other times it can try different ways to approach a problem before it settles on one of them. Whether it can go back and check some work, it did a few seconds earlier, only to see if it was correct.

    In short, the system tries everything it is possible to answer your question.

    This is a kind of a learning school student who has difficulty finding a way to solve a math problem and scribble various options on a sheet of paper.

    It may reason anywhere. But reasoning is the most effective when you ask questions regarding mathematics, science and computer programming.

    You could rather ask chatbots to show you how they had reached a certain answer or to check their own work. Because the original chatgpt had learned from text on the internet, where people showed how they had received an answer or had checked their own work, this could also do this kind of self -reflection.

    But a reasoning system continues. It can do things like this without being asked. And it can do them in more extensive and complex ways.

    Companies call it a reasoning system because it feels like it works more as a person who thinks a difficult problem.

    Companies like OpenAi believe that this is the best way to improve their chatbots.

    For years these companies relied on a simple concept: the more internet data they have pumped into their chatbots, the better these systems were performed.

    But in 2024 they used almost all text on the internet.

    That meant that they needed a new way to improve their chatbots. So they started building reasoning systems.

    Last year, companies such as OpenAi started leaning heavily on a technique called reinforcement.

    Through this process – which can extend more than months – an AI system can learn behavior through extensive trial and error. By going through thousands of mathematical problems, for example, it can learn which methods lead to the correct answer and which do not.

    Researchers have designed complex feedback mechanisms that show the system when it did something right and when it has done something wrong.

    “It's a bit like training a dog,” said Jerry Tworek, an OpenAi researcher. “If the system is doing well, you give it a cookie. If things don't go well, you say,” Bad Dog. “

    (The New York Times has sued OpenAi and his partner, Microsoft, in December for infringing the copyright of news content with regard to AI systems.)

    It works pretty well in certain areas, such as mathematics, science and computer programming. These are areas where companies can clearly define the good behavior and the bad. Math problems have definitive answers.

    Learning reinforcements does not work so well in areas such as creative writing, philosophy and ethics, where the distinction between good and difficult to steal. Researchers say that this process can generally improve the performance of an AI system, even when questions answers outside mathematics and science.

    “It gradually learns which reasoning patterns lead it in the right direction and which do not,” said Jared Kaplan, Chief Science Officer at Anthropic.

    No. Learning reinforcement is the method that companies use to build reasoning systems. It is the training phase that Chatbots ultimately allows to reason.

    Absolute. Everything that a chatbot does is based on probably. It chooses a path that most resembles the data it has learned – whether that data came from the internet or were generated by means of reinforcement. Sometimes it chooses an option that is wrong or not logical.

    AI experts are divided on this question. These methods are still relatively new and researchers are still trying to understand their limits. In the AI ​​field, new methods often progress very quickly in the beginning, before they delayed.