Skip to content

Why Deepseek could change what Silicon Valley believes about AI

    The breakthrough of artificial intelligence that shock waves sends through stock markets, makes Silicon Valley Giants and generating breathless takes around the end of America's technological dominance arrived with a modest, crazy title: “Stimulating reasoning capacity in LLMS through reinforcement.”

    The paper of 22 pages, released last week by a Scrappy Chinese AI-start-up called Deepseek, did not immediately bring alarm bells. It took a few days for researchers to digest the claims of the newspaper and the implications of what it described. The company had created a new AI model called Deepseek-R1, built by a team of researchers who claimed to have used a modest number of second-rate AI chips to correspond to the performance of leading American AI models against a fraction of the Costs.

    Deepseek said it had done this by using Clever Engineering to replace RAW Computing Porther Porther. And it had done it in China, a country that many experts thought it was in a distant second place in the global AI race.

    Some industrial viewers initially responded to the breakthrough of Deepseek with disbelief. Certainly, they thought, Deepseek had cheated to reach the results of R1, or Fudered their number to make their model look more impressive than it was. Perhaps the Chinese government promoted propaganda to undermine the story of the American AI. Perhaps Deepseek hid a stock of illegal Nvidia H100 chips, prohibited under the US export checks and lying over it. Perhaps R1 was actually just a smart repeated of American AI models that did not represent much in the way of real progress.

    In the end, as more people in the details of Deepseek-R1 Groeven-Die, in contrast to most leading AI models, was released as open-source software, so that outsiders could better investigate their skepticism.

    And at the end of last week, when many Americans started using the Deepseek models for themselves, and the Deepseek Mobile app hit the number one place in Apple's App Store, it tipped in a complete panic.

    I am skeptical about the most dramatic view I have seen in recent days – such as the claim, made by an investor in Silicon Valley, that Deepseek is an extensive conspiracy of the Chinese government to destroy the American technical industry. I also think it is plausible that the company's choeling budget is seriously exaggerated, or that it has not announced it by the American AI companies of the American AI in a way.

    But I do think that the breakthrough of Deepseek was real. Based on conversations I have had with insiders from the industry, and the value of a week of experts who are tracing around and testing the newspaper's findings, it seems to question several important assumptions that the American technical industry has done .

    The first is the assumption that to build advanced AI models, you have to spend huge amounts of money on powerful chips and data centers.

    It is difficult to overestimate how fundamental this dogma has become. Companies such as Microsoft, Meta and Google have already spent dozens of billions of dollars building the infrastructure that they thought was necessary to build and run AI models of the next generation. They are planning to spend tens of billions more – or, in the case of OpenAi, no less than $ 500 billion through a joint venture with Oracle and Softbank announced last week.

    Deepseek seems to have spent a small fraction of that building R1. We do not know the exact costs and there are numerous reservations about the figures they have released so far. It is almost certainly higher than $ 5.5 million, the number that the company claims that it has trained a previous model.

    But even if R1 costs 10 times more to train than Deepseek claims, and even if you take into account other costs that they may have excluded, such as engineering salaries or the costs of basic research, it would still be less orders in size than anything American AI companies spend on developing their most capable models.

    The obvious conclusion to draw is not that American technical giants waste their money. It is still expensive to perform powerful AI models as soon as they have been trained, and there are reasons to think that spending hundreds of billions of dollars will still be useful for companies such as OpenAi and Google, who can afford a lot To pay to the head the package.

    But the breakthrough of Deepseek on cost challenge The “bigger is better” story that the AI ​​weapon race has driven in recent years by showing that relatively small models, if correctly trained, can match or surpass the performance of much larger models.

    That in turn means that AI companies may be able to achieve very powerful capacities with much less investments than previously thought. And it suggests that we will soon be able to see a stream of investments in smaller AI startups, and much more competition for the Giants or Silicon Valley. (Which, due to the enormous costs of training their models, usually compete with each other so far.)

    There are other, more technical reasons that everyone in Silicon Valley pays attention to Deepseek. In the research paper, the company reveals some details about how R1 was actually built, including some advanced techniques in model stilling. (In fact, that means compressing large AI models in smaller ones, making them cheaper to run without losing much in the way of performance.)

    Deepseek also contained details that suggested that it would not have been as difficult as previously thought of converting a “vanilla” AI language model into a more advanced reasoning model, by applying a technique mentioned as reinforcement learning on top. (Do not worry if these conditions are about your head – the point is that methods for improving AI systems that were previously closely monitored by American technology companies are now on the internet, free for someone to take and replicate.)

    Even if the stock prices of American technical giants recover in the coming days, the success of DeepSekek raises important questions about their long-term AI strategies. If a Chinese company is able to build cheap, open-source models that match the performance of expensive American models, why would someone pay for ours? And if you are Meta-De only American technology giant who releases its models as a free open-source software-What prevents deep-key or another start-up of simply taking your models, which you have spent billions of dollars and them in smaller distilled, cheaper Models they can offer for money?

    The breakthrough of Deepseek also undermines some of the geopolitical assumptions that many American experts had done about the position of China in the AI ​​race.

    Firstly, the story challenges that China is meaningfully behind the border when it comes to building powerful AI models. For years, many AI experts (and the policymakers who listen to them) assumed that the United States had at least several years ahead, and that copying the progress of American technology companies for Chinese companies was quickly difficult to do quickly.

    But the results of Deepseek show that China AI possibilities has advanced those models that can match or surpass open -to -open and other American AI companies, and that breakthroughs of American companies are trivial easy for Chinese companies – or at least one Chinese company – to to replicate in a matter of weeks.

    (The New York Times has sued OpenAi and his partner, Microsoft, accuses them of infringing the copyright of news content with regard to AI systems. OpenAi and Microsoft have denied those claims.)

    The results also raise questions about whether the steps that the US government has taken to limit the spread of powerful AI systems to our opponents -namely the export controls used to prevent powerful AI chips from falling into China's hands – Working as designed, or whether those instructions should adapt to take into account new, more efficient ways of training models.

    And of course there are concerns about what it would mean for privacy and censorship when China took the lead in building powerful AI systems used by millions of Americans. Users of Deepseek's models have noticed that they routinely refuse to respond to questions about sensitive topics in China, such as the massacre of Tiananmen Square and Uyghur detentiekampen. If other developers build the top of Deepseek's models, as is usual with open-source software, those censorship measures can be embedded in the industry.

    Privacy experts have also expressed their concern about the fact that data that is shared with Deepseek models can be accessible by the Chinese government. If you were worried that TIKTOK was used as an instrument of surveillance and propaganda, the rise of Deepseek should also be worried.

    I still don't know for sure what the full impact of the breakthrough of Deepseek will be, or that we will regard the release of R1 as a “sputnik moment” for the AI ​​industry, as some have claimed.

    But it seems wise to take the opportunity that we are now in a new era of AI Brinkmanship – that the largest and richest American technology companies may no longer win, and that containing the spread of increasingly powerful AI systems might be more difficult than We thought.

    At the very least Deepseek has shown that the AI ​​weapon race is real, and that after a few years of dizzying progress are even more surprises in the store.