By 2025, entrepreneurs will launch a flood of AI-powered apps. Finally, generative AI will live up to the hype with a new crop of affordable consumer and business apps. This is not the consensus view today. OpenAI, Google and xAI are engaged in an arms race to train the most powerful large language model (LLM) in the pursuit of artificial general intelligence, known as AGI, and their gladiator battle dominates the mindshare and revenue share of the young Gen AI ecosystem.
For example, Elon Musk raised $6 billion to launch the newcomer xAI and bought 100,000 Nvidia H100 GPUs, the expensive chips used to process AI, costing more than $3 billion to train his model, Grok. At those prices, only techno tycoons can afford to build these massive LLMs.
The incredible spending from companies like OpenAI, Google, and xAI has created a lopsided ecosystem that is top-heavy and top-heavy. The LLMs trained by these massive GPU farms also tend to be very expensive for inference, the process of entering a prompt and generating an answer from large language models embedded in every app that uses AI. It's like everyone has 5G smartphones, but using data is too expensive for everyone to watch a TikTok video or surf social media. As a result, excellent LLMs with high inference costs have made it prohibitively expensive to distribute great apps.
This lopsided ecosystem of ultra-wealthy tech moguls competing with each other has enriched Nvidia while forcing application developers into a catch-22: either use a cheap and low-performance model that will disappoint users, or pay exorbitant inference costs and risk going bankrupt.
In 2025, a new approach will emerge that could change all that. This will return to what we learned from previous technological revolutions, such as the PC era of Intel and Windows or the mobile era of Qualcomm and Android, where Moore's Law improved PCs and apps, and lower bandwidth costs improved mobile phones and apps years . after year.
But what about the high inference costs? A new law for AI inference is imminent. The cost of inference has fallen by a factor of 10 per year, driven by new AI algorithms, inference technologies and better chips at lower prices.
As a point of reference, if a third-party developer were to use OpenAI's top models to build AI searches, the cost would be about $10 per search in May 2023, while Google's non-Gen AI searches cost $0.01, a 1,000 x difference. But in May 2024, the price of OpenAI's top model dropped to around $1 per query. With this unprecedented tenfold annual price drop, application developers will be able to use increasingly higher quality and cheaper models, leading to a proliferation of AI apps over the next two years.