Meta's Next Llama AI Models Train On A GPU Cluster 'bigger Than Anything'

Managing such a massive array of chips to develop Llama 4 will likely pose unique technical challenges and require enormous amounts of energy. Meta executives on Wednesday sidestepped an analyst question about energy access restrictions in parts of the U.S. that have hampered companies' efforts to develop more powerful AI.

By one estimate, a cluster of 100,000 H100 chips would require 150 megawatts of power. In contrast, the largest national laboratory supercomputer in the United States, El Capitan, requires 30 megawatts of power. Meta expects to spend as much as $40 billion in capital on setting up data centers and other infrastructure this year, an increase of more than 42 percent compared to 2023. The company expects even faster growth in that expenditure next year.

Meta's total operating costs have increased by about 9 percent this year. But overall revenue — much of it from advertising — is up more than 22 percent, giving the company bigger margins and bigger profits even as it pours billions of dollars into the Llama effort.

Meanwhile, OpenAI, considered the current leader in advanced AI development, is burning money despite charging developers for access to its models. What remains a nonprofit for now has said it is training GPT-5, a successor to the model that currently powers ChatGPT. OpenAI has said that GPT-5 will be larger than its predecessor, but it hasn't said anything about the computer cluster it uses for training. OpenAI has also said that GPT-5 will feature other innovations in addition to scale, including a recently developed approach to reasoning.

CEO Sam Altman has said that GPT-5 will be “a significant leap forward” compared to its predecessor. Last week, Altman responded to a news report stating that OpenAI's next frontier model would be released in December, writing on X: “fake news out of control.”

On Tuesday, Google CEO Sundar Pichai said the latest version of the company's Gemini family of generative AI models is in development.

Meta's open approach to AI has sometimes proven controversial. Some AI experts worry that making significantly more powerful AI models freely available could be dangerous because it could help criminals launch cyberattacks or automate the design of chemical or biological weapons. Although the lama is finely tuned before release to limit misbehavior, it is relatively trivial to remove these restrictions.

Zuckerberg remains optimistic about the open source strategy, even as Google and OpenAI push proprietary systems. “It seems pretty clear to me that open source will be the most cost-effective, adaptable, reliable, performant and easiest-to-use option available to developers,” he said on Wednesday. “And I am proud that Llama is taking the lead in this.”

Zuckerberg added that Llama 4's new capabilities should be able to power a broader range of features for Meta Services. Today, the signature offering based on Llama models is the ChatGPT-like chatbot known as Meta AI, which is available in Facebook, Instagram, WhatsApp and other apps.

More than 500 million people use Meta AI every month, Zuckerberg said. Meta expects to generate revenue through ads in the feature over time. “There will be a broader and broader set of questions that people use it for, and the monetization opportunities will exist over time,” Meta CFO Susan Li said on the call Wednesday. With the potential for ad revenue, Meta might be able to subsidize Llama before everyone else.

Meta's next Llama AI models train on a GPU cluster 'bigger than anything'