Nvidia has created a fortune that supplies chips to companies involved in artificial intelligence, but today the chipmaker has taken a step to become a more serious model maker itself, releasing a range of advanced open models, along with data and tools to help engineers use them.
The move, which comes at a time when AI companies like OpenAI, Google and Anthropic are developing their own increasingly capable chips, could provide a hedge against these companies diverging from Nvidia's technology over time.
Open models are already a crucial part of the AI ecosystem, and many researchers and startups use them to experiment, prototype and build. Although OpenAI and Google offer small open models, they don't update them as often as their rivals in China. For these and other reasons, open models from Chinese companies are currently much more popular, according to data from Hugging Face, a hosting platform for open source projects.
Nvidia's new Nemotron 3 models are among the best to download, customize, and run on native hardware, according to benchmark scores the company shared ahead of release.
“Open innovation is the foundation of progress in AI,” CEO Jensen Huang said in a statement ahead of the news. “With Nemotron, we are transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale.”
Nvidia is taking a completely more transparent approach than many of its US rivals by releasing the data used to train Nemotron – a fact that should help engineers adjust the models more easily. The company is also releasing tools to aid in customization and refinement. This includes a new hybrid latent mix of expert model architecture, which Nvidia says is especially good for building AI agents that can take actions on computers or on the web. The company is also launching libraries that allow users to train agents to do things using reinforcement learning, where models are given simulated rewards and punishments.
Nemotron 3 models come in three sizes: Nano, with 30 billion parameters; Super, which has 100 billion; and Ultra, which has 500 billion. The parameters of a model roughly correspond to how capable it is and how clumsy it is to run. The largest models are so cumbersome that they have to run on racks of expensive hardware.
Model foundations
Kari Ann Briski, vice president of generative AI software for enterprises at Nvidia, said open models are important for AI builders for three reasons: builders increasingly need to customize models for certain tasks; it often helps to pass questions to different models; and it is easier to get more intelligent responses out of these models after training by having them perform some kind of simulated reasoning. “We believe open source is the foundation for AI innovation that will continue to accelerate the global economy,” said Briski.
Social media giant Meta released the first advanced open models under the name Llama in February 2023. However, as competition has increased, Meta has indicated that its future releases may not be open source.
The move is part of a larger trend in the AI industry. Over the past year, American companies have moved away from openness, becoming more secretive about their research and more reluctant to let their rivals know about their latest tech tricks.