NVIDIA released Nemotron 3 Ultra, an open model built for long-running agents with contributions from the Nemotron Coalition. Models powering long-running agents do more than generate text. They interpret information, plan next steps, call tools, evaluate results and iterate across turns to complete complex coding, research and enterprise tasks. This requires efficient models that can explore more of the search space in less time to deliver higher-accuracy results faster.
Nemotron 3 Ultra is built for that new workload. It’s a frontier smart model that delivers up to 5x faster inference and lowers the cost of complex agentic tasks by up to 30%. This enables agents to finish the same job in less time or complete more jobs in the same time.
Nemotron 3 Ultra, a 550-billion-parameter mixture-of-experts model, handles the orchestration and hardest reasoning calls in an autonomous workflow: architectural decisions in long-running coding sessions, synthesis across hundreds of research sources and verification across thousands of interdependent constraints.
Enterprise software leaders are building agents with the new model, including for workflows spanning software development, deep research, customer service and enterprise automations.
- Aible is integrating Nemtoron 3 Ultra into the AIbleClaw platform, allowing its customers to build secure long-running agents at scale for various domains.
- Glean is making Nemotron 3 Ultra available in its model-agnostic agent harness, alongside an agentic search model fine-tuned with Nemotron 3 Nano, expanding enterprises’ access to cost-effective, enterprise agentic AI.
- Greptile is integrating Nemotron 3 Ultra into its code review platform for codebase indexing, enabling code reviews with leading accuracy at lower cost.
- Harvey is enabling support for Nemotron 3 Ultra and post-trained versions of the model through its platform, helping customers build and deploy AI-powered legal workflows with greater control over their data.
- Perplexity is using Nemotron 3 Ultra for search and Perplexity Computer, and using its agent router to direct workloads to fine-tuned open models or proprietary models based on the task, helping its AI assistants operate with speed, efficiency and scale.
Announced earlier this week, CrowdStrike and Palantir are adopting Nemotron 3 Ultra to enable a new class of long-running AI agents to help teams analyze complex data, coordinate tasks and streamline operations across cybersecurity and enterprise environments.
Additional companies adopting the model include Applied Compute, CodeRabbit, Dataiku and ServiceNow.
The model is trained on agent traces and optimized for agent harnesses, enabling developers to choose their preferred frameworks while maintaining accuracy.
Agent platforms and harnesses including BlackBox AI, Cline, Factory AI, Hermes Agent, Kilo Code, LangChain Deep Agents, OpenClaw, OpenCode, OpenHands and Pi support the new Nemotron models.
Nemotron 3 Ultra works with the NVIDIA NemoClaw blueprint, which provides enterprises with a secure runtime, open models and domain-specific skills to put autonomous agents to work at scale.
H Company, Naver, Nous and Prime Intellect Join Nemotron Coalition
H Company, NAVER Cloud, Nous Research and Prime Intellect are joining the Nemotron Coalition. These members will contribute unique strengths spanning data, training environments, evaluation frameworks and domain expertise to support the collaborative development of an open frontier model trained on NVIDIA DGX Cloud, which will serve as the foundation for the upcoming Nemotron 4 family.
By combining forces, the coalition is bringing together leading global AI labs and infrastructure providers to accelerate the development of open frontier models. This collaborative approach aims to broaden access to cutting-edge AI innovation while enabling developers and enterprises worldwide to build and customize models for their industries, regions and use cases.
New Nemotron Speech and Safety Models
Also live today, a new Nemotron speech recognition model brings real-time streaming ASR to 40 language locales for voice agent workflows across global enterprise deployments. The Nemotron 3.5 Content Safety model — a 4-billion-parameter open multimodal model — classifies content across 23 safety categories and a dozen languages, with support for custom enterprise policies.
Open and Customizable, Deployable Anywhere
Nemotron models are released with open weights, datasets and recipes, giving organizations transparency and control to customize models for domain-specific workflows and deploy them where their applications and data reside.
Developers can use tools like NVIDIA NeMo for customization, evaluation and optimization for their use cases. Because the Nemotron family of models is open, organizations can deploy them in environments that meet regulatory, sovereignty or data localization requirements.
The models are available on Hugging Face, ModelScope, OpenRouter and build.nvidia.com as NVIDIA NIM microservices and through a broad ecosystem of NVIDIA Cloud Partners, inference platforms and cloud service providers.











