NVIDIA’s GROOT N1 lets robots see, understand, and move all in the same model. This makes true generic AI possible for automating humans in many fields.

The GROOT N1 and its improved replacement, the GROOT N1.5, are NVIDIA’s big step toward making open, general-purpose brains for humanoid robots. These models try to give robots the ability to think, sense, and control their movements like humans, all from a single AI design.

GROOT N1


But it’s not just how smart these models are that makes them new; it’s also how open, flexible, and likely to change physical AI across industries.

What Is GROOT N1

NVIDIA’s open-source Vision-Language-Action (VLA) base model is called GROOT N1. It was made for robots that work in the real world. While other AI models are only trained to work with certain robot arms or assembly jobs, GROOT N1 is designed to be human-like in a wide range of situations, from kitchen chores to factory work.

It works like this:

  • It sees the world through robot cameras.
  • It understands language commands like “put the bottle on the shelf.”
  • And it acts using real-time motion control through a robot’s joints and limbs.

The next wave of humanoid robots will need to be able to do a lot of different things. Now, a single AI brain can do all of these things.

Dual Intelligence System Inspired by Human Thinking

NVIDIA made GROOT N1 think and act more like a person by putting together two special modules:

  • System 2: The Vision-Language Module (VLM) handles reasoning and context. It interprets what the robot sees and hears and generates high-level task tokens.
  • System 1: The Diffusion Transformer (DiT) turns those tokens into fast, fluid motor actions. Think of it like the robot’s reflex system, controlling movements with real-time precision.

Robots with this setup can not only figure out what they need to do but also move in a way that is smooth and adaptable while they do it.

How It Learns – The Data Pyramid

GROOT N1 is smart because it learns from three levels of training data, which NVIDIA calls the “data pyramid”:

  1. At the base, it uses publicly available video data like first-person human recordings to learn general behaviors.
  2. In the middle, it uses AI-generated synthetic simulations to connect high-level understanding with robotic control.
  3. At the top, it learns from real-world robot demonstrations — including precise actions captured from advanced humanoid robots.

GROOT N1 can generalize better and do new tasks with little data thanks to this multilayered learning approach.

What Can GROOT N1 Actually Do

In real-world tests, GROOT N1 has delivered impressive results:

  • Achieved over 80% success in pick-and-place and bimanual manipulation on the Fourier GR-1 humanoid robot.
  • Adapted to different robot bodies using a single model.
  • Accepted natural language instructions and translated them into physical motion.
  • Outperformed imitation learning benchmarks in both simulation and real-world robotics tests.

Some of the biggest names in robotics already use it. Boston Dynamics, Agility Robotics, and Mentee Robotics are all adding GROOT N1 to their own systems.

What’s New in GROOT N1.5

GROOT N1.5 is based on N1 and adds a number of advanced features that make performance, generalization, and language learning much better.

  • Frozen VLM (Eagle 2.5): Keeps language understanding stable during training and fine-tuning.
  • FLARE Objective: A new learning strategy that helps the model learn from videos shot from human perspectives.
  • DreamGen Synthetic Data: Generates simulated robot training data at scale to improve how well the model adapts to new objects and tasks.
  • Streamlined adapter design: Helps the vision and action modules work more efficiently together.

With these improvements, GROOT N1.5 is more than twice as good as N1 at following language directions. In real-world tests, it went from 46% to over 93%.

GROOT N1 vs GROOT N1.5 (Quick Comparison)

FeatureGROOT N1GROOT N1.5
Language ModelEagle 2Eagle 2.5 (Frozen)
Success on GR-1 Tasks46.6%93.3%
GeneralizationModerateHigh (Better with new objects)
Post-training OptimizationStandardFLARE Objective for human-like learning
Training BoostReal + SyntheticAdds DreamGen synthetic environments


Final Take

The GROOT N1 and N1.5 are two of the most exciting robot models on the market right now because they have deep learning, real-time motor control, and open source access. This is the AI brain you’ll want to learn more about, change, and use in the real world if you work in robots or automation.

Not only will robots of the future be smart, but they will also be open, flexible, and aware of people. It’s also possible that it’s running on GROOT.