News

Project GR00T: NVIDIA to boost robot dexterity, mobility with new AI, simulation tools

V.Rodriguez55 min ago

NVIDIA has unveiled new AI and simulation tools and workflows to help robotics developers greatly accelerate their work on AI-enabled robots.

The lineup unveiled this week at the Conference for Robot Learning (CoRL) in Munich, Germany, includes the general availability of the NVIDIA Isaac Lab robot learning framework; six new humanoid robot learning workflows for Project GR00T, which is an initiative to accelerate humanoid robot development. The company also unveiled new world-model development tools for video data curation and processing, including the NVIDIA Cosmos tokenizer and NVIDIA NeMo Curator for video processing.

Project GR00T to advance robot development

"Humanoid robots are the next wave of embodied AI," said Jim Fan, senior research manager of embodied AI at NVIDIA . "NVIDIA research and engineering teams are collaborating across the company and our developer ecosystem to build Project GR00T to help advance the progress and development of global humanoid robot developers." The company claimed that six new Project GR00T workflows provide humanoid developers with blueprints to realize the most challenging humanoid robot capabilities. They include GR00T-Gen, GR00T-Mimic, GR00T-Dexterity, GR00T-Control, GR00T-Mobility and GR00T-Perception.

NVIDIA Cosmos tokenizer

Eric Jang, vice president of AI at 1X Technologies, stated that NVIDIA Cosmos tokenizer achieves really high temporal and spatial compression of "our data while still retaining visual fidelity." "This allows us to train world models with long horizon video generation in an even more compute-efficient manner," added Jang. Providing high-quality compression and up to 12x faster visual reconstruction, the Cosmos tokenizer paves the path for scalable, robust and efficient development of generative applications across a broad spectrum of visual domains, according to NVIDIA.

Superior visual tokenization

The company claimed that open-source Cosmos tokenizer provides robotics developers superior visual tokenization by breaking down images and videos into high-quality tokens with exceptionally high compression rates. It runs up to 12x faster than current tokenizers, while NeMo Curator provides video processing curation up to 7x faster than unoptimized pipelines. Other humanoid and general-purpose robot developers, including XPENG Robotics and Hillbot, are developing with the NVIDIA Cosmos tokenizer to manage high-resolution images and videos.

NeMo Curator

NeMo Curator now includes a video processing pipeline. This enables robot developers to improve their world-model accuracy by processing large-scale text, image, and video data. Curating video data poses challenges due to its massive size, requiring scalable pipelines and efficient orchestration for load balancing across GPUs. Additionally, models for filtering, captioning and embedding need optimization to maximize throughput. The company claimed that NeMo Curator overcomes these challenges by streamlining data curation with automatic pipeline orchestration, reducing processing time significantly. It supports linear scaling across multi-node, multi-GPU systems, efficiently handling over 100 petabytes of data. This simplifies AI development, reduces costs and accelerates time to market.

0 Comments
0