- NVIDIA is open-sourcing massive datasets to accelerate AI development, including 10 trillion language tokens and specialized data for robotics and autonomous vehicles.
- The “Open Data for AI” initiative provides developers with high-quality, diverse data to train foundational and physical AI models.
- Key releases include 1,700+ hours of driving data and 500,000 robotics trajectories to address the “data wall” in physical AI.
- These resources are hosted on Hugging Face and GitHub, supporting the creation of sovereign AI systems tailored to specific industry needs.
Entities: NVIDIA, Hugging Face