Skip to content
@InternRobotics

Intern Robotics

Building inclusive infrastructure for Embodied AI, from Shanghai AI Lab.

Toolchain - Training, Inference and Evaluation

  • InternUtopia: A simulation platform for versatile Embodied AI research and developments.
  • InternManip: An all-in-one robot manipulation learning suites (5 pretrained models, 3 benchmarks, and more coming soon).
  • InternNav: A open platform for building generalized navigation foundation models (with 6 mainstream benchmarks and 10+ baselines).
  • InternHumanoid: A versatile, all-in-one toolbox for whole-body humanoid robot contorl.
  • InternSR: A open-source toolbox for vision-based embodied spatial intelligence.

Models, Datasets and Research

  • Humanoids/Legged Robots

    • Datasets:
      • InternData-H1: The largest open-sourced 3D human motion dataset with text annotation, including 2.5k hours 1.9M episodes.
    • Models and Research:
      • UniHSI: Unified Human-Scene Interaction via Prompted Chain-of-Contacts
      • HIMLoco: Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response
      • 🏆HoST[Best Systems Paper Finalist at RSS 2025]: Learning Humanoid Standing-up Control across Diverse Postures
      • HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit
  • Manipulation

    • Datasets:
      • InternData-A1: A hybrid synthetic-real manipulation dataset integrating 5 heterogeneous robots, 15 skills, and 200+ scenes, emphasizing multi-robot collaboration under dynamic scenarios.
      • InternData-M1: A large-scale synthetic dataset for generalizable pick-and-place over 80K objects, with open-ended instructions covering object recognition, spatial and commonsense reasoning, and long-horizon tasks.
    • Models and Research:
      • InternVLA-A1: Unifying Understanding, Generation, and Action for Robotic Manipulation
      • InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
      • F1-VLA: Visual foresight generation for planning-based control
      • VLAC: A generalist vision-language-action-critic model for robotic real-world reinforcement learning
      • Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
      • RoboSplat: Novel Demonstration Generation with Gaussian Splatting Enables Robust One-Shot Manipulation
      • GenManip: LLM-driven Simulation for Generalizable Instruction-Following Manipulation
  • Navigation

    • Datasets:
      • InternData-N1: A high-quality navigation dataset with the most diverse scenes and extensive randomization across embodiments/viewpoints, including 3k+ scenes and 830k VLN data.
    • Models and Research:
      • InternVLA-N1: An Open Dual-System Vision-Language Navigation Foundation Model with Learned Latent Plans
      • NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance
      • StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling
      • VLN-PE: A Holistic Study of Physical and Visual Disparities in Vision-and-Language Navigation
  • AIGC for Embodied AI

    • Datasets:
      • OmniWorld: A large-scale, multi-domain, multi-modal dataset, enables significant performance improvements in 4D reconstruction and video generation.
    • Models and Research:
      • MeshCoder: Generate Structured 3D Object Blender Code from Point Clouds
      • Infinite-Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation
      • Aether: Geometric-Aware Unified World Modeling
  • 3D Vision and Embodied Perception

    • EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
    • 🏆PointLLM[Best Paper Candidate at ECCV 2024]: Empowering Large Language Models to Understand Point Clouds
    • MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
    • OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
  • 3D Assets for Embodied AI

    • InternScenes: A large-scale interactive indoor scene dataset with realistic layouts, 40,000 diverse scenes and 1.96M 3D objects.

Pinned Loading

  1. InternUtopiaInternUtopiaPublic

    A simulation platform for versatile Embodied AI research and developments.

    Python 1.2k 69

  2. InternNavInternNavPublic

    InternRobotics' open platform for building generalized navigation foundation models.

    Jupyter Notebook 645 74

  3. EmbodiedScanEmbodiedScanPublic

    [CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

    Python 649 50

  4. PointLLMPointLLMPublic

    [ECCV 2024 Best Paper Candidate & TPAMI 2025] PointLLM: Empowering Large Language Models to Understand Point Clouds

    Python 963 51

  5. HIMLocoHIMLocoPublic

    Learning-based locomotion control from OpenRobotLab, including Hybrid Internal Model & H-Infinity Locomotion Control

    Python 783 79

  6. OpenHomieOpenHomiePublic

    Open-sourced code for "HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit".

    C++ 504 46

Repositories

Showing 10 of 60 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Most used topics

Loading…