NVIDIA Isaac Enables Healthcare Robot Development with SO-ARM Starter Workflow
These articles are AI-generated summaries. Please check the original sources for full details.
Building Healthcare Robots with NVIDIA Isaac and the SO-ARM Starter Workflow
Summary
NVIDIA Isaac for Healthcare has released a new SO-ARM starter workflow to accelerate the development and deployment of autonomous medical robots. This workflow provides a comprehensive, end-to-end pipeline for collecting data, training policies, and deploying them on real hardware, significantly reducing the time required to bring these robots to the operating room. The workflow leverages a mixed training approach, combining simulation and real-world data to overcome the limitations of each.
Introduction
The development of medical robots has been hindered by the challenges of acquiring sufficient and diverse data. Simulation has emerged as a key solution, but translating simulated policies to real-world systems has been difficult. NVIDIA Isaac for Healthcare addresses this challenge with its developer framework, offering integrated pipelines for data collection, training, and evaluation across both simulation and hardware. The v0.4 release introduces the SO-ARM starter workflow, designed to lower the barrier to entry for MedTech developers.
The SO-ARM Starter Workflow
The SO-ARM starter workflow provides a complete end-to-end pipeline for autonomous surgical assistance, enabling developers to:
- Collect Data: Utilize the LeRobot system with the SO-ARM101 hardware to gather both real-world and synthetic data.
- Train Policies: Fine-tune the GR00T N1.5 model on the combined dataset, leveraging IsaacLab’s RL framework.
- Deploy to Hardware: Deploy the trained policies to physical hardware for real-time inference.
This workflow emphasizes a safe and repeatable environment for training and refining robotic skills before deployment in a clinical setting.
Technical Implementation
The workflow employs a three-stage pipeline:
- Data Collection: This involves mixed simulation and real-world teleoperation demonstrations using the SO101 and LeRobot systems. Approximately 70 simulation episodes with diverse scenarios and environmental variations are combined with 10-20 real-world episodes for authenticity.
- Model Training: The GR00T N1.5 model is fine-tuned on the combined dataset, utilizing dual-camera vision for enhanced perception.
- Policy Deployment: Trained policies are deployed for real-time inference on physical hardware using RTI DDS communication.
A significant aspect of this workflow is the heavy reliance on synthetic data, with over 93% of the data used for policy training generated in simulation.
Sim2Real Mixed Training Approach
The workflow combines simulation and real-world data to address the limitations of each approach. Pure simulation often fails to capture real-world complexities, while real-world training is expensive and limited. The mixed training approach aims to create policies that generalize effectively across both domains.
Hardware Requirements
To run the SO-ARM starter workflow, the following hardware is required:
- GPU: An Ampere or later architecture GPU with ≥30GB VRAM is required for GR00T N1.5 inference.
- SO-ARM101 Follower: A 6-DOF precision manipulator with dual-camera vision (wrist and room).
- SO-ARM101 Leader: A 6-DOF teleoperation interface for expert demonstration collection.
Notably, the workflow can be run on a single DGX Spark, enabling developers to perform all simulation, training, and deployment steps on one machine.
Data Collection Implementation
The following Python commands are used for data collection:
- Real-world Data:
python lerobot-record \ --robot.type=so101_follower \ --robot.port=<follower_port_id> \ --robot.cameras="{wrist: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}, room: {type: opencv, index_or_path: 2, width: 640, height: 480, fps: 30}}" \ --robot.id=so101_follower_arm \ --teleop.type=so101_leader \ --teleop.port=<leader_port_id> \ --teleop.id=so101_leader_arm \ --dataset.repo_id=<user>/surgical_assistance/surgical_assistance \ --dataset.num_episodes=15 \ --dataset.single_task="Prepare and hand surgical instruments to surgeon" - Simulation Data:
# With keyboard teleoperation python -m simulation.environments.teleoperation_record \ --enable_cameras \ --record \ --dataset_path=/path/to/save/dataset.hdf5 \ --teleop_device=keyboard # With SO-ARM101 leader arm python -m simulation.environments.teleoperation_record \ --port=<your_leader_arm_port_id> \ --enable_cameras \ --record \ --dataset_path=/path/to/save/dataset.hdf5
Simulation Teleoperation Controls
The workflow provides keyboard-based teleoperation controls for users without physical SO-ARM101 hardware:
- Joint 1 (shoulder_pan): Q (+) / U (-)
- Joint 2 (shoulder_lift): W (+) / I (-)
- Joint 3 (elbow_flex): E (+) / O (-)
- Joint 4 (wrist_flex): A (+) / J (-)
- Joint 5 (wrist_roll): S (+) / K (-)
- Joint 6 (gripper): D (+) / L (-)
- R Key: Reset recording environment
- N Key: Mark episode as successful
Model Training Pipeline
The following Python commands are used to convert and combine datasets for training:
# Convert simulation data to LeRobot format
python -m training.hdf5_to_lerobot \
--repo_id=surgical_assistance_dataset \
--hdf5_path=/path/to/your/sim_dataset.hdf5 \
--task_description="Autonomous surgical instrument handling and preparation"
# Fine-tune GR00T N1.5 on mixed dataset
python -m training.gr00t_n1_5.train \
--dataset_path /path/to/your/surgical_assistance_dataset \
--output_dir /path/to/surgical_checkpoints \
--data_config so100_dualcam
End-to-End Sim-Collect-Train-Eval Pipelines
IsaacLab v0.3 supports a full pipeline of simulation-based development:
- Generate Synthetic Data: Teleoperate robots using keyboard or hardware controllers to capture multi-camera observations, robot states, and actions.
- Train and Evaluate Policies: Utilize IsaacLab’s RL framework for PPO training, leveraging parallel environments and built-in metrics.
- Convert Models to TensorRT: Optimize models for production deployment, supporting dynamic shapes and multi-camera inference.
Getting Started
To get started with the SO-ARM starter workflow:
- Clone the repository:
git clone https://github.com/isaac-for-healthcare/i4h-workflows - Choose a workflow: Start with the SO-ARM Starter Workflow or explore other workflows.
- Run the setup script:
tools/env_setup_so_arm_starter.sh
Resources
- GitHub Repository: https://huggingface.co/blog/lerobotxnvidia-healthcare
- Documentation: Setup and usage guides
- GR00T Models: Pre-trained foundation models
- Hardware Guides: SO-ARM101 setup instructions
- LeRobot Repository: End-to-end robotics learning
Continue reading
Next article
Hugging Face and VirusTotal Collaborate to Enhance AI Security
Related Content
NVIDIA DreamDojo: Scaling Robotics with 44k Hours of Human Video Data
NVIDIA releases DreamDojo, an open-source world model trained on 44,711 hours of human video, enabling real-time robot simulation at 10.81 FPS.
OpenMind OM1: Building an Open Source Operating System for Humanoid Robots
Jan Liphardt introduces OM1, an open-source robotic OS that leverages large language models for data fusion and utilizes $1,250 hardware components with 10,000-hour durability to enable human-centric robot interactions, shifting the focus from complex motor tasks like onion chopping to social engagement and spatial understanding.
LeRobot v0.4.0: Supercharging OSS Robot Learning with New Features and Integrations
LeRobot v0.4.0 introduces significant advancements in datasets, simulation environments, codebase flexibility, and hardware integration, empowering open-source robot learning.