From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity
Adaptive & Intelligent Robotics Lab, Imperial College London
Abstract
Autonomous skill discovery aims to enable robots to acquire diverse behaviors without explicit supervision. Learning such behaviors directly on physical hardware remains challenging due to safety and data efficiency constraints. Existing methods, including Quality-Diversity Actor-Critic (QDAC), require manually defined skill spaces and carefully tuned heuristics, limiting real-world applicability. We propose Unsupervised Real-world Skill Acquisition (URSA), an extension of QDAC that enables robots to autonomously discover and master diverse, high-performing skills directly in the real world. We demonstrate that URSA successfully discovers diverse locomotion skills on a Unitree A1 quadruped in both simulation and the real world. Our approach supports both heuristic-driven skill discovery and fully unsupervised settings. We also show that the learned skill repertoire can be reused for downstream tasks such as real-world damage adaptation, where URSA outperforms all baselines in 5 out of 9 simulated and 3 out of 5 real-world damage scenarios. Our results establish a new framework for real-world robot learning that enables continuous skill discovery with limited human intervention.
Key Results
URSA leverages world models and quality-diversity algorithms to enable autonomous skill discovery directly in the real world—without requiring simulation. Watch the extended video for a comprehensive overview of our method and all the key results demonstrated below.
Unsupervised Behavior Discovery
With URSA, the robot discovers diverse locomotion skills without explicit supervision over 5 hours directly in the real world (no simulation), guided by a reward for forward velocity.
Few-Shot Adaptation with ITE
Using the Intelligent Trial and Error algorithm, we demonstrate how the learned skill repertoire enables rapid adaptation to damage scenarios with no additional training.
Real-World Supervised Learning
URSA can also learn from hand-crafted skill definitions, enabling the robot to track diverse target velocities—both forward and angular—through on-robot training without simulation.
Velocity Control
We can then use the learned skill repertoire for controlling the robot's forward and angular velocities.
Citation
@inproceedings{grillotti2025ursa, title={From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity}, author={Grillotti, Luca and Coiffard, Lisa and Pang, Oscar and Faldor, Maxence and Cully, Antoine}, booktitle={Conference on Robot Learning (CoRL)}, year={2025}, address={Seoul, Korea} }