From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity

Luca Grillotti · Lisa Coiffard · Oscar Pang · Maxence Faldor · Antoine Cully

Adaptive & Intelligent Robotics Lab, Imperial College London

CoRL 2025 · Seoul, South Korea

Abstract

Autonomous skill discovery aims to enable robots to acquire diverse behaviors without explicit supervision. Learning such behaviors directly on physical hardware remains challenging due to safety and data efficiency constraints. Existing methods, including Quality-Diversity Actor-Critic (QDAC), require manually defined skill spaces and carefully tuned heuristics, limiting real-world applicability. We propose Unsupervised Real-world Skill Acquisition (URSA), an extension of QDAC that enables robots to autonomously discover and master diverse, high-performing skills directly in the real world. We demonstrate that URSA successfully discovers diverse locomotion skills on a Unitree A1 quadruped in both simulation and the real world. Our approach supports both heuristic-driven skill discovery and fully unsupervised settings. We also show that the learned skill repertoire can be reused for downstream tasks such as real-world damage adaptation, where URSA outperforms all baselines in 5 out of 9 simulated and 3 out of 5 real-world damage scenarios. Our results establish a new framework for real-world robot learning that enables continuous skill discovery with limited human intervention.

Key Results

URSA leverages world models and quality-diversity algorithms to enable autonomous skill discovery directly in the real world—without requiring simulation. Watch the extended video for a comprehensive overview of our method and all the key results demonstrated below.

Unsupervised Behavior Discovery

With URSA, the robot discovers diverse locomotion skills without explicit supervision over 5 hours directly in the real world (no simulation), guided by a reward for forward velocity.

Few-Shot Adaptation with ITE

Using the Intelligent Trial and Error algorithm, we demonstrate how the learned skill repertoire enables rapid adaptation to damage scenarios with no additional training.

Real-World Supervised Learning

URSA can also learn from hand-crafted skill definitions, enabling the robot to track diverse target velocities—both forward and angular—through on-robot training without simulation.

Velocity Control

We can then use the learned skill repertoire for controlling the robot's forward and angular velocities.

Citation

@inproceedings{grillotti2025ursa,
  title={From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity},
  author={Grillotti, Luca and Coiffard, Lisa and Pang, Oscar and Faldor, Maxence and Cully, Antoine},
  booktitle={Conference on Robot Learning (CoRL)},
  year={2025},
  address={Seoul, Korea}
}