I lead robot simulation team in X (formerly Google [x]). My research at X focuses on physics-based simulation and deep learning for robotics. In particular, I am interested in learning robot manipulation skills using deep reinforcement learning and learning from demonstration, by leveraging simulation and transferring the learned controllers in simulations to real robots.


I received my Ph.D. degree in Computer Science from Georgia Institute of Technology in 2015, under the advice of Dr. C. Karen Liu. My thesis focuses on designing algorithms for synthesizing human motion of object manipulation. I was a member of Computer Graphics Lab in Georgia Tech.


I received my B.E. degree from Tsinghua University, China in 2010.


We introduce Meta Strategy Optimization, a meta-learning algorithm for training policies with latent variable inputs that can quickly adapt to new scenarios with a handful of trials in the target environment. We evaluate our method on a real quadruped robot and demonstrate successful adaptation in various scenarios, including sim-to-real transfer.

Learning a complex vision-based task requires an impractical number of demonstrations. We propose a method that can learn to learn from both demonstrations and trial-and-error experience with sparse reward feedback.

We propose a self-supervised approach for learning representations of objects from monocular videos and demonstrate it is particularly useful in situated settings such as robotics. 

Training a deep network policy for robot manipulation is notoriously costly and time consuming as it depends on collecting a significant amount of real world data. we propose a method that learns to perform table-top instance grasping of a wide variety of objects while using no real world grasping data by using learned 3D point cloud of object as input.

Designing agile locomotion for quadruped robots often requires extensive expertise and tedious manual tuning. In this paper, we present a system to automate this process by leveraging deep reinforcement learning techniques. The control policies are learned in a physics simulator and then deployed on real robots.

Learning-based approaches to robotic manipulation are limited by the scalability of data collection and accessibility of labels. In this paper, we present a multi-task domain adaptation framework for instance grasping in cluttered scenes by utilizing simulated robot experiments.

Learning to interact with objects in the environment is a fundamental AI problem involving perception, motion planning, and control. We propose a geometry-aware learning agent, which predicts  3D geometry and uses it for grasping interactions.

© 2017-2020 by Yunfei Bai

  • LinkedIn Clean Grey