The power of continuous learning

During my first 2.5 years at OpenAI, I worked on the robotics team with a moonshot idea: We wanted to teach a single human-like robot hand to solve the Rubik’s Cube. It was a tremendously exciting, challenging and emotional experience. We solved the challenge with deep reinforcement learning (RL), insane amounts of domain randomization, and no real-world training data. Most importantly, we overcame the challenge as a team.

From RL simulation and training to vision perception and hardware firmware, we collaborated so closely and cohesively. It was an incredible experiment, and during that time, I often thought about Steve Jobs’ reality distortion field: when you believe in something so strongly and keep pushing for it so persistently, you can somehow make it happen. ‘impossible.

Since the beginning of 2021, I started to lead the Applied AI Research team. Managing a team presents a different set of challenges and requires changes in work style. I am very proud of several projects related to language model safety within Applied AI:

  1. We designed and built a dataset and evaluation tasks to evaluate the tendency of pre-trained language models to generate hateful, sexual or violent content.
  2. We’ve created a detailed taxonomy and built a strong classifier to detect unwanted content as well as why the content is inappropriate.
  3. We are working on several techniques to make the model less likely to generate unsafe results.

As the Applied AI team practices how best to deploy cutting-edge AI techniques, such as continuous learning“>large pre-trained language models, we see how powerful and useful they are for real-world tasks. We are also aware of the importance of deploying techniques safely, as our Charter underlines.

