More Stories

Proximal Policy Optimization
RESEARCH

8 years ago

Robust adversarial inputs
RESEARCH

8 years ago

Faster physics in Python
RESEARCH

8 years ago

Learning from human preferences
SAFETY & ALIGNMENT

8 years ago

OpenAI Baselines: DQN
RESEARCH

8 years ago

Robots that learn
RESEARCH

8 years ago