OpenAI AI Coding Filter page 14

OpenAI AI Coding Filter October 31, 2018 07:00

Reinforcement learning with prediction-based rewards

We’ve developed Random Network Distillation (RND), a prediction-based method for encouraging reinforcement learning agents to explore their environments through curiosity, which for the first time exceeds average human performance on Montezuma’s Revenge.

OpenAI AI Coding Filter July 25, 2018 07:00

OpenAI Scholars 2018: Meet our Scholars

Our first class of OpenAI Scholars is underway, and you can now follow along as this group of experienced software developers becomes machine learning practitioners.

OpenAI AI Coding Filter July 04, 2018 07:00

Learning Montezuma’s Revenge from a single demonstration

We’ve trained an agent to achieve a high score of 74,500 on Montezuma’s Revenge from a single human demonstration, better than any previously published result. Our algorithm is simple: the agent plays a sequence of games starting from carefully chosen...

OpenAI AI Coding Filter March 07, 2018 08:00

Reptile: A scalable meta-learning algorithm

We’ve developed a simple meta-learning algorithm called Reptile which works by repeatedly sampling a task, performing stochastic gradient descent on it, and updating the initial parameters towards the final parameters learned on that task. Reptile is the...

OpenAI AI Coding Filter February 26, 2018 08:00

Ingredients for robotics research

We’re releasing eight simulated robotics environments and a Baselines implementation of Hindsight Experience Replay, all developed for our research over the past year. We’ve used these environments to train models which work on physical robots. We’re also...

OpenAI AI Coding Filter October 26, 2017 07:00

Learning a hierarchy

We’ve developed a hierarchical reinforcement learning algorithm that learns high-level actions useful for solving a range of tasks, allowing fast solving of tasks requiring thousands of timesteps. Our algorithm, when applied to a set of navigation problems,...

OpenAI AI Coding Filter October 11, 2017 07:00

Meta-learning for wrestling

We show that for the task of simulated robot wrestling, a meta-learning agent can learn to quickly defeat a stronger non-meta-learning agent, and also show that the meta-learning agent can adapt to physical malfunction.

OpenAI AI Coding Filter August 16, 2017 07:00

More on Dota 2

Our Dota 2 result shows that self-play can catapult the performance of machine learning systems from far below human level to superhuman, given sufficient compute. In the span of a month, our system went from barely matching a high-ranked player to beating...

OpenAI AI Coding Filter August 03, 2017 07:00

Gathering human feedback

RL-Teacher is an open-source implementation of our interface to train AIs via occasional human feedback rather than hand-crafted reward functions. The underlying technique was developed as a step towards safe AI systems, but also applies to reinforcement...

OpenAI AI Coding Filter June 28, 2017 07:00

Faster physics in Python

We’re open-sourcing a high-performance Python library for robotic simulation using the MuJoCo engine, developed over our past year of robotics research.

OpenAI AI Coding Filter June 13, 2017 07:00

Learning from human preferences

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration...

OpenAI AI Coding Filter June 08, 2017 07:00

Learning to cooperate, compete, and communicate

Multiagent environments where agents compete for resources are stepping stones on the path to AGI. Multiagent environments have two useful properties: first, there is a natural curriculum—the difficulty of the environment is determined by the skill of your...