Cade Metz
The New York Times
Originally published August 13, 2017
Here is an excerpt:
Many specialists in the A.I. field believe a technique called reinforcement learning — a way for machines to learn specific tasks through extreme trial and error — could be a primary path to artificial intelligence. Researchers specify a particular reward the machine should strive for, and as it navigates a task at random, the machine keeps close track of what brings the reward and what doesn’t. When OpenAI trained its bot to play Coast Runners, the reward was more points.
This video game training has real-world implications.
If a machine can learn to navigate a racing game like Grand Theft Auto, researchers believe, it can learn to drive a real car. If it can learn to use a web browser and other common software apps, it can learn to understand natural language and maybe even carry on a conversation. At places like Google and the University of California, Berkeley, robots have already used the technique to learn simple tasks like picking things up or opening a door.
All this is why Mr. Amodei and Mr. Christiano are working to build reinforcement learning algorithms that accept human guidance along the way. This can ensure systems don’t stray from the task at hand.
Together with others at the London-based DeepMind, a lab owned by Google, the two OpenAI researchers recently published some of their research in this area. Spanning two of the world’s top A.I. labs — and two that hadn’t really worked together in the past — these algorithms are considered a notable step forward in A.I. safety research.
The article is here.