Originally published February 13, 2017
Here is an excerpt:
But when they used larger, more complex networks as the agents, the AI was far more willing to sabotage its opponent early to get the lion's share of virtual apples.
The researchers suggest that the more intelligent the agent, the more able it was to learn from its environment, allowing it to use some highly aggressive tactics to come out on top.
"This model ... shows that some aspects of human-like behaviour emerge as a product of the environment and learning," one of the team, Joel Z Leibo, told Matt Burgess at Wired.
"Less aggressive policies emerge from learning in relatively abundant environments with less possibility for costly action. The greed motivation reflects the temptation to take out a rival and collect all the apples oneself."
DeepMind was then tasked with playing a second video game, called Wolfpack. This time, there were three AI agents - two of them played as wolves, and one as the prey.
The article is here.