Originally published February 1, 2017
Here is an excerpt:
Libratus relied on three different systems that worked together, a reminder that modern AI is driven not by one technology but many. Deep neural networks get most of the attention these days, and for good reason: They power everything from image recognition to translation to search at some of the world’s biggest tech companies. But the success of neural nets has also pumped new life into so many other AI techniques that help machines mimic and even surpass human talents.
Libratus, for one, did not use neural networks. Mainly, it relied on a form of AI known as reinforcement learning, a method of extreme trial-and-error. In essence, it played game after game against itself. Google’s DeepMind lab used reinforcement learning in building AlphaGo, the system that that cracked the ancient game of Go ten years ahead of schedule, but there’s a key difference between the two systems. AlphaGo learned the game by analyzing 30 million Go moves from human players, before refining its skills by playing against itself. By contrast, Libratus learned from scratch.
Through an algorithm called counterfactual regret minimization, it began by playing at random, and eventually, after several months of training and trillions of hands of poker, it too reached a level where it could not just challenge the best humans but play in ways they couldn’t—playing a much wider range of bets and randomizing these bets, so that rivals have more trouble guessing what cards it holds. “We give the AI a description of the game. We don’t tell it how to play,” says Noam Brown, a CMU grad student who built the system alongside his professor, Tuomas Sandholm. “It develops a strategy completely independently from human play, and it can be very different from the way humans play the game.”
The article is here.