Human-Complete Problems

Why define a set of problems in such a human-centric way?

.. Chess-boxing and Iron Man are real-world examples of such compound games.

.. Each of these exists as a universe of open-ended variety. Lee Sedol’s “make a living” game does not just involve the “beat everybody else at Go” finite game. It likely also includes: win awards, trash-talk other Go players, make the New Yorker cover, drink tea, respect your elders, eat bibimbap, and so on. AlphaGo beat Lee Sedol at Go, but hasn’t yet proven better than him at the specific infinite game problem of Making a Living as Lee Sedol (which would mean continuing to fulfill the ineffable potential of being Lee Sedol better than the human Lee Sedol himself manages). It also hasn’t figured out the problem of Making a Living as AlphaGo

.. When I was in grad school studying control theory — a field that attracts glum, pessimistic people — I used to hang out a lot with AI people, since I was using some AI methods in my work. Back then, AI people were even more glum and pessimistic than controls people, which is an achievement worthy of the Nobel prize in literature.

Is AlphaGo Really Such a Big Deal?

Will the technical advances that led to AlphaGo’s success have broader implications? To answer this question, we must first understand the ways in which the advances that led to AlphaGo are qualitatively different and more important than those that led to Deep Blue.

.. In chess, beginning players are taught a notion of a chess piece’s value. In one system, a knight or bishop is worth three pawns.

.. The notion of value is crucial in computer chess.

.. Top Go players use a lot of intuition in judging how good a particular board position is. They will, for instance, make vague-sounding statements about a board position having “good shape.” And it’s not immediately clear how to express this intuition in simple, well-defined systems like the valuation of chess pieces.

.. What’s new and important about AlphaGo is that its developers have figured out a way of bottling something very like that intuitive sense.

.. AlphaGo took 150,000 games played by good human players and used an artificial neural network to find patterns in those games. In particular, it learned to predict with high probability what move a human player would take in any given position. AlphaGo’s designers then improved the neural network by repeatedly playing it against earlier versions of itself, adjusting the network so it gradually improved its chance of winning.

.. AlphaGo created a policy network through billions of tiny adjustments, each intended to make just a tiny incremental improvement. That, in turn, helped AlphaGo build a valuation system that captures something very similar to a good Go player’s intuition about the value of different board positions.

.. I see AlphaGo not as a revolutionary breakthrough in itself, but rather as the leading edge of an extremely important development: the ability to build systems that can capture intuition and learn to recognize patterns. Computer scientists have attempted to do this for decades, without making much progress. But now, the success of neural networks has the potential to greatly expand the range of problems we can use computers to attack.

The Sadness and Beauty of Watching Google’s AI Play Go

According to Google, 60 million Chinese watched the first game on Wednesday afternoon.

.. In the first game, Lee Sedol was caught off-guard. In the second, he was powerless.

.. Kwon even went so far as to say that he is now more aware of the potential for machines to break free from the control of humans, echoing words we’ve long heard from people like Elon Musk and Sam Altman. “There was an inflection point for all human beings,” he said of AlphaGo’s win. “It made us realize that AI is really near us—and realize the dangers of it too.”

.. Lee Sedol said that over the course of the four-hour match he never once felt in control. “Yesterday, I was surprised,” he said through an interpreter, referring to Game One. “But today I am speechless. If you look at the way the game was played, I admit, it was a very clear loss on my part. From the very beginning of the game, there was not a moment in time when I felt that I was leading.”

.. In the end, Lee Sedol said he felt that, unlike in Game One, AlphaGo made no real mistakes. Not one. “I really feel that AlphaGo played the near perfect game,”

Google AI Wins Pivotal Second Game in Match with Go Grandmaster

The thing to realize is that, after playing AlphaGo for the first time on Wednesday, Lee Sedol could adjust his style of play—just as Kasparov did back in 1996. But AlphaGo could not. Because this Google creation relies so heavily on machine learning techniques, the DeepMind team needs a good four to six weeks to train a new incarnation of the system. And that means they can’t really change things during this eight-day match.

“This is about teaching and learning,” Hassabis told us just before Game Two. “One game is not enough data to learn from—for a machine—and training takes an awful lot of time.”

.. Following Game One, Lee Sedol acknowledged he was “shocked” by how well AlphaGo played and said he’d made a notable mistake at the beginning of the game that led to his loss about three hours later. “The failure I made at the very beginning of the game lasted until the the very end,” he said, through an interpreter. “I didn’t think that AlphaGo would play the game in such a perfect manner.” It’s unclear what early mistake he was referring to.

.. the current version of AlphaGo not only plays more aggressively. It makes fewer mistakes.

.. “Although we have programmed this machine to play, we have no idea what moves it will come up with,” Graepel said. “Its moves are an emergent phenomenon from the training. We just create the data sets and the training algorithms. But the moves it then comes up with are out of our hands—and much better than we, as Go players, could come up with.”

.. AlphaGo does not attempt to maximize its points or its margin of victory. It tries to maximize its probability of winning. So, Graepel said, if AlphaGo must choose between a scenario where it will win by 20 points with 80 percent probability and another where it will win by 1 and a half points with 99 percent probability, it will choose the latter.