The thing to realize is that, after playing AlphaGo for the first time on Wednesday, Lee Sedol could adjust his style of play—just as Kasparov did back in 1996. But AlphaGo could not. Because this Google creation relies so heavily on machine learning techniques, the DeepMind team needs a good four to six weeks to train a new incarnation of the system. And that means they can’t really change things during this eight-day match.
“This is about teaching and learning,” Hassabis told us just before Game Two. “One game is not enough data to learn from—for a machine—and training takes an awful lot of time.”
.. Following Game One, Lee Sedol acknowledged he was “shocked” by how well AlphaGo played and said he’d made a notable mistake at the beginning of the game that led to his loss about three hours later. “The failure I made at the very beginning of the game lasted until the the very end,” he said, through an interpreter. “I didn’t think that AlphaGo would play the game in such a perfect manner.” It’s unclear what early mistake he was referring to.
.. the current version of AlphaGo not only plays more aggressively. It makes fewer mistakes.
.. “Although we have programmed this machine to play, we have no idea what moves it will come up with,” Graepel said. “Its moves are an emergent phenomenon from the training. We just create the data sets and the training algorithms. But the moves it then comes up with are out of our hands—and much better than we, as Go players, could come up with.”
.. AlphaGo does not attempt to maximize its points or its margin of victory. It tries to maximize its probability of winning. So, Graepel said, if AlphaGo must choose between a scenario where it will win by 20 points with 80 percent probability and another where it will win by 1 and a half points with 99 percent probability, it will choose the latter.