A computer that thinks for itself has mastered the world’s most difficult game with no human input, scientists have revealed

A computer that thinks for itself has mastered the world’s most difficult game with no human input, scientists have revealed.

The Artificial Intelligence computer program taught itself the game, and scientists claim the breakthrough is a gamechanger with all sorts of applications.

Deep Mind’s updated AlphaGo algorithm defeated its predecessor by 100-nil – which in turn had been too good for Go champion Lee Sedol last year.

Go is a 3,000 year-old Chinese game and has trillions of possible moves – far harder than Chess.


There are an astonishing 10 to the power of 170 possible board configurations in Go – more than the number of atoms in the known universe.

Prof David Silver, of London based Deep Mind, said: “What we are most excited about is how far it can go in the real world.

“The fact we have seen a program can achieve a very high level of performance in a domain as complicated and challenging as Go should mean now we can start to tackle some of the most challenging and impactful problems for society.”

The breakthrough means humans having “meaningful conversations” with artificial intelligence – like with the menacing robot HAL in 2001 A Space Odyssey – is no longer the stuff of science fiction.

It’s hoped AI will eventually lead to better medical treatments – such as discovering drugs and speeding up diagnosis of serious illnesses like cancer, diabetes and heart disease.

It can also be used in a diverse range of other areas from wildlife preservation and agriculture to cybersecurity and the law.

Deep Mind said: “After a few days of training — including almost 5 million games of self-play — AlphaGo Zero could outperform humans and defeat all previous versions of AlphaGo.

“As the program trained, it independently discovered some of the same game principles that took humans thousands of years to conceptualise and also developed novel strategies that provide new insights into this ancient game.”

AlphaGo’s previous training required the use of experts in the ancient game – along with multiple machines and chips. The new method is free of human guidance.

In one of the biggest advances in AI to date it uses the game board as the input for an artificial neural network.

This calculates the probability with which each possible next move could be played and estimates the probability of winning for the player whose turn it is to make the move.

The AI learns the moves that will maximize its chance of winning through trial and error – called ‘reinforcement learning’ – and was trained exclusively by playing games against itself.

AlphaGo Zero used about 0.4 seconds of thinking time per move to perform a look-ahead search.

That means it used a combination of game simulations to decide which moves would give it the highest probability of winning – updating its neural network.

Previously AlphaGo required many neural networks and multiple sources of training data.

AlphaGo zero also needed only 4.9 million training games instead of 30 million – three days rather than several months of training.

Playing under conditions that match those of human games AlphaGo Zero beat AlphaGo 100–0.

Deep Mind says AlphaGo zero starts from a blank slate and conquers the grand challenge of Go with “superhuman proficiency”

Describing it in Nature the team says it learns solely from the games it plays against itself.

This starts with from random moves with only the board and pieces as inputs and without human data.

Deep Mind said: “AlphaGo Zero uses a single neural network which is trained to predict the program’s own move selection and the winner of its games – improving with each iteration of self-play.”

The new program uses a single machine and just four specialised chips.


David Silver

Prof Silver said it could be transplanted from Go to “any other domain” – and was “so good it could be applied anywhere.”

He said: “After a few days of training – including almost 5 million games of self-play – AlphaGo Zero could outperform humans and defeat all previous versions of AlphaGo.

“As the program trained, it independently discovered some of the same game principles that took humans thousands of years to conceptualise and also developed novel strategies that provide new insights into this ancient game.”

He said the aim of AlphaGo Zero is not to beat humans but “understand what knowledge is.”

Computer scientist Professor Satinder Singh, of Michigan University, reviewed the findings for the journal and said teaching a computer to learn from scracth is a “major achievement.”

He said: “Go players, coming from so many nations, speak to each other with their moves, even when they do not share an ordinary language.

“They share ideas, intuitions and, ultimately, their values over the board – not only particular openings or tactics, but whether they prefer chaos or order, risk or certainty, and complexity or simplicity.

“The time when humans can have a meaningful conversation with an AI has always seemed far off and the stuff of science fiction. But for Go players, that day is here.”

AlphaGo made its name last year when it defeated high-profile Go player Lee Sedol in China – following this up by beating Go world champion Ke Jie.

DeepMind Technologies Limited is a British artificial intelligence company founded in September 2010. It was bought by Google in 2014 for £400 million.

Police officers use PlayStation game in order to improve driving games 

Oxford set to become the world’s first Zero Emissions Zone after announcing a ban on ALL non-electric vehicles

Leave a Reply