How An Advanced AI Took Over Google Search
By: Daniel Jeffers, Elizabeth Navas
To many people, Google’s search algorithm is the symbol of an advanced intelligence. Google/Skynet jokes have been around for a long time and reflect how important the search engine is to our daily lives and our ability to interact with all the information outside of us.
But the truth is that Google search, specifically, has been holding out against the modern wave of artificial intelligence until very recently. Machine learning is the way we describe a program that can basically program itself based on huge amounts of data. The advantage of machine learning is that it can be more flexible, more efficient. The disadvantage, from an engineering point of view, is that once it has started writing its own code, you no longer can be entirely sure how it works.
Though Google has been heavily invested in machine learning, the search team has resisted using it as part of the algorithm. Machine learning has been incorporated into the advertising, gmail inbox features, and image search, and is central to many Google research projects. But search was considered too important, too sacred. The search engineers preferred to keep control over the algorithm.
In 2014, the neural network engineers convinced the search ranking team to let them try an experiment. Initially, RankBrain was rolled out just to deal with never-before-seen queries, which account for between 10 and 15% of all searches on Google. RankBrain could learn to match these queries to phrases already known and indexed using learned vectors between words and concepts.
A year later RankBrain is involved in every query and has been cited as the third most important ranking factor. Apparently it has passed its audition.
Other than that we really don’t know a lot about how RankBrain is being used. Is it connected to project Leapfrog? Google contracts out a project that employs thousands of human raters to review and compare search result pages on both desktops and mobile. Could RankBrain be using their results to teach itself? Could it even be submitting problems to the human raters to help itself learn faster? We don’t know.
We do know that Google claims RankBrain is helping the algorithm become more and more human-like, or least better able to predict human responses to the results offered. But we don’t know how it does this, or what modifications it may be making to the algorithm That’s partly because Google, famously tight about the factors it uses, doesn’t want us to know. But it is also because the Google engineer’s don’t really know either.
The process of training the code is now more organic. According to expert Christine Robson (quoted here):
“The machine learning model is not a static piece of code — you're constantly feeding it data,” says Robson. “We are constantly updating the models and learning, adding more data and tweaking how we're going to make predictions. It feels like a living, breathing thing. It’s a different kind of engineering.”
How does this change our relationship to Google? Well, we have always been guessing. We could run a lot of tests, read published papers, and scour the words of those few Google engineers allowed to comment publicly, but mostly we have to try things and see what works. Of course we also use Google’s Webmaster Guidelines, which should give us at least a safe place from which to operate.
Presumably RankBrain will continue to adhere to, and value the Webmaster Guidelines. It may be similarly mysterious, and currently does not have any publicly authorized spokesperson to explain itself. Trial and error should work about as well as it always has, but when we form theories to test, we need to think of RankBrain as something other than code written by engineers. It is, essentially, a self-writing program whose goal is to become as human-like as possible.