There have been quite a few major changes over the years with how search engines rank websites, pages, and content. In this article, I briefly survey the most significant changes made by Google. The emphasis will be the recent Google updates affecting SEO and the best strategies for dealing with them.
In the Beginning there was PageRank
In the beginning (of the internet), the internet was without form and void. Unless you had a specific address for the content you wanted to load, there was nothing to “browse”. There was no method to search across addresses for content relevant to your research interests. In fact, the internet was not used for research at all.
I fondly recall doing my university research in the 1990’s. All research was done in the libraries. The libraries did have a searchable indexed digital catalogue for their books by title, author, year, synopsis, and keywords. For journals, it was a little bit different, we had microfiche and microfiche scanners. Searching through academic journals was essentially analogue and took many hours, days, or even weeks to find enough relevant material.
There was a shortcut, however, to wading through the microfiche of millions of journal articles. Nearly every well regarded textbook in your field referred to the most significant journal entries. So if you wanted to do research on a topic beyond the introductory textbook information in a chapter, all you had to do was jump to the bibliography. And bang! There’s a list of mostly journal articles to start your research. Spend a long day in the library, and about $40 in photocopying, and you could come home with every primary article source for that topic. After a week or so of reading, you could go back with a secondary list of sources compiled from the bibliographies of the relevant first sources. And so on, until your day of rest, usually the day before the day the research paper is due (the very last day reserved for writing!).
If you think about it, and especially if you have a similar academic experience in a research based discipline, the challenges in finding the relevant information were more than just analogous to the development of a search engine for the internet. The challenges and existing solutions were both the context and inspiration for Google’s early search criteria.
PageRank, named after Larry Page, both an academic and a founder of Google Search, is a criterion for ranking the authority of pages based on which and how many other pages refer to them. It’s a method for mathematically discovering the primary, secondary, tertiary, and subsequent ariness of sources. If everyone in a field refers to some seminal journal article, we should assume that that is an essential primary source of highest quality, authority, and trustworthiness. PageRank is basically the same method we used as an academic research shortcut to the currently inefficient and time consuming methods provided by the library. Except the concept was to index the entire internet and apply this shortcut to that.
All this assumes sufficient similarity between academic content and internet content well before anyone could predict exactly how, or how diverse, the internet could become, both in content and use. The early implementation of PageRank also used other commonly accessible journal article attributes such as the title, author, keywords, source (an URL instead of journal name and number) and abstract (the page description).
These worked pretty well for the first decade or so, well enough to make Google the dominant search engine for the internet. But they weren’t without problems. And as the internet evolved, so did the challenges Google faced in indexing and ranking searches. A dominant theme throughout is relevancy. How does Google rank and present the most relevant results for a given query? We will continue to explore this theme as we get to the best SEO strategies and practices for 2016.
Problems in Paradise (and the Rise of Censorship)
Or How Not to do SEO
While Google built one of the most profitable tech empires, partly based on Google Search, and by far the largest internet database and search engine in the world, it was not without problems. The goal was to provide the most relevant search algorithm for content on the internet. But their model for content was academic journals with standard library search parameters. There was one significant difference between internet content and academic journal articles. Academic journal articles were already peer reviewed for content and quality prior to publication. The internet was an open space for content of any quality and topic.
This big difference created big problems for Google Search and motivated their revisions over their second decade. All their revisions involved increased censorship with the intention of quality management similar to the peer reviewed management of academic journal publications. But there was no possible way Google could be an expert in every possible topic for content on the internet. So while Google started publicly trading their stock options, mostly driven by Adwords revenue, their search engine suffered abuse after abuse, creating a decade where SEO was more about exploiting Google’s ranking deficiencies then providing genuine quality content.
Bogus Backlink Exploits
You probably still see spam email telling you that someone is willing to provide hundreds of quality backlinks to your website for almost nothing. This is a legacy of the most common Google exploit during this period. PageRank made backlinks the most important method for improving page position in a Google search. For a time, you could effectively fool the Googlebot into thinking that your site was the most referenced on the internet for a specific topic. If you recall my explanation of academic research using journal article references and sources, that would have worked for internet pages as well, but only if they were already peer reviewed. To combat fraudulent use of backlinks, Google took a two pronged approach.
First, Google modified how they determined the “quality” of the source of a backlink. Previously, as with academic references, quality is merely a quantitative measure of the relative number of references to a source. In Google rankings, backlinks from references with higher PageRanks were higher quality backlinks. But now, backlink quality had to be determined somehow from the source itself, apart from PageRank. A backlink from Apple.com, for example, had to mean more than from some unknown blogger website. It’s not generally known exactly how they changed their backlink quality assessment. Because of the proliferation of exploits, many of Google’s updates were kept a secret. It could have been as simple as manually creating a database ranking the top sources in different industries and sectors. What is clear is that they changed something because this sort of backlink exploit stopped working.
Anybody who tells you they can get your website into the top page of a search just by generating backlinks for you is selling you a scam. You should immediately junk it as spam. Don’t waste your money.
Second, Google censored suspicious targets of excessive backlinks. Again, their exact methods were very secretive. However, what was clear was that if an unknown website suddenly got a large number of backlinks, it could be flagged for manual review. Once reviewed, the website and all its associated domains could be blacklisted from Google Search. There was no algorithm for blacklisting, and it was totally up to the judgement of someone at Google.
During this time, there was a lot of buzz throughout the SEO community about “negative SEO”. As SEO professionals, we were worried that malicious attacks could spam backlinks to our client websites, flag them, and shut them down. This would be entirely out of our control, since we can’t control who puts a link on their website pointing to ours (outside of manually blocking those references one by one by URL or IP server-side, which would be loads of manual labour).
When Google addressed this issue, they denied using negative pointing in their ranking algorithm, but not blacklisting domains. They did say, however, that the number of domains they had to blacklist was extremely small. Basically, they had to have enough evidence that the same domain which was getting the excessive backlinks was also the one generating them. “Negative SEO” was a myth in that sense. But it was also totally up to the judgement of Google. Google was doing source-checking the same way an academic peer reviewer would of an article being reviewed.
Backlink exploits aren’t only scams, they can be viciously harmful to your website’s search position. They can even get your domain blacklisted by Google Search. But only if they have sufficient reason to believe the exploit was intentional.
Since then, backlinks and PageRank have been consistently decreased in their ranking priority. There is good reason to believe that eventually they will be removed altogether, and that Google is beta testing version of their ranking algorithm without it. As I will explain later, other factors, such as the shift toward social media interaction, seem to have a greater effect. These shifts better reflect how non-academic word-of-mouth references work in the business and commercial world to peer review content and products.
URL Abuse and Domain Name Parking
Google used to prioritize keywords in domain names and URLS, even over keywords in content. I drew the analogy earlier between URLS and journal sources. Academic journals tend to include their topic in their name. We know that “The American Journal of Physics”, for example, is going to contain articles about physics. In the same way, people were advised to purchase domain names which were descriptive of their content. A plumbing company named Grunge would be advised to get a domain name like grungeplumbing.com. While this might be a good idea in general, there are exceptions (“Google.com”, for example, isn’t “googlesearch.com”). And the emphasis on descriptive domain names led to a virtual real estate submarket where companies would register and hoard domain names with descriptive content, jump on registration of expired domain names, and pick up domain names with similar names or even misspellings of existing domain names. All in hopes of later reselling them at premium prices.
This practice also led companies to buy multiple domain names related to their primary domain. The fear was that someone could use a related domain name and quickly outrank yours. Google basically ignored this practice and fear throughout this period. They probably hoped that their other modifications would outweigh the abuses. Eventually, however, like in the last 5 years, domain names and urls have decreased as a factor in ranking. Today, they are virtually non-existent as a factor. But for other reasons I will explain later.
Don’t worry about your domain name in terms of SEO. Get something easy to remember. And you only need one. Your content will speak for itself.
Keywords were probably the most commonly understood SEO element. When someone searches an engine, they try to guess the keywords which best match what they are searching for. So on your webpage, you would do the same in reverse, trying to guess the keywords which users searching for you would guess best matched your content. It’s sort of like the SEO version of the television game of Family Feud. But, without any body of peers reviewing your keywords, keywords were also the simplest criterion to abuse.
Instead of trying to accurately describe your page content with a few keywords, SEO hackers would try to use common keywords and phrases used to boost your page position in a wider range of searches. Or they would attempt to find niche keywords or phrases which had high queries but a low number of results. This is all done apart from, and often unrelated to, the page content.
Again, for a long time, Google did almost nothing to combat keyword abuses. Again, they probably hoped that by adding other factors, the harm of keyword abuses would be diminished. Eventually, they shifted toward prioritizing keywords in content. What that meant was that they expected you to put your most valuable keywords in the most prominent content, such as in your titles and introductory paragraphs, along with the repetition of your most desired keywords. This practice paved the way for the SEO analysis of keyword density whereby we measured which words and phrases were used how many times and compared it with Google’s Webmaster Tools to the actual queries used to give your page impressions in a Google Search. Google banked so heavily on keywords in content and keyword density that they eventually completely stopped reading the keywords meta tag (which was the original primary source for keywords just like in academic journal articles).
Neither keywords in content nor keyword density were sufficiently secured from abuse, however. Exploits included hidden keywords in content and misrepresentative or irrelevant keywords scattered throughout the content, visible or hidden, meant to artificially target high value queries. Businesses themselves were encouraged to have two layers of content, one for the search engines, and one for their actual potential visitors.
I believe Google still hasn’t resolved this issue based on the evidence that their latest series of updates have primarily targeted these issues. Keyword density, which used to be a positive measure, is now considered part of keyword stuffing, which is negative. Since these were common practices, and their latest series of minor updates were very quiet with their releases, content revisions conforming to Google’s latest best practices should be considered a top SEO priority. We’ll discuss these revisions later when we look specifically at the best SEO strategies and practices for 2016.
It is against Google policy to hide keywords in your content with the intent to mislead Google and misrepresent your content. “Keyword stuffing” is also against their policy. Content revisions conforming to Google’s latest best practices should be considered a top SEO priority.
Best SEO Strategies and Practices for 2016 and beyond
I hope that having a broader understanding of the historical context of Google’s Search rankings provides a better understanding of the changing role of SEO going forward. The challenge for Google has been translating their search engine from a tool to find academic library content to uncensored, unreviewed, real world content on the internet, and to make it relevant to businesses and commerce without overbearing central censorship. I’ll review here their most recent changes and try to put it within this context so we have a long term strategy for developing best SEO practices going forward from 2016 and beyond.
Localization Makes Everyone a Winner
There’s one big difference between searching for academic journals articles and searching for online businesses which I haven’t yet discussed. Many, if not most, businesses are local. That means they have a location (even if they distribute to a wide number of locations). For SEO, that means that you don’t need to be in the top ranking listings globally for a specific query. You just need to place top within your locality, which might include your closest local competitors.
SEO is like an auction in that it is a bidding war. Given a group of competing businesses, and given equal efficiency on their SEO expenditures, all else being equal, the one who spends the most is going to get the best positions in a Google search. If that were true also over a global distribution, no one except the massive multi-national brand empires would ever get good placement on a Google search. So one of the solutions Google worked very hard on (while others were working on social media), was localization of search queries. This resulted in a number of different, but later connected, Google technologies.
The biggest winner for Google in the localization space was Google Maps. Google Maps has become as or more ubiquitous than Google Search. High tech growth business like Uber rely on the GPS location data provided by Google Maps. Google Maps doesn’t only include people searching the map for a location, it also includes all the background API calls to Google Search from web and mobile apps which want to integrate location data about a user or content query. Many times you don’t even see these calls being made. Google Maps also allows users to add business and other location specific information, media, and other content. This additional content is also available to apps through the API. Ads can also be used to target location specific content.
Within Google Search results, the top ten local maps listings are often provided at the top of the Google Search listings (right under the top 3 Adwords paid advertisements). Businesses can make use of localization for SEO just by having their business listed in Google Maps. Most recently, Google has launched My Business as a central manager for both your business page(s) in Google+ and your Google Maps listing(s).
I believe we are just see the beginning of the uses and effects of localization. Localization could potentially be used for highly personalized and targeted marketing campaigns. An app, for example, could allow a business to tailor a store special based on a user’s buying patterns, then offer that special just as the user is walking or driving within a specified proximity of the business location. With the addition of tracked devices in the Internet of Things (such as pet tracking collars), the localization possibilities explode even further.
Here’s a list of my top localization strategies for 2016 and beyond:
- Get listed on Google My Business. By adding a location there, you will also be listed on Google Maps. If you already have a location listing on Google Maps, make sure that location shows up in My Business, otherwise you might end up creating duplicate listings which are both bad for your SEO and really difficult to fix after. Tip: make sure you use the same Google account email to login to all your Google services. If you need to have more than one account access these services, learn how to add a user account to the particular service.
- Make your Google Map listing stand out by adding images and business information. Don’t forget to add your website address! This helps the user and Google connect your business location with your website and better localize searches for your content.
- Get your web developer or SEO expert to add location structured-data tags to highlight the location information on your website. Most websites put their location information in their footer. That helps Google localize you for searches. But you can explicitly highlight that data for your Google and your search listing by adding meta tags around that information. Google integrates structured data as defined by http://schema.org/.
- Using Google’s Webmaster Tools, your SEO expert can help you localize your content for common related search queries in your area. A sound SEO strategy is to optimize your content for queries in the shortest optimal radius to your location (roughly 5-10km). Once your average position and click through rate (CTR) is high in a narrow radius, that radius can be expanded with little effort. For SEO, narrowing your target first is a much more effective strategy than casting a broad net to see what catches.
The Rise of Social Media Marketing
Many technology and communications experts believe that Google completely missed predicting the rise of social media giants. Some think it was because Google is a geeky academic computer science company, not a cool hip company like Facebook. Others believe that their focus has always been on data, not on social interaction, so social media entrance wasn’t a good fit for them. I believe that they probably predicted the size of the data available in a social media space like Facebook, but didn’t see it as core to their business model, for whatever reason.
Officially, Google has claimed that social media activity has no direct influence on Google Search. The keyword there is direct. Other SEO experts like those at SearchMetrics and KissMetrics have both reported that their tests show there is some influence between social media activity and page rankings. This may be indirect based on other Google factors. For example, Google might have shifted their backlinks ranking to include backlinks from social media posts. This would make sense since social media references to a page are a much more natural representation of word-of-mouth recommendations. Plus, social media is already semi-censored and peer reviewed.
Of all the social media, even though Facebook is by far the most used, Google+ clearly provides the most effects for SEO gain. Again, these are likely indirect. It’s not so much that Google explicitly trying to promote Google+ through Google Search rankings. Rather, Google’s integration of their services provides tools to help you rank higher. For example, Google+ Pages provide a means for user feedback, ratings, and reviews. If you have connected the services correctly in My Business, these Google+ Page ratings and reviews will show up both in your Google Maps and Google Search listings. Not only will they show up, but they will factor into the position of the listing over those which haven’t been reviewed. Following my analogy to peer reviewed journal articles, these ratings and reviews are a strong and common replacement for academic peer reviews in the non-academic business and commercial world. Consumers, especially Millennials, rarely make a purchase online or step into a new store without first consulting reviews. Google is merely utilizing and patterning this behavior within its search results while phasing out the less personalized method of backlinks and PageRank. The integration and use of Google’s services to demonstrate the quality of your products or services is a highly effective SEO strategy for 2016 and going forward.
By sheer volume and engagement of users, Facebook is probably the most important social media platform for marketing. For SEO, it is a close second to Google+ according to recent tests. To understand the SEO benefits of Facebook activity, first understand that Facebook links back to your website count as backlinks. They might also count as quality backlinks since Facebook content has some censorship and peer review characteristics. While anyone can post nearly anything, if your peers don’t like it, they aren’t going to repost it. And Likes of content increase the number of feeds given to the post. So the increased number of impressions in feeds really does represent a sort of general consensus about the quality of that post. You might disagree if you’ve seen the kinds of posts which get massive likes and reposts! But that is purely personal opinion. “Objective” quality in social media is equivalent to popularity. Peers need not be highly educated for their reviews to count as it does for academic journals! By embracing this difference, we can understand and use popularity as socially authoritative of quality content on the internet. This is a tough one for Gen Xers and older to understand being that culturally, we tend to be from a pre-populist pre-internet pro-education culture in which much of what is popular on the internet appears to be “objectively” garbage to us by our standards. If this is your problem too, the trick to success in the social media space, I think, is lowering your standards, dumbing down content, and appealing to the lowest common denominator instead of the higher educated specialists. Of course, this totally depends on the sort of content you are trying to promote.
I’m not a social media expert by professional standards. I don’t even enjoy most social media activity for personal use. But I understand it enough to recognize its cross-over implications for SEO and digital marketing. The biggest advantage of social media is that it changes the way we access information. Instead of actively having to search for everything we want, social media provides passive “feeds” of information intended to be relevant to our interests and behavior. Feeds provide the opportunity to actively present yourself to potential audiences rather than passively wait for them to find us. Social media puts the onus on the content provider to be seen and heard rather than on the consumer to find what they are looking for.
Hiring a social marketing specialist, or using a professional social media manager like Hubspot, can be expensive and often out of the reach of small to medium sized businesses. Even using a free social media management service like Hootsuite can be overwhelming and time-consuming to many. The trick is understanding what you can do effectively in-house without expending excessive time and resources. If you don’t think you will be investing heavily in social media for your digital marketing, I would at least consider in investing in it enough to help your SEO.
A quick and easy example of a social media workflow for SEO that I recommend goes something as follows:
- Publish blog posts on your website regularly (how regularly is up to your available time and content). A blog can be considered your starting point and manager for your outgoing social media posts. Typical of press releases and newsletters, a blog post can highlight an aspect of your business. It’s a really good idea to provide links from a post both to other related pages on your website as well as your contact page. It’s also a good idea to have a subscription call to action which can collect interested viewers contact information and notify them when you release new blog posts.
- Auto-publish your post to your social media pages. If you are using a CMS, there are plugins to help auto-publish your post to your social media pages. The plugin should format your post nicely for the social feed using Facebook’s Open Graph protocol (this protocol is widely accepted beyond Facebook for formatting page data nicely within a social post). The plugin will also provide a “read more” link back to your post on your website. That little “read more” link is the first step to creating the desired backlink and social authority for your post page.
- Use social sharing on the post to facilitate re-posts and likes. Again, if you’re using a CMS, there are plugins to assist social sharing. Social sharing provides the means to increase the reach of your post. Increasing the reach of your post is the second step to building the social authority and backlinks to your post page.
You might additional use social media marketing tools to boost the reach of your post. Facebook has a “boost” feature available on feed content. But even if you don’t spend advertising money on social media, if you follow the steps above regularly, using your blog you can easily and quickly increase the quality traffic to your website and, indirectly, boost your page position in Google Search.
Mobile Devices, Multiple Devices, and the Internet of Things
For about two decades, internet access was basically restricted to desktop computers (there are a few technical exceptions I won’t go into here). Only relatively recently has computational power gotten cheap and small enough to provide internet access on other devices such as mobile phones and tablets. Mobile devices, however, have had a huge impact on the way we use the internet.
The internet is not just something which is searched in a browser. Apps can access it, reprocess it, and represent it in niche specific ways. Users can browse or use apps across multiple devices. And, increasingly, apps will be connected to devices with very specific functions and no interface themselves (recall my GPS pet tracker collar example). The proliferation of devices and uses of the internet provided Google some major challenges to indexing and ranking content since we’re not just talking about website pages anymore. (I was just recently browsing the Samsung website. While they put smart technology into all their electronics and appliances now, they have also created a category for SmartThings which contains, for now, Smart Home sensors, hubs, and outlets which can all be controlled with your mobile phone.)
For some time, while Google worked on the challenges of tracking the same user across multiple devices given content variations on different devices, Google Search kept mobile and desktop search traffic separate. That meant that you could rank high on one but not the other. The same search on one could have radically different results than a search on the other. That was until the highly publicized update on April 21, 2015.
What you probably heard about that update was that you better get your website mobile friendly or your website would be penalized and drop in rankings. I provided our clients at Allegra with a similar notification in my post, Is it time to upgrade your website for mobile traffic?. What Google promised to do was (a) test for mobile friendliness, (b) penalize pages which did not pass their test, (c) and merge mobile and desktop search into a single index.
Since Google implemented their Mobile Friendly Test into their page ranking system, I’ve seen an average session drop of 20% in client websites which have not updated their website for mobile traffic. This drop has entirely been from organic search traffic from Google. This update is real, and it has a major effect on traffic to websites which do not pass Google’s mobile friendly test.
The excuse that your client base does not use mobile phones to browse your website isn’t a good excuse not to update your website anymore. Failing to be mobile friendly and responsive makes it harder for desktop clients to find you in Google Search as well. And if you’re like most websites, users from Google Search probably accounts for about 50% of your new traffic.
Content Killed the Keyword
Over the last year, since the latest big Google update, I began noticing a small drop in traffic on some client websites even though we had made them ready for the mobile friendly test with completely responsive design and high scores on mobile related PageSpeed Insights. This was puzzling me. I retested the websites and still found no mobile related issues. So I began digging and do some research.
It turns out that behind the big fan-fare of mobile readiness was another really important update to the way Google reads content. I’ve already mentioned that Google stopped reading keywords in the keywords meta-tag. Since then, all SEO advice was to embed keywords in your content, particularly in the significant markup like titles (and even bolded tags). We all measured keyword density, the number of times a word or word phrase is used throughout the page. And our strategy was to make the keyword density accurately reflect both the intended page content and also the search queries being used to find that content. Well, that’s all changed again!
I’m telling you today, it’s confirmed. Google has dropped keywords entirely. That includes keyword density. Matching keywords to search phrases is dead.
Google had been cautioning against “keyword stuffing” for a long time. For SEO, this meant not making our keywords too dense (ie., not too many repetitions). But now it appears that any keyword density is considered “keyword stuffing”. Without warning, and hidden behind their big update announcements, was a major change to the way Google read page content.
So what has Google replaced keywords with?
That’s a good question, and it’s something we’re still investigating. The short answer is “content”. But that’s not very helpful is it? It’s not even entirely accurate. The almost as short answer is “quality content” with the emphasis on quality. Apparently, there have been a number of subsequent minor updates to refine or correct their quality assessment. From what we can tell so far, quality content includes:
- Uniqueness. The uniqueness of the content includes having a unique angle on some topic which sets it apart from similar competitor pages. It also means using synonyms instead of repetitions. You can repeat the same idea as much as you like, just as long as you do it using different words, and present it within a unique context. Generic phrases, especially meaningless ones or context dependent ones, will not help and might hinder your content’s uniqueness score.
- Readability. There are some SEO experts who believe that Google is using the Flesch-Kincaid readability tests to measure readability. These tests basically score the reading difficulty of the writing and assign it a grade level. The ideal writing level, apparently, is about Grade 9 (although this might be different for different categories of content). This is measured by a combination of the average word length of the sentences and syllable length of the words. For most people, this just means writing naturally, as one speaks, avoiding technical jargon and big words as much as possible. Spelling and grammar might also be factored into readability, but I haven’t seen this confirmed yet.
- Content length. It doesn’t appear there is any minimum or maximum length for content. The ideal length of the content seems to depend on the topic and how much unique content you can contribute to it. A contact page, for example, doesn’t need loads of extra information weakening the point of that page, to get contact information or submit a form. Homepages, however, which have only a slider, menu, and footer, might be too small. I’ve done several tests with those, changing them instead to more of a “one page” scrolling style with several sections describing their main product or service categories. All of my tests have shown an improvement in the amount of organic Google Search traffic, as well as engagement with the content on the website (less bounces, more pages view, and higher conversion rates). The idea here is to have just enough content on the page to fully describe what the page is about, and not more (although I haven’t seen any pages get penalized yet for too much content). The main problem on most website is that the content is too short.
- Content organization. This aspect hasn’t changed. A well written page will be organized well by topic using title headers. Paragraphs will flow well from an introductory paragraph to a closing paragraph. Asides, advertisements, forms, and anything not significant to the topic should be labeled as such so the Googlebot does not read it as part of the content.
- Remove everything keywordy. Especially keywords hidden behind things, in tags, in images, or anywhere not visible by the user. But this also means re-reading your page content to look for anything which might pop out and get flagged as a keyword. This sounds counter-intuitive for those of us who have been doing SEO for a long time now since keywords have been a central aspect of our SEO strategies.
I think Google’s general move I think is that content should be user driven, not bot driven. Every SEO exploit to date has used an aspect of Google’s assessment which has been disconnected from actual visitor use. The more human Google makes their bot, the less easy it will be to exploit and abuse. SEO should be about meeting human usability conditions, not about satisfying or hacking search engine criteria.
So how does Google understand the content without using keywords?
The short answer is that we don’t entirely know. I’ve given some suggestions above for some tested tactics for meeting Google’s content requirements. But that doesn’t tell us anything about the actual engine or algorithm driving Google’s content scoring. For that, I can merely provide an educated speculation.
The key, I believe, is to understand where Google has been spending their research money. One of their huge recent acquisitions has been DeepMind, a machine learning program in the sector of Artificial Intelligence. In just the last year, DeepMind was able to win against a professional level Go player, a feat that was thought many years away for AI. Google is not the only large company investing in AI. Other companies include Microsoft, IBM, and Facebook. While it’s highly unlikely that DeepMind is being used in Google Search yet, it’s very conceivable that sometime, in the near future, it will be. What’s presently likely is that the Googlebot is using very complex semantic processing, possibly with some machine learning components to assist how it understands content.
In a recent post on Google’s Official Blog, the new CEO of Google, Sundar Pichai writes:
Looking to the future, the next big step will be for the very concept of the “device” to fade away. Over time, the computer itself—whatever its form factor—will be an intelligent assistant helping you through your day. We will move from mobile first to an AI first world. From This year’s Founders’ Letter.
What that means for SEO is that it will get increasingly difficult to “beat” the Googlebot with exploits to get to the top of Google Search. More than ever, we need to focus our SEO efforts on providing authentic quality content and genuine usability for relevant human visitors. We measure these in Google Analytics by engagement metrics like bounce rate, pageviews per session, and goal conversions. We test these with A/B Split Testing methods using Google Content Experiments. By improving the performance of our websites for human traffic, we help teach the Googlebot to interact with our websites more like a human. Google ultimately wants to score our webpages as real humans would, and they have access to all the metrics by which to measure their success. And so do we.