Google got its start when the internet was new, and search engines used keyword spam, directories, and traffic circles. Very few websites from Google's early days still exist, and few of the remaining websites are still hot.
Google has survived by being nimble and working hard to understand the data they present to their search engine viewers. Google has changed and updated its algorithm many times to keep up with the times, and those changes now happen continuously.
One of the most significant changes was switching to an entity-based approach instead of the older keyword-based approach. Entities allowed Google to understand the data that it served instead of blindly dishing out matching strings of letters and numbers.
Before we get into entities, let's talk a little about their older and easier to understand cousins, the keywords.
Keywords are key search terms frequently used by humans to find content. "Car Tires" and "iPhone 6 battery" are good examples.
The first search engines tried to match people's keywords with text strings inside of content. To rank content, they counted the number of times a keyword appeared in the text. In theory, this should have matched people with content that was discussing their topic of choice.
This technique had two fatal flaws. The first major flaw was that talking about a topic doesn't imply quality. Secondly, because search engines permitted low-quality content and rewarded keyword frequency, the web was a soup of useless documents filled with keyword synonyms.
Before Google, web search was so bad that it was almost useless. In the late 90s, internet providers solved the problem by creating walled gardens filled with curated content.
Google improved search with its page rank algorithm by ranking sites according to their backlinks, but this also had problems. Just like creating useless pages filled with keywords was easy money, so was creating useless documents filled with backlinks.
It was clear that something needed to be done.
Introduced in Google's Hummingbird update, entities were one of many improvements that led to where we are today with the internet.
Entities are singular objects with internal attributes and can be weighed against each other using machine learning.
Each real-world noun exists as a single google entity. Each business, person, animal, plant, concept, idea, thought, or opinion exists once.
At first, this seems like a minor change. Simply shifting the data type from being a string to an object doesn't necessarily improve outcomes. But entities have been a game-changer for Google.
The real power of entities is that they can be linked together in Google's Knowledge Graph and have internal attributes. These attributes and links help Google AIs like RankBrain to understand the entity and its relationships to other entities.
Even though Google works hard to understand search intent with entities, keywords are still powerful SEO tools. Entities and keywords currently work together to signal relevance to a particular search query.
When a person searches for something, the actual search query starts as a long-tail keyword. When I type a long-tail keyword into the search bar, that's a string of letters that Google doesn't understand but can sort, match, and count.
Google has worked hard to begin to understand these keywords, and after many years of work, hit upon entities and machine learning.
Entities are conceptual boxes that contain keywords, attributes and have weighted relationships on Google's Knowledge Graph.
Because Google understands the entities, search results are more likely to be on-topic, answer the question fully, and make better suggestions for continued content consumption.
Entities give focus to keywords and make them even more powerful. That's a strange paradox, but let me explain.
Entities allow you to focus on making good content instead of counting the number of times a specific word appears in your text. They allow you to be creative and establish yourself as an expert.
The end goal of every website is to convert. We all want our site visitors to do something. Even Wikipedia has a purpose and a dream that it intends to accomplish.
Entities work with humans, and they convert like fire. Once you understand them, you'll love them.
You don't want a whole bunch of trash leads that gum up your system and cost you money. Remember that website visitors cost some money and that bounces and pogo-sticking penalize your search engine ranking.
You want leads that will convert fast and efficiently. Entities-based SEO is the method that gets you there.
My custom shoes website will appropriately target people who are looking for youth fashion. I don't want or need horseshoes, car brakes, or gambling.
One of the challenges with entities is that they are concepts with singular examples in the real world. Many people are named Mary, but a single name concept, "Mary," is linked to many people. Figuring out which concepts are singular has proven challenging.
Because of the challenge, Google engineers did a lot of the early work by taking information from well-organized and authoritative sources like Wikipedia. Focussing on Wikipedia left out a lot of the information in the world.
The promise is that as machine learning improves, it will make improved conceptual leaps and handle increased complexity. Eventually, machine learning will build out the knowledge graph in a much more sophisticated way than current technology allows.
As entities have improved, keywords have become more powerful. However, now keywords need to span a topic rather than simply repeating as much as possible. Thorough coverage of a long-tail keyword is the idea.
Techniques like spoke and wheel have developed. The spoke and wheel SEO strategy links primary content to long-tailed keywords. Spoke and wheel SEO thoroughly saturates a niche with content and makes the website a go-to for the content niche. Your site gets bookmarked and followed rather than relying on the SERP.
Internal and external links need to saturate your niche. SEO auditors look for opportunities to build website footprints by thoroughly saturating a niche with content, linking that content internally, and becoming a community content resource that other niche sites turn to for respected citations.
Backlinks need to come from within your niche or from extremely high DA generalist websites like news media.
A newspaper writing an article in your niche should be going to your website for authoritative citation. That's good news for your SEO.
On the other hand, a large percentage of your backlinks should be coming from websites in your niche. Google sees these as community votes for which websites have the best content.
This backlink strategy directly relates to entities because Google sees those websites as nodes and clusters in Google’s Knowledge Graph.
Rather than having x number of links, you want to be the central hub of a cluster of nodes and the primary source that high DA websites connect to your cluster.
Search engines use entities to try and understand content and search intent and then match people with the content that best fits their needs.
That's also true for keywords, but the difference is that Google's AI has some understanding of what each entity means and what its place in the world is. In contrast, computers see keywords as meaningless collections of letters that can be sorted, matched, and counted.
Entities model the way the human brain works. One of Google's examples is Leonardo de Vinci. Leonardo de Vinci has the attributes of "renaissance," "painter," and "polymath," and his entity links to great painters of the renaissance and his namesake Leonardo di Caprio.
Google uses this understanding to match people's search intent with search results.
In the case of Leonardo, I might ask a search engine, "Who was Leonardo in Italy?" Google considers that I used the past tense and then looks at the weighting of Leonardo and Italy. Leonardos that have a solid connection to Italy will come before Leonardos with weak ties.
Entities are crucial for translation because they allow your global content to connect and let Google understand that it's all part of the same conversation.
Google uses Entities to connect concepts in different languages. The concept of red is understood widely worldwide but has different keywords depending on the language.
"Roja" in Spanish would ordinarily have a weak keyword association with "red" in English because they rarely occur together except in language textbooks.
But, because "Roja" and "red" both have strong connections to the color concept, Google understands the connection that they share.
Attributes develop Google's understanding even more because "red" and "Roja" both have the attribute "color," and both share a similar range of the electromagnetic spectrum.
Google isn't simply a search engine company. They're also a mapping company, self-driving car company, video sharing company, social media company, and SASS company.
Because Google has its hands in many different technology areas, its uses of and interest in entities are complex.
The Youtube algorithm, for example, builds profiles of their users and content creators to try and match them up as well as possible.
Google published a paper in 2016 called Deep Neural Networks for Youtube Recommendations. They describe creating a profile of a user's watch history which predicts future videos a user will watch.
Today, Google uses the viewer's history and preferences, local audience preferences, and the recent success of a particular video. These factors are weighted to get the video suggestions.
Entities are similar to objects in object-oriented programming but expanded to encompass concepts used in machine learning. Because of the broad definition, entities are everywhere at Google.
The Knowledge Graph is where Google got started with entities. The knowledge graph currently receives a lot of its information from Wikipedia, but it will probably encompass every concept in the universe in the future.
Currently, the knowledge graph is a fantastic resource for presenting contextual knowledge panels and in Google Search. In the future, it will be beneficial for Google Maps, Self Driving Cars, and probably most things that Google does.
There is a lot of machine learning at Google that doesn't belong to the knowledge graph, though. That's OK because the same techniques that work with entities also work well with humans and other AI approaches.
Even though Google isn't always using entities in their product lines per se, the machine learning techniques and focus on understanding the content and searchers always apply.
Google wants to understand its customers and products. One way to look at Google's business model is that information delivery is Google's product, searchers are their customers, and ads are the price society pays for the service.
Google needs to understand the information in a profound and sophisticated way to remain the leader in the search engine market.
Google patents don't always tell us Google's plans for the future because Google doesn't ever use most of its patents. But we can see that Google is betting heavily on machine learning and AI.
Google's biggest use for entities will be understanding the world more deeply and then selling that knowledge.
Youtube doesn't seem to be using entities per se, and Google search doesn't count them amongst the highest priorities in organic search.
And yet, an entity-based approach is still the best approach towards SEO for either Youtube or Search.
The reason is that entities and the knowledge graph are based on machine learning and closely track how humans and AI understand the world. Entities build networks of understanding and cluster your content so that it connects and yet spans a niche.
Entity-based approaches work well with humans, convert like fire, and they rank better than keyword-based strategies.
It's worth mentioning that since Rankbrain changes the algorithm continuously, there isn't a single human on the planet who knows what the algorithm is at a given moment.
Google processes maybe 60,000 -70,000 search queries per second. In the time that it takes humans to understand what the algorithm is doing, massive amounts of data have already been served.
SEO experts don't know all of the factors Google uses to build the algorithm, and nobody knows the exact weighting.
But we know what works. That's the key.
Entity-based keywords work. They build good thick data that answers relevant questions. Entity-based content becomes the pillar you use to support your entire content strategy because it performs well.
Based on community experience, SEO experts recommend an entity-based approach, and Google agrees with our general assessment.
Google detects Entities in many different ways. One way is to tell Google that an entity exists. However, that's not the primary way that Google finds entities.
The primary way that Google detects entities is by using nightly Wikipedia dumps. Wikipedia pages are well organized and have good tags to identify information.
Entities are fundamental to how AI understands the world, and Google is betting big.
Artificial intelligence is developing stunningly quickly. Moores law described the trend in which computational power roughly doubled every two years. AI is doubling in performance every 3.4 months.
As the AI's capability improves, new applications become possible. In other words, entities are becoming more significant monthly as Google's AI improves and develops. The ways AI will use entities is an important space to watch.
Even so, Google isn't perfect at detecting entities yet. Entities are deceptively tricky for AI because AI lacks some of the context clues we have as humans.
Google finds context to detect entities in several ways.
One way is Schema markup. If you're not using schema, you should start today. Schema's structured markup data allows you to mark data as an entity.
Another way Google develops entities is by using off-page factors. If google has high confidence that your site is about movies and movie stars, Leonardo is probably a movie star. On the other hand, Leonardo is likely an artist if your site is about renaissance artists.
Google's AI is also simply getting better at Natural Language Processing. Currently, Google’s Natural Language API is good enough to detect most people, places, and organizations but is weak at detecting objects and concepts.
As the Google NLP API develops, it will become a more significant factor in how Google detects entities.
Google Search is still where Google makes money and why you're reading blog posts about SEO. Since the Google Hummingbird update, entities have become increasingly more important to search.
Rankbrain, in particular, is one of the three top ranking factors for the SERP. RankBrain is an AI that makes continuous optimizations to the Google algorithm and then tests the optimizations to see if the results are better than before.
So far, yes, Rankbrain's optimizations are better than Google engineers can do by hand. Don't expect this to go away.
The BERT update extended Google Search's use of Natural Language Processing. Language learning is challenging and goes through stages of complexity as humans gain experience. AI goes through the same stages, so we shouldn't expect Google to understand nuance and humor yet, but that day is coming.
Entities are a ranking factor, but SEO is very complex today, and it's about more than ranking factors. Today, SEO uses best practices in public relations and high-quality content to build notability.
Let me give an example. Ranking factors for doing well in class might include doing your homework, taking notes, and arriving at class on time. These are best practices.
A student might arrive late and never take notes but still get a good grade. That might be because another factor, such as prior knowledge, picked up the slack.
If you have a massive number of people following your social media accounts, you might be able to push traffic into your site without SEO, and then your site might get featured in news reports and end up with a lot of high-quality backlinks. A terrible website could win.
Best practices work for the majority of websites. Entities are a ranking factor, but their real power is that an entity-based approach encourages best practices all the way around.
Entities are a fascinating development, and they're going to grow in importance throughout the 2020s as AI develops and begins to mature. These are still very early days for the technology and approach.
It's easy to see exciting places where Google might go with entities, indexing individual real-world objects as entities and mapping them. For example, particular banknotes passing through various people's hands might show intriguing patterns. We don't know the full potential of the technology yet.
Right now, they enable a much more refined version of keyword search, which understands intent. In addition, thick content is rewarded against its thin content competition. Entities are many technologies helping to create a world where good design and unique content are good SEO.
Remember that your SEO strategy should focus on good design and thick content. Keyword and backlink spam won't work anymore.
Current Search Engine Optimization is a combination of public relations campaigns, excellent design, and thick content.
If your site isn't getting the traffic it needs, talk to an SEO auditor. SEO auditors have the experience and knowledge to turn a decent website into a winner.