Google’s Semantic Search: What a Year Has Taught Us (Part 1 of 3)

feed_knowledge_graph_structrured_data

Semantic Search: A Look Back

Semantic search was introduced in its most concrete form in Google's Hummingbird update released over a year ago in September of 2013. Back then the digital marketing world was abuzz with chatter about how this would change the future of SEO. It seemed there were two sides to the argument - one that SEO was dead and the other that this was great news for good and honest SEOs. If you were doing what you were supposed to be doing, building quality content and listening to your customers, then you were already doing what was required from this new release. Of course, knowing exactly what to focus in on is important even for good and honest SEOs.

Hindsight is 20-20 and a year later, we can take a clear look at what really did change and what we've learned in the post Hummingbird days. The three truths that came from semantic search changes are:

  1. Knowledge Graph Continues to Grow in Importance and Is Fed by Structured Data
  2. We Now Know More About Buyers' Intent
  3. Link Your Social Media to Your Website

Here's more about them and advice on how all of us good and honest SEOs can make sure we're covering the basics.

Truth #1:

Knowledge Graph Continues to Grow in Importance and Is Fed by Structured Data

A year ago the Knowledge Graph, in the format it appears in today, was a new concept and today it is still in its infancy. It will continue to grow and it will become more and more important to make sure to be included. And if Knowledge Graph doesn't persuade you to start preparing your content to be easily decipherable to search engines, think of the future of semantic search outside of your desktop and as it relates to navigation, operating your home, and Google Now.

When considering your website, there are three types of data: structured, semi-structured, and unstructured. Structured data is what Google relies on to populate the Knowledge Graph. Where structured data isn't available, Google will turn to semi-structured and then unstructured. The following lays out the types of data and details about each. Use this as a guide for preparing your websites for Knowledge Graph.

Structured Data

Structured data is content on a website that has been marked up specifically using one of these following four systems. Although Google accepts all of the following forms, they recommend using Microdata, because they have "found that microdata strikes a balance between the extensibility of RDFa and the simplicity of microformats..." (Google Support, "Schema.org FAQ").

1. Schema.org

Schema.org uses "the microdata markup format and a vocabulary that is shared by all the search engines and that supports a wide variety of item types and properties," (Google Support, "Schema.org FAQ"). Because Schema.org offers Google's preferred format and adds in vocabulary that has been agreed upon, it is the best place to go to mark up your next website. For more about microdata, skip to #3 below.

2. Microformats

"In general, microformats use the class attribute in HTML tags (often <span> or <div>) to assign brief and descriptive names to entities and their properties," (Google Support, "About Microformats").

Example of How to Write Microformats from Google

3. Microdata

"Microdata uses simple attributes in HTML tags (often <span> or <div>) to assign brief and descriptive names to items and properties," (Google Support, "About Microdata"). To get your microdata markups for your next site, head over to schema.org to get shared vocabulary too.

Microdata Example from Google

4. RDFa

"In general, RDFa uses simple attributes in XHTML tags (often <span> or <div>) to assign brief and descriptive names to entities and properties." (Google Support, "About RDFa").

Example of RDFa from Google

Semi-structured Data

Semi-structured data in SEO is content that contains data mark-ups which are not grouped together in a structured manner recognizable by search engines.

When could semi-structured data be useful? When a structure doesn't exist that could constrain it.

1. XML

For instance, an XML sitemap uses tags but doesn't use a hierarchy like fully structured data would. "...a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site," (Sitemaps.org).

XML Sitemap Example from sitemaps.org

2. JSON

"JSON or JavaScript Object Notation, is an open standard format that uses human-readable text to transmit data objects consisting of attribute–value pairs," (Wikipedia). Notice the lack of hierarchy here as well.

Example of JSON from Wikipedia

Unstructured Data

Unstructured data is the content on a website that has not been assigned to fit into a database or algorithm neatly. "Experts estimate that 80 to 90 percent of the data in any organization is unstructured," (Webopedia).

Any type of content can be unstructured including images, videos, text, etc. If no markup has been has been added, the data is left for the search engine to try to decipher implicitly. "The technology utilized to obtain these entities is typically some sort of stochastic algorithm like NLP (Natural Language Processing) or a similar form of information retrieval technique," (Search Engine Land).

To optimize data that can't be structured, the best advice is to set up topics through structured data where possible and then speak to that topic in the unstructured data portion of the content.

Next Steps

Now armed with a basic understanding of the importance of the Knowledge Graph and how to properly "feed" it (with as much structured data as possible) what should you do right now? Go here and download these awesome tools for decoding what entities are on websites and start structuring everything in site (pun intended).

Stay Tuned for Part Two, Coming Soon!

Follow Steph McGuinn on Twitter, @HeartBuzzAgency.