Rich summaries in site search

Use external web search engines' recommendations for rich snippets to increase clickthroughs and on-site search success.

In recent years, major whole-of-web search engines have been promoting, and encouraging users to promote, content that enables rich snippets to be produced on search results pages.

The benefits of this approach seem obvious: searchers get a stronger information scent, answers can be generated directly on search results pages, and greater trust is given to both the search engine and the content author.

Unfortunately, no guarantees are ever given by whole-of-web search engines that your effort in designing, creating and publishing the content necessary for rich snippets will be taken into account by them when creating summaries, or even if those results will appear at all. Thankfully, your site search can be directed to take advantage of this structure - once it exists - increasing site search click-through rates and task completion by your users.

What needs enriching?

Content types that typically benefit from enriched summaries might include:

  • Products
  • People
  • Places
  • Events
  • Videos

As a government body, these might include your public figures, elected representatives, taxpayer-funded services, town hall consultations, etc. For a university, these might be your teaching staff, teaching room locations, public lectures, courses, etc. Other industry verticals will generally have their own equivalents - the concept of 'products' doesn't necessarily need to be limited to ecommerce-style operations. The general principle to consider here is:

Are there items of content included in the scope of my search that deserve special visual treatment, ranking boosts, or can be used to summarise a direct answer to a search query?

Examples of rich summaries

Let's take a look at a few examples:

Person result type

Rich summary - person result type
The enriched summary for a fictional university staff member shows a thumbnail image of the person's face, their honorific, first name, last name, role and faculty, along with their areas of interest. Additional functionality is provided via click to call/email links.

Video result type

Rich summary - video result type
Rich video result showing video thumbnail, overlaid duration, along with a video title, source, date, and clickable tags.

Product result type

Rich summary - product result
Rich product result showing product thumbnail, title, author, ISBN, category, description, format and size.

Review result type

Rich summary - movie result
Rich summary showing thumbnail from movie, title, summary, movie rating, audience sentiment and star ratings from movie critics.

Event result type

Rich summary - event result
Rich summary for an event showing event thumbnail, title, URL, description, performing group, location and start / end dates.

Using these as an aspirational end-state, we'll now examine what's required to achieve this via configuration of our CMS and search engine.

Part 1: How to enrich content

Mixing and matching content types as part of your publishing strategy is a task most modern content management systems should be able to support, albeit with some minor backend configuration. You may already have pre-existing templates for these content types, but you may not be expressing the structures in a form that search engines understand. Ideally, this configuration effort is performed early in the life of a CMS deployment - retro-fitting CMS templates may be feasible, but will often be slightly more painful.

The good news: this effort can be used by both external and internal search.

The simplest way to get started is to examine the templates used by your CMS to display pages - extending the region to show these new fields is a good starting point. Let's start with the basics:

<!DOCTYPE html>
  <head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# product: http://ogp.me/ns/product#">
  ...
    <!-- OpenGraph //-->
    <meta property="og:title" content="Bachelor of Basket Weaving - University of Funnelback" />
    <meta property="og:type" content="product" />
    <meta property="og:url" content="http://example.com/courses/b-basket-weaving" />
    <meta property="og:site_name" content="University of Funnelback" />
    <meta property="og:image" content="http://example.com/path/to/course/image.png" />
    <meta property="og:description" content="Learn how to make the most of those old, dry vines.  Assessment conducted by qualified basket weavers." />
    <meta property="fb:app_id" content="1234567890" />

    <!-- OpenGraph - Product //-->
    <meta property="product:category" content="Arts & Crafts" />
    <meta property="product:price" content="£40000" />

    <!-- Twitter Card //-->
    <meta name="twitter:title" content="Bachelor of Basket Weaving - University of Funnelback" />
    <meta name="twitter:card" content="product" />
    <meta name="twitter:site" content="http://example.com/" />
    <meta name="twitter:image" content="http://example.com/path/to/course/image.png" />
    <meta name="twitter:description" content="Who doesn't need a high quality basket? Guaranteed post-grad employment " />

    <!-- Twitter Card - product //-->
    <meta name="twitter:label1" content="Location" />
    <meta name="twitter:value1" content="Sydney, London, New York" />
    <meta name="twitter:label2" content="Faculty" />
    <meta name="twitter:value2" content="Arts & Crafts" />

    <!-- Organization-specific metadata //-->
    <meta name="course:duration" content="2 years" />
    <meta name="course:entryScore" content="750" />
    <meta name="course:campus" content="Sydney;London;New York" />
    <meta name="course:mode" content="Full-time;Part-time;Online" />
    <meta name="course:faculty" content="Arts & Crafts" />
    <meta name="course:fees" content="£40000" />

    <!-- Auto-generated metadata //-->
    <meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
    <meta name="dcterms:language" scheme="RFC4646" content="en" />
  ...
  </head>
  ...

Enriched content - markup examples

Using this basic template above, we can generate some more detailed markup examples that could be used to generate the rich snippets in the earlier screenshots.

University Course

The limitations of the generic schema.org structures become immediately apparent when attempting to describe a university course - in this example we'll need to add several organization- and type-specific fields: entry score; faculty; mode; duration and campus. External search engines will not be able to leverage this information, but as a user progresses further through their search for a course, it's appropriate to reveal these additional details in site search.

Arguably, a university's courses are being regarded as 'products', in order to shoehorn them into schema.org concepts. In this example, 'course:faculty' and 'course:fees' map to 'product:category' and 'product:price' respectively. 'Location' and 'Faculty' have been nominated as two of the most valuable fields to expose when showing this product as a Twitter card.

<!DOCTYPE html>
  <head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# product: http://ogp.me/ns/product#">
  ...
    <!-- OpenGraph //-->
    <meta property="og:title" content="Bachelor of Basket Weaving - University of Funnelback" />
    <meta property="og:type" content="product" />
    <meta property="og:url" content="http://example.com/courses/b-basket-weaving" />
    <meta property="og:site_name" content="University of Funnelback" />
    <meta property="og:image" content="http://example.com/path/to/course/image.png" />
    <meta property="og:description" content="Learn how to make the most of those old, dry vines.  Assessment conducted by qualified basket weavers." />
    <meta property="fb:app_id" content="1234567890" />

    <!-- OpenGraph - Product //-->
    <meta property="product:category" content="Arts & Crafts" />
    <meta property="product:price" content="£40000" />

    <!-- Twitter Card //-->
    <meta name="twitter:title" content="Bachelor of Basket Weaving - University of Funnelback" />
    <meta name="twitter:card" content="product" />
    <meta name="twitter:site" content="http://example.com/" />
    <meta name="twitter:image" content="http://example.com/path/to/course/image.png" />
    <meta name="twitter:description" content="Who doesn't need a high quality basket? Guaranteed post-grad employment " />

    <!-- Twitter Card - product //-->
    <meta name="twitter:label1" content="Location" />
    <meta name="twitter:value1" content="Sydney, London, New York" />
    <meta name="twitter:label2" content="Faculty" />
    <meta name="twitter:value2" content="Arts & Crafts" />

    <!-- Organization-specific metadata //-->
    <meta name="course:duration" content="2 years" />
    <meta name="course:entryScore" content="750" />
    <meta name="course:campus" content="Sydney;London;New York" />
    <meta name="course:mode" content="Full-time;Part-time;Online" />
    <meta name="course:faculty" content="Arts & Crafts" />
    <meta name="course:fees" content="£40000" />

    <!-- Auto-generated metadata //-->
    <meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
    <meta name="dcterms:language" scheme="RFC4646" content="en" />
  ...
  </head>
  ...

Staff member

Schema.org does a good job of describing some of the fields necessary for 'people' in general, but there may still be organization-specific values that are useful to display and refine by. We've added organization-specific fields for faculty, role, honorific, email and work phone number below:

<!DOCTYPE html>
  <head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# profile: http://ogp.me/ns/profile#">
    ...
    <!-- OpenGraph //-->
    <meta property="og:title" content="Dr Solomon Otombe - University of Funnelback" />
    <meta property="og:type" content="person" />
    <meta property="og:url" content="http://example.com/staff/solomon-otombe" />
    <meta property="og:site_name" content="University of Funnelback" />
    <meta property="og:image" content="http://example.com/path/to/person/image.png" />
    <meta property="og:description" content="Lecturer - Social Science Education, School of Education" />
    <meta property="og:category" content="Social Science Education,Teacher Education,Education Policy" />
    <meta property="og:updated_time" content="2014-08-02" />
    <meta property="fb:app_id" content="1234567890" />

    <!-- OpenGraph - Person //-->
    <meta property="profile:first_name" content="Solomon" />
    <meta property="profile:last_name" content="Otombe" />

    <!-- Twitter Card //-->
    <meta name="twitter:title" content="Dr Solomon Otombe - University of Funnelback" />
    <meta name="twitter:card" content="summary_large_image" />
    <meta name="twitter:site" content="http://example.com/" />
    <meta name="twitter:image" content="http://example.com/path/to/person/image.png" />
    <meta name="twitter:description" content="Lecturer - Social Science Education, School of Education" />

    <!-- Organization-specific metadata //-->
    <meta name="staff:faculty" content="School of Education" />
    <meta name="staff:role" content="Lecturer - Social Science Education" />
    <meta name="staff:honorific" content="Dr" />
    <meta name="staff:email" content="s.otombe@university.edu" />
    <meta name="staff:phoneWork" content="1 2345 6789" />

    <!-- Auto-generated metadata //-->
    <meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
    <meta name="dcterms:language" scheme="RFC4646" content="en" />
    ...
  <head>
  ...

Public consultation

If we regard a Public Consultation as an event - something that has a start and end date - it maps quite closely to the Event concept defined by schema.org:

<!DOCTYPE html itemscope itemtype="http://schema.org/Event">
    <head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb#" itemscope itemtype="http://schema.org/Event">
        ...
        <!-- OpenGraph //-->
        <meta property="og:title" content="Public Consultation: Objective relevancy ranking - Department of Search" />
        <meta property="og:type" content="article" />
        <meta property="og:url" content="http://example.com/consultations/objective-relevancy-ranking" />
        <meta property="og:site_name" content="Department of Search" />
        <meta property="og:image" content="http://example.com/path/to/consultation/image.png" />
        <meta property="og:description" content="Contribute to this public consultation on objectivity and search result relevancy ranking." />
        <meta property="og:updated_time" content="2014-08-02" />
        <meta property="fb:app_id" content="1234567890" />

        <!-- Twitter Card //-->
        <meta name="twitter:title" content="Public Consultation: Objective relevancy ranking - Department of Search" />
        <meta name="twitter:card" content="summary_large_image" />
        <meta name="twitter:site" content="http://example.com/" />
        <meta name="twitter:image" content="http://example.com/path/to/consultation/image.png" />
        <meta name="twitter:description" content="Contribute to this public consultation on objectivity and search result relevancy ranking." />

        <!-- schema.org - event //-->
        <meta itemprop="name" content="Public Consultation: Objective relevancy ranking - Department of Search" />
        <meta itemprop="description" content="Contribute to this public consultation on objectivity and search result relevancy ranking." />
        <meta itemprop="image" content="http://example.com/path/to/consultation/image.png" />
        <meta itemprop="startDate" content="2015-09-01" />
        <meta itemprop="endDate" content="2015-09-10" />

        <!-- Auto-generated metadata //-->
        <meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
        <meta name="dcterms:language" scheme="RFC4646" content="en" />
        ...
    <head>
    ...

Video lecture

Unless you're producing a television series or a big-budget movie, most videos you're publishing on your website will probably conform quite closely to the general 'video' rich result type:

<!DOCTYPE html>
    <head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb#  video: http://ogp.me/ns/video#">
        ...
        <!-- OpenGraph //-->
        <meta property="og:title" content="Computer Science: Information Retrieval 101 - University of Funnelback" />
        <meta property="og:type" content="video.other" />
        <meta property="og:url" content="http://example.com/consultations/objective-relevancy-ranking" />
        <meta property="og:site_name" content="Department of Search" />
        <meta property="og:image" content="http://example.com/path/to/consultation/image.png" />
        <meta property="og:description" content="Dr. Tim Smith spends 2hrs introducting filtering, indexing and ranking concepts alongside real-world examples." />
        <meta property="og:updated_time" content="2014-08-02" />
        <meta property="fb:app_id" content="1234567890" />

        <!-- OpenGraph - Video //-->
        <meta property="video:duration" content="7200" />
        <meta property="video:tag" content="computer science, information retrieval, filtering, indexing, ranking, tim smith" />
        <meta property="video:writer" content="Dr. Tim Smith" />
        <meta property="video:release_date" content="2014-08-02" />
        <meta property="video:secure_url" content="https://example.com/path/to/video/video_id.mp4" />
        <meta property="video:type" scheme="RFC4337" content="video/mp4" />
        <meta property="video:width" content="1024" />
        <meta property="video:height" content="768" />

        <!-- Twitter Card //-->
        <meta name="twitter:title" content="Computer Science: Information Retrieval 101 - University of Funnelback" />
        <meta name="twitter:card" content="player" />
        <meta name="twitter:site" content="http://example.com/" />
        <meta name="twitter:image" content="http://example.com/path/to/course/video/image.png" />
        <meta name="twitter:description" content="Dr. Tim Smith spends 2hrs introducting filtering, indexing and ranking concepts alongside real-world examples" />

        <!-- Twitter Card - Player //-->
        <meta name="twitter:player" content="https://example.com/path/to/video/player/video_id" />
        <meta name="twitter:player:height" content="768" />
        <meta name="twitter:player:width" content="1024" />
        <meta name="twitter:player:stream" content="https://example.com/path/to/video/video_id.mp4" />
        <meta name="twitter:player:stream" scheme="RFC4337" content="video/mp4" />

        <!-- Auto-generated metadata //-->
        <meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
        <meta name="dcterms:language" scheme="RFC4646" content="en" />
        ...
    <head>

Part 2: Funnelback configuration

Metadata mapping

We can use an organization's metadata schema together with other industry standards for site search - this is particularly useful if you've created separate versions for a page's description or title for redisplay on Twitter and Facebook.

Funnelback v14.2 and above supports the use of long-form metadata field names. The date field is the exception below, being mapped to 'd':

d,0,dcterms:published
dateUpdated,0,og:updated_time
duration,0,og:video:duration

# Common concepts, different sources
description,1,og:description
description,1,twitter:description
imageUrl,0,og:image
imageUrl,0,twitter:image
title,1,og:title
title,1,twitter:title
author,1,video:writer
author,1,article:author
type,0,og:type
type,0,twitter:card
tags,1,article:tag
tags,1,video:tag

# People
firstName,1,profile:first_name
lastName,1,profile:last_name
faculty,1,staff:faculty
staffRole,1,staff:role
staffHonorific,1,staff:honorific
staffEmail,1,staff:email
staffPhonework,1,staff:phoneWork

# Events
startDate,0,startDate
endDate,0,endDate

Ranking adjustments

You may decide that matches on 'author' or 'tags' may be more valuable than matches on title or description. You can experiment with these settings by adding weights to those fields for each search query, appending these parameters to your search UI URL:

/s/search.html?...&wmeta_category=1.0&wmeta_tags=1.0&wmeta_title=0.2

Front-end output

Alter your query processing options to return these fields and expand summaries to include both metadata and snippets:

#collection.cfg
query_processor_options=-SM=both -SF=[type,duration,title,author,imageUrl,startDate,endDate,tags]

These metadata fields will now be available in your data model for display, clicking, tracking, and other secondary actions.

/s/search.json?PARAMS
response.resultPacket.results[...]{
  rank: 1,
  score: 1000,
  title: "Page Title",
  liveUrl: http://example.com/path/to/page,
  metaData: {
    "type":"...",
    "author":"...",
    "imageUrl":"...",
    "tags":"...",
    ...
  }
}

Part 3: Tracking impacts

Tracking impacts of rich displays

Simple A/B testing using your existing web analytics package is probably the easiest way to measure the cosmetic changes to rich results, assuming that the ranking behaviour is unchanged.

If using Funnelback's default search analytics, you may want to configure separate profiles, each with their own analytics and search forms, to examine the click-through rates for users exposed to rich snippets, as opposed to the 'plain vanilla' result rankings:

$SEARCH_HOME/conf/COLLECTION
/_default/simple.ftl
/rich-display/simple.ftl

Calls to search results pages could correspondingly be cycled between:

/s/search.html?collection=COLLECTION&query=QUERY

and

/s/search.html?collection=COLLECTION&query=QUERY&profile=rich-display

Tracking impact of up-weighting rich snippets

Experimenting with upweighting behaviours applied manually is fraught with issues - up-weighting may benefit some users' search queries whilst penalising others. One method worth considering is the application of query-independent evidence: all other ranking elements being, certain site sections or types will receive a ranking boost.

Query-independent evidence is applied at index-time:

# qie.cfg
# Upweight videos, people, courses
1.0 example.com/courses
0.8 example.com/people
0.6 example.com/videos

This weight can also be ignored or applied at query time by modifying the query_processor_options, using a similar profile structure to our rich snippet display testing:

#collection.cfg
# Negate impact of query-independent evidence
query_processor_options=-cool.5=0.0
#/rich-display-ranking/padre_opts.cfg
# Maximise impact of query-independent evidence
query_processor_options=-cool.5=1.0
$SEARCH_HOME/conf/COLLECTION
/qie.cfg
/collection.cfg
/rich-display-ranking/padre_opts.cfg

Compare the impacts of applying query-independent evidence to these styled (or unstyled) rich snippet search results pages by tracking a proportion of search queries from:

/s/search.html?collection=COLLECTION&query=QUERY

(baseline styling, baseline weighting on rich content types)

and

/s/search.html?collection=COLLECTION&query=QUERY&profile=rich-display

(rich styling, normal weighting on rich content types)

and

/s/search.html?collection=COLLECTION&query=QUERY&profile=rich-display-ranking

(rich styling, upweighted rich content types)

References

Markup Guides

Funnelback Documentation

Share Article on: