Search powered by Funnelback
In recent years, major whole-of-web search engines have been promoting, and encouraging users to promote, content that enables rich snippets to be produced on search results pages.
The benefits of this approach seem obvious: searchers get a stronger information scent, answers can be generated directly on search results pages, and greater trust is given to both the search engine and the content author.
Unfortunately, no guarantees are ever given by whole-of-web search engines that your effort in designing, creating and publishing the content necessary for rich snippets will be taken into account by them when creating summaries, or even if those results will appear at all. Thankfully, your site search can be directed to take advantage of this structure - once it exists - increasing site search click-through rates and task completion by your users.
Content types that typically benefit from enriched summaries might include:
As a government body, these might include your public figures, elected representatives, taxpayer-funded services, town hall consultations, etc. For a university, these might be your teaching staff, teaching room locations, public lectures, courses, etc. Other industry verticals will generally have their own equivalents - the concept of 'products' doesn't necessarily need to be limited to ecommerce-style operations. The general principle to consider here is:
Are there items of content included in the scope of my search that deserve special visual treatment, ranking boosts, or can be used to summarise a direct answer to a search query?
Let's take a look at a few examples:
Using these as an aspirational end-state, we'll now examine what's required to achieve this via configuration of our CMS and search engine.
Mixing and matching content types as part of your publishing strategy is a task most modern content management systems should be able to support, albeit with some minor backend configuration. You may already have pre-existing templates for these content types, but you may not be expressing the structures in a form that search engines understand. Ideally, this configuration effort is performed early in the life of a CMS deployment - retro-fitting CMS templates may be feasible, but will often be slightly more painful.
The good news: this effort can be used by both external and internal search.
The simplest way to get started is to examine the templates used by your CMS to display pages - extending the region to show these new fields is a good starting point. Let's start with the basics:
<!DOCTYPE html>
<head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# product: http://ogp.me/ns/product#">
...
<!-- OpenGraph //-->
<meta property="og:title" content="Bachelor of Basket Weaving - University of Funnelback" />
<meta property="og:type" content="product" />
<meta property="og:url" content="http://example.com/courses/b-basket-weaving" />
<meta property="og:site_name" content="University of Funnelback" />
<meta property="og:image" content="http://example.com/path/to/course/image.png" />
<meta property="og:description" content="Learn how to make the most of those old, dry vines. Assessment conducted by qualified basket weavers." />
<meta property="fb:app_id" content="1234567890" />
<!-- OpenGraph - Product //-->
<meta property="product:category" content="Arts & Crafts" />
<meta property="product:price" content="£40000" />
<!-- Twitter Card //-->
<meta name="twitter:title" content="Bachelor of Basket Weaving - University of Funnelback" />
<meta name="twitter:card" content="product" />
<meta name="twitter:site" content="http://example.com/" />
<meta name="twitter:image" content="http://example.com/path/to/course/image.png" />
<meta name="twitter:description" content="Who doesn't need a high quality basket? Guaranteed post-grad employment " />
<!-- Twitter Card - product //-->
<meta name="twitter:label1" content="Location" />
<meta name="twitter:value1" content="Sydney, London, New York" />
<meta name="twitter:label2" content="Faculty" />
<meta name="twitter:value2" content="Arts & Crafts" />
<!-- Organization-specific metadata //-->
<meta name="course:duration" content="2 years" />
<meta name="course:entryScore" content="750" />
<meta name="course:campus" content="Sydney;London;New York" />
<meta name="course:mode" content="Full-time;Part-time;Online" />
<meta name="course:faculty" content="Arts & Crafts" />
<meta name="course:fees" content="£40000" />
<!-- Auto-generated metadata //-->
<meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
<meta name="dcterms:language" scheme="RFC4646" content="en" />
...
</head>
...
Using this basic template above, we can generate some more detailed markup examples that could be used to generate the rich snippets in the earlier screenshots.
The limitations of the generic schema.org structures become immediately apparent when attempting to describe a university course - in this example we'll need to add several organization- and type-specific fields: entry score; faculty; mode; duration and campus. External search engines will not be able to leverage this information, but as a user progresses further through their search for a course, it's appropriate to reveal these additional details in site search.
Arguably, a university's courses are being regarded as 'products', in order to shoehorn them into schema.org concepts. In this example, 'course:faculty' and 'course:fees' map to 'product:category' and 'product:price' respectively. 'Location' and 'Faculty' have been nominated as two of the most valuable fields to expose when showing this product as a Twitter card.
<!DOCTYPE html>
<head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# product: http://ogp.me/ns/product#">
...
<!-- OpenGraph //-->
<meta property="og:title" content="Bachelor of Basket Weaving - University of Funnelback" />
<meta property="og:type" content="product" />
<meta property="og:url" content="http://example.com/courses/b-basket-weaving" />
<meta property="og:site_name" content="University of Funnelback" />
<meta property="og:image" content="http://example.com/path/to/course/image.png" />
<meta property="og:description" content="Learn how to make the most of those old, dry vines. Assessment conducted by qualified basket weavers." />
<meta property="fb:app_id" content="1234567890" />
<!-- OpenGraph - Product //-->
<meta property="product:category" content="Arts & Crafts" />
<meta property="product:price" content="£40000" />
<!-- Twitter Card //-->
<meta name="twitter:title" content="Bachelor of Basket Weaving - University of Funnelback" />
<meta name="twitter:card" content="product" />
<meta name="twitter:site" content="http://example.com/" />
<meta name="twitter:image" content="http://example.com/path/to/course/image.png" />
<meta name="twitter:description" content="Who doesn't need a high quality basket? Guaranteed post-grad employment " />
<!-- Twitter Card - product //-->
<meta name="twitter:label1" content="Location" />
<meta name="twitter:value1" content="Sydney, London, New York" />
<meta name="twitter:label2" content="Faculty" />
<meta name="twitter:value2" content="Arts & Crafts" />
<!-- Organization-specific metadata //-->
<meta name="course:duration" content="2 years" />
<meta name="course:entryScore" content="750" />
<meta name="course:campus" content="Sydney;London;New York" />
<meta name="course:mode" content="Full-time;Part-time;Online" />
<meta name="course:faculty" content="Arts & Crafts" />
<meta name="course:fees" content="£40000" />
<!-- Auto-generated metadata //-->
<meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
<meta name="dcterms:language" scheme="RFC4646" content="en" />
...
</head>
...
Schema.org does a good job of describing some of the fields necessary for 'people' in general, but there may still be organization-specific values that are useful to display and refine by. We've added organization-specific fields for faculty, role, honorific, email and work phone number below:
<!DOCTYPE html>
<head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# profile: http://ogp.me/ns/profile#">
...
<!-- OpenGraph //-->
<meta property="og:title" content="Dr Solomon Otombe - University of Funnelback" />
<meta property="og:type" content="person" />
<meta property="og:url" content="http://example.com/staff/solomon-otombe" />
<meta property="og:site_name" content="University of Funnelback" />
<meta property="og:image" content="http://example.com/path/to/person/image.png" />
<meta property="og:description" content="Lecturer - Social Science Education, School of Education" />
<meta property="og:category" content="Social Science Education,Teacher Education,Education Policy" />
<meta property="og:updated_time" content="2014-08-02" />
<meta property="fb:app_id" content="1234567890" />
<!-- OpenGraph - Person //-->
<meta property="profile:first_name" content="Solomon" />
<meta property="profile:last_name" content="Otombe" />
<!-- Twitter Card //-->
<meta name="twitter:title" content="Dr Solomon Otombe - University of Funnelback" />
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:site" content="http://example.com/" />
<meta name="twitter:image" content="http://example.com/path/to/person/image.png" />
<meta name="twitter:description" content="Lecturer - Social Science Education, School of Education" />
<!-- Organization-specific metadata //-->
<meta name="staff:faculty" content="School of Education" />
<meta name="staff:role" content="Lecturer - Social Science Education" />
<meta name="staff:honorific" content="Dr" />
<meta name="staff:email" content="s.otombe@university.edu" />
<meta name="staff:phoneWork" content="1 2345 6789" />
<!-- Auto-generated metadata //-->
<meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
<meta name="dcterms:language" scheme="RFC4646" content="en" />
...
<head>
...
If we regard a Public Consultation as an event - something that has a start and end date - it maps quite closely to the Event concept defined by schema.org:
<!DOCTYPE html itemscope itemtype="http://schema.org/Event">
<head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb#" itemscope itemtype="http://schema.org/Event">
...
<!-- OpenGraph //-->
<meta property="og:title" content="Public Consultation: Objective relevancy ranking - Department of Search" />
<meta property="og:type" content="article" />
<meta property="og:url" content="http://example.com/consultations/objective-relevancy-ranking" />
<meta property="og:site_name" content="Department of Search" />
<meta property="og:image" content="http://example.com/path/to/consultation/image.png" />
<meta property="og:description" content="Contribute to this public consultation on objectivity and search result relevancy ranking." />
<meta property="og:updated_time" content="2014-08-02" />
<meta property="fb:app_id" content="1234567890" />
<!-- Twitter Card //-->
<meta name="twitter:title" content="Public Consultation: Objective relevancy ranking - Department of Search" />
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:site" content="http://example.com/" />
<meta name="twitter:image" content="http://example.com/path/to/consultation/image.png" />
<meta name="twitter:description" content="Contribute to this public consultation on objectivity and search result relevancy ranking." />
<!-- schema.org - event //-->
<meta itemprop="name" content="Public Consultation: Objective relevancy ranking - Department of Search" />
<meta itemprop="description" content="Contribute to this public consultation on objectivity and search result relevancy ranking." />
<meta itemprop="image" content="http://example.com/path/to/consultation/image.png" />
<meta itemprop="startDate" content="2015-09-01" />
<meta itemprop="endDate" content="2015-09-10" />
<!-- Auto-generated metadata //-->
<meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
<meta name="dcterms:language" scheme="RFC4646" content="en" />
...
<head>
...
Unless you're producing a television series or a big-budget movie, most videos you're publishing on your website will probably conform quite closely to the general 'video' rich result type:
<!DOCTYPE html>
<head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# video: http://ogp.me/ns/video#">
...
<!-- OpenGraph //-->
<meta property="og:title" content="Computer Science: Information Retrieval 101 - University of Funnelback" />
<meta property="og:type" content="video.other" />
<meta property="og:url" content="http://example.com/consultations/objective-relevancy-ranking" />
<meta property="og:site_name" content="Department of Search" />
<meta property="og:image" content="http://example.com/path/to/consultation/image.png" />
<meta property="og:description" content="Dr. Tim Smith spends 2hrs introducting filtering, indexing and ranking concepts alongside real-world examples." />
<meta property="og:updated_time" content="2014-08-02" />
<meta property="fb:app_id" content="1234567890" />
<!-- OpenGraph - Video //-->
<meta property="video:duration" content="7200" />
<meta property="video:tag" content="computer science, information retrieval, filtering, indexing, ranking, tim smith" />
<meta property="video:writer" content="Dr. Tim Smith" />
<meta property="video:release_date" content="2014-08-02" />
<meta property="video:secure_url" content="https://example.com/path/to/video/video_id.mp4" />
<meta property="video:type" scheme="RFC4337" content="video/mp4" />
<meta property="video:width" content="1024" />
<meta property="video:height" content="768" />
<!-- Twitter Card //-->
<meta name="twitter:title" content="Computer Science: Information Retrieval 101 - University of Funnelback" />
<meta name="twitter:card" content="player" />
<meta name="twitter:site" content="http://example.com/" />
<meta name="twitter:image" content="http://example.com/path/to/course/video/image.png" />
<meta name="twitter:description" content="Dr. Tim Smith spends 2hrs introducting filtering, indexing and ranking concepts alongside real-world examples" />
<!-- Twitter Card - Player //-->
<meta name="twitter:player" content="https://example.com/path/to/video/player/video_id" />
<meta name="twitter:player:height" content="768" />
<meta name="twitter:player:width" content="1024" />
<meta name="twitter:player:stream" content="https://example.com/path/to/video/video_id.mp4" />
<meta name="twitter:player:stream" scheme="RFC4337" content="video/mp4" />
<!-- Auto-generated metadata //-->
<meta name="dcterms:published" scheme="ISO8601" content="2015-08-02" />
<meta name="dcterms:language" scheme="RFC4646" content="en" />
...
<head>
We can use an organization's metadata schema together with other industry standards for site search - this is particularly useful if you've created separate versions for a page's description or title for redisplay on Twitter and Facebook.
Funnelback v14.2 and above supports the use of long-form metadata field names. The date field is the exception below, being mapped to 'd':
d,0,dcterms:published
dateUpdated,0,og:updated_time
duration,0,og:video:duration
# Common concepts, different sources
description,1,og:description
description,1,twitter:description
imageUrl,0,og:image
imageUrl,0,twitter:image
title,1,og:title
title,1,twitter:title
author,1,video:writer
author,1,article:author
type,0,og:type
type,0,twitter:card
tags,1,article:tag
tags,1,video:tag
# People
firstName,1,profile:first_name
lastName,1,profile:last_name
faculty,1,staff:faculty
staffRole,1,staff:role
staffHonorific,1,staff:honorific
staffEmail,1,staff:email
staffPhonework,1,staff:phoneWork
# Events
startDate,0,startDate
endDate,0,endDate
You may decide that matches on 'author' or 'tags' may be more valuable than matches on title or description. You can experiment with these settings by adding weights to those fields for each search query, appending these parameters to your search UI URL:
/s/search.html?...&wmeta_category=1.0&wmeta_tags=1.0&wmeta_title=0.2
Alter your query processing options to return these fields and expand summaries to include both metadata and snippets:
#collection.cfg
query_processor_options=-SM=both -SF=[type,duration,title,author,imageUrl,startDate,endDate,tags]
These metadata fields will now be available in your data model for display, clicking, tracking, and other secondary actions.
response.resultPacket.results[...]{
rank: 1,
score: 1000,
title: "Page Title",
liveUrl: http://example.com/path/to/page,
metaData: {
"type":"...",
"author":"...",
"imageUrl":"...",
"tags":"...",
...
}
}
Simple A/B testing using your existing web analytics package is probably the easiest way to measure the cosmetic changes to rich results, assuming that the ranking behaviour is unchanged.
If using Funnelback's default search analytics, you may want to configure separate profiles, each with their own analytics and search forms, to examine the click-through rates for users exposed to rich snippets, as opposed to the 'plain vanilla' result rankings:
$SEARCH_HOME/conf/COLLECTION
/_default/simple.ftl
/rich-display/simple.ftl
Calls to search results pages could correspondingly be cycled between:
/s/search.html?collection=COLLECTION&query=QUERY
and
/s/search.html?collection=COLLECTION&query=QUERY&profile=rich-display
Experimenting with upweighting behaviours applied manually is fraught with issues - up-weighting may benefit some users' search queries whilst penalising others. One method worth considering is the application of query-independent evidence: all other ranking elements being, certain site sections or types will receive a ranking boost.
Query-independent evidence is applied at index-time:
# qie.cfg
# Upweight videos, people, courses
1.0 example.com/courses
0.8 example.com/people
0.6 example.com/videos
This weight can also be ignored or applied at query time by modifying the query_processor_options, using a similar profile structure to our rich snippet display testing:
#collection.cfg
# Negate impact of query-independent evidence
query_processor_options=-cool.5=0.0
#/rich-display-ranking/padre_opts.cfg
# Maximise impact of query-independent evidence
query_processor_options=-cool.5=1.0
$SEARCH_HOME/conf/COLLECTION
/qie.cfg
/collection.cfg
/rich-display-ranking/padre_opts.cfg
Compare the impacts of applying query-independent evidence to these styled (or unstyled) rich snippet search results pages by tracking a proportion of search queries from:
/s/search.html?collection=COLLECTION&query=QUERY
(baseline styling, baseline weighting on rich content types)
and
/s/search.html?collection=COLLECTION&query=QUERY&profile=rich-display
(rich styling, normal weighting on rich content types)
and
/s/search.html?collection=COLLECTION&query=QUERY&profile=rich-display-ranking
(rich styling, upweighted rich content types)