GoodRelations is a standardized vocabulary for product, price, and company data that can (1) be embedded into existing static and dynamic Web pages and that (2) can be processed by other computers. This increases the visibility of your products and services in the latest generation of search engines, recommender systems, and other novel applications.
Martin Hepp
martin.hepp at ebusiness-unibw.org
Tue Aug 24 15:04:40 CEST 2010
Dear all:
Some shop applications, unfortunately, display the very same item at
multiple URIs. This is problematic for Search Engine Optimization
(SEO) and the Web of Data alike.
Examples and Causes
===================
There are two typical causes:
a) The navigation path of the shop system is used to create "clean"
URIs:
http://www.myshop.com/staplers/green_pocket_stapler123
http://www.myshop.com/featured-items/green_pocket_stapler123
b) Parameters, e.g. such to control the language of the output, the
preferred currency, the session ID (bad...), or the referrer.
This case is more severe, because it can easily cause 10 - 100
duplicates per single page:
http://www.myshop.com/staplers/green_pocket_stapler123
http://www.myshop.com/staplers/green_pocket_stapler123?lang=en
http://www.myshop.com/staplers/green_pocket_stapler123?currency=usd?
lang=en
http://www.myshop.com/staplers/green_pocket_stapler123?referrer=clicksale
Problems
========
1. If you embed GoodRelations data markup in RDFa syntax to your HTML/
XHTML shop templates, this may cause a massive duplication of data
elements for applications that are trying to consume our shop data.
2. This will reduce the findability of your items in GoodRelations-
aware applications.
3. It spoils the Web of Data ("proliferation of URIs").
4. Your pages will receive a lower ranking in search engines, because
the amount of links will be spread over multiple URIs.
5. Pages may even be banned from search engines ("duplication of
content"); that is independent of whether you are using GoodRelations
or not.
6. Crawlers will waste more resources crawling your site, consume more
of your valuable bandwidth, and are more likely to use outdated cached
versions of your pages.
Solutions
=========
1. The ideal solution is to aim for canonical URIs (one URI per
product) as much as possible. This may not be easy for pattern a), but
it is straightforward for case b), e.g. by using session cookies and /
or HTTP redirects.
2. If that is not possible, you should use *absolute* instead of
*relative* identifiers in RDFa for all major data elements ("about"
attribute in RDFa).
For example, in the template for the "product item" page, use
<div about="http://www.myshop.com/staplers/green_pocket_stapler123#offering
" typeof="gr:Offering">
...
instead of
<div about="#offering" typeof="gr:Offering">
...
The effect will be that no matter from which URI the page was actually
requested, the same RDF data will be extracted. It requires, though,
that you can determine the canonical URI of the page at the time of
the request.
3. The last option (but the least powerful, yet still much better than
doing nothing) is to add owl:sameAs statements from the RDFa pattern
to the canonical URI.
The canonical URI should also be used for the foaf:page property,
which is the crucial link from the data to the page from where the
product can be ordered.
Example:
<div about="#offering" typeof="gr:Offering">
<div rel="owl:sameAs" resource="http://www.myshop.com/staplers/green_pocket_stapler123#offering
"></div>
<div property="rdfs:label" content="Cool green stapler for $8.99"
xml:lang="en"></div>
...
<div rel="foaf:page" resource="http://www.myshop.com/staplers/green_pocket_stapler123
"></div>
</div>
</div>
Best wishes
Martin Hepp
--------------------------------------------------------
martin hepp
e-business & web science research group
universitaet der bundeswehr muenchen
e-mail: hepp at ebusiness-unibw.org
phone: +49-(0)89-6004-4217
fax: +49-(0)89-6004-4620
www: http://www.unibw.de/ebusiness/ (group)
http://www.heppnetz.de/ (personal)
skype: mfhepp
twitter: mfhepp
Check out GoodRelations for E-Commerce on the Web of Linked Data!
=================================================================
* Project Main Page: http://purl.org/goodrelations/
* Quickstart Guide for Developers: http://bit.ly/quickstart4gr
* Vocabulary Reference: http://purl.org/goodrelations/v1
* Developer's Wiki: http://www.ebusiness-unibw.org/wiki/GoodRelations
* Examples: http://bit.ly/cookbook4gr
* Presentations: http://bit.ly/grtalks
* Videos: http://bit.ly/grvideos