GoodRelations is a standardized vocabulary for product, price, and company data that can (1) be embedded into existing static and dynamic Web pages and that (2) can be processed by other computers. This increases the visibility of your products and services in the latest generation of search engines, recommender systems, and other novel applications.
Martin Hepp (UniBW)
martin.hepp at ebusiness-unibw.org
Sat Oct 10 17:39:52 CEST 2009
Dear all:
The distributed character of the Web makes it very likely that the exact
same entity is being defined in multiple graphs. In particular, there
will be significant redundancy in the definition of
- Business Entities and
- Product Models.
The main cause is that providers of data may define entities locally
rather than searching for an authoritative URI on the Web. For example,
someone exporting a catalog may want to refer to the manufacturer of a
gr:ProductOrServiceModel or gr:ProductOrServicesSomeInstancesPlaceholder
without searching for the authoritative URI of that manufacturer.
This is not a major technical problem, since providers of commerce
dataspaces will very likely offer entity consolidation as one important
feature.
For your own projects, you can start with the following simple SPARQL
CONSTRUCT rules to create owl:sameAs statements so that multiple
definition of the very same entities will be treated as one.
Note that the current rules assume perfect equivalence of the legal
names resp. the EAN/UPC code. You could use more sophisticated filters
for expanding the scope of the consolidation, e.g. ignoring
capitalization and special characters (e.g. "Miller Ltd." vs. "miller ltd").
# Consolidate Business Entities that have the exact same legalName
CONSTRUCT {?u2 owl:sameAs ?u1.}
WHERE
{?u1 a gr:BusinessEntity.
?u2 a gr:BusinessEntity.
?u1 gr:legalName ?name1.
?u2 gr:legalName ?name2.
FILTER (?u1!=?u2 && ?name1=?name2)}
# Consolidate Product Models that have the exact same gr:hasEAN_UCC-13
CONSTRUCT {?u2 owl:sameAs ?u1.}
WHERE
{?u1 a gr:ProductOrServiceModel.
?u2 a gr:ProductOrServiceModel.
?u1 gr:hasEAN_UCC-13 ?ean1.
?u2 gr:hasEAN_UCC-13 ?ean2.
FILTER (?u1!=?u2 && ?ean1=?ean2 && ?ean1!="")}
Important:
1. Make sure you consolidate only nodes of the same type. For example,
two gr:Offerings may have the same gr:hasEAN_UCC-13 property, but are of
course not the same.
2. For local sets of such statements, you have any degree of freedom and
I encourage you to experiment with different ones. Before publishing
such sameAs statements, however, run thorough quality checks first.
Reckless usage of sameAs can spam the Web of Linked Data, and dataspaces
will consequently ignore all your graphs.
Best wishes
Martin Hepp
--
--------------------------------------------------------------
martin hepp
e-business & web science research group
universitaet der bundeswehr muenchen
e-mail: hepp at ebusiness-unibw.org
phone: +49-(0)89-6004-4217
fax: +49-(0)89-6004-4620
www: http://www.unibw.de/ebusiness/ (group)
http://www.heppnetz.de/ (personal)
skype: mfhepp
twitter: mfhepp
Check out GoodRelations for E-Commerce on the Web of Linked Data!
=================================================================
Webcast:
http://www.heppnetz.de/projects/goodrelations/webcast/
Recipe for Yahoo SearchMonkey:
http://www.ebusiness-unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey
Talk at the Semantic Technology Conference 2009:
"Semantic Web-based E-Commerce: The GoodRelations Ontology"
http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-goodrelations-ontology-1535287
Overview article on Semantic Universe:
http://www.semanticuniverse.com/articles-semantic-web-based-e-commerce-webmasters-get-ready.html
Project page:
http://purl.org/goodrelations/
Resources for developers:
http://www.ebusiness-unibw.org/wiki/GoodRelations
Tutorial materials:
CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey
http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-Commerce_Tutorial_IEEE_CEC%2709
-------------- next part --------------
A non-text attachment was scrubbed...
Name: martin_hepp.vcf
Type: text/x-vcard
Size: 308 bytes
Desc: not available
URL: <http://ebusiness-unibw.org/pipermail/goodrelations/attachments/20091010/e100d64a/attachment.vcf>