This is the archive of the goodrelations dicussion list

goodrelations

This may be interesting for developers adding GoodRelations to their sites, too:

As said: Our current advice is 

1. Use a standard sitemap for your site AND make sure that the lastmod data is correct. This field tells a crawler whether a particular resource has been modified since the last crawl. Using the proper lastmod attribute minimizes the crawling load on your site, reduces the crawling work and thus increases the likelihood that your site data is updated in time in the respective index. If you omit the lastmod attribute or set it to the data of creating the sitemap, you force a crawler to crawl ALL pages anew, even if just a small percentage has actually changed.

2. Use a complementing semantic sitemap for listing your specific Semantic Web resources, e.g. data dump files.

3. Indicate both in the robots.txt file.

References:

Standard sitemaps: http://sitemaps.org/protocol.php
Semantic sitemaps: http://sw.deri.org/2007/07/sitemapextension/
Sindice recommendation: http://sindice.com/developers/publishing

Best

Martin Hepp

Begin forwarded message:

> Resent-From: semantic-web at w3.org
> From: Martin Hepp <martin.hepp at ebusiness-unibw.org>
> Date: April 4, 2011 10:14:35 AM GMT+02:00
> To: Richard Cyganiak <richard at cyganiak.de>
> Cc: Giovanni Tummarello <giovanni.tummarello at deri.org>, Francisco Javier López Pellicer <fjlopez at unizar.es>, semantic-web <semantic-web at w3c.org>
> Subject: Re: SPARLQ endpoint discovery
> 
> Hi all:
> 
> Richard raises an important point - since Semantic Sitemaps don't validate in Google tools, it is hard to convince site-owners to use them.
> 
> However, there is a work-around: You can publish BOTH a regular sitemap and a semantic sitemap for your site and list both in the robots.txt file.
> 
> Google should accept the regular one (you could also submit this to them manually) and ignore the semantic sitemap. RDF-aware crawlers would find both and could prefer the semantic sitemap.
> 
> The downside of this approach is that you risk to increase the crawling load on your site. But I would assume you could minimize the overlap of URIs in both - e.g., you do not need to tell Google of your compressed RDF dump file resources.
> 
> Best wishes
> 
> Martin
> 
> On Apr 4, 2011, at 8:53 AM, Richard Cyganiak wrote:
> 
>> Hi Giovanni,
>> 
>> Semanitc Sitemaps seemed like a good idea because it was a very simple extension to standard XML Sitemaps, which are a widely adopted format supported by Google and other major search engines.
>> 
>> What killed Semantic Sitemaps for me is the fact that adding *any* extension element, even a single line, makes Google reject the Sitemap.
>> 
>> In practice, XML Sitemaps are not an extensible format.
>> 
>> On the question of complexity of Sitemaps and VoID: Publishers will get it right if and only if there is a) some serious consumption of the data that publishers actually care about and b) a validator. At the moment neither a) nor b) is given, neither for Semantic Sitemaps nor for VoID.
>> 
>> Best,
>> Richard

This is the archive of the goodrelations dicussion list

[goodrelations] Fwd: SPARLQ endpoint discovery