Search Engines take on Structured Data
Structured data on the web got a boost this week, with Google's announcement of Rich Snippets and Rich Snippets in Custom Search. Structured data at such a large scale raises at least three issues:
- Syntax
- Vocabulary
- Policy
Google's documentation shows support for both microformats and RDFa. It follows the hReview microformat syntax with small vocabulary changes (name vs fn). Support for RDFa syntax, in theory, means support for vocabularies that anyone makes; but in practice, Google is starting with a clean slate: data-vocabulary.org. That's a place to start, though it doesn't provide synergy with anyone who has uses FOAF or Dublin Core or the like to share their data.
The policy questions are perhaps the most difficult. Structured data is a pointy instrument; if anyone can say anything about anything, surely the system will be gamed and defrauded. Google's rollout is one step at a time, starting with some trusted sites and an application process to get your site added. The O'Reilly interview with Guha and Hansson is an interesting look at where they hope to go after this first step; if you're curious about how this fits in to HTML standards, see Sam Ruby's microdata.
While issues remain--there are syntactic i's to dot and t's to cross and even larger policy issues to work out--between Google's rollout and Yahoo's searchmonkey and the UK Central Office of Information rollout, it seems that the industry is ready to take on the challenges of using structured data in search engines.
It's good to know that you folks are watching this. I hope you can prevail on Google to accept more microformats not invented there.
By the way, is there an RDFa equivalent to hReview that I would have overlooked?
It seems to be “easy” to build a profitable website, but it is not. The expectations are high and as I see it too many IT nerds master the technical side needed but lack the basic knowledge of relevant business principles. I do not think that a website can be turned into a golden egg only based on the tricks used by some in order to get the best ranking. Much more competence is needed and a deep understanding of how to manage through crises as we experience right now or expressed in another way, few have the ability to manage when rapid changes take place. I think it is also a question of communication, to whom do we speak, well both to our present customers as well as to potential. In advertising it is a known fact the a major part of advertising is spent to keep existing customers happy and the rest is for potential but maybe these days other rules are applying?