Data Shapes Working Group Launched
It's taken a while but we've finally been able to launch the RDF Data Shapes Working Group. As the charter for the new WG says, the mission is to produce a language for defining structural constraints on RDF graphs. In the same way that SPARQL made it possible to query RDF data, the product of the RDF Data Shapes WG will enable the definition of graph topologies for interface specification, code development, and data verification. In simpler terms, it will provide for RDF what XML Schema does for XML. A way to define cardinalities, lists of allowed values for properties and so on.
Can't you do that already?
Of course you can.
You can do it with OWL, SPIN, Resource Shapes, Shape Expressions and any number of other ways, but the workshop held a year ago suggested that this landscape was less than satisfactory. Each of the technologies in that incomplete list has its adherents and implementation experience to draw on, but what is the best way forward? Does the technology to address 80% of use cases need to be as sophisticated as the technology to address all 100%?
As the charter makes clear, there are many different areas where we see this work as being important. Data ingestion is the obvious one (if I'm going to ingest and make sense of your RDF data then it must conform to a topology I define), but we also see it as being important for the generation of user interfaces that can guide the creation of data, such as metadata about resources being uploaded to a portal. Tantalizingly, knowing the structure of the graph in detail has the potential to lead to significant improvements in the performance of SPARQL engines.
The new WG will begin by developing a detailed Use Cases and Requirements document. More than anything, it is that work that will inform the future direction of the working group. If you're a W3C Member, please join the working group. If not, please subscribe to the RDF Data Shapes public mailing list.
Can't we just apply XML Schema to an RDF/XML document? for those who desire structure?
Why would I want to accompany my TTL file with a TTL schema?
That's another way of doing it, yes, but the underlying problem with the XML serialization of RDF is that people then try and think in XML terms, not in graphs - and that's where confusion can occur. There are any number of existing systems that do the job and if your context means that using XML schema to validate RDF/XML works, then, OK, that works for you. More generally, a graph structure needs a graph-centric validation method, however it's serialized.
The problem with using XML Schema and RDF/XML is that you're operating at the wrong level of abstraction. People want a solution that is independent from the serialization format.
Re using XML Schema at the syntax level - Jane Hunter had a paper on this approach at WWW10 - http://www10.org/cdrom/papers/572/
At that time, the layering of RDF and XML was a hugely controversial topic, e.g. http://www.w3.org/TR/schema-arch addresses the 'problem' of W3C having two different schema languages.
A closer ancestor, conceptually, here is Rick Jelliffe's Schematron approach - http://en.wikipedia.org/wiki/Schematron - you can do pretty much the same thing over RDF graphs as he did with XML trees, and that's what W3C is finally standardizing here. There's enough experience now (a decade plus, e.g. http://markmail.org/message/4c4kavqkfn5gp4ge etc.) with this approach for it to make sense to document a best practice.