Eric Prud'hommeaux, RDF Data Access Working Group team contact
subjects | predicates | object |
---|---|---|
order7 | product | product23 |
order7 | shipDate | 20050508 |
product23 | product-name | "Mystifier" |
subjects | object |
---|---|
order7 | 20050508 |
subjects | object |
---|---|
product23 | "Mystifier" |
Example use case: OrderTracking.
id | customer | product | orderDate | shippingAddress |
---|---|---|---|---|
2186 | 518 | 7385 | 2006-05-01 01:08:30 | NULL |
2186 | 208 | 2922 | 2006-05-01 18:34:38 | NULL |
2187 | 518 | 7439 | 2006-05-01 01:05:30 | NULL |
SQL graph shape comes from table aliases:
... WHERE o.orderDate < "20060501" AND o.shipDate = "20060508"
SPARQL graph shape (mostly) comes from variable.
... WHERE { ?o <orderDate> "20060501" . ?o <shipDate> "20060508" }
SQL restricts cross joins on attribute values.
... FROM Orders AS o JOIN Products AS p ON o.product=p.id
SPARQL has links in the data model that it queries.
... WHERE { ?o <Orders.products> ?p }
{ ?o <Orders.product> ?p }
{ ?o <Orders.product> ?p ?o <Orders.shipDate> ?ship }
{ ?t <Orders.product> ?p OPTIONAL { ?t <Orders.shipDate> ?ship } }
SPARQL UNION corresponds to an SQL subselected UNION
WHERE { { ?x <TableA.field> ?y } UNION { ?x <TableB.field> ?y } }
JOIN ( SELECT TableA.field AS foo FROM ... WHERE ...
UNION
SELECT TableB.field [AS foo] FROM ... WHERE ... ) AS U0 ON U0.foo=...
SELECT ?part ?descr ?given ?family ?posn ?rma ?billCity ?billState ?shipCity ?shipState WHERE { { ?order <Orders.product> ?prod . ?order <Orders.customer> ?cust . OPTIONAL { ?order <Orders.shippingAddress> ?shipAddr . ?shipAddr <Addresses.city> ?shipCity . ?shipAddr <Addresses.state> ?shipState } } UNION { ?order <Returns.product> ?prod . ?order <Returns.customer> ?cust . ?order <Returns.rma> ?rma } ?prod <Products.partNo> ?part . ?prod <Products.description> ?descr . { ?cust <Customers.id> ?num . ?cust <Customers.givenName> ?given . ?cust <Customers.familyName> ?family . ?cust <Customers.billingAddress> ?addr } UNION { ?cust <Employees.id> ?num . ?cust <Employees.givenName> ?given . ?cust <Employees.familyName> ?family . ?cust <Employees.position> ?posn } ?cust <Employees.homeAddress> ?addr . ?addr <Addresses.city> ?billCity . ?addr <Addresses.state> ?billState }
This SQL approximates the previous SPARQL query:
SELECT Products_0.partNo AS part, Products_0.description AS descr, U1.given, U1.family, U1.posn,U0.rma, Addresses_1.city AS billCity, Addresses_1.state AS billState, U0.shipCity, U0.shipState FROM ( SELECT Orders_0.customer AS cust, Orders_0.product AS prod, NULL AS rma, IF(Addresses_0.city IS NULL OR Addresses_0.state IS NULL, NULL, Addresses_0.city) AS shipCity, IF(Addresses_0.city IS NULL OR Addresses_0.state IS NULL, NULL, Addresses_0.state) AS shipState FROM Orders AS Orders_0 LEFT OUTER JOIN Addresses AS Addresses_0 ON Orders_0.shippingAddress=Addresses_0.id WHERE Orders_0.customer IS NOT NULL AND Orders_0.product IS NOT NULL UNION ALL SELECT Returns_0.customer AS cust, Returns_0.product AS prod, Returns_0.RMAnumber AS rma, NULL AS shipCity, NULL AS shipState FROM Returns AS Returns_0 WHERE Returns_0.customer IS NOT NULL AND Returns_0.product IS NOT NULL AND Returns_0.RMAnumber IS NOT NULL ) AS U0 INNER JOIN Products AS Products_0 ON U0.prod=Products_0.id INNER JOIN ( SELECT Customers_0.id AS num, Customers_0.givenName AS given, Customers_0.familyName AS family, NULL AS posn, Customers_0.billingAddress AS addr FROM Customers AS Customers_0 WHERE Customers_0.id IS NOT NULL AND Customers_0.givenName IS NOT NULL AND Customers_0.familyName IS NOT NULL AND Customers_0.billingAddress IS NOT NULL UNION ALL SELECT Employees_0.id AS num, Employees_0.givenName AS given, Employees_0.familyName AS family, Employees_0.position AS posn, Employees_0.homeAddress AS addr FROM Employees AS Employees_0 WHERE Employees_0.id IS NOT NULL AND Employees_0.givenName IS NOT NULL AND Employees_0.familyName IS NOT NULL AND Employees_0.position IS NOT NULL AND Employees_0.homeAddress IS NOT NULL ) AS U1 ON U1.num=U0.cust INNER JOIN Addresses AS Addresses_1 ON U1.addr=Addresses_1.id WHERE Products_0.partNo IS NOT NULL AND Products_0.description IS NOT NULL AND Addresses_1.city IS NOT NULL AND Addresses_1.state IS NOT NULL;
$query1 = "SELECT id FROM Orders WHERE Orders.shipDate = '20060501'"; mysql_query($query1); ... $query2 = "SPARQL: SELECT ?o WHERE { ?o <Orders.product> '20060501' }"; mysql_query($query2);
operator mapping in XTech talk:
http://www.w3.org/2005/05/22-SPARQL-MySQL/XTech#features
SPASQL in MySQL project page
http://www.w3.org/2005/05/22-SPARQL-MySQL/
# Assign convenient prefixes to common for later syntactic shorthand. PREFIX db: <http://www.w3.org/2003/01/21-RDF-RDB-access/ns#SqlDB?properties=..%2Ftest%2F> PREFIX ma: <http://med.example/ma#> PREFIX cs: <http://med.example/cs#> PREFIX up: <http://med.example/up#> PREFIX sa: <http://med.example/sa#> PREFIX mt: <http://med.example/mt#> SELECT ?name ?chemical ?motif ?saProt ?kd50 ?like ?ld FROM NAMED db:MicroArray.prop FROM NAMED db:Uniprot.rdf FROM NAMED db:ScreeningAssay.prop FROM NAMED db:ChemStructure.prop FROM NAMED db:MouseToxicity.prop WHERE { # Get a name and a chemical from the (SQL) MicroArray database. GRAPH db:MicroArray.prop { ?g ma:name ?name . ?g ma:expression "up" . ?g ma:experiment ?kinase . ?kinase ma:against ?agin . ?agin cs:chemical ?chemical } # The uniprot data (in RDF) has motif and pathway information. GRAPH db:Uniprot.rdf { ?p ma:name ?name . # bound to ?ma.ma:name ?p up:motif ?motif . ?p up:pathway "apoptosis" } # Use the (SQL) Kinase databaes to limit to the interesting chemicals. GRAPH db:ScreeningAssay.prop { ?a sa:name "KinaseAssay" . ?a cs:chemical ?chemical . # bound to ?ma.cs:chemical ?a sa:upname ?saProt . ?a ma:kd50 ?kd50 FILTER (?kd50 >= .7 || ?kd50 < .2) } # This (SQL) chemical database indexes like sidechains. GRAPH db:ChemStructure.prop { ?c cs:chemical ?chemical . # bound to ?ma.cs:chemical, ?sa.cs:chemical ?c cs:structure "asdfasdf" . ?c cs:sidechain ?side . ?c2 cs:sidechain ?side . ?c2 cs:chemical ?like } # Limit by toxicity in the (SQL) MouseToxicity experiments. GRAPH db:MouseToxicity.prop { ?t cs:chemical ?like . # bound to ?cs.cs:sidechain ?t mt:toxicity ?ld FILTER (?ld < .35) } }
# Get a name and a chemical from the (SQL) MicroArray database. GRAPH db:MicroArray.prop { ?g ma:name ?name . ?g ma:expression "up" . ?g ma:experiment ?kinase . ?kinase ma:against ?agin . ?agin cs:chemical ?chemical } }
gives me:
g | name | kinase | agin | chemical |
---|---|---|---|---|
g1 | name1 | kinase1 | agin1 | chemical1 |
g2 | name2 | kinase2 | agin2 | chemical2 |
g3 | name3 | kinase2 | agin2 | chemical3 |
g | name | kinase | agin | chemical |
---|---|---|---|---|
g1 | name1 | kinase1 | agin1 | chemical1 |
g2 | name2 | kinase2 | agin2 | chemical2 |
g3 | name3 | kinase2 | agin2 | chemical3 |
The next GRAPH
query
# The uniprot data (in RDF) has motif and pathway information.
GRAPH db:Uniprot.rdf {
?p ma:name ?name . # bound to ?ma.ma:name
?p up:motif ?motif .
?p up:pathway "apoptosis" }
is constrained by the variable name
# The uniprot data (in RDF) has motif and pathway information.
GRAPH db:Uniprot.rdf {
?p ma:name ?span . # bound to ?ma.ma:name
?p up:motif ?motif .
?p up:pathway "apoptosis" }
name1
, name2
, name3
) substituted for ?name
GRAPH db:Uniprot.rdf {
?p ma:name "name1" . # bound to ?ma.ma:name
?p up:motif ?motif .
?p up:pathway "apoptosis" }
GRAPH db:Uniprot.rdf {
?p ma:name "name2" . # bound to ?ma.ma:name
?p up:motif ?motif .
?p up:pathway "apoptosis" }
GRAPH db:Uniprot.rdf {
?p ma:name "name3" . # bound to ?ma.ma:name
?p up:motif ?motif .
?p up:pathway "apoptosis" }
Send current (relevent) bindings with the query:
# The uniprot data (in RDF) has motif and pathway information.
GRAPH db:Uniprot.rdf {
?p ma:name ?name . # bound to ?ma.ma:name
?p up:motif ?motif .
?p up:pathway "apoptosis" }
BINDINGS ?name {
("name1")
("name2")
("name3") }
BINDINGS
section at bottom of query.# The uniprot data (in RDF) has motif and pathway information.
GRAPH db:Uniprot.rdf {
?p ma:name ?name . # bound to ?ma.ma:name
?p up:motif ?motif .
?p up:pathway "apoptosis" }
BINDINGS ?name {
("name1") # ⊨ rows 1-3
("name2") # ⊨ no new rows
("name3") } # ⊨ row 4
name | p | motif |
---|---|---|
name1 | <http://...protein/352354#p> | motifA |
name1 | <http://...protein/856876#p> | motifB |
name1 | <http://...protein/833993#p> | motifC |
name3 | <http://...protein/685685#p> | motifX |