W3C logo
slanted W3C logo
 Cover page images (keys)

MySQL in the Semantic Web
a SPARQL parser in mysqld

Eric Prud'hommeaux, RDF Data Access Working Group team contact

http://www.w3.org/2007/Talks/0424-spasql/
http://tinyurl.com/39sp65

Why did you add a parser to mysqld?

One Query Language

A pharmaceutical company does a query joining

Save $1,000,000 / day with early termination.

What is SPASQL

Making queries like:

SELECT ?billCity ?billState
 WHERE { ?order <Orders.customer> ?cust .
         ?cust <Customers.billingAddress> ?addr .
         ?addr <Addresses.city> ?billCity .
         ?addr <Addresses.state> ?billState }

on data like:

Orders
idcustomerproductorderDateshippingAddress
218651873852006-05-01 01:08:30NULL
218620829222006-05-01 18:34:38NULL
218751874392006-05-01 01:05:30NULL
Customers
givenNamefamilyNamebillingAddress
BobSmith18

(simple one-database example)

What is That Query?

What's RDF?

Why Put Data in RDF?

Basis in URIs means everything is globally identifiable...

Data Models

XML: query element/attribute structure.

<Person name="Sue"><knows><Person name="Bob">...</Person></name></Person>
<Person id="p7"><name>Sue</name>...</Person>
<Person id="p23"><name>Bob</name>...</Person>
<knows who="p7" whom="p23"/>
<knows who="p7" whom="..."/>

Relational: tuples of attributes

knows and Persons tables

RDF: relationships between objects.

Sue knows Bob triple

Patterns

XQuery: query element/attribute structure.

let $pairs := //knows
let $n1 := Person/@name[@id=$pairs/@who]
let $n2 := Person/@name[@id=$pairs/@whom]

SQL: query attributes of objects

FROM knows
     JOIN Persons as who ON knows.who=who.id
     JOIN Persons as whom ON knows.whom=whom.id

SPARQL: query relationships between objects.

WHERE { ?who foaf:knows ?whom .
        ?who foaf:name ?n1 .
        ?whom foaf:name ?n2 }

What is SPASQL?

SPASQL Example

Use case: OrderTracking.

Orders
idcustomerproductorderDateshippingAddress
218651873852006-05-01 01:08:30NULL
218620829222006-05-01 18:34:38NULL
218751874392006-05-01 01:05:30NULL

Data Structure

Term: Graph

Many ways to represent graphs:

XML

<order prodct="#p23">...<Product id="p23" name="mystifier">

relational

...

RDF

simple order-product graph

Comparing SPARQL to SQL — references

Both SPARQL and SQL can query graph data.

SQL graph shape comes from reuse of table aliases:

... FROM Orders AS o
    WHERE o.orderDate < "20060501"
      AND o.shipDate = "20060508"

SPARQL graph shape comes from reuse of variables.

   PREFIX ot: <http://bigco.example/Orders/>
... WHERE { ?o ot:orderDate "20060501" .
            ?o ot:shipDate "20060508" }

Comparing SPARQL to SQL — links

Term: restrict

SQL restricts cross joins on attribute values.

  ...
  FROM Orders AS o
       JOIN Products AS p ON o.product=p.id

SPARQL has links in the data model that it queries.

  ...
  WHERE { ?o <Orders.products> ?p }

What's Out There?

Costs of SPASQL

UNION Example — SPARQL

SELECT ?part ?descr ?given ?family ?posn ?rma ?billCity ?billState ?shipCity ?shipState
 WHERE { { ?order <Orders.product> ?prod .
           ?order <Orders.customer> ?cust .
           OPTIONAL { ?order <Orders.shippingAddress> ?shipAddr .
                      ?shipAddr <Addresses.city> ?shipCity .
                      ?shipAddr <Addresses.state> ?shipState }
         } UNION {
           ?order <Returns.product> ?prod .
           ?order <Returns.customer> ?cust .
           ?order <Returns.rma> ?rma
         }
         ?prod <Products.partNo> ?part .
         ?prod <Products.description> ?descr .
         { ?cust <Customers.id> ?num .
           ?cust <Customers.givenName> ?given .
           ?cust <Customers.familyName> ?family .
           ?cust <Customers.billingAddress> ?addr
         } UNION {
           ?cust <Employees.id> ?num .
           ?cust <Employees.givenName> ?given .
           ?cust <Employees.familyName> ?family .
           ?cust <Employees.position> ?posn
         }
         ?cust <Employees.homeAddress> ?addr .
         ?addr <Addresses.city> ?billCity .
         ?addr <Addresses.state> ?billState }

UNION Example — corresponding SQL

SELECT Products_0.partNo AS part,
       Products_0.description AS descr,
       U1.given,
       U1.family,
       U1.posn,U0.rma,
       Addresses_1.city AS billCity,
       Addresses_1.state AS billState,
       U0.shipCity,
       U0.shipState
  FROM (
              SELECT Orders_0.customer AS cust,
                     Orders_0.product AS prod,
                     NULL AS rma,
                     IF(Addresses_0.city IS NULL OR Addresses_0.state IS NULL, NULL, Addresses_0.city) AS shipCity,
                     IF(Addresses_0.city IS NULL OR Addresses_0.state IS NULL, NULL, Addresses_0.state) AS shipState
                FROM Orders AS Orders_0
                     LEFT OUTER JOIN Addresses AS Addresses_0 ON Orders_0.shippingAddress=Addresses_0.id
               WHERE Orders_0.customer IS NOT NULL AND Orders_0.product IS NOT NULL
        UNION ALL
              SELECT Returns_0.customer AS cust,
                     Returns_0.product AS prod,
                     Returns_0.RMAnumber AS rma,
                     NULL AS shipCity,
                     NULL AS shipState
                FROM Returns AS Returns_0
               WHERE Returns_0.customer IS NOT NULL AND Returns_0.product IS NOT NULL AND Returns_0.RMAnumber IS NOT NULL
    ) AS U0
    INNER JOIN Products AS Products_0 ON U0.prod=Products_0.id
    INNER JOIN (
              SELECT Customers_0.id AS num,
                     Customers_0.givenName AS given,
                     Customers_0.familyName AS family,
                     NULL AS posn,
                     Customers_0.billingAddress AS addr
                FROM Customers AS Customers_0
               WHERE Customers_0.id IS NOT NULL AND Customers_0.givenName IS NOT NULL AND Customers_0.familyName IS NOT NULL AND Customers_0.billingAddress IS NOT NULL
        UNION ALL
              SELECT Employees_0.id AS num,
                     Employees_0.givenName AS given,
                     Employees_0.familyName AS family,
                     Employees_0.position AS posn,
                     Employees_0.homeAddress AS addr
                FROM Employees AS Employees_0
               WHERE Employees_0.id IS NOT NULL AND Employees_0.givenName IS NOT NULL AND Employees_0.familyName IS NOT NULL AND Employees_0.position IS NOT NULL AND Employees_0.homeAddress IS NOT NULL
    ) AS U1 ON U1.num=U0.cust
    INNER JOIN Addresses AS Addresses_1 ON U1.addr=Addresses_1.id
 WHERE Products_0.partNo IS NOT NULL AND Products_0.description IS NOT NULL AND Addresses_1.city IS NOT NULL AND Addresses_1.state IS NOT NULL;

Benefits

Future Work

Deployment Plan

Data Mapping Details

Term: tuple

Term: predicate

... ...

Data Mapping Details

... ...

Data Mapping Details

... ...
{ ?o <Orders.product> ?p }

Data Mapping Details

... ...
{ ?o <Orders.product> ?p
  ?o <Orders.shipDate> ?ship }

Data Mapping — NULLs

... ...
{ ?t <Orders.product> ?p
  OPTIONAL { ?t <Orders.shipDate> ?ship } }

Data Mapping — UNION

SPARQL UNION corresponds to an SQL subselected UNION

SPARQL

WHERE { { ?x <TableA.field> ?y }
        UNION
        { ?x <TableB.field> ?y } }

SQL

JOIN ( SELECT TableA.field AS foo FROM ... WHERE ...
       UNION
       SELECT TableB.field [AS foo] FROM ... WHERE ...  ) AS U0 ON U0.foo=...

Questions?

$#&@!!!