SPARQL Query Language for RDF

Comments by Pat Hayes

Editors working draft.
Live Draft - version:
$Revision: 1.140 $ of $Date: 2004/11/28 08:28:53 $
Latest published version:
First Working Draft


RDF is a flexible, extensible way to represent information about World Wide Web resources. It is used to represent, among other things, personal information, social networks, metadata about digital artifacts like music and images, as well as provide a means of integration over disparate sources of information. A standardized query language for RDF data with multiple implementations offers developers and end users a way to write and to consume the results of queries across this wide range of information. This document describes a query language for RDF, called SPARQL, for querying RDF data.

This document describes the query language part of SPARQL for easy access to RDF stores. It is designed to meet the requirements and design objectives described in the W3C RDF Data Access Working Group (DAWG) document "RDF Data Access Use Cases and Requirements".

Status of This document

This is a live document and is subject to change without notice. See also the change log. It reflects the best effort of the editor to reflect impelementation experience and incorporate input from various members of the WG, but is not yet endorsed by the WG as a whole.

Table of Contents


See also:


DAWG issues list

1 Introduction

Section status: bare outline

Key features in one page.  Refs to other documents by DAWG.

An RDF graph is a set of triples, each consisting of a subject, a predicate and an object, and a property relationship between them as defined in RDF Concepts and Abstract syntax. These triples can come from a variety of sources. For instance, they may come directly from an RDF document. They may be inferred from other RDF triples. They may be the RDF expression of data stored in other formats, such as XML or relational databases.

SPARQL is a query language for accessing ?Is this the same as querying? such RDF graphs. It provides facilities to:

As a data access language, it is suitable for both local and remote use. When used across networks, the companion document [@@ protocol document not yet published @@] describes a remote access protocol.

1.1 Document Conventions

When undeclared, the namespace rdf stands in place of http://www.w3.org/1999/02/22-rdf-syntax-ns#, the namespace rdfs stands in place of http://www.w3.org/2000/01/rdf-schema#, and the namespace xsd for http://www.w3.org/2001/XMLSchema#.

2 Making Simple Queries

Queries match graph patterns against the target graph of the query.  Patterns are like graphs but may ?? named variables in place of some of the nodes or predicates; the simplest graph patterns are single triple patterns.  and g Graph patterns can be combined using various operators into more complicated graph patterns. 

A binding is a mapping from the a variable in a query to terms?Not yet defined. A pattern solution is a set of bindings which, when applied to the variables in the query, cab can be used to produce a subgraph of the target graph; query results are a set of pattern solutions. If there are no result mappings, the query results is an empty set.

Pictorially, suppose we have a graph with two triples and the given triple pattern:

_:1 foaf:mbox "alice@work.example"


_:2 foaf:mbox "robt@home.example"


?who foaf:mbox ?addr


with the result:

reference author
http://www.w3.org/TR/xpath "James Clark"
http://www.w3.org/TR/xpath "Steve DeRose"

RDF graphs are constructed from one or more triples, ex. graph1.

_:1 foaf:mbox "alice@work.example". _1 foaf:knows _2. _:2 foaf:mbox "robt@home.example"


?who foaf:mbox "alice@work.example". ?who foaf:knows ?whom. ?whom foaf:mbox ?address


A query for graphPattern1 will return the email address of people known by Alice (specifically, the person with the mbox alice@work.example). When matched against the example RDF graph, we get one result mapping which binds three variables:

referrer reference author
http://www.w3.org/TR/xpath http://www.w3.org/TR/xpath "James Clark"
http://www.w3.org/TR/xpath http://www.w3.org/TR/xpath "Steve DeRose"
But this looks like two result mappings.

2.1 Writing a Simple Query

The example below shows a query to find the title of a book from the information in an RDF graph. The query consists of two parts, the SELECT clause and the WHERE clause. Here, the SELECT clause names the variable of interest to the application ?What does that mean? The ideas of binding, variable etc. do not mention 'interest to the application'. What application??, and the WHERE clause has one triple pattern.


<http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> "SPARQL Tutorial" . 


SELECT ?title
WHERE  ( <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title )

Query Result:

"SPARQL Tutorial"

The terms delimited by "<>" are URI References [13] (URIRefs); URIRefs can also abbreviated with an XML QName-like form [14]; this is syntactic assistance and is translated to the full URIRef.Translated by what? When? Other RDF terms are literals which, following N-Triples syntax [7], are a string and optional language tag (introduced with '@') and datatype URIRef (introduced by '^^').

Variables in SPARQL queries have global scope; it is the same variable everywhere the name is used. Everywhere? Or everywhere in the same query? Variables are indicated by '?'; the '?' does not form part of the variable's name. This is a very bad idea, IMO. I'll expand on it elsewhere. It would be much better to not distinguish between variables and variable names. The distinction is unnecessary and introduces a hornet's nest of potential confusions, eg can two distinct variables have the same name? (Why not?) Also it means that a variable cannot be identified with a string in any SPARQL syntax.

An alternative choice here is '$'. Awaiting reports of usage in DB connection technologies.

Because URIRefs can be long, SPARQL provides an abbreviation mechanism. Prefixes can be defined and a QName-like syntax provides shorter forms: we also use the N3/Turtle [15] prefix mechanism for describing data. Prefixes apply to the whole query. Does the query have to use the same abbreviations as the target graph specification? I ask because in all the examples they do, suggesting an alignment is assumed or required.

PREFIX  dc: <http://purl.org/dc/elements/1.1/>
SELECT  ?title
WHERE   ( <http://example.org/book/book1> dc:title ?title )
PREFIX  dc: <http://purl.org/dc/elements/1.1/>
PREFIX  : <http://example.org/book/>
SELECT  ?title
WHERE   ( :book1  dc:title  ?title )

Similarly, we abbreviate data:

@prefix dc:   <http://purl.org/dc/elements/1.1/> .
@prefix :     <http://example.org/book/> .
:book1  dc:title  "SPARQL Tutorial" .

Prefixes are syntactic: the prefix name does not effect the query ?That sounds like you could change the prefixes without changing the query. Surely not?, nor do prefix names in queries need to be the same prefixes as used for data. This query is equivalent to the previous one and will give the same results when applied to the same graph.

PREFIX  dcore:  <http://purl.org/dc/elements/1.1/>
PREFIX  xsd:    <http://www.w3.org/2001/XMLSchema#>
SELECT  ?title
WHERE   ( ?book dcore:title ?title )

RDF has typed literals. Such literals are written using "^^".You said that already Integers can be directly written and are interpreted as typed literals of datatype xsd:integer.

@prefix ns:   <http://example.org/ns#> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .
@prefix :     <http://example.org/book/> .

:book1 ns:numPages  "200"^^xsd:integer .
:book2 ns:numPages  100 .

2.2 Triple Patterns

The building blocks of queries are triple patterns. Syntactically, a SPARQL triple pattern is a subject, predicate and object delimited by parentheses.(Can a URIref start with a parenthesis?) The previous example shows a triple pattern with a variable subject (the variable book), a predicate of dcore:title and a variable object (the variable title).

( ?book dcore:title ?title )

A triple pattern applied to a graph matches all triples with identical RDF terms for the corresponding subject, predicate and object. The variables in the triple pattern, if any, are bound to the corresponding RDF terms in the matching triples.

Definition: RDF Term

An RDF Term is anything that can occur in the RDF data model.
let RDF-U be the set of all RDF URI References
let RDF-L be the set of all RDF Literals
let RDF-B be the set of all bNodes

The set of RDF Terms, RDF-T, is RDF-U union RDF-L union RDF-B.

Definition: Query Variable

Let V be the set of all query variables.  V and RDF-T are disjoint.

A query variable is a name?Is it? That reads very oddly to me. What is it a name of? Isnt it just a character string? , used to define queries as graph patterns. A query variable is associated with RDF terms in a graph by a binding.

An RDF triple contains three components:

In SPARQL, a triple pattern is like an RDF triple but with the addition that components can be a query variable instead.

Definition: Triple Pattern

The set of triple patterns is
    (RDF-U union RDF-B union V) x (RDF-U union V) x (RDF-T union V)

Proposed http://lists.w3.org/Archives/Public/public-rdf-dawg/2004OctDec/0313.html to make this
(RDF-T union V) x (RDF-T union V) x (RDF-T union V)

Do we really want to allow bnodes in patterns? There seems to be a misalignment between these definitions and later text. It does not make sense to both allow bnodes in patterns and to allow a subset of variables to be 'selected' for answer bindings. We have two obvious options: either (A) allow bnodes in queries, as these definitions do, and use variables only when one expects to get a binding; or (B) not allow bnodes in queries and distinguish selected variables from unselected ones. Which is the intended way we are going? I can fix the definitions either way round, but there is not enough information in this draft to tell which way is intended.  

Later. On reading further I am pretty sure that option B is the one intended, so I will tweak the definitions to conform to that. Where it matters, I'll mark the changes with [**B] to indicate that if we decide on a different option these will need to be changed.

[**B] Definition: Triple Pattern

The set of triple patterns is
    (RDF-U union V) x (RDF-U union V) x (RDF-U union RDF-L union V)


[**B] Define RDF Ground Terms, RDF-G to be RDF-U union RDF-L. Then a triple pattern is an element of

(RDF-G union V) x (RDF-U union V) x (RDF-G union V

and a graph pattern, or simply a pattern, is a set of triple patterns.

Notice, no blank nodes allowed in a pattern anywhere.

Definition: Binding

A binding is a pair which defines a mapping from a variable to an RDF Term. If B is such a binding, var(B) is the variable of the binding, and val(B) is the RDF term.

See other document for summary of definitions.

In this document, we illustrate bindings in results in tabular form, Does this mean that SPARQL implementations must use this tabular form? I have already had negative feedback about these tables, eg Bob McGregor strongly suggested that returning answers as triple stores made more sense. Also, later in the document the tables are extended with blank entries, which do not make sense when understood as illustrations of bindings. :

x y
"Alice" "Bob"

Not every binding needs to exist in every row of the table.  So far, the examples have shown queries that either exactly match the graph, or do not match at all. Optional Matches can cause bindings, bit if they fail to match, they do not cause the solution to be rejected, and so can leave variables unset in a row of the table. ?So what does this mean, when understood as a specification of a binding? Is the unset variable bound or not? Or is it bound to a special 'blank' value?

Ive sketched a series of definitions in another document to try to overcome this kind of objection.

Definition: Substitution

A substitution S is a partial functional relation from variables to RDF terms or variables . We write S[v] for the RDF term that S pairs with the variable v and define S[v] to be v where there is no such pairing.

Definition: Triple Pattern Matching

For substution S and Triple Pattern T, S(T) is the triple pattern forms by replacing any variable v in T with S[v].

Triple Pattern T matches RDF graph G with substitution S, if S(T) is a triple of G.

If the same variable name is used more than once in a pattern then, within each solution to the query, the variable has the same value.

For example, the query:

SELECT * WHERE ( ?x ?x ?v )

matches the triple:

rdf:type rdf:type rdf:Property .

with solution:

x v
rdf:type rdf:Property

It does not match the triple:

rdfs:seeAlso rdf:type rdf:Property .

because the variable x would need to be both rdfs:seeAlso and rdf:type in the same solution.

2.3 Graph Patterns

The keyword WHERE is followed by?This sounds like a syntactic description, but you havnt mentioned syntax yet. a Graph Pattern which is made of one or more Triple Patterns. These Triple Patterns are "and"ed together. More formally, the Graph Pattern is the conjunction of the Triple Patterns. ?Not sure what that means, since patterns don't have truthvalues (do they??) Why do we need to say this?

There is a delicate issue is saying that a query is a conjunction, since queries are often treated logically as being on the RHS of an entailment - a goal to be proved - and these in turn are often treated as if they were negated, which maps conjunction into disjunction. So for many readers trained in eg. Prolog, a multiple-triple query would be considered to be a disjunction rather than a conjunction. But the chief point is that queries do not have truthvalues, and are not asserted, so the use of disjunction/conjunction language is not appropriate. In each query solution, all the triple patterns must be satisfied with the same binding of variables to values.


@prefix foaf:    <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name   "Johnny Lee Outlaw" .
_:a  foaf:mbox   <mailto:jlow@example.com> .

There is a bNode [12] in this dataset. Just within the file, for encoding purposes, the bNode is identified by _:a but the information about the bNode label is not in the RDF graph. No query will be able to identify that bNode by the label used in the serialization. OK, but are you saying that this is inevitable or that this is just the way that SPARQL works? After all, it would be possible to return a bnode ID in an answer binding.


PREFIX foaf:   <http://xmlns.com/foaf/0.1/> 
SELECT ?mbox
  ( ?x foaf:name "Johnny Lee Outlaw" )
  ( ?x foaf:mbox ?mbox )

Query Result:


This query contains a conjunctive graph pattern Is there any other kind of graph pattern?. A conjunctive graph pattern is a set of triple patterns, each of which must match for the graph pattern to match. You have to say that they all match with the same substitution.

It would be easier to just say all this directly for graphs, seems to me. Nothing is made simpler by first restricting to the single-triple case, and you have to say everything twice.

Definition: Graph Pattern (Partial Definition) – Conjunction

A set of triple patterns is a graph pattern GP. For such a graph pattern to match with substitution S, each triple pattern in GP must match with substitution S.

Definition: Graph Pattern Matching

For substitution S, we write S(GP) for the graph pattern produced by applying S to each triple pattern T in GP.

If GP = { T | T triple pattern } then S(GP) = { S(T) }

Graph Pattern GP matches RDF graph G with substitution S if G simply entails S(GP).

2.4 Multiple Matches

The results of a query are all the ways a query can match the graph being queried. Each result is one solution to the query and there may be zero, one or multiple results to a query, depending on the data.


@prefix foaf:  <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name   "Johnny Lee Outlaw" .
_:a  foaf:mbox   <mailto:jlow@example.com> .
_:b  foaf:name   "Peter Goodguy" .
_:b  foaf:mbox   <mailto:peter@example.org> .


PREFIX foaf:   <http://xmlns.com/foaf/0.1/> 
SELECT ?name, ?mbox
  ( ?x foaf:name ?name )
  ( ?x foaf:mbox ?mbox )

Query Result:

name mbox
"Johnny Lee Outlaw" <mailto:jlow@example.com>
"Peter Goodguy" <mailto:peter@example.org>

The results enumerate the RDF terms to which the selected variables What does selected mean? You havnt mentioned that until now. can be bound in the graph pattern. In the above example, the following two subsets of the data caused the two matches.

 _:a foaf:name  "Johnny Lee Outlaw" .
 _:a foaf:box   <mailto:jlow@example.com> .
 _:b foaf:name  "Peter Goodguy" .
 _:b foaf:box   <mailto:peter@example.org> .

For a simple, conjunctive graph pattern match, all the variables used in the query pattern will be bound in every solution.

Definition: Pattern Solution

A Pattern Solution of Graph Pattern GP on graph G is any substitution S such that GP matches G with S.

For a graph pattern GP formed as a set of triple patterns, S(G), has no variables and is a subgraph of G.

Definition: Query Solution

A Query Solution is a Pattern Solution where the pattern is the whole pattern of the query.

Definition: Query Results

The Query Results, for a given graph pattern GP on G, is written R(GP,G), and is the set of all query solutions such that GP matches G.

R(GP, G) may be the empty set.

2.5 Blank Nodes

Blank Nodes and Queries

There is no standard representation of bNodes in RDF ?? Yes there is. What do you mean?? and the syntax of SPARQL queries does not allow them ??BUt you just defined it to allow them: RDF-T includes RDF-B as a subset, and patterns are sets of triples of (RDF-T union V). In fact, this allows bnodes in places that RDF does not allow them, eg as the predicate of a pattern triple . They can form part of a pattern match ???You just said they could not !! and do take part in in the pattern matching process.

Suggestions for better wording most welcome.I would make some if I could understand what you have in mind.The above seems self-contradictory.

I'd suggest that we EITHER allow bnodes in query patterns and do not distinguish 'selected' variables, OR ELSE we do not allow bnodes in query patterns and distinguish 'selected' variables. Let me know which of these you want. For now I'll assume the second alternative.

Blank Nodes and Results

In the results of queries, the presence of bNodes can be indicated but the internal system identification is not preserved. We need to say this without talking about systems. It would probably be best to use the terminology used in the RDF specs. The answer bindings are not required to be those in the target graph, only a renaming of them. This allows bnodes to be exchanged on a 1:1 basis. Thus, a client can tell that two solutions to a query differ in bNodes needed to perform the graph match but this information is only scoped to the results (result set or RDF graph).

Redo when XML syntax document is available an duse the syntax there.


@prefix foaf:  <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name   "Alice" .
_:b  foaf:name   "Bob" .


PREFIX foaf:   <http://xmlns.com/foaf/0.1/> 
SELECT ?x ?name
WHERE  ( ?x foaf:name ?name )

Query Result:

x name
_:a "Alice"
_:b "Bob"
x name
_:r "Alice"
_:s "Bob"

?Why are there two results? These two results have the same information: the blank node used to match the query was different in the two solutions.  There is no relation between using _:a in the results and any internal blank node label in the data graph; the labels in the results only indicate whether elements in the soltuions were the same or different.

3 Working with RDF Literals

RDF Literals are written in SPARQL as strings, with optional language tag (indicted by '@') or optional datatype (indicated by '^^'), with additional convenience forms for xsd:integers and xsd:doubles:

Examples of literal syntax in SPARQL:

3.1 Matching RDF Literals

The dataset below contains a number of RDF literals:

@prefix dt:   <http://example.org/datatype#> .
@prefix ns:   <http://example.org/ns#> .
@prefix :     <http://example.org/ns#> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .

:x   ns:p     "42"^^xsd:integer .
:x   ns:p     "abc"^^dt:specialDatatype .
:x   ns:p     "cat"@en .

The pattern in the query matches because 42 is syntax for "42" with datatype URI http://www.w3.org/2001/XMLSchema#integer.Is it significant that the variable names correspond exactly to the binding qnames? Presumably not, but the coincidence is awfully suggestive.

SELECT ?v WHERE (?x ?p 42) 

?? The selected variable does not occur in the pattern. What is the result binding in this case? This query matches, without requiring the query processor to have any understanding of the values in the space:

SELECT ?v WHERE ( ?x ?p "abc"^^<http://example.org/datatype#specialDatatype> )

This query has a pattern that fails to match because "cat" is not the same RDF literal as "cat"@en:

SELECT ?v WHERE ( ?x ?p "cat" )

but this does find a solution: Can someone write a query with a variable in part of a literal? Eg in this case, a query like

SELECT ?v WHERE (?x ?p "cat"@?v)

Why not? After all, it seems to make sense and even would work in much the same way. Similarly for selecting a literal with a given type, by a construction like ?v^^xsd:date.

SELECT ?v WHERE ( ?x ?p "cat"@en ) 

Implementation Requirements

An implementation of SPARQL only needs to be able to match lexical forms and datatypes in graph patterns. It is not required to provide support the datatype hierarchy of XML schema nor for application-defined hierarchies. It is not required to provide matching in patterns based on value spaces ?What would that mean? . Thus, testing numercial equality in a constraint ?What is a constraint? is not identical to literal matching in pattern matching.

In this dataset,

@prefix ns:   <http://example.org/ns#> .
@prefix :     <http://example.org/ns#> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .

:x   ns:p     "42"^^xsd:short .

there is no match required ?? Do you mean no match exists? for the query:

SELECT ?v WHERE ( ?x ?p 42 )

but there is for this query,

SELECT ?v WHERE ( ?x ?p ?v ) AND ?v == 42

because of the use of numeric equality.?!!? What? This is completely new syntax. What does it mean? WHY does this query match but not the previous one?? You have to give more explanation here of what the hell is going on. So far, a match is defined in terms of substitutions for variables, with no mention of equality or values. Introducing equality changes the language's expressive power enormously, so we need to say very exactly what we mean by this. For example, is this a legal query?

SELECT ?v ?w WHERE (?v ?p ?w) AND ?v == ?w

If so, does it match things that would not be matched by

SELECT ?v WHERE (?v ?p ?v)

An implementation may choose to provide datatype hierarchies and value based pattern matching. Applications using a SPARQL processor should not assume that the processor provides datatype hierarchies or matching based on value-spaces of literals unless the application knows explicitly that this is the case.

3.2 Constraining Values

Graph pattern matching creates bindings of variables. It is possible to further restrict possible solutions by constraining the allowable binding of variables to RDF Terms.?Surely they are already constrained to be RDF terms, right?   Constraints in SPARQL take the form of boolean-valued expressions; the language also allows application-specific filter functions. This is not enough information to enable the reader to understand what this all means. What counts as a filter function? Boolean-valued expressions of what form??


@prefix dc:   <http://purl.org/dc/elements/1.1/> .
@prefix :     <http://example.org/book/> .
@prefix ns:   <http://example.org/ns#> .

:book1  dc:title  "SPARQL Tutorial" . 
:book1  ns:price  42 .
:book2  dc:title  "The Semantic Web" . 
:book2  ns:price  23 .


PREFIX  dc:  <http://purl.org/dc/elements/1.1/>
PREFIX  ns:  <http://example.org/ns#> 
SELECT  ?title ?price
WHERE   ( ?x dc:title ?title )
        ( ?x ns:price ?price ) AND ?price < 30

Query Result:

title price
"The Semantic Web" 23

By having a constraint on the "price" variable, only one of the books matches the query. Not with the definitions of 'match' given so far. Do you intend that the definitions should be modified? Like a triple pattern, this is just a restriction on the allowable values of a variable. No, its not. It is a restriction on the value of the variable, not a syntactic restriction. This is a completely different ball game. I no longer know what the rules are. Are arbitrary computations allowed on terms? Can we restrict to, say, answer strings with more than ten characters, or URIrefs that do not contain Cyrillic script? Can the computation be state-sensitive, so we can restrict answers to those that are longer than a previous answer? The text needs to be very clear at this point on exactly what is being allowed here and what it means.

In any case, the definitions so far have been in terms of bindings to variables. Something has to make clear what the intended connection is between that way of understanding a query answer and this new way based on predicates on values.

Definition: Constraints

A constraint is a boolean-valued expression of variables and RDF Terms that can be applied to restrict query solutions.That isnt a definition. Any computable predicate can be counted as a boolean-valued expression.

Definition: Graph Pattern (Partial Definition) – Constraints

A graph pattern can also include constraints. These constraints further restrict the possible query solutions of matching a graph pattern with a graph.That isnt a definition either.

SPARQL defines a set of operations that all implementations must provide. In addition, there is an extension mechanism for boolean tests that are specific to an application domain or kind of data.

A constraint may lead to an error condition when testing some variable binding.  The exact error will depend on the constraint: in numeric operations, supplying a non-number or a bNode will lead to such  an error. ?What does 'supplying' mean here? You have not specified when these operators are to be called during the answering process, so a procedural understanding is incomplete, and you have not given a descriptive or formal specification of what the operation means. Any potential solution that causes an error condition in a constraint will not form part of the final results. Does that mean that errors are simply ignored in the answer set? Or that the answering process must report an error condition to the querying process?

4 Including Optional Values

So far, the graph matching and value constraints allow queries that perform exact matches on a graph. For every solution of the query, every variable has is bound to an RDF Term. Sometimes useful, additional information about some item of interest in the graph can be found but, for another item, the information is not present. If the application writer wants that additional information, the query should not fail just because the some information is missing. ?? I cannot follow what the last two sentences are saying. What is an 'item'? Why is the information 'additional'? (Additional to what?)

4.1 Optional Matching

Optional portions of the graph ?What does that mean? RDF does not have a notion of an optional portion of a graph. may be specified in either of two equivalent ways:

 OPTIONAL (?s ?p ?o)...
 [ (?s ?p ?o)... ]


@prefix foaf:       <http://xmlns.com/foaf/0.1/> .
@prefix rdf:        <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:       <http://www.w3.org/2000/01/rdf-schema#> .

_:a  rdf:type        foaf:Person .
_:a  foaf:name       "Alice" .
_:a  foaf:mbox       <mailto:alice@work.example> .

_:b  rdf:type        foaf:Person .
_:b  foaf:name       "Bob" .

Query (these two are the same query using slightly different syntax):

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox
WHERE  ( ?x foaf:name  ?name )
       OPTIONAL ( ?x  foaf:mbox  ?mbox ) 
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox
WHERE  ( ?x foaf:name  ?name )
       [ ( ?x  foaf:mbox  ?mbox ) ]

Query result:

name mbox
"Alice" <mailto:alice@example.com>

Now, there is no value of mbox where the name is "Bob". It is left unset in the result.

But what does that mean? A result is defined in terms of mappings from variables to terms. If there is no term, there is no mapping.

This query finds the names of people in the dataset, and, if there is an mbox property, retrieve that as well. In the example, only a single triple pattern is given in the optional match part of the query but in general it is a graph pattern.

For each optional block, the query processor attempts to match the query pattern. It would be better to describe this without referring to processing. The spec should not constrain the processing strategy. Failure to match the block does not cause this query solution to be rejected. The whole graph pattern of an optional block must match for the optional to add to the query solution.

4.2 Multiple Optional Blocks

A query may have zero or more top-level optional blocks. These blocks will fail or provide bindings independently. Optional blocks can also be nested, that is, an optional block may appear inside another optional block. Dear God, is all this really necessary? What does a nested optional mean? If the block is optional already, what purpose is served by making part of it even more optional?


@prefix foaf:       <http://xmlns.com/foaf/0.1/> .
@prefix rdf:        <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:       <http://www.w3.org/2000/01/rdf-schema#> .

_:a  foaf:name       "Alice" .
_:a  foaf:homepage   <http://work.example.org/alice/> .

_:b  foaf:name       "Bob" .
_:b  foaf:mbox       <mailto:bob@work.example> .


PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox ?hpage
WHERE  ( ?x foaf:name  ?name )
       [ ( ?x foaf:mbox ?mbox ) ]
       [ ( ?x foaf:homepage ?hpage ) ]

Query result:

name mbox hpage
"Alice" <http://work.example.org/alice/>
"Bob" <mailto:bob@example.com>

In this example, there are two independent optional blocks. Each depends only on variables defined in the non-optional part of the graph pattern. If a new variable is mentioned ?Do you mean, occurs in? in an optional block (as mbox and hpage are mentioned in the previous example), that variable can be mentioned in that block and can not be mentioned in a subsequent block. ??Why not? Is there a reason for this exclusion?

4.3 Optional Matching – Formal Definition

In an optional match, either a graph pattern matches a graph and so defines one or more pattern solutions, or gives an empty pattern solution but does not cause matching to fail overall.

Definition: Optional Matching

Given graph pattern GP1, and graph pattern GP2, let GP= (GP1 union GP2).

The optional match of GP2 of graph G, given GP1, defines a pattern solution PS such that:

If GP matches G, then the solutions of GP is the patterns solutions of GP else the solutions are the pattern solutions of GP1 matching G.

5 Nested Patterns

Graph patterns may contain nested patterns ??What does that mean? . We've seen this earlier in optional matches. Nested patterns are delimited with ()s:

{ ( ?s ?p ?n2 ) ( ?n2 ?p2 ?n3 ) }

Definition: Graph Pattern – Nesting

A graph pattern GP can contain other graph patterns GPi. No, it can't. You have defined what 'graph pattern' means: it is a set of pattern triples. A set of triples cannot (by definition) contain a set of triples. A query solution of Graph Pattern GP on graph G is any B such that each element GPi of GP matches G with binding B.

For example:

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT ?name ?foafmbox
WHERE  ( ?x foaf:name ?name )
         { ( ?x foaf:mbox ?mbox ) }

Because this example has a simple conjunction for the nested pattern, and because the nested pattern is a conjunctive element What does that mean? in the outer pattern, this has the same results:

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT ?name ?foafmbox
WHERE  ( ?x foaf:name ?name ) ( ?x foaf:mbox ?mbox )

Optional blocks can be nested. The outer optional block must match for any nested one to apply. That is, the outer graph pattern pattern is fixed for the purposes of any nested optional block. I am completely unable to understand what is going on here. What does 'fixed for the purposes' mean? What does it mean for a pattern to 'apply'? What is a block?


@prefix foaf:       <http://xmlns.com/foaf/0.1/> .
@prefix rdf:        <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:       <http://www.w3.org/2000/01/rdf-schema#> .
@prefix vcard:      <http://www.w3.org/2001/vcard-rdf/3.0#> .
_:a  foaf:name       "Alice" .
_:a  foaf:mbox       <mailto:alice@work.example> .
_:a  vcard:N         _:d .

_:d  vcard:Family    "Hacker" .
_:d  vcard:Given     "Alice" .

_:b  foaf:name       "Bob" .
_:b  foaf:mbox       <mailto:bob@work.example> .

_:c  foaf:name       "Eve" .
_:c  vcard:N         _:e .

_:e  vcard:Family    "Hacker" .
_:e  vcard:Given     "Eve" .


PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
PREFIX vcard:   <http://www.w3.org/2001/vcard-rdf/3.0#>
SELECT ?foafName ?mbox ?fname ?gname
WHERE  ( ?x foaf:name ?foafname )
       [ ( ?x foaf:mbox ?mbox ) ]
       [ ( ?x  vcard:N  ?vc )
          [ ( ?vc vcard:Family ?fname ) 
            ( ?vc vcard:Given  ?gname )

Query result:

foafName mbox fname gname
"Alice" <mailto:alice@work.example> "Hacker" "Alice"
"Bob" <mailto:bob@work.example>
"Eve" "Hacker" "Eve

This query finds the name, optionally the mbox, and also optionally the vCard structured name components. By nesting the optional access to vcard:Family and vcard:Given, the query only reaches these What does that mean? There seems to be an implicit assumption here of a processing model being used by the query answering process. I suggest that this either be made explicit or eliminated. if there is a vcard:N property. It is possible to expand out What does that mean? optional blocks to remove nesting at the cost of duplication of expressions. Is this always the case? That is, can we always think of nesting as syntactic sugar for an expanded form? If so, I suggest we do that explicitly. Here, the expression is a simple triple pattern on vcard:N but it could be a complex graph match with value constraints.

5.1 Nested Optional Blocks

There is an additional condition that must be met for nested optional blocks. Considering the query pattern as a tree of blocks ?? I have no idea how to make sense of this. A query pattern is a set. , then a variable in an optional block can only be mentioned in other optional blocks nested within this one. ??Is this a constraint that follows from something structural, or is it an imposed condition? If the latter, why?? A variable can not be used in two optional blocks where the outermost mention (shallowest occurence in the tree for each occurence) of the two uses is not the same block.

All occurences of variable, v, in a query, the outermost mention of v must be the same.

Suggestions for better wording most welcome! I would make some if I had the remotest idea what was being said.

The purpose of this condition is to enable the query processor to process the query blocks What is a query block? in arbitrary (or optimized) order. If a variable was introduced in one optional block and mentioned in another, it would be used to constrain the second. Reversing the order of the optional blocks would reverse the blocks in which the variable was was introduced and was used to constrain. Such a query could give different results depending on the order in which those blocks were evaluated.

This is extremely puzzling. Both RDF graphs and queries are defined as sets of triples, and answers depend only on the existence of substitution mappings. So how can results possibly depend on the order in which anything is evaluated?

6 More Pattern Matching – Alternatives

SPARQL provides a means combining graph patterns in to more complex ones so that one of several possibilities is attempted ?? Attempted by what? This seems inappropriate language. to see if it matches.  If more than one of the alternatives matches, all the possible pattern solutions are found. I can't make sense of this.

6.1 Joining Patterns with UNION

The UNION keyword is the syntax for pattern alternatives.


@prefix dc10:  <http://purl.org/dc/elements/1.0/> .
@prefix dc11:  <http://purl.org/dc/elements/1.1/> .

_:a  dc10:title     "SPARQL Query Language Tutorial" .
_:a  dc10:creator   "Alice" .

_:b  dc11:title     "SPARQL Protocol Tutorial" .
_:b  dc11:creator   "Bob" .


PREFIX dc10:  <http://purl.org/dc/elements/1.1/>
PREFIX dc11:  <http://purl.org/dc/elements/1.0/>

SELECT ?title
WHERE  ( ?book dc10:title  ?title ) UNION ( ?book dc11:title  ?title )

Query result:

"SPARQL Protocol Tutorial"
"SPARQL Query Language Tutorial"

This query finds titles of the books in the dataset, whether the title is recorded using Dublin Core properties from version 1.0 or version 1.1. If the application wishes to know how exactly the information was recorded, then the query:

PREFIX dc10:  <http://purl.org/dc/elements/1.1/>
PREFIX dc11:  <http://purl.org/dc/elements/1.0/>

SELECT ?title10 ?title11
WHERE  ( ?book dc10:title ?title10 ) UNION ( ?book dc11:title  ?title11 )
title11 title10
"SPARQL Protocol Tutorial"  
  "SPARQL Query Language Tutorial"

will return results with the variables title10 or title11 bound depending on which way the query processor matches the pattern to the dataset. Note that, unlike optionals, if no part of the union pattern matched, then the query pattern would not match.

6.2 Blocks in Union Patterns

More than one triple pattern can be given in a pattern being used in a pattern union: In general, it seems kind of silly to first introduce each topic using single triples and then re-state it all using general graphs. Of course more than one triple pattern can be given: triple patterns are only a trivial case of general patterns.

PREFIX dc10:  <http://purl.org/dc/elements/1.1/>
PREFIX dc11:  <http://purl.org/dc/elements/1.0/>

SELECT ?title ?author
WHERE  { ( ?book dc10:title ?title )  ( ?book dc10:creator ?author ) }
       { ( ?book dc11:title ?title )  ( ?book dc11:creator ?author ) }
author title
"Alice" "SPARQL Protocol Tutorial"
"Bob" "SPARQL Query Language Tutorial"

This query will only match a book if it has both a title and creator property from the same version of Dublin Core.

6.3 Alternative Matching – Formal Definition

Definition: Pattern Matching (Union)

Given graph patterns GP1 and GP2, and graph G,  then a union pattern solution of GP1 and GP2 is any pattern solution S such that either S(GP1) matches G or S(GP2) matches G with substitution S.

Query results involving a pattern containing GP1 and GP2, will include separate solutions for each match where GP1 and GP2 give rise to different sets of bindings.

7 More Pattern Matching – Unsaid

Section status: working group is not working on this feature at the moment. It is currently likely to be dropped from the SPARQL query language.

8 Choosing What to Query

While the RDF data model is limited to expressing triples with a subject, predicate and object, many RDF data stores augment this with a notion of the source of each triple or paret of the overall graph. Typically, implementations associate RDF triples or graphs with a URI specifying their real or virtual origin.

A SPARQL query is against a single RDF Graph called the Data Graph. This graph may be constructed through logical inference, and never materialized. It can be arbitrarily large or infinite.

The Data Graph is either a single RDF graph or it is an RDF-merge operation over a collection of RDF Graphs.  Such a merge can be virtual and it does not have to be materialised.

Definition: Data Graph (from a collection)

Given a collection of RDF Graphs {RG1, ..., RGn}, the Data Graph, DG, is an RDF graph formed from the RDF-merge of the graphs RG1, ..., RGn.
In the query context, there is a mapping from a collection of URI References (URIrefs) GN1...GNn to the collection of graphs RG1...RGn.

The RDF data graph can be given implicitly by the local API, externally from the SPARQL protocol or it can be specified in the query itself. The FROM clause gives URIs that the query processor can use to supply RDF Graphs for the query execution.


FROM <http://www.w3.org/2000/08/w3c-synd/home.rss>
WHERE ( ?x ?y ?z )

A query processor may use any local mechanism to associate a data graph with a query.  For example, it could use this URI to retrieve the document, parse it and use the resulting triples to provide the data graph; alternatively, it might only service queries against one of a known set of data graphs and the FROM clause is used to identify one or more such locally known graphs with the query.

Aggregate graphs may also be queried by using multiple source URIs in the FROM clause such as:

FROM <http://example.org/foaf/aliceFoaf>  <http://example.org/foaf/bobFoaf>

An aggregate graph is the RDF-merge of a number of subgraphs. Implementations may provide a single web service target that aggregates multiple source URIs, accessed by the DAWG protocol or some other mechanism.

Will need to significantly update when the protocol is drafted.

The RDF graph may be constructed through inference rather than retrieval or never be materialized.

The abbreviated form of URI references given by prefixes can also be used: this FROM clause gives the same URIs as the example above:

PREFIX foaf: <http://example.org/foaf/>
FROM   data:aliceFoaf data:bobFoaf


9 Querying the Origin of Statements

When querying a collection of graphs, the SOURCE keyword allows access to the URI references naming the graphs in the collection, or restriction by URI reference. SOURCE causes a pattern to be applied to graphs in the collection, respecting their graph labels.

The following two graphs will be used in examples:

# Graph: http://example.org/foaf/aliceFoaf
@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name     "Alice" .
_:a  foaf:mbox     <mailto:alice@work.example> .
_:a  foaf:knows    _:b .

_:b  foaf:name     "Bob" .
_:b  foaf:mbox     <mailto:bob@work.example> .
_:b  foaf:age      32 .
_:b  foaf:PersonalProfileDocument <http://example.org/foaf/bobFoaf> .
# Graph: http://example.org/foaf/bobFoaf
@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:1  foaf:mbox     <mailto:bob@work.example> .
_:1  foaf:PersonalProfileDocument <http://example.org/foaf/bobFoaf>.
_:1  foaf:age 35 .

9.1 Accessing Graph Labels

Access to the graph labels of the collection of graphs being queried is by variable in the SOURCE expression.

The query below looks for the source and age for Bob:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?src ?age
FROM  <http://example.org/foaf/aliceFoaf> 
      ( ?x foaf:mbox <mailto:bob@work.example> )
      SOURCE ?src ( ?x foaf:age ?age )

Because the bNode for the FOAF records is within a single graph of the collection, this information can also be found when restricting a graph pattern, not just a triple pattern, as below:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX data: <http://example.org/foaf/>

SELECT ?src ?age
FROM   data:aliceFoaf data:bobFoaf
  SOURCE ?src
    ( ?x foaf:mbox <mailto:bob@work.example> )
    ( ?x foaf:age ?age )

The query result gives the label and value for the age information:

src age
<http://example.org/foaf/aliceFoaf> 32
<http://example.org/foaf/bobFoaf> 35

It is not necessary to use the FROM clause to create the data graph for a colection of graphs. The query environment may conatins the single RDF graph or a named collection of graphs that the query is to be applied to.

9.2 Restricting by Graph Label

The query can restrict the matching applied to a specific graph by supplying the graph label.  This query looks for the age as the graph file:bobFoaf.n3 states it.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX data: <http://example.org/foaf/>

FROM   data:aliceFoaf data:bobFoaf
    ( ?x foaf:mbox <mailto:bob@work.example> )
    SOURCE data:bobFoaf ( ?x foaf:age ?age )

which yields a single solution:


9.3 Restricting via Query Pattern

The Query Graph is the RDF-merge of the graphs derived from files aliceFoaf.n3 and bobFoaf.n3.  This can be queried using graph patterns and the information used in SOURCE restrictions.

The query below uses the merged graph to find the profile document for Bob. Note that the pattern in the SOURCE part finds the bNode for the person with the same mail box (given by variable mbox), because bNode used to match for variable whom from Alice's FOAF file is not the same bNode in the profile document.

PREFIX data: <http://example.org/foaf/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?mbox ?age ?ppd
FROM   data:aliceFoaf data:bobFoaf
  SOURCE data:aliceFoaf
    ( ?alice foaf:mbox <mailto:alice@work.example> )
    ( ?alice foaf:knows ?whom )
    ( ?whom foaf:mbox ?mbox )
    ( ?whom foaf:PersonalProfileDocument ?ppd )
  SOURCE ?ppd
     ( ?w foaf:mbox ?mbox )
     ( ?w foaf:age ?age )
mbox age ppd
<mailto:bob@work.example> 35 <http://example.org/foaf/bobFoaf>

The triple in Alice's FOAF file giving Bob's age is not used to provide an age for Bob.

9.4 SOURCE and a single, unnamed graph

The datagraph for a query can be a collection of named graphs but it can also be a single RDF graph, with no information as to the source of triples. In this case, an error is generated.

Original text was "A data store that does not support source SHOULD bind SOURCE variables to NULL and fail to match source-constrained queries."

But NULL is not a plain value nor is it a graph label  NULL != NULL (in SQL at least) so the following fails:

   SOURCE ?src (?x ?y ?z)
   SOURCE ?src (?x ?y ?z)

which is odd and introduces order dependences in query execution.

There are 5 possibilities that I can see:

  1. Fail a query with SOURCE in it - special error, detectable during parsing.
  2. Don't match that part of the query
  3. Just don't bind ?src (then the above corner case does work - there is no restriction placed on ?src
  4. Ensure there is always a URI like http://localhost/dataGraph
  5. Require a name for any graph.

Proposal: 1 - justification is that the application asked for a kind of information that is not available and it is clearer than 2 because it says that any query will fail.  Its not a feature of the current state of the data graph that there are no matches. 3 is also acceptable.  4 and 5 are not reflecting that it will be common to query a graph with no labelled subgraphs.

9.5 Definition for SOURCE

Definition: SOURCE


10 Result Forms

SPARQL has a number of query forms for returning results. These result forms use the solutions from pattern matching the query pattern to form result sets or RDF graphs. A result set is a serialization of the bindings in a query result. The query forms are:

Returns all, or a subset of, the variables bound in a query pattern match. Formats for the result set can be in XML or RDF/XML (see the result format document)
Returns either an RDF graph that provides matches for all the query results or an RDF graph constructed by substituting variables in a set of triple templates.
Returns an RDF graph that describes the resources found.
Returns whether a query pattern matches or not.

10.1 Selecting which Variables to Return

The SELECT form of results returns the variables directly. The syntax SELECT * is shorthand for select all the variables.

@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:a    foaf:name   "Alice" .
_:a    foaf:knows  _:b .
_:a    foaf:knows  _:c .

_:b    foaf:name   "Bob" .

_:c    foaf:name   "Clare" .
_:c    foaf:nick   "CT" .

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT ?nameX ?nameY ?nickY
     ( ?x foaf:knows ?y    )
     ( ?x foaf:name ?nameX )
     ( ?y foaf:name ?nameY ) [ ( ?y foaf:nick ?nickY ) ]
nameX nameY nickY
"Alice" "Bob"  
"Alice" "Clare" "CT"

Result sets can be accessed by the local API but also can be serialized into either XML or an RDF graph. The XML result set [@@linkme@@] form gives:

Example in XML

And in RDF/XML, using the result set vocabulary [@@link to result set description@@] gives:

    <rs:solution rdf:parseType="Resource">
      <rs:binding rdf:parseType="Resource">
      <rs:binding rdf:parseType="Resource">
      <rs:binding rdf:parseType="Resource">
   <rs:solution rdf:parseType="Resource">
      <rs:binding rdf:parseType="Resource">
      <rs:binding rdf:parseType="Resource">

Results can be thought of as a table, with one row per query solution. Some cells may be empty because a variable is not bound in that particular solution.

Results form a set of tuples. However, implementations may include duplicates for implementation and performance reasons unless indicated otherwise by the presence of the DISTINCT keyword.


The result set can be restricted by adding the DISTINCT keyword which ensures that every combination of variable bindings (i.e. each result) in a result set must be unique. Thought of as a table, each row is different.

@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:a    foaf:name   "Alice" .
_:a    foaf:mbox   <mailto:alice@org> .

_:z    foaf:name   "Alice" .
_:z    foaf:mbox   <mailto:smith@work> .
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?name WHERE ( ?x foaf:name ?name )


The LIMIT form puts an upper bound on the number of solutions returned. A query may return a number of results up to and including the limit.

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT ?name
WHERE ( ?x foaf:name ?name )

Limits on the number of results can also be applied via the SPARQL query protocol [@@ protocol document not yet published @@]. Do we say anything about which 20 results are delivered? If the smae query is repeated, must the answer be the same? Or if there are in fact 40 results, must the second query get the second batch of 20?

Definition for SELECT

Definition: Projection

For a substitution S and a finite set of variables VS,
project(S, VS) = { (v, S[v]) | v in VS }

For a query solution Q project(Q, VS) is { project(S, V) | S in Q }

For a set QS of query solutions, project(QS, VS) is { project(Q, V) | Q in QS }

The SELECT query form is a projection of pattern solutions.

10.2 Constructing an Output Graph

The CONSTRUCT result form returns a single RDF graph specified by either a graph template or by "*". If a graph template is supplied, then the RDF graph is formed by taking each query solution and substituting the variables into the graph template and merging the triples into a single RDF graph.


The form CONSTRUCT * returns an RDF that is equivalent to ?In what sense of equivalence? the subgraph of the data graph that has all the triples that matched the query. It will give all the same bindings if the query is executed on the returned graph. Is this the same as the minimal subgraph that would give the same answer to the query?

@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:a    foaf:name   "Alice" .
_:a    foaf:mbox   <mailto:alice@org> .

_:b    foaf:name   "Bob" .
_:b    foaf:mbox   <mailto:bob@org> .
PREFIX: vcard:  <http://www.w3.org/2001/vcard-rdf/3.0#>
CONSTRUCT * WHERE ( ?x foaf:name ?name )

Gives the result graph having just the triples with property foaf:name:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

[] foaf:name "Bob" .
[] foaf:name "Alice" .what does [] mean here? Why isnt there a bnode in the answer graph?

CONSTRUCT with a template

The CONSTRUCT form with a template returns a single RDF graph formed by merging ?? the triples in the template replacing variables by their RDF terms in bindings from the query pattern matching. Explicit variable bindings are not returned.

@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:a    foaf:name   "Alice" .
_:a    foaf:mbox   <mailto:alice@org> .

_:b    foaf:name   "Bob" .
_:b    foaf:mbox   <mailto:bob@org> .
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
PREFIX vcard:   <http://www.w3.org/2001/vcard-rdf/3.0#>
CONSTRUCT   ( ?x foaf:name ?name )
WHERE       ( ?x vcard:FN ?name )
creates vcard properties corresponding to the FOAF information:
@prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> .

[] vcard:FN "Alice" .
[] vcard:FN "Bob" .

If a triple template has a varibale, and in a query solution, the variable is unset, then the substituion of this triple template is skipped but other triple templates ??What other templates? How can there be more than one? are still processed for the same solution and any triples from other solutions are included in the result graph.

Templates with bNodes

To be discussed ...

A template can create an RDF graph containing bNodes,Ambiguous. Can a template contain bnodes? Or can the bindings introduce bnodes substituted for variables? Or both? indicated by the syntax of a prefixed name with prefix _ and some label for the local name.  Teh labels are scoped to the template for each solution. ??Surely they are scoped to the solution, not the template. If two such prefixed names share the same label in the template, then there will be one bNode created for each query solution but there will be different bNodes Different, or just that if the same Bnode is used in different solutions, it does not mean the same thing is referred to? across triples generated by different query solutions.

@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:a    foaf:givenname   "Alice" .
_:a    foaf:family_name "Hacker" .

_:b    foaf:firstname   "Bob" .
_:b    foaf:surname     "Hacker" .
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
PREFIX vcard:   <http://www.w3.org/2001/vcard-rdf/3.0#>

CONSTRUCT   ( ?x vcard:N _:a )
            ( _:n vcard:givenName  ?gname )
            ( _:n vcard:familyName ?fname )
       { ( ?x foaf:firstname ?gname ) OR (?x foaf:givenname   ?gname ) }
       { ( ?x foaf:surname   ?fname ) OR (?x foaf:familt_name ?fname ) }
creates vcard properties corresponding to the FOAF information:
@prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> .

_:v1 vcard:N         _:x .
_:x vcard:givenName  "Alice" .
_:x vcard:familyName "Hacker" .

_:v2 vcard:N         _:z .
_:z vcard:givenName  "Bob" .
_:z vcard:familyName "Hacker" .

The use of variable ?x in the template, which in this example will be bound to bNodes, causes an equivalent graph to be constructed with a different bNode as shown by the document-scoped label.

10.3 Descriptions of Resources

The DESCRIBE form returns a single RDF graph containing data associated with the resources. That is very vague. Im not sure what it means. What 'resources' are being referred to? . The resource can be a query variable or it can be a URI.??What does that mean? What resource, and why would it be a variable? (Is this instead of the pattern, so there are no patterns in this kind of query? If so, all the definitions from the top need to be re-stated to include this case. Sigh.) The RDF returned is the choice of the deployment and may be dependent on the query processor implementation, data source and local configuration.  It should be the useful information the server has (within security matters outside of SPARQL) about a resource. It may include information about other resources: the RDF data for a book may also include details of the author.

A simple query such as

PREFIX ent:  <http://myorg.example/employees#> 
DESCRIBE ?x WHERE (?x ent:employeeId "1234")

might return a description of the employee and some other potentially useful details:

@prefix foaf:   <http://xmlns.com/foaf/0.1/> .
@prefix vcard:  <http://www.w3.org/2001/vcard-rdf/3.0> .
@prefix myOrg:   <http://myorg.example/employees#> .

_:a     myOrg:employeeId    "1234" ;
        foaf:mbox_sha1sum   "ABCD1234" ;
         [ vcard:Family       "Smith" ;
           vcard:Given        "John"  ] .

foaf:mbox_sha1sum  rdf:type  owl:InverseFunctionalProperty .

which includes the bNode closure for the vcard vocabulary vcard:N. For a vocabulary such as FOAF, where the resources are typically bNodes,?? Really?? The resources are bnodes? That seems very unlikely. returning sufficient information to identify a node such as the InverseFunctionalProperty foaf:mbox_sha1sum as well information which as name and other details recorded would be appropriate.I can't parse that into an English sentence, sorry. In the example, the match to the WHERE clause was returned but this is not required.

In the returned graph there is information about one of the properties that the query server has deemed to be relevant and helpful in further processing.

DESCRIBE ?x ?y WHERE (?x ns:marriedTo ?y)

When there are multiple resources found, the RDF data for each is merged into the result graph.

A URI can be provided directly:

DESCRIBE <http://example.org/>

Directly specified URIrefs and variables can be mixed in the same DESCRIBE request.

10.4 Asking "yes or no" questions

Applications can use the ASK form to test whether or not a query pattern has a solution. No information is returned about the possible query solutions, just whether the server can find one or not.

@prefix foaf:       <http://xmlns.com/foaf/0.1/> .
@prefix rdf:        <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:       <http://www.w3.org/2000/01/rdf-schema#> .

_:a  foaf:name       "Alice" .
_:a  foaf:homepage   <http://work.example.org/alice/> .

_:b  foaf:name       "Bob" .
_:b  foaf:mbox       <mailto:bob@work.example> .
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
ASK  (?x foaf:name  "Alice" )

Align results to XML results format

on the same data, the following returns no match because Alice's mbox is not as described.

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
ASK  (?x foaf:name  "Alice" )
     (?x foaf:mbox  <mailto:alise@work.example> )

11 Testing Values

SPARQL defines a number of test operations on the RDF values in a query. These operations are taken from the XQuery and XPath Functions and Operators and apply to XML Schema built-in datatypes.

Evaluation rules:


11.1 Standard Operations on XML Datatypes

The SPARQL language provides subsets of the operations on plain literals, XSD Strings, XSD integers XSD floats/doubles and XSD datatime defined in XQuery and XPath Functions and Operators.

Operations "fn:" can callable by applications
fn: = http://www.w3.org/2004/07/xpath-functions

Operations "op:" underpin syntax. SPARQL defines op: for its usage of operations with
op: = http://www.w3.org/2001/sw/DataAccess/operations

Implementation levels

Possible level of implementation completeness: suggestions?

  1. xsd:integer, xsd:string
  2. xsd:integer, xsd:string, xsd:float, xsd:double
  3. xsd:integer, xsd:string, xsd:float, xsd:double, xsd:dateTime

Not sure about the right set for dates & time.  Need advice here.

11.1.0 General Operations

The following operators name the general comparison operations that are backed by specific operations depending on the arguments:

Operators Meaning SPARQL Syntax
sparql:eq Equality eq
sparql:ne Non-equality ne

11.1.1 Boolean Types

These are defined in XPath 2.0.

Operators Meaning SPARQL Syntax
  conjunction &&
  disjunction | |

In "Functions and Operators":

Function Meaning
fn:not Inverts the xs:boolean value of the argument.

11.1.2 Numeric Types

The following operations, taken from XQuery and XPath Functions and Operators, are provided in all implementations of SPARQL.

Variables which are unbound, bound to bNodes or URIrefs cause an evaluation error.

Operators on Numeric Values

Operators Meaning SPARQL Syntax
op:numeric-add Addition +
op:numeric-subtract Subtraction -
op:numeric-multiply Multiplication *
op:numeric-divide Division /
op:numeric-integer-divide Integer division  
op:numeric-mod Modulus %
op:numeric-unary-plus Unary plus +
op:numeric-unary-minus Unary minus (negation) -

Comparison of Numeric Values

Operator Meaning SPARQL Syntax
op:numeric-equal Equality comparison ==
op:numeric-less-than Less-than comparison <
op:numeric-greater-than Greater-than comparison >
  Less-than-or-equal (not-greater-than) <=
  Greater-than-or-equal (not-less-than) >=

Functions on Numeric Values

Function Meaning
fn:abs Returns the absolute value of the argument.
fn:ceiling Returns the smallest number with no fractional part that is greater than or equal to the argument.
fn:floor Returns the largest number with no fractional part that is less than or equal to the argument.
fn:round Rounds to the nearest number with no fractional part.
fn:round-half-to-even Takes a number and a precision and returns a number rounded to the given precision. If the fractional part is exactly half, the result is the number whose least significant digit is even.

11.1.3 String Types

Need to do more here to take account of structure in RDF literals.

Comparison and Collation on Strings

Function Meaning
fn:compare Compare two strings.. Returns -1, 0, or 1, according to the rules of the collation used.

Implementations need only support codepoint collation.

Functions on Strings

Strings: XPath and XQuery Functions and Operators / Functions on Strings.

Function Meaning
fn:contains Indicates whether one xs:string contains another xs:string. A collation may be specified.
fn:starts-with Indicates whether the value of one xs:string begins with the collation units of another xs:string. A collation may be specified.
fn:ends-with Indicates whether the value of one xs:string ends with the collation units of another xs:string. A collation may be specified.
fn:substring-before Returns the collation units of one xs:string that precede in that xs:string the collation units of another xs:string. A collation may be specified.
fn:substring-after Returns the collation units of xs:string that follow in that xs:string the collation units of another xs:string. A collation may be specified.
Function Meaning
fn:string-length Returns the length of the argument.
fn:upper-case Returns the upper-cased value of the argument.
fn:lower-case Returns the lower-cased value of the argument.

String Functions that Use Pattern Matching

Regular expressions: Perl5 syntax as defined in "Functions and Operators".

Function Meaning
fn:matches Returns an xs:boolean value that indicates whether the value of the first argument is matched by the regular expression that is the value of the second argument.

Comparison of Strings

Functions and Operators / Equality and Comparison of Strings

Operators Meaning SPARQL Syntax
sparql:regex String pattern match =~
  String pattern not match !~

11.1.4 Dates and Times

See Functions and Operators on Durations, Dates and Times

See XML Schema Part 2: Datatypes. http://www.w3.org/TR/xmlschema-2/

dateTime , timedate , gYearMonth , gYear , gMonthDay , gDay , gMonth

11.1.4 AnyURI

Function Meaning
op:anyURI-equal Returns true if the two arguments are equal.

11.2 Value Testing / RDF Types

RDF Literals can be typed or have an optional language tag. In addition to the operations above, SPARQL, provides operations to access information in RDF Literals:

11.2.1 SPARQL Operations on RDF Literals

Operator Meaning Return Type
lang(expr, lang) Tests whether the expression (normally a variable) is compatible with the supplied language. xsd:boolean
lang(expr) Return the language tag, if any,  Else return "-" (an illegal language tag). plain string
dtype(expr) Return the datatype URI URI

11.2.2 SPARQL specific operations

Operator Meaning Return Type
isBound(expr) Return true if its argument, which may be an expression, is defined.
NaNs and INFs count as defined.

11.3 Extending Value Testing

Section status: placeholder text - not integrated

Implementations may provide custom extended value testing operations, for example, for specialised datatypes. These are provided by functions in the query that return true or false for their arguments.

&qname(?var or constant, ?var or constant , ...)


SELECT ?x WHERE (?x ns:location ?loc) AND &func:test(?loc, 20) 

A function can test some condition of bound and unbound variables or constants. The function is called for each possible query result (or the equivalent effect if optimized in some way). A function is named by URIRef in a QName form, and returns a boolean value. "true" means accept; "false" means reject this result.

If a query processor encounters a function that it does not provide, the query is not executed and an error is returned.

Functions should have no side-effects.  A SPARQL query processor may remove calls to functions if it can optimize them away. A SPARQL query processor may call to a function more than once for the same solution.

A. SPARQL Grammar

Section status: drafted – terminal syntax not checked against that of the XML 1.1 spec

This grammar defines the allowable constructs in a SPARQL query. The EBNF format is the same as that used in the XML 1.1 [14] specification. Please see the "Notation" section of that specification for specific information about the notation. Why not use the ISO format, since ebnf is an ISO standard? (There are many folk who will be offended if you don't.)

References to lexical tokens are enclosed in <>. Whitespace is skipped. Then its not ebnf. Check the ISO documentation.

Comments in SPARQL queries take the form of '#', outside a URI or string, and continue to the end of line or end of file if there is no end of line after the comment marker.

Notes: The term "literal" refers to a constant value, and not only an RDF Literal.

The grammar starts with the Query production.


[1]    Query    ::=    PrefixDecl* ReportFormat PrefixDecl* FromClause? WhereClause?
[2]    ReportFormat    ::=    'select' 'distinct'? <VAR> ( CommaOpt <VAR> )*
| 'select' 'distinct'? '*'
| 'construct' ConstructPattern
| 'construct' '*'
| 'describe' VarOrURI ( CommaOpt VarOrURI )*
| 'describe' '*'
| 'ask'
[3]    FromClause    ::=    'from' FromSelector ( CommaOpt FromSelector )*
[4]    FromSelector    ::=    URI
[5]    WhereClause    ::=    'where' GraphPattern
[6]    SourceGraphPattern    ::=    'source' '*' PatternElement
| 'source' VarOrURI PatternElement
[7]    OptionalGraphPattern    ::=    'optional' PatternElement
| '[' GraphPattern ']'
[8]    GraphPattern    ::=    GraphAndPattern ('union' GraphAndPattern)*
[9]    GraphAndPattern    ::=    PatternElement+
[10]    PatternElement    ::=    TriplePattern
| GroupGraphPattern
| SourceGraphPattern
| OptionalGraphPattern
| 'and'? Expression
[11]    GroupGraphPattern    ::=    '{' GraphPattern '}'
[12]    TriplePattern    ::=    '(' VarOrURI CommaOpt VarOrURI CommaOpt VarOrLiteral ')'
[13]    ConstructPattern    ::=    ConstructElement+
[14]    ConstructElement    ::=    TriplePattern
| '{' ConstructPattern '}'
[15]    VarOrURI    ::=    <VAR>
[16]    VarOrLiteral    ::=    <VAR>
| Literal
[17]    PrefixDecl    ::=    'prefix' <NCNAME> ':' QuotedURI
| 'prefix' ':' QuotedURI
[18]    Expression    ::=    ConditionalOrExpression
[19]    ConditionalOrExpression    ::=    ConditionalAndExpression ( '||' ConditionalAndExpression )*
[20]    ConditionalAndExpression    ::=    ValueLogical ( '&&' ValueLogical )*
[21]    ValueLogical    ::=    StringEqualityExpression
[22]    StringEqualityExpression    ::=    EqualityExpression StringComparitor*
[23]    StringComparitor    ::=    'eq' EqualityExpression
| 'ne' EqualityExpression
[24]    EqualityExpression    ::=    RelationalExpression RelationalComparitor?
[25]    RelationalComparitor    ::=    '==' RelationalExpression
| '!=' RelationalExpression
[26]    RelationalExpression    ::=    AdditiveExpression NumericComparitor?
[27]    NumericComparitor    ::=    '<' AdditiveExpression
| '>' AdditiveExpression
| '<=' AdditiveExpression
| '>=' AdditiveExpression
[28]    AdditiveExpression    ::=    MultiplicativeExpression AdditiveOperation*
[29]    AdditiveOperation    ::=    '+' MultiplicativeExpression
| '-' MultiplicativeExpression
[30]    MultiplicativeExpression    ::=    UnaryExpression MultiplicativeOperation*
[31]    MultiplicativeOperation    ::=    '*' UnaryExpression
| '/' UnaryExpression
| '%' UnaryExpression
[32]    UnaryExpression    ::=    UnaryExpressionNotPlusMinus
[33]    UnaryExpressionNotPlusMinus    ::=    ( '~' | '!' ) UnaryExpression
| PrimaryExpression
[34]    PrimaryExpression    ::=    <VAR>
| Literal
| FunctionCall
| '(' Expression ')'
[35]    FunctionCall    ::=    '&' <QNAME> '(' ArgList? ')'
[36]    ArgList    ::=    VarOrLiteral ( ',' VarOrLiteral )*
[37]    Literal    ::=    URI
| NumericLiteral
| TextLiteral
[38]    NumericLiteral    ::=    <INTEGER_LITERAL>
[39]    TextLiteral    ::=    String <LANG>? ( '^^' URI )?
[40]    String    ::=    <STRING_LITERAL1>
[41]    URI    ::=    QuotedURI
| QName
[42]    QName    ::=    <QNAME>
[43]    QuotedURI    ::=    <URI>
[44]    CommaOpt    ::=    ','?


These terminals are further factored for readability.

[45]    <URI>    ::=    "<" <NCCHAR1> (~[">"," "])* ">"
[46]    <QNAME>    ::=    (<NCNAME>)? ":" <NCNAME>
[47]    <VAR>    ::=    "?" <NCNAME>
[48]    <LANG>    ::=    '@' <A2Z><A2Z> ("-" <A2Z><A2Z>)?
[49]    <A2Z>    ::=    ["a"-"z","A"-"Z"]>
[50]    <INTEGER_LITERAL>    ::=    (["+","-"])? <DECIMAL_LITERAL> (["l","L"])?
| <HEX_LITERAL> (["l","L"])?
[51]    <DECIMAL_LITERAL>    ::=    <DIGITS>
[52]    <HEX_LITERAL>    ::=    "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+
[53]    <FLOATING_POINT_LITERAL>    ::=    (["+","-"])? (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)?
| "." (["0"-"9"])+ (<EXPONENT>)?
| (["0"-"9"])+ <EXPONENT>
[54]    <EXPONENT>    ::=    ["e","E"] (["+","-"])? (["0"-"9"])+
[55]    <STRING_LITERAL1>    ::=    "'" ( (~["'","\\","\n","\r"]) | ("\\" ~["\n","\r"]) )* "'"
[56]    <STRING_LITERAL2>    ::=    "\"" ( (~["\"","\\","\n","\r"]) | ("\\" ~["\n","\r"]) )* "\""
[57]    <DIGITS>    ::=    (["0"-"9"])
[58]    <PATTERN_LITERAL>    ::=    [m]/pattern/[i][m][s][x]
[59]    <NCCHAR1>    ::=    ["A"-"Z"]
| "_" | ["a"-"z"]
| ["\u00C0"-"\u02FF"]
| ["\u0370"-"\u037D"]
| ["\u037F"-"\u1FFF"]
| ["\u200C"-"\u200D"]
| ["\u2070"-"\u218F"]
| ["\u2C00"-"\u2FEF"]
| ["\u3001"-"\uD7FF"]
| ["\uF900"-"\uFFFF"]
[60]    <NCNAME>    ::=    <NCCHAR1> (<NCCHAR1> | "." | "-" | ["0"-"9"] | "\u00B7" )*

B. References

Section status: misc


[1] "Three Implementations of SquishQL, a Simple RDF Query Language", Libby Miller, Andy Seaborne, Alberto Reggiori; ISWC2002

[2] "RDF Query and Rules: A Framework and Survey", Eric Prud'hommeaux

[3] "RDF Query and Rule languages Use Cases and Example", Alberto Reggiori, Andy Seaborne

[4] RDQL Tutorial for Jena (in the Jena tutorial).

[5] RDQL BNF from Jena

[6] Enabling Inference, R.V. Guha, Ora Lassila, Eric Miller, Dan Brickley

[7] N-Triples

[8] RDF http://www.w3.org/RDF/

[9] "Representing vCard Objects in RDF/XML", Renato Iannella, W3C Note.

[10] "RDF Data Access Working Group"

[11] "RDF Data Access Use Cases and Requirements ? W3C Working Draft 2 June 2004", Kendall Grant Clark.

[12] "Resource Description Framework (RDF): Concepts and Abstract Syntax", Graham Klyne, Jeremy J. Carroll, W3C Recommendation.

[13] "RFC 2396", T. Berners-Lee, R. Fielding, L. Masinter, Internet Draft.

[14] "Namespaces in XML 1.1", Tim Bray et al., W3C Recommendation.

[15] "Turtle - Terse RDF Triple Language", Dave Beckett.

Valid XHTML 1.0!

CVS Change Log:

$Log: Overview.html,v $
Revision 1.140  2004/11/28 08:28:53  eric
updated to reflect andy's current grammar

Revision 1.139  2004/11/22 16:52:02  aseaborne
+ "subgraph" => simple entail in def of Graph Pattern Matching
+ Noted generalization for TriplePatterns to be uniform.
+ sec 11: added op:anyURI-equals to list of tests
+ removed "casting" from eq/ne.

Revision 1.138  2004/11/22 11:36:06  aseaborne
*** empty log message ***

Revision 1.137  2004/11/19 17:44:46  aseaborne
Section 8 drafted, based on DaveB email

Revision 1.136  2004/11/19 09:29:26  eric
integrated some comments from http://lists.w3.org/Archives/Public/public-rdf-dawg/2004OctDec/0241.html

Revision 1.135  2004/11/18 18:33:29  eric
use new introductory example

Revision 1.134  2004/11/17 16:17:15  aseaborne
Tidy up of some markup

Revision 1.133  2004/11/17 16:09:26  aseaborne
*** empty log message ***

Revision 1.132  2004/11/15 15:44:16  aseaborne
+ Move 4.3 (Nested optionals) to 5.1
+ Section 8 : added first part of text from DaveB
  (awaiting clarification before doing rest).

Revision 1.131  2004/11/12 15:24:29  aseaborne
Replace text mistakenly deleted in first Graph Pattern defintion

Revision 1.130  2004/11/12 14:50:43  aseaborne
Hopefully, I have:
+ Removed all commas from query examples.
+ changed "Set of bindings" => "Substitution"

Revision 1.129  2004/11/10 18:09:55  aseaborne
+ Added a little text to 2.5 that discusses bNodes.
+ Added definition of projection to SELECT result form.

Revision 1.128  2004/11/08 15:27:14  aseaborne
+ Section 3 renamed and split into subsections on
  literal matching and testing values.

Revision 1.127  2004/11/05 15:32:42  aseaborne
+ Added text about UNION in the section on alternatives.
  (not complete but outlines a design)

Revision 1.126  2004/11/04 15:20:29  aseaborne
+ ASK examples done: one that asnwers "yes" and one that answers "no"
+ triple pattern grouping is now {braces} (except for [] optionals)
+ Results from SELECT are sets (but implementations may include duplicates
  unless DISTINCT).
+ Text for error conditions in CONSTRUCT

Revision 1.125  2004/10/25 13:02:12  aseaborne
foaf:box => foaf:mbox in 2.4

Revision 1.124  2004/10/25 12:44:14  aseaborne
Types and trying to get ,spell to not give seemingly false negatives.

Revision 1.123  2004/10/25 12:40:38  aseaborne
Editoprial changes based on comments in:

Revision 1.122  2004/10/25 11:18:59  aseaborne

Revision 1.121  2004/10/25 11:14:53  aseaborne
Changes based on comments in:

+ Trimmed log back to working draft publication point.
+ Added 4.3 which talks about nested optional blocks.
+ Removed discussion in sec 12 on contrainst and grpah pattern matching.
+ section 2.5 to discuss bNodes in queries (can't have them) and
  in results (doc-scoped ids only).
+ section 3 says that errors in constraints lead to solution rejection
  Includes bNodes for numeric comparisions.

Revision 1.120  2004/10/19 20:32:32  connolly
replace ficticious status with true status
add tbody markup (required by nxml-mode's rng schema, at least)

Revision 1.119  2004/10/15 16:00:42  aseaborne
Added notes on various ways to access RDF literals

Revision 1.118  2004/10/15 14:31:23  eric
- added latest published version link
- XML validated

Revision 1.117  2004/10/13 11:32:06  aseaborne
+ replaced section 12 text with new, preliminary material.
+ Commented out publishing header - put in "editors draft" text

Revision 1.116  2004/10/12 18:52:52  eric
- reflect implementation experience in the SOTD

Revision 1.115  2004/10/12 09:46:58  eric
- CSS validated
- removed links to old text

Revision 1.114  2004/10/12 09:39:36  eric
validating pubrules compliance and CSS

Revision 1.113  2004/10/12 09:28:11  eric
: commit to check pubrules
- switch to publication headers (this version, ...)