CARVIEW |
This document defines Notation 3 (also known as N3), an assertion and logic language which is a superset of RDF.
Status of this document
This document is one of 3 documents produced by the W3C N3 Community Group:- Notation3 Language (this document)
- Notation3 Builtin Functions
- Notation3 Semantics
This specification was published by the Notation 3 (N3) Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.
GitHub Issues are preferred for discussion of this specification. Alternatively, you can send comments to our mailing list. Please send them to public-n3-dev@w3.org (subscribe, archives).
Introduction
The Semantic Web represents a vision of online and interconnected knowledge. The core building block is a logical formalism called the Resource Description Framework (RDF) ([[[RDF11-PRIMER]]]). RDF includes logical conjunctions of statements, each describing properties of resources, the properties of the objects of those properties, and so on; ultimately, leading to a Knowledge Graph. It builds on the fundamental identification mechanism of the Web, i.e., the Uniform Resource Identifier (URI), also known as International Resource Identifier (IRI), as a means to identify any RDF resource, ranging from abstract concepts (the book "Moby Dick") to physical (a paper copy of the book "Moby Dick") to electronic objects (an e-book copy of "Moby Dick"). People have used RDF to build vast quantities of online, connected Knowledge Graphs. The Semantic Web has set the stage for decision making within an open environment of interconnected knowledge graphs, as opposed to a closed system of locally trusted facts.
Notation3 Logic, or N3 for short, aims to implement such decision-making abilities in an open web environment — by (a) extending the representational abilities of RDF and (b) allowing access to, operating on, and reasoning over online information. N3 attempts to walk the line between, on the one hand, ease-of-use by authors and simplicity of reasoning for developers; and, on the other hand, extended utility and practicality for building real-world applications.
The main characteristics of N3 are as follows:
- N3 is a superset of RDF and Turtle. Any valid [[[Turtle]]] graph will be valid in N3 as well, meaning that all of Turtle's syntactic sugar is available in N3 — including predicate and object lists and unlabeled blank nodes. Moreover, collections are first-class citizens in N3, with an associated set of builtins for accessing and manipulating them.
- N3 adds declarative programming. N3 allows making statements about the world, which include logical implications with variables — these can loosely be compared to If-Then statements. Subsequently, an N3 reasoner can draw inferences from these statements, allowing for problem solving, automating decision making, or simply enriching your Knowledge Graph.
- N3 supports quoting and describing graphs of statements. A graph term includes a conjunction of quoted statements. It allows expressing where particular statements came from, at what time they were stated and by whom (i.e., provenance), and any other description in general.
-
N3 includes builtins for accessing online knowledge.
To dynamically retrieve online knowledge for decision making, the
log:semantics
andlog:conclusion
builtins allow pulling in, parsing, and reasoning over logical expressions from online or local sources. -
N3 supports a scoped version of negation-as-failure.
It can be useful to check whether a specific online source, local graph term, or even the N3 document
itself,
at a given point in time, does or does not support a set of facts.
This is referred to as scoped negation as failure, and is supported by N3's
log:collectAllIn
,log:forAllIn
, andlog:notIncludes
builtin.
Language
The aim of this section is to provide an informal overview of the N3 language and its different features. Where possible, this section is based on the [[[Turtle]]] specification. More formal definitions will follow in the subsequent sections.
N3 document
An N3 document represents an N3 graph in a compact textual form. An N3 graph is a series of N3 statements. These are written as triples consisting of a subject, predicate, and object resource. An N3 resource can be an RDF IRI, literal or blank node; or an N3 graph term, collection, logical implication, variable or N3 builtin. We introduce these types of resources and statements below.
Comments are indicated using a separate '#' and continue until the end of the line.
Simple triples
The simplest N3 statement or triple is a sequence of a subject, predicate, and object resource, separated by whitespace and terminated by '`.`' after each triple.
In the example below, three N3 triples highlight the enmity between Spiderman and the Green Goblin and lists their human-readable names:
For now, we will asssume that a resource is represented either by an IRI or a literal. In general, one uses an IRI to identify an identifiable entity such as a person, place, or thing — a literal is used for a textual or numerical (i.e., datatype) value, such as name, date, height, and so on.
There is no inherent order for N3 or RDF triples. The same is true for the relational database model, i.e., relational tuples or rows do not have an inherent order. Hence, it is a mistake to associate meaning with the order of statements in an N3 document (e.g., assuming that a first listed telephone for Spiderman is their landline, and the second one their mobile number).
Predicate and object lists
As shown in the above example, the same subject (here, Spiderman) will often be described by several N3
statements.
To make these N3 statements less cumbersome to write, one can put a semicolon (";
") at the
end of an N3 statement to describe the same subject in the subsequent statement:
Similarly, a predicate (e.g., name) can often list multiple objects for the same subject (e.g., Spiderman).
This can be written by listing the object values separated by a ',
':
IRIs
An IRI is used to represent an identifiable entity — such as a person, place, or thing.
Until now, we have been writing absolute IRIs [[RFC3987]] (e.g.,
https://example.org/Spiderman
), which include both the namespace (e.g.,
https://example.org/
) and the local name (e.g., Spiderman
).
It is often much easier to write an IRI as a prefixed name — e.g., ex:Spiderman
,
which includes a prefix label (e.g., ex
) as a shorthand for the namespace,
and the local name (e.g., Spiderman
), separated by a colon (":
").
The `@prefix` directives associate the prefix label
with a namespace IRI. A prefixed name is turned into an absolute IRI
by concatenating the namespace IRI with the local name.
The following example is equivalent to the original example:
N3 also supports case-insensitive `PREFIX` and `BASE` directives, as does Turtle, to align the syntax with SPARQL (see grammar). These do not have a trailing '.'.
To further simplify prefixed names, one can leave the prefix label empty (e.g., for an often-used namespace):
One can also write relative IRI references, e.g.,
<#Spiderman>
.
A relative IRI reference is resolved, i.e., turned into an absolute
IRI, by concatenating the base IRI with the local name (e.g., Spiderman
).
A base IRI is defined using the `@base` directive.
For instance, the following is equivalent to the prior example:
Specifics of relative IRI reference resolution are described in .
We recommend listing `@prefix` / `PREFIX` and `@base` / `BASE` declarations at the top of an N3 document. This is not mandatory, however, and they can technically be put anywhere before the prefixed name or relative IRI that relies on the declaration. Subsequent `@prefix` / `PREFIX` directives may "re-map" the same prefix label to another namespace IRI.
Prefixed names are a superset of XML QNames. They differ in that the local name may include:
- leading digits, e.g. `leg:3032571` or `isbn13:9780136019701`
- non leading colons, e.g. `og:video:height`
- reserved character escape sequences, e.g. `wgs:lat\-long`
IRI property lists
In many cases, an IRI occurs only once as an object, and is then further described as a subject in other statements (e.g., see a prior example). The example below illustrates how this may result in difficult-to-read code: to find descriptions of IRI objects such as `tobey-maguire` or `willem-dafoe`, one must scan the full N3 graph, as they are only described near the end.
Using the IRI property list syntax, descriptions of these object IRIs can be directly "embedded" within the object position:
Using basic statements: | Using IRI property lists: |
---|---|
|
|
IRI property lists build on the predicate object list syntax to group statements about IRIs. One can use objects lists in this syntax as well.
See also the blank node property list syntax.
See embedding in [[JSON-LD11]] for a discussion of a similar feature in another format.
Literals
Literals are used to represent a textual or numerical (i.e., datatype) value, such as name,
date, height, and so on. Numbers (integers, decimals, and doubles) are simply represented using their
numerical value, and booleans are represented using true
or false
:
For details on numerical syntaxes (e.g., decimals, doubles), we refer to the RDF 1.1: Turtle (Section 2.5.2) specification.
Other literals, such as strings, but also dates, binary, octal or hex code, XML or JSON code, or other types of numbers (e.g., shorts), need to be written as datatyped literals. These are represented as a string, followed by the `^^` symbol and the corresponding datatype IRI. (`xsd:string` is the datatype IRI for strings; this is the default and can be omitted, in which case the `^^` MUST also be omitted; in other words, `"example"^^` is not a valid datatyped literal). For instance:
The literal's lexical form will include the characters between the delimiter quotes (e.g.,
2001-08-10
).
The language of the string literal can be indicated using the @
symbol and the corresponding
language tag (as defined in [[BCP47]] — find the registry in [[LNG-TAG]]). For instance:
The reading direction can be described together with the language [[BCP47]], using the i18n
namespace:
If no datatype IRI or language tag is given, the datatype xsd:string
will be
assumed.
In case a language tag is given or a i18n
"datatype" is used, the datatype
rdf:langString
is inferred.
It is not possible to specify both a datatype IRI and a language tag.
There are also other ways to describe the language tag and base direction of RDF literals, see Compound literals. Please note that these various ways are still in flux, see the "text direction" discussions in the RDF-star working group.
Integers, decimals, doubles, and booleans may also be written as string literals with the appropriate datatype IRI (e.g., `"8"^^xsd:integer` or `"true"^^xsd:boolean`).
In case the the string literal itself contains the string delimiter (e.g., double quotes), or includes
newlines, other string delimiters can be used, i.e., single quotes or compound delimiters "carview.php?tsp="carview.php?tsp="
or
'''
.
Alternatively, one can also use a '\
' for escaping the delimiter each time it occurs within a
string literal.
For instance:
The escape symbol '\
' (U+005C) may only appear in a
string literal as part of an escape sequence. Other restrictions within string literals
depend on the delimiter:
- Literals delimited by
'
may not contain the characters'
,LF
(U+000A), orCR
(U+000D). - Literals delimited by
"
may not contain the characters"
,LF
, orCR
. - Literals delimited by
'''
may not contain the sequence of characters'''
. - Literals delimited by
"""
may not contain the sequence of characters"""
.
Blank Nodes
When describing resources in RDF, you can run into the following situations:
- It is not worth minting a new IRI for the resource, as it is unlikely that other N3 graphs will refer to it.
- There likely already exists an IRI for the resource, and we don't want to do an extensive search to find out what it is.
Instead, you can use blank nodes to talk about resources. They are existential variables; that is, they state the existence of a thing without identifying it. Blank nodes can be represented in several ways, as described below. For details, please refer to ([[[RDF11-CONCEPTS]]]).
Blank node identifiers
A blank node can be represented by a blank
node identifier, which is unique within the N3 graph, expressed as _:someLabel
.
Then, we use this identifier within our N3 graph to describe the corresponding resource, same as with
an IRI.
For instance, this example shows that the Mona Lisa has an unidentified tree in its background. We don't want to concretely identify this tree, but we do want to describe it — such as the painting it is in, and the type of tree:
In this example, we minted a blank node identifier _:someTree
to represent the resource
that
we want to describe. Note that this identifier is only usable within our local N3 graph —
if you want other N3 graphs to describe it, you should represent it using an IRI.
Blank node property lists
Using a blank node identifier requires introducing a new identifier for each blank node. In many cases, however, a blank node occurs only once as an object, and is then further described as a subject in other statements, such as in the prior example. In those cases, a more convenient syntax can be used.
The example below shows a more elaborate example: identifiers `_:a` and `_:t` are used as object in only one statement, and are then described as subject in other statements. Using the blank node property list syntax, these cases can be represented as follows:
Using blank node identifiers: | Using blank node property lists: |
---|---|
|
|
See also the IRI property list syntax. However, note that you cannot use that syntax (`id` keyword) with blank node identifiers.
The "blank node property list" syntax is especially useful in cases where a blank node occurs only once as an object. When the blank node occurs several times as an object, such as when describing multiple people with the same address, it is better to use blank node identifiers. For instance:
We can still use the blank node property list syntax for describing the town, as this blank node still only occurs once as object.
Blank node use cases
Below, we summarize typical use cases where blank nodes are used to describe resources.
Unknown resources: we might want to state that the Mona Lisa painting has in its background an unidentified tree, which we know to be a cypress tree. We could mint an IRI such as "mona-lisa-cypress-tree", but we feel that would be redundant — we simply want to describe the tree, such as the painting it is in, and the type of tree. We're not particularly interested in allowing other N3 graphs to refer to the tree. Moreover, there may already exist an IRI for that particular tree, and we don't want to mint another IRI to refer to the same tree (nor do we want to lookup this existing IRI).
Composite information: when describing composite pieces of information, such as street addresses, telephone numbers and dates, it is often unlikely that anyone outside this N3 graph would need to refer to this address or its pieces. Hence, it would be redundant to mint an IRI just for the purpose of structuring this information. Instead, one can use a blank node to connect the "composed" pieces of information, e.g., the street address, to its composite values, e.g., street name, number, and city, as shown in this example.
N-ary relations: blank nodes are a convenient way to represent n-ary relations in N3.
Compound literal: to represent language-tagged strings, see section Compound literals.
Collections
We often need to describe ordered collections of things, e.g., a book written by several authors,
listing the students in a course, or the software modules within a package. N3 supports collections as first-class citizens
to represent ordered collections of resources enclosed by parentheses
("(
)
").
The contained resources are called members.
For instance:
This states that the value of the author property is the collection resource.
Any additional meaning is not given by the N3 semantics.
For instance, this example does not imply that each member (e.g.,
:deborah
) can be considered as a value of the author property — i.e., the author property
does not "distribute" across the members.
For that matter, it is also not implied that the first member put the most effort into the book.
Any additional meaning, such as each member being an author of the book,
and the ordering reflecting the authors' effort, would always be application-specific.
In general, to implement such application-specific meanings, N3 rules could be used.
Alternatively, one could use an object list
to explicitly state that each of these resources are authors of the book:
`:description_logic_handbook :author :deborah, :daniele, :peter .`
However, since RDF (and thus N3) statements do not have an inherent order, this example loses meaning
as we no longer know who can be considered the "first" author.
To an extent, an N3 collection is equivalent to the more verbose RDF Collection vocabulary. We refer to the original N3 submission for the additional axioms needed for this equivalency.
In N3, collections may occur as subjects, predicates or objects in a statement. For instance:
Or even:
As before: any meaning that goes beyond the fact that 3 collection resources are involved in this statement, would be application-specific.
Graph Terms
It is often useful to attach metadata to groups of triples — to give the provenance, context, or version of the information, our opinion on the matter, and so on. We can use graph terms to quote RDF graphs, and then describe the graph term using N3 statements. For instance:
Essentially, a graph term represents an occurrence of an RDF graph — i.e., a quoting or
citing of the graph. Importantly, a graph term does not assert the contents of the RDF
graph as being true (e.g., :cervantes dc:wrote :moby_dick
). In fact, the graph term is interpreted as a resource on its own.
As with collections, graph terms can be used in any position in an N3 statement.
As they represent a quoting of RDF graphs, graph terms are not "referentially transparent". For instance:
This N3 statement states that Lois Lane believes that Superman can fly.
Even if it is known that :Superman
is the same as :ClarkKent
, one cannot infer from
this that Lois Lane believes that :ClarkKent
can fly.
Indeed, this is an accurate depiction of Lois Lane's statement at the time —
as she did not know that Superman is Clark Kent at that point,
she would certainly not be saying that Clark Kent can fly.
N3 Rules
N3 supports declarative programming by allowing to make statements about the world, including logical implications, or N3 rules, which can be loosely compared to If-Then statements. Based on the N3 semantics, so-called inferences can be drawn from these statements. An N3 reasoner can draw such inferences automatically, to support problem solving, decision making, or simply enriching your Knowledge Graph.
For instance, the following is an N3 rule:
This is a logical construct, such as a conjunction (AND) and disjunction (OR). Stating such a logical implication means that, in case the rule premise is true (it is raining), the rule conclusion must be true as well (it is cloudy). If this is not the case — raining does not imply that it is cloudy — the implication would be false. However, this is not possible, since all statements in N3 (and RDF) are true by default.
Hence, when the premise holds, then we can safely infer the conclusion. This is also referred to as firing the rule. For the example above, if we state `:weather a :Raining`, then we can safely infer `:weather a :Cloudy`.
An N3 rule is actually a N3 statement, where the subject and object constitute graph terms, and the predicate is `log:implies`, with symbol `=>` as syntactic sugar.
Rule Variables
N3 rules are more useful when they include variables. For instance, the following includes a universal variable (prefixed by `?`):
This states that the N3 rule is true for each value of variable `?x`: when the premise is true for a particular value for `?x` (being a super hero), then the conclusion must be true for that value as well (being imaginary). For the example above, the premise is true for value `:spiderman`, meaning we can infer the conclusion for that value, i.e., `:spiderman a :Imaginary`.
More technically, the triples in the rule premise (e.g., `?x a :SuperHero`) can be seen as triple patterns. In order to fire the rule, these triple patterns are matched to concrete triples in the N3 document. If a match is successful, concrete resources from the matching triple (e.g., `:spiderman`) are bound to the triple pattern's variables (e.g., `?x`). These bound variable values are then used in the inferred conclusion (e.g., `:spiderman a :Imaginary`).
Rule Chaining
In practice, a pivotal feature is that N3 rules can be "chained": a rule can depend on the results of other rules. For example:
The third N3 rule relies on the rule conclusion (`:locomotion`) of the first two rules: if one of those rules fire, i.e., inferring a flying locomotion for a resource, then the third rule would fire as well, i.e., inferring that the resource would be a suitable observer for a street with heavy traffic. This allows for a modularization of N3 code: the first two rules separately decide when something supports a flying locomotion; the third rule determines what the general effects of flying locomotion are.
Variable names only have to be unique within a rule, and not across rules. For instance, in the prior example, we simply could have named every variable `?x`. As with general coding practices, it is typically a good idea to give variables a meaningful name.
An N3 reasoner, which can draw inferences from N3 rules, can operate in forward chaining, backward chaining, or some hybrid mode. An N3 rule can also indicate which mode should be used, by using the appropriate predicate. Up until now, we have used `=>`, which is syntactic sugar for `log:implies` and directs the reasoner to operate in forward chaining mode. The `<=` predicate, which is syntactic sugar for `log:impliedBy`, indicates a backward chaining mode.
For instance, the following is equivalent to the prior example and combines forward and backward chaining:
In a nutshell, forward reasoning presents a bottom-up approach: starting from a set of initial and inferred statements, a reasoner will fire any N3 rule where the premise holds, each time adding the inferences to the set, until no more N3 rules can be fired. Backward chaining is a top-down approach: given a query, such as `?x :locomotion :flying`, the reasoner will search for any rules with conclusions that may satisfy the query (e.g., first and second rule). Then, it will check whether the premises of those rules hold, which may, in itself, require searching for rules with matching conclusions. Regarding general expressivity, these two reasoning modes can be considered equivalent. A detailed discussion of these two reasoning modes, their subtle differences and impacts on performance, is beyond the scope of this specification.
N3 Builtins
An N3 builtin is used within an N3 rule to implement an arbitrary operation on an N3 resource or even an entire N3 document. Builtins include mathematic, string, time and cryptography operators, operations on collections and graphs, and logical operations in general.
Lists of built-ins
Built-ins are denoted by a controlled IRI defined in one of the core namespaces:
- Crypto – https://www.w3.org/2000/10/swap/crypto#,
- List – https://www.w3.org/2000/10/swap/list#,
- Log – https://www.w3.org/2000/10/swap/log#,
- Math – https://www.w3.org/2000/10/swap/math#,
- String – https://www.w3.org/2000/10/swap/string#, and
- Time – https://www.w3.org/2000/10/swap/time#.
Simple Builtins
As a simple example, the rule below uses the `math:quotient` builtin to calculate a person's height in meters:
An N3 builtin is used as the predicate in an N3 builtin statement, where the subject and object act as input or output arguments. In the example above, the second triple constitutes a builtin statement: the subject is the input, as `math:quotient` calculates the quotient of the subject collection elements (i.e., the person's height in cm and 100); the object constitutes the output, as the result is then bound to the object variable `?m`.
Builtins with Multiple Results
N3 builtins can generate multiple results. Below, we use the `list:iterate` builtin to iterate over collection elements, and subsequently constructs a string for each element:
The `list:iterate` builtin iterates over all elements in the subject collection (`?finalists`).
For each element, it binds its index, and the element itself, to the variables in the object collection
(`?i` and `?finalist`, respectively).
Hence, the `list:iterate` builtin generates multiple sets of bindings for the `?i` and `?finalist`
variables:
( ?i = 1, ?finalist = :flash )
( ?i = 2, ?finalist = :superman )
( ?i = 3, ?finalist = :spiderman )
For each set of bindings, the `string:concatenation` builtin concatenates the resources in the subject
collection, and binds the result to the object variable (`?entry`).
This will lead to the following object values:
"`1. :flash`" , "`2. :superman`" , and "`3. :spiderman`"
A builtin statement can be seen as a triple pattern that, for each builtin result,
yields one or more "matching" triples depending on the builtin. For the example above:
( :flash :superman :spiderman ) list:iterate ( 1 :flash ) .
( :flash :superman :spiderman ) list:iterate ( 2 :superman ) .
( :flash :superman :spiderman ) list:iterate ( 3 :spiderman ) .
As with a SPARQL query, for each "matching triple",
a join is attempted with the remaining triples in the rule premise.
Hence, for each matching triple, the `string:concatenation` builtin statement is evaluated.
Finally, for each consistent set of variable bindings, the rule conclusion is instantiated and inferred.
Scoped Negation as Failure (SNAF)
N3 builtins can also operate on a local graph term or even an entire N3 document. The `log:collectAllIn`, `log:forAllIn`, and `log:notIncludes` builtins are logical operators that support a negation-as-failure scoped on the builtin statement object: the object can be a graph term or a blank node, in which case the current document is the scope. For instance, the rule below uses `log:collectAllIn` to collect all of spiderman's defeated villains found in the current document:
The `log:collectAllIn` builtin accepts as subject a collection that includes:
- a variable of which the values will be collected (`?enemy`),
- a where clause with triple patterns that constrain this variable, and
- a variable bound to the resulting collection of values (`?enemies`).
:spiderman :defeatedEnemies ( :green-goblin :doctor-octopus )
The example below uses `log:forAllIn` to determine whether a super-hero's identity is safe:
The `log:forAllIn` builtin accepts as subject a collection with two clauses of a logical implication. If, for all cases where the first clause holds, the second clause holds as well, then the logical implication is true. In this example, the implication is true when, for each person who knows the identity of spiderman, this person can keep a secret. In this case the rule will fire, as both `mary-jane-watson` and `aunt-may` can keep secrets.
In the two examples above, the blank node object `_:t` indicates the current N3 document as scope. Alternatively, a graph term can be used as the scope as well. This is illustrated in the `log:notIncludes` example below.
The `log:notIncludes` builtin checks whether the given scope does not include a given clause. The example below uses a graph term as scope, instead of the current N3 document, as was the case for the prior two examples. The rule finds any of spiderman's enemies who have not been defeated, at least, as reported by the Daily Bugle:
For each of spiderman's enemies (`?enemy`), the rule checks whether the graph term `?graph`, as reported by the `:daily_bugle`, does not include any statement where the enemy has been defeated. If so, we infer that this enemy is undefeated, in this case, `:sandman`.
The counterpart of `log:notIncludes` is `log:includes`, which checks whether the given object scope includes a given clause.
Access Online Knowledge
To dynamically retrieve online knowledge within an N3 rule,
log:semantics
and log:conclusion
builtins allow pulling in, parsing, and reasoning
over logical expressions from online sources.
In the rule below, the `log:semantics` builtin retrieves an online source and subsequently parses it a graph term:
Subsequently, `log:collectAllIn` is used to collect all persons found within the online source.
The `log:conclusion` builtin allows generating the closure of a given graph term, i.e., extending the graph with all applicable rule inferences. This includes graph terms retrieved and parsed from online sources. For example:
Instead of simply printing the inferences, one could, e.g., use the `log:forAllIn` builtin to check whether all persons listed in the online source are indeed inferred to be animals as well.
Resource Paths
Similar to SPARQL property paths, N3 resource paths concisely express paths between resources, when intermediary resources on the path are not relevant. In practice, resource paths are most useful in N3 rules.
A resource path starts from a subject resource, followed by one or more predicates; each predicate is separated by a directional indicator to follow the predicate either forward (`!`) or in reverse (`^`). For example, the following example describes the city of Joe's mother's office's address:
In this example, the intermediary resources — Joe's mother, her office, and its address — are not described or referenced further, so the more concise resource path syntax can be used.
The expansion of this shorthand syntax uses blank nodes to express the path between the two resources. The example above is equivalent to the following (using blank node identifiers for clarity):
In other words, each predicate in resource path is expanded into a statement, with as subject either the starting resource, or prior blank node object; and as object a newly minted blank node.
Relations can also be followed in reverse using the reverse (`^`) indicator. The following example, starting from `joe`, follows the `hasMother` predicate to Joe's mother, and then follows the `hasMother` predicate in reverse (thus pointing to someone who has the same mother):
This could equally well be represented by the more verbose:
In SPARQL, a property path is used in the
predicate position to describe a string of relationships between
a subject and an object resource.
In contrast, an N3 resource path is most often used in the subject or
object position.
Using a resource path in the predicate position, although technically allowed, is often a mistake.
For instance, the following N3 resource path:
`:joe :hasAddress!:hasCity "Metropolis" .`
Would lead to the following expansion:
`:joe _:bn_1 "Metropolis" .`
`:hasAddress :hasCity _:bn_1 .`
Which is likely not the intention of the author.
EBNF Grammar
The Turtle grammar was used as the starting point for the N3 grammar, which was subsequently adapted and extended with N3 constructs.
The N3 Working Group made the following decisions that modify the N3 grammar as originally presented in [[N3]]:
-
Dropping the
@keywords
declaration. It is complex and difficult to explain. Also, when using the declaration, N3 documents look wholly different from when it is not being used, since it allows local names in the default namespace to be listed without the ":
" symbol. -
Supporting all verb and boolean keywords (
is .. of
,has
,a
,true
,false
) both with and without "@
" prefix. Turtle supports thea
keyword, i.e., without an "@
", but requires the symbol for the@prefix
and@base
declarations. The original N3 grammar required the "@
" prefix for all verb and boolean keywords. Hence, this decision was made for compatibility with Turtle as well as to avoid an unintuitive grammar, i.e., where some keywords have the "@
" prefix and some don't. -
Representing an inverted property using the
<-
symbol. The original N3 grammar allowed the inverting of a property by using the@is .. @of
construct. But, this construct can be unintuitive when property names are more verbosely specified (e.g.,:hasFather
), leading to statements such as?x @is :hasFather @of ?y
. Theis .. of
construct is still supported, but the above statement can now be represented as follows:?x <- :hasFather ?y
.
Whitespace
White space (WS production) is used to separate terminals. The amount and type of white space (e.g., newline (`%0A`) or space (`%20`)) is only significant within terminals.
We note that the IRIREF production only allows IRI-encoded white spaces.
Escape sequences
There are three forms of escapes used in N3 documents:
-
Escape sequences in string literals. These are characters that are traditionally escaped in strings:
Escape sequence Unicode code point '\t' U+0009 '\b' U+0008 '\n' U+000A '\r' U+000D '\f' U+000C '\"' U+0022 '\'' U+0027 '\\' U+005C -
Numeric escape sequences. These represent Unicode code points:
Escape sequence Unicode code point '\u' hex hex hex hex A Unicode character in the range U+0000 to U+FFFF inclusive corresponding to the value encoded by the four hexadecimal digits interpreted from most significant to least significant digit. '\U' hex hex hex hex hex hex hex hex A Unicode character in the range U+0000 to U+10FFFF inclusive corresponding to the value encoded by the eight hexadecimal digits interpreted from most significant to least significant digit. -
Escape sequences in local names. These escape the following reserved characters in the local name
part
of prefixed names:
_ ~ . - ! $ & \ ( ) * + , ; = / ? # @ %
(see the PN_LOCAL_ESC production)
IRI resolution
Relative IRIs are resolved with base IRIs using the algorithm in [[[RFC3986]]] [[RFC3986]] Section 5.2 "Relative Resolution" as supplemented by Section 6.5 of [[[RFC3987]]] [[RFC3997]].
The N3 @base
or BASE
directive can be used to
define the Base IRI,
per [[RFC3986]] Section 5.1.1 "Base URI Embedded in Conent".
Each `@base` or `BASE` directive sets a new In-Scope Base IRI,
relative to the previous base IRI.
Resource path resolution
Resource paths are resolved into zero or more N3 triples, and a single N3 triple element which is used as the expression value of the path.
This section describes two logically equivalent algorithms for transforming a path into a set of N3 triples, and providing a resource to use as the effective subject, predicate, or object in place of the original path expression.
Right to Left Algorithm
The first algorithm describes a means of processing a path starting from the right hand side of the path. This is useful for an implementation based processing an Abstract Syntax Tree generated by a parser, or where the entire path is treated as a token and language-specific tools are used to process it further.
Processing is performed by recursively processing the path |p|, in reverse, from the last directional indicator (`!`) or (`^`). The result of resolving |p| into an expression and set of emitted N3 triples MUST be equivalent to using the following algorithm:
- If |p| matches the pathItem production, then |p| can be reduced no further, return |p| as the result.
- Otherwise, separate the |p| into two components pn-1 and predn on the last occurrence of the directional indicator dirn.
- Create objn by invoking this algorithm recursively using pn-1 for |p|.
- Create a novel blank node Bn.
- If dirn is "`!`", emit a new N3 triple (objn predn Bn).
- Otherwise, dirn is "`^`", emit a new N3 triple (Bn predn objn).
- Return Bn as the result.
Left to Right Algorithm
The second algorithm describes a means of processing a path starting from the left hand side of a path. This is useful for an implementation based on the parser productions described in the grammar that create events in this order (i.e., event-based).
Processing is performed by iteratively processing the path p, in the forward direction, from the first directional indicator (`!`) or (`^`). The result of resolving p into an expression and set of emitted N3 triples MUST be equivalent to using the following algorithm:
Initialize |n| to `0` and B0 to the first pathItem in |p|. Repeat the following algorithm steps until return.
B0 starts as the first pathItem in |p|, but is updated to a novel blank node on subsequent iterations.
- If |n| equals the number of directional indicators in |p|, return Bn.
- Increment |n|, set Bn to a novel blank node, set dirn to the next (nth) directional indicator dirn (if any), set predn to the next pathItem (if any).
- If dirn is "`!`", emit a new N3 triple (Bn-1 predn Bn).
- Otherwise, if dirn is "`^`", emit a new N3 triple (Bn predn Bn-1).
Syntax shorthands
Similar to Turtle, N3 provides a special shorthand syntax for commonly used URIs. This may only be used in the predicate position. From the original Team Submission [[N3]]:Shorthand | URI |
---|---|
a |
<https://www.w3.org/1999/02/22-rdf-syntax-ns#type> |
= |
<https://www.w3.org/2002/07/owl#sameAs> |
=> |
<https://www.w3.org/2000/10/swap/log#implies> |
<= |
<https://www.w3.org/2000/10/swap/log#impliedBy> |
Grammar
A textual version of this grammar may be found here.
Relationship to Other Languages
Turtle
N3 is a superset of [[[turtle]]], meaning that all valid Turtle documents will be valid in N3 as well. The inverse is not true, i.e., a valid N3 document will not always be valid in Turtle.
The current N3 grammar started from the Turtle grammar which was adapted and extended to be in line with the original N3 grammar. Hence, many of the grammar productions will be much more similar to the Turtle grammar than the initial N3 grammar.
Important differences with Turtle are the following:
- Literals are allowed at any s/p/o position (i.e., subject, predicate or object) in a statement. See the pathItem production, which is (eventually) referenced by the "subject", "predicate" and "object" productions.
- N3 includes graph terms (i.e., between "
{
" and "}
"), which are allowed in any s/p/o position in a statement. See the pathItem production. - Quick-variables (e.g., "
?x
"), which are allowed in any s/p/o position in a statement. See the pathItem production. - A path syntax, comparable to (but not quite as extensive as) the SPARQL 1.1 Property Path syntax. See the path production.
- The possibility to invert the predicate within a statement. See the predicate production.
- An additional set of keywords, including "
is .. of
", "has
", "=
", "=>
", "<=
", in addition to Turtle's "a
" keyword (among others). All keywords can be optionally preceded by "@
", for consistency with the "@prefix
" and "@base"
keywords.
SPARQL
The SPARQL 1.1 Query Language (SPARQL) [[SPARQL11-QUERY]] uses a Turtle-style [[turtle]] syntax for its TriplesBlock production. Differences between Turtle and SPARQL are elaborated in the Turtle specification.
Below, we indicate some important differences between the TriplesBlock production and N3:
- Like N3, SPARQL permits literals as the subject of RDF triples, but, in contrast to N3, it does not allow literals as the predicate of RDF triples. Similarly, N3 allows for blank node property lists and collections in any position, whereas SPARQL only allows them in the subject or object position.
-
Like N3, SPARQL permits variables in any part of the triple. But, in contrast to N3, SPARQL allows writing
variables as both
?name
and$name
, whereas N3 only allows?name
. - N3 allows prefix and base directives anywhere outside of a triple. In SPARQL, they are only allowed in the Prologue (i.e., at the start of the SPARQL query). However, in general, we also recommend listing prolog and base directives at the start of N3 documents.
-
In N3, most keywords (including
@prefix
and@base
directives) are case sensitive, but most keywords in SPARQL are case-insensitive (aside froma
). An exception in N3 are thePREFIX
andBASE
directives, which are derived from SPARQL and are case insensitive in N3 as well. -
The N3 path syntax resembles the SPARQL 1.1
Property Path syntax, but there are important differences:
-
It is assumed that the path starts from a resource (IR) instead of a property
(IP). Hence, they are meant to be used in the subject and object positions, rather
than the predicate position as is the case for SPARQL 1.1 property paths. This has important
repercussions
on how paths are resolved. Note that paths are unrelated to the inverted notation
^
for predicates (see the predicate production) -
Compared to SPARQL, N3 only supports the SequencePath and InversePath expressions, but
with syntactic differences: the '
!
' symbol is used to separate path items, whereas the '^
' symbol is used to indicate an inverse predicate.
-
It is assumed that the path starts from a resource (IR) instead of a property
(IP). Hence, they are meant to be used in the subject and object positions, rather
than the predicate position as is the case for SPARQL 1.1 property paths. This has important
repercussions
on how paths are resolved. Note that paths are unrelated to the inverted notation
For more information, see the SPARQL Grammar section of [[[SPARQL11-QUERY]]].
TriG
[[[TRIG]]] is itself a superset of the Turtle syntax and includes a compact way to write RDF datasets, i.e.,
sets of named graphs. In particular, TriG allows the specification of so-called graph statements, which are a
pair of an IRI or blank node label and a group of triple statements surrounded by "{
" and
"}
".
For instance:
N3 is not directly compatible with TriG as it does not support this graph statement notation. Nevertheless, since N3 supports graph terms as part of regular N3 statements, authors can use the N3 Named Graphs extension; this extension allows associating names or identifiers with graph terms. For instance:
The N3 Named Graphs extension defines a set of builtins (used as predicates) to associate names or identifiers with graph terms, which then become "named graphs". Moreover, each predicate has a well-defined semantics on how the named graph should be interpreted: as graph terms (the default N3 interpretation), a partitioning of triples within a dataset context, sets of triples with their own isolated contexts, or specifying relations between local and online graphs.
Design Patterns
In this section, we present common patterns to solve often-occurring problems, for instance regarding data modeling, in N3.
N-ary Relations
Until now, we only considered binary relations between entities and/or values. But, many types of relations are ternary, quaternary, or, in general, n-ary in nature, i.e., they have an arbitrary number of participants. Typical examples are purchase, employment, or membership relations.
In other cases, we want to describe properties of relations — such as the provenance of a piece of information, or the probability of a diagnosis. But, in essence, this is the same problem as representing n-ary relations.
There are several ways of representing n-ary relations in RDF — these are described in [[[swbp-n-aryRelations]]].
Below, we illustrate options for representing n-ary relations in N3 in particular.
Using sets of binary relations
In general, it is possible to convert any n-ary relation into an equivalent set of binary relations. This is a convenient solution, since we already know how to represent binary relations.
First, we create a resource that represents the n-ary relation, and then use a set of binary relations to link each participant to this newly minted resource. Each binary relation is hereby given a meaningful name that represents the role of the participant in the n-ary relation.
For instance, say we want to describe the Purchase relation between a buyer called "John", a purchased book called "Lenny the Lion", the amount paid for the book, and the seller:
Either you could mint a new IRI for representing the n-ary relation, or simply use a blank node. In case other parties may want to refer to the n-ary relation from outside the N3 graph, one could choose to mint a new IRI.
In other cases, things are more naturally described as properties of relations, rather than n-ary relations — for instance, the provenance of a piece of information, the trend of someone's body temperature and when the temperature was taken. Nevertheless, these can be represented in the same way as n-ary relation participants.
We start from the same solution above, i.e., introducing a resource to represent the (in this case, binary) relation, and then linking the two participants to this resource. Subsequently, we use a set of binary relations to attach each descriptive property (e.g., diagnosis probability; temperature trend) to the relation resource.
For instance, when describing someone's (e.g., Christine) current temperature, you may want to indicate the absolute value (e.g., 40 degrees), a description of that value (e.g., elevated), the trend compared to the prior value (e.g., rising), and the time the temperature was taken:
This is possible since we know that the relation resource (e.g., _:to1
) represents the n-ary
relation. Hence, any descriptive properties of the relation, in addition to participants in the relation,
can
simply be attached to the entity.
In this example, we made a statement with one of the participants (:Christine
) as
subject, and the relation resource (_:to1
) as object. An alternative would have been to
add
:Christine
as just another element of the n-ary relation, e.g., using a property
temperatureOf
. Our modeling choice here aimed to indicate that Christine is somehow the "owner" of the relationship.
Using collections
An alternative solution is to use a collection to keep all the participants of the n-ary relation. For instance:
A clear advantage of this approach is that it is easier and much less verbose to write down. However, the roles each participant play in the n-ary relation are no longer explicated. This is not a problem when the participants do not have different roles, as in the example above — they all play the same role of group member.
Compound literal
This solution is inspired by a separate discussion within the RDF community on Language Tagged
Strings.
The essence of the discussion is to separate the string, as a simple data, from its various characterizations,
such as reading direction and language.
This design pattern uses the rdf:CompoundLiteral
class together with the
rdf:language
, rdf:direction
, and rdf:value
properties
to respectively describe literal values on the base direction, language, and string value of the subject.
Please note that these various ways are still in flux,
see the "text direction"
discussions in the RDF-star working group.
Graph Terms
Graph terms allow attaching metadata to groups of triples, such as the provenance, context, version, opinion, probability, etc. See the example below:
Importantly, as discussed in the graph terms section, a graph term does not assert the contents of the RDF graph as being true. This allows expressing examples such as above, where the statement within the graph term should not be asserted.
Embedding N3 in HTML documents
HTML [[HTML5]] script
element can be used to embed data blocks in documents. N3 can be easily
embedded in script
with the type
attribute set to text/n3
.
Such content may be escaped as indicated below:
&
: & (ampersand, U+0026)<
: < (less-than sign, U+003C)>
: > (greater-than sign, U+003E)"
: " (quotation mark, U+0022)'
: ' (apostrophe, U+0027)
When embedded in XHTML N3 data blocks must be enclosed in CDATA sections. Those CDATA markers must be in Turtle
comments. If the character sequence ]]>
occurs in the document it must be escaped using strings
escapes (\u005d\u0054\u003e
). This will also make N3 safe in polyglot documents served as both
text/html
and application/xhtml+xml
. Failing to use CDATA sections or escape
]]>
may result in a non well-formed XML document.
Internet Media Type, File Extension and Macintosh File Type
This section has been submitted to the Internet Engineering Steering Group (IESG) for review, approval, and registration with IANA.
- Contact:
- NAME
- Type name:
- text
- Subtype name:
- n3
- Optional parameters:
charset
— this parameter is required when transferring non-ASCII data. If present, the value ofcharset
is alwaysUTF-8
.- Encoding considerations:
- The syntax of Notation3 is expressed over code points in Unicode [[UNICODE]]. The encoding is always UTF-8 [[UTF-8]]. Unicode code points may also be expressed using an \uXXXX (U+0000 to U+FFFF) or \UXXXXXXXX syntax (for U+10000 onwards) where X is a hexadecimal digit [0-9A-Fa-f]
- Security considerations:
- Notation3 is a general-purpose assertion language; applications may evaluate given data to infer more assertions or to dereference IRIs, invoking the security considerations of the scheme for that IRI. Note in particular, the privacy issues in [[RFC3023]] section 10 for HTTP IRIs. Data obtained from an inaccurate or malicious data source may lead to inaccurate or misleading conclusions, as well as the dereferencing of unintended IRIs. Care must be taken to align the trust in consulted resources with the sensitivity of the intended use of the data; inferences of potential medical treatments would likely require different trust than inferences for trip planning. Notation3 is used to express arbitrary application data; security considerations will vary by domain of use. Security tools and protocols applicable to text (e.g. PGP encryption, MD5 sum validation, password-protected compression) may also be used on Notation3 documents. Security/privacy protocols must be imposed which reflect the sensitivity of the embedded information. Notation3 can express data which is presented to the user, for example, RDF Schema labels. Application rendering strings retrieved from untrusted Notation3 documents must ensure that malignant strings may not be used to mislead the reader. The security considerations in the media type registration for XML ([[RFC3023]] section 10) provide additional guidance around the expression of arbitrary data and markup. Notation3 uses IRIs as term identifiers. Applications interpreting data expressed in Notation3 should address the security issues of Internationalized Resource Identifiers (IRIs) [[RFC3987]] Section 8, as well as Uniform Resource Identifier (URI): Generic Syntax [[RFC3986]] Section 7. Multiple IRIs may have the same appearance. Characters in different scripts may look similar (a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER E WITH ACUTE). Any person or application that is writing or interpreting data in Notation3 must take care to use the IRI that matches the intended semantics, and avoid IRIs that make look similar. Further information about matching of similar characters can be found in Unicode Security Considerations [[UNICODE-SECURITY]] and Internationalized Resource Identifiers (IRIs) [[RFC3987]] Section 8.
- Interoperability considerations:
- Not Applicable
- Published specification:
- This specification.
- Applications which use this media type:
- Any programming environment that requires the exchange of directed graphs. Implementations of Notation3 have been created for JavaScript, Python, Java, and Prolog. It may be used by some web services and clients consuming their data.
- Additional information:
- Magic number(s):
- Notation3 documents may have the strings 'prefix' or 'base' (case independent) near the beginning of the document.
- File extension(s):
- .n3
- Macintosh file type code(s):
- TEXT
- Person & email address to contact for further information:
- NAME <EMAIL>
- Intended usage:
- Common
- Restrictions on usage:
- None
- Author(s):
- Dörthe Arndt, William Van Woensel, Dominik Tomaszuk
- Change controller:
- W3C
Changes since the Team Submission
The following is a summary of changes made since the original Team Submission [[N3]]:
- Removed support for `@keywords` customizations.
- Removed `@a`, `@is`, `@of`, and `@has` in favor of `a`, `is`, `of` and `has`.
- Removed `@true` and `@false` in favor of `true` and `false`.
- Added `<-` to represent an inverted property as a synonym of `is` expression `of`.
- Removed support for explicit N3 quantifiers (`@forSome` and `@forAll`).
- Removed whitespace from the IRIREF grammar terminal to be consistent with Turtle.
There are more accumulated changes to account for, to be sure.
Comments
Comments are indicated using a "
#
" symbol outside an N3 terminal (e.g., IRIREF, STRING) and will continue until the end of the line (indicated by\r
,\n
or\f
) or end of file, if there is no end of line marker.All recognized comment terminals will be skipped by the grammar (i.e., a resulting parser will not call listener or visitor code when encountering a comment.)