CARVIEW |
Generating RDF from Tabular Data on the Web
W3C Recommendation
- This version:
- https://www.w3.org/TR/2015/REC-csv2rdf-20151217/
- Latest published version:
- https://www.w3.org/TR/csv2rdf/
- Latest editor's draft:
- https://w3c.github.io/csvw/csv2rdf/
- Test suite:
- https://www.w3.org/2013/csvw/tests/
- Implementation report:
- https://www.w3.org/2013/csvw/implementation_report.html
- Previous version:
- https://www.w3.org/TR/2015/PR-csv2rdf-20151117/
- Editors:
- Jeremy Tandy, Met Office
- Ivan Herman, W3C
- Gregg Kellogg, Kellogg Associates
- Repository:
- We are on GitHub
- File a bug
- Changes:
- Diff to previous version
- Commit history
Please check the errata for any errors or issues reported since publication.
This document is also available in this non-normative format: ePub
The English version of this specification is the only normative version. Non-normative translations may also be available.
Copyright © 2015 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
Abstract
This document defines the procedures and rules to be applied when converting tabular data into RDF. Tabular data may be complemented with metadata annotations that describe its structure, the meaning of its content and how it may form part of a collection of interrelated tabular data. This document specifies the effect of this metadata on the resulting RDF.
Status of This Document
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
The CSV on the Web Working Group was chartered to produce a recommendation "Access methods for CSV Metadata" as well as recommendations for "Metadata vocabulary for CSV data" and "Mapping mechanism to transforming CSV into various formats (e.g., RDF, JSON, or XML)". This document aims to satisfy the RDF variant of the mapping recommendation.
This document was published by the CSV on the Web Working Group as a Recommendation. If you wish to make comments regarding this document, please send them to public-csv-wg@w3.org (subscribe, archives). All comments are welcome.
Please see the Working Group's implementation report.
This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 September 2015 W3C Process Document.
Table of Contents
1. Introduction
This document describes the processing of tabular data to create an RDF subject-predicate-object triples [rdf11-concepts]. Since RDF is an abstract syntax, these triples MAY be serialized in a concrete RDF syntax such as N-Triples [n-triples], Turtle [turtle], RDFa [rdfa-primer], JSON-LD [json-ld], or TriG [trig]. The RDF serializations offered by a conversion application is implementation defined.
The [tabular-data-model] defines an annotated tabular data model consisting of tables, columns, rows, and cells, enriched with annotations that describe the structure of the tabular data and the meaning of its content. A group of tables is a collection of tables published as a single atomic unit.
The conversion procedure described in this specification operates on the annotated tabular data model. This specification does not specify the processes needed to convert CSV-encoded data into tabular data form. Please refer to [tabular-data-model] for details of parsing tabular data.
Conversion applications MUST provide at least two modes of operation: standard and minimal.
Standard mode conversion frames the information gleaned from the cells of the tabular data with details of the rows, tables, and a group of tables within which that information is provided.
Minimal mode conversion includes only the information gleaned from the cells of the tabular data.
Standard and minimal conversion are described normatively below.
Conversion applications MAY offer additional implementation specific conversion modes.
Transformation definitions, as defined in [tabular-metadata] MAY be used to specify how tabular data can be transformed into another format using a script or template. Such transformation definitions MAY use the RDF output described in this specification as input.
There is no requirement on conversion applications to check the semantic consistency of the data during the conversion, nor to validate the triples against RDF schema. Downstream applications SHOULD be aware of the potential for inconsistencies and take appropriate action.
2. Conformance
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST, and SHOULD are to be interpreted as described in [RFC2119].
Tabular data MUST conform to the description from [tabular-data-model]. In particular note that each row MUST contain the same number of cells (although some of these cells may be empty).
Not all CSV-encoded data can be parsed into a tabular data model. An algorithm for parsing CSV-based files is described in [tabular-data-model].
This specification makes use of the compact IRI Syntax; please refer to the Compact IRIs from [json-ld].
This specification makes use of the following namespaces:
csvw
:https://www.w3.org/ns/csvw#
rdf
:https://www.w3.org/1999/02/22-rdf-syntax-ns#
xsd
:https://www.w3.org/2001/XMLSchema#
3. Typographical conventions
The following typographic conventions are used in this specification:
markup
- Markup (elements, attributes, properties), machine processable values (string, characters, media types), property name, or a file name is in red-orange monospace font.
- variable
- A variable in pseudo-code or in an algorithm description is in italics.
- definition
- A definition of a term, to be used elsewhere in this or other specifications, is in bold and italics.
- definition reference
- A reference to a definition in this document is underlined and is also an active link to the definition itself.
markup definition reference
- A references to a definition in this document, when the reference itself is also a markup, is underlined, red-orange monospace font, and is also an active link to the definition itself.
- external definition reference
- A reference to a definition in another document is underlined, in italics, and is also an active link to the definition itself.
markup external definition reference
- A reference to a definition in another document, when the reference itself is also a markup, is underlined, in italics red-orange monospace font, and is also an active link to the definition itself.
- hyperlink
- A hyperlink is underlined and in blue.
- [reference]
- A document reference (normative or informative) is enclosed in square brackets and links to the references section.
Notes are in light green boxes with a green left border and with a "Note" header in green. Notes are normative or informative depending on the whether they are in a normative or informative section, respectively.
Examples are in light khaki boxes, with khaki left border, and with a numbered "Example" header in khaki. Examples are always informative. The content of the example is in monospace font and may be syntax colored.
4. Converting Tabular Data to RDF
The procedures for converting tabular data into RDF are described below for both standard and minimal modes.
4.1 Algorithm terms
- about URL
- The about URL annotation on the current cell. As defined in [tabular-data-model].
- annotated table
- The annotated table is defined in [tabular-data-model] as describing a particular table and its annotations.
- blank node
- A blank node is defined in [rdf11-concepts] as an RDF Term disjoint from IRIs or literals.
- cell
- A cell is defined in [tabular-data-model] as the intersection of a row and a column within a table.
- cell errors
- Cell errors are defined in [tabular-data-model] as a (possibly empty) list of validation errors generated while parsing the literal content of a cell to generate the semantic value.
- cell value
- A cell value is defined in [tabular-data-model] as the semantic value of the cell; this MAY be
null
or a sequence of values. - column
- A column is defined in [tabular-data-model] as a vertical arrangement of cells within a table.
- group of tables
- A group of tables is defined in [tabular-data-model] as comprising a set of annotated tables and a set of annotations that relate to that group.
- group of tables identifier
- The group of tables identifier is the id annotation on a group of tables. As defined in [tabular-data-model].
- literal node
- A literal node is defined in [rdf11-concepts] as a node within an RDF graph that provides values such as strings, numbers, and dates.
- node
- A node is defined in [rdf11-concepts] as a subject or an object of an RDF triple. When in subject position, it can be either a blank node or identified with a URL; when in object position, it can be a blank node, a literal, or identified with a URL.
- non-core annotations
- Core annotations are listed in [tabular-data-model]; groups of tables and tables may also have other annotations that are not defined in that specification; these are known as non-core annotations.
- notes
- A list of notes, as defined in [tabular-data-model], attached to an annotated table or group of tables using the
notes
property. This may be an empty list. - predicate
- A predicate is defined in [rdf11-concepts] as an IRI that denotes the property used to relate nodes within an RDF triple.
- prefixed name
- A prefixed name is an abbreviation for a URI, in the syntax
prefix:name
. See Names of Common Properties in [tabular-metadata] for information on expansion. - property URL
- The property URL annotation on the current cell. As defined in [tabular-data-model].
- row
- The row is defined in [tabular-data-model] as a horizontal arrangement of cells within a table.
- row number
- A row number is defined in [tabular-data-model] as the position of the row within the table, starting from 1.
- row source number
- A row source number is defined in [tabular-data-model] as the position of the row within the source tabular data file. Provision of the row source number is dependent on parsing applications and may be reported as
null
. - subject
- Within this algorithm, a subject is the resource that the value of a given cell refers to. This may be specified using about URL.
- table identifier
- The table identifier is the id annotation on an annotated table. As defined in [tabular-data-model].
- tabular data mapping
- The mapping from tabular data to RDF, as defined by this Recommendation.
- value URL
- The value URL annotation on the current cell. As defined in [tabular-data-model].
4.2 Generating RDF
A conformant RDF conversion application MUST emit triples conforming to those described in this algorithm according to the chosen mode of conversion: standard or minimal.
Unless specified otherwise, the steps in the algorithm defined herein apply to both standard and minimal modes.
Where an annotated table is defined in isolation (e.g. in the absence of a group of tables), a default group of tables is provided with a single tables annotation that refers to the given table.
The [tabular-data-model] specifies that string values within tabular data (such as column titles or cell string values) MUST contain only Unicode characters. No Unicode normalization (as specified in [UAX15]) is applied to these string values during the conversion to RDF.
If a CSV file is originally encoded as UTF-8, it should not go through Unicode normalization during parsing, nor in conversion to RDF. This can result in RDF literals that are not in Normal Form C as they should be according to [rdf11-concepts].
-
In standard mode only, establish a new node G.
If the group of tables has an identifier then node G MUST be identified accordingly; else if identifier is
null
, then node G MUST be a new blank node. -
In standard mode only, specify the type of node G as
csvw:TableGroup
; emit the following triple:- subject
- node G
- predicate
rdf:type
- object
csvw:TableGroup
-
In standard mode only, emit the triples generated by running the algorithm specified in section 6. JSON-LD to RDF over any notes and non-core annotations specified for the group of tables, with node G as an initial subject, the notes or non-core annotation as property, and the value of the notes or non-core annotation as value.
-
For each table where the suppress output annotation is
false
:-
In standard mode only, establish a new node T which represents the current table.
If the table has an identifier then node T MUST be identified accordingly; else if identifier is
null
, then node T MUST be a new blank node. -
In standard mode only, relate the table to the group of tables; emit the following triple:
-
In standard mode only, specify the type of node T as
csvw:Table
; emit the following triple:- subject
- node T
- predicate
rdf:type
- object
csvw:Table
-
In standard mode only, specify the source tabular data file URL for the current table based on the url annotation; emit the following triple:
-
In standard mode only, emit the triples generated by running the algorithm specified in section 6. JSON-LD to RDF over any notes and non-core annotations specified for the table, with node T as an initial subject, the notes or non-core annotation as property, and the value of the notes or non-core annotation as value.
NoteAll other core annotations for the table are ignored during the conversion; including information about table schemas and their columns, foreign keys, table direction, transformations, etc.
-
For each row in the current table:
-
In standard mode only, establish a new blank node R which represents the current row.
-
In standard mode only, relate the row to the table; emit the following triple:
-
In standard mode only, specify the type of node R as
csvw:Row
; emit the following triple:- subject
- node R
- predicate
rdf:type
- object
csvw:Row
-
In standard mode only, specify the row number n for the row; emit the following triple:
- subject
- node R
- predicate
csvw:rownum
- object
- a literal n; specified with datatype IRI
xsd:integer
-
In standard mode only, specify the row source number nsource for the row within the source tabular data file URL using a fragment-identifier as specified in [RFC7111]; if row source number is not
null
, emit the following triple: -
In standard mode only, if row titles is not
null
, insert any titles specified for the row. For each value, tv, of the row titles annotation, emit the following triple:- subject
- node R
- predicate
csvw:title
- object
- a literal tv; specified with the the appropriate language tag (as defined in [rdf11-concepts]) for that row title annotation value
-
In standard mode only, emit the triples generated by running the algorithm specified in section 6. JSON-LD to RDF over any non-core annotations specified for the row, with node R as an initial subject, the non-core annotation as property, and the value of the non-core annotation as value.
-
Establish a new blank node Sdef to be used as the default subject for cells where about URL is undefined.
NoteA row MAY describe multiple interrelated subjects; where the value URL annotation on one cell matches the about URL annotation on another cell in the same row.
For each cell in the current row where the suppress output annotation for the column associated with that cell is
false
:-
Establish a node S from about URL if set, or from Sdef otherwise as the current subject.
-
In standard mode only, relate the current subject to the current row; emit the following triple:
-
If the value of property URL for the cell is not
null
, then predicate P takes the value of property URL.Else, predicate P is constructed by appending the value of the name annotation for the column associated with the cell to the the tabular data file URL as a fragment identifier.
- If the value URL for the current cell is not
null
, then value URL identifies a node Vurl that is related the current subject using the predicate P; emit the following triple: - Else, if the cell value is a list and the cell ordered annotation is
true
, then the cell value provides an ordered sequence of literal nodes for inclusion within the RDF output using an instance ofrdf:List
Vlist as defined in [rdf-schema]. This instance is related to the subject using the predicate P; emit the triples defining list Vlist plus the following triple: - Else, if the cell value is a list, then the cell value provides an unordered sequence of literal nodes for inclusion within the RDF output, each of which is related to the subject using the predicate P. For each value provided in the sequence, add a literal node Vliteral; emit the following triple:
- subject
- node S
- predicate
- P
- object
- literal node Vliteral
- Else, if the cell value is not
null
, then the cell value provides a single literal node Vliteral for inclusion within the RDF output that is related the current subject using the predicate P; emit the following triple:- subject
- node S
- predicate
- P
- object
- literal node Vliteral
The literal nodes derived from the cell values MUST be expressed according to the cell value's datatype as defined below: Interpreting datatypes.
NoteIn the case when a cell value does not have a datatype, the conversion SHOULD default to
string
.NoteIn the case where a sequence of values is provided, each value in the list has its own datatype; the datatype may be different for different items in the sequence.
-
-
-
4.3 Interpreting datatypes
Cell values are expressed in the RDF output according to the cell value's datatype. The relationship between the value of the cell value's datatype and the datatype IRI used in the RDF output is as follows:
- if the datatype's id annotation is not
null
, then its value MUST be used as the RDF datatype IRI; - else, the datatype's base annotation value MUST be mapped to the RDF datatype IRI as shown below:
datatype | RDF datatype IRI | Remarks |
---|---|---|
anyAtomicType | xsd:anyAtomicType | |
anyURI | xsd:anyURI | |
base64Binary | xsd:base64Binary | |
boolean | xsd:boolean | |
date | xsd:date | |
dateTime | xsd:dateTime | |
dateTimeStamp | xsd:dateTimeStamp | |
decimal | xsd:decimal | |
integer | xsd:integer | |
long | xsd:long | |
int | xsd:int | |
short | xsd:short | |
byte | xsd:byte | |
nonNegativeInteger | xsd:nonNegativeInteger | |
positiveInteger | xsd:positiveInteger | |
unsignedLong | xsd:unsignedLong | |
unsignedInt | xsd:unsignedInt | |
unsignedShort | xsd:unsignedShort | |
unsignedByte | xsd:unsignedByte | |
nonPositiveInteger | xsd:nonPositiveInteger | |
negativeInteger | xsd:negativeInteger | |
double | xsd:double | |
duration | xsd:duration | |
dayTimeDuration | xsd:dayTimeDuration | |
yearMonthDuration | xsd:yearMonthDuration | |
float | xsd:float | |
gDay | xsd:gDay | |
gMonth | xsd:gMonth | |
gMonthDay | xsd:gMonthDay | |
gYear | xsd:gYear | |
gYearMonth | xsd:gYearMonth | |
hexBinary | xsd:hexBinary | |
QName | xsd:QName | |
string | xsd:string or rdf:langString | Choice depends on whether or not the value has an associated language |
normalizedString | xsd:normalizedString | |
token | xsd:token | |
language | xsd:language | |
Name | xsd:Name | |
NMTOKEN | xsd:NMTOKEN | |
xml | rdf:XMLLiteral | |
html | rdf:HTML | |
json | csvw:JSON | csvw:JSON is a sub-class of xsd:string |
time | xsd:time |
A datatype's format annotation is irrelevant to the conversion procedure defined in this specification; the cell value has already been parsed from the contents of the cell according to the format annotation.
Cell errors MUST be recorded by applications where the contents of a cell cannot be parsed or validated (see Parsing Cells and Validating Tables in [tabular-data-model] respectively). In cases where cell errors are recorded, applications may attempt to determine the appropriate RDF datatype IRI during the subsequent conversion process according to local rules.
In the case of rdf:langString
, the appropriate language tag (as defined in [rdf11-concepts]) MUST be provided for the string, based on the value of cell value's language.
(See section on Graph Literals in [rdf11-concepts] for further details on language tagged literals.)
According to [rdf11-concepts] language tags cannot be combined with other xsd
datatypes. If a cell has any datatype other than string
, the value of lang
MUST be ignored. Also, all literals have a datatype; however, specific serializations, like [turtle], MAY provide a special syntax for literals with datatype xsd:string
or rdf:langString
.
5. Inclusion of provenance information
This section is non-normative.
In addition to the namespaces defined above, the following namespace is used in this section:
prov
:https://www.w3.org/ns/prov#
Conversion applications MAY include provenance information in the RDF output describing how and when the output was created; e.g., using terms from the PROV Ontology [prov-o]. Information that may be of interest to downstream applications includes:
- the source tabular data file;
- the metadata description file(s) used;
- when the conversion to RDF occurred; and
- the conversion application used.
In order to facilitate providing such information, this specification introduces two instances of prov:Role
:
csvw:csvEncodedTabularData
- Defines the role of the source tabular data file.
csvw:tabularMetadata
- Defines the role of the metadata description file.
https://example.org/my-csv2rdf-application
:
@prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix prov: <http://www.w3.org/ns/prov#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <> prov:wasGeneratedBy [ a prov:Activity ; prov:wasAssociatedWith <http://example.org/my-csv2rdf-application> ; prov:startedAtTime "2015-02-13T15:12:44"^^xsd:dateTime ; prov:endedAtTime "2015-02-13T15:12:46"^^xsd:dateTime ; prov:qualifiedUsage [ a prov:Usage ; prov:entity <http://example.org/csv/data.csv> ; prov:hadRole csvw:csvEncodedTabularData ]; prov:qualifiedUsage [ a prov:Usage ; prov:entity <http://example.org/csv/data.csv-metadata.json> , <http://example.org/csv/csv-metadata.json> ; prov:hadRole csvw:tabularMetadata ]; ]
6. JSON-LD to RDF
This section defines a mechanism for transforming the [json-ld] dialect used for non-core annotations and notes originating from the processing of metadata (as defined in [tabular-metadata]) into RDF in a manner consistent with the Deserialize JSON-LD to RDF Algorithm defined in [json-ld-api]. Converters MAY use any algorithm which results in equivalent triples.
Conversion applications may have other means to create annotated tables, e.g. through some application specific APIs. In such cases the exact format for non-core annotations or notes may be different. Specifications for such annotation processes should specify how these annotations are converted into RDF.
Given a subject, property and value in normalized form:
- Property is a term defined in the [csvw-context], a prefixed name, or an absolute URL; expand to an absolute URL by replacing a term with the URI from the term definition in [csvw-context] or a prefixed name as described in Names of Common Properties in [tabular-metadata].
- If value is an array, generate RDF by running this algorithm using subject, property using each array member as value.
- If value is an object containing
@value
, create an RDF Literal lit using the string value of@value
and language from@language
, or datatype from@type
if present, expanding@type
as necessary using the procedure outlined for property, and emit the following triple:- subject
- node subject
- predicate
- property
- object
- literal node lit
NoteIf neither
@language
nor@type
is present, the literal lit has the datatypexsd:string
. - Else, if value is an object:
- Establish a new node S that is identified with the value of
@id
if defined, or else as a blank node, and emit the following triple: - For every value of
@type
, either a term defined in the [csvw-context], a prefixed name, or an absolute URL; establish a new node Ti by expanding the value to an absolute URL by replacing a term with the URI from the term definition in [csvw-context] or a prefixed name with its expanded value. For each Ti, emit the following triple: - For every key and val from value that does not start with
@
(U+0040
) generate RDF by running this algorithm using S for subject, key for property and val for value.
- Establish a new node S that is identified with the value of
- Else, establish lit as an RDF Literal as follows:
- If value is
true
orfalse
, create an RDF Literal lit using the strings "true" or "false", accordingly with datatypexsd:boolean
. - Else, if value is a JSON number with a non-zero fractional part, create an RDF Literal lit using the canonical representation for value with datatype
xsd:double
. - Else, if value is a JSON number with no non-zero fractional part, create an RDF Literal lit using the canonical representation for value with datatype
xsd:integer
. - Otherwise, create an RDF Literal lit using the canonical representation for value with datatype
xsd:string
.
Emit the following triple:
- subject
- node subject
- predicate
- property
- object
- literal node lit
- If value is
7. Examples
This section is non-normative.
In addition to the namespaces defined above, the examples provided here make use of the following namespaces:
dc
:https://purl.org/dc/terms/
foaf
:https://xmlns.com/foaf/0.1/
oa
:https://www.w3.org/ns/oa#
org
:https://www.w3.org/ns/org#
schema
:https://schema.org/
Furthermore, these examples also make use of the Turtle syntax @base
declaration (as defined in [turtle]). Where a single tabular data file is used in the example, the @base
declaration is set to the URL of that tabular data file.
Each of the examples expresses more complex conversions - it is recommended that readers of this specification work through the examples in sequential order.
7.1 Simple example
This example comprises a single annotated table containing information attributes about countries; country code, position (latitude, longitude) and name. Whilst the input tabular data file, published at https://example.org/countries.csv
, includes a header line, no further metadata annotations are given. The tabular data file is provided below:
countryCode,latitude,longitude,name AD,42.5,1.6,Andorra AE,23.4,53.8,"United Arab Emirates" AF,33.9,67.7,Afghanistan
The annotated table generated from parsing the tabular data file is shown below and provides the basis for the conversion to RDF.
Annotations for the resulting table T, with 4 columns and 3 rows, are shown below:
id | core annotations | ||
---|---|---|---|
url | columns | rows | |
T | https://example.org/countries.csv | C1, C2, C3, C4 | R1, R2, R3 |
Annotations for the columns, rows and cells in table T are shown in the tables below:
Column annotations:
id | core annotations | |||||
---|---|---|---|---|---|---|
table | number | source number | cells | name | titles | |
C1 | T | 1 | 1 | C1.1, C2.1, C3.1 | countryCode | countryCode |
C2 | T | 2 | 2 | C1.2, C2.2, C3.2 | latitude | latitude |
C3 | T | 3 | 3 | C1.3, C2.3, C3.3 | longitude | longitude |
C4 | T | 4 | 4 | C1.4, C2.4, C3.4 | name | name |
Row annotations:
id | core annotations | |||
---|---|---|---|---|
table | number | source number | cells | |
R1 | T | 1 | 2 | C1.1, C1.2, C1.3, C1.4 |
R2 | T | 2 | 3 | C2.1, C2.2, C2.3, C2.4 |
R3 | T | 3 | 4 | C3.1, C3.2, C3.3, C3.4 |
Cell annotations:
id | core annotations | |||||
---|---|---|---|---|---|---|
table | column | row | string value | value | property URL | |
C1.1 | T | C1 | R1 | "AD" | "AD" | null |
C1.2 | T | C2 | R1 | "42.5" | "42.5" | null |
C1.3 | T | C3 | R1 | "1.6" | "1.6" | null |
C1.4 | T | C4 | R1 | "Andorra" | "Andorra" | null |
C2.1 | T | C1 | R2 | "AE" | "AE" | null |
C2.2 | T | C2 | R2 | "23.4" | "23.4" | null |
C2.3 | T | C3 | R2 | "53.8" | "53.8" | null |
C2.4 | T | C4 | R2 | "United Arab Emirates" | "United Arab Emirates" | null |
C3.1 | T | C1 | R3 | "AF" | "AF" | null |
C3.2 | T | C2 | R3 | "33.9" | "33.9" | null |
C3.3 | T | C3 | R3 | "67.7" | "67.7" | null |
C3.4 | T | C4 | R3 | "Afghanistan" | "Afghanistan" | null |
Minimal mode output for this example is provided in [turtle] syntax below:
@base <http://example.org/countries.csv> . _:8228a149-8efe-448d-b15f-8abf92e7bd17 <#countryCode> "AD" ; <#latitude> "42.5" ; <#longitude> "1.6" ; <#name> "Andorra" . _:ec59dcfc-872a-4144-822b-9ad5e2c6149c <#countryCode> "AE" ; <#latitude> "23.4" ; <#longitude> "53.8" ; <#name> "United Arab Emirates" . _:e8f2e8e9-3d02-4bf5-b4f1-4794ba5b52c9 <#countryCode> "AF" ; <#latitude> "33.9" ; <#longitude> "67.7" ; <#name> "Afghanistan" .
The about URL annotation has not been set for cells in table T ({ "url": "https://example.org/countries.csv"}
); cells in a given row where about URL has not been specified are assumed to refer to the same subject. This unspecified subject is treated as a blank node.
Given that the property URL is null
for cells in table T ({ "url": "https://example.org/countries.csv"}
), the property URL defaults to the URI Template (see [RFC6570]) #{[column-name]}
, where [column-name]
is the value of the name annotation of the column associated with the cell. For example, the value of the property URL annotation for all cells in column C1 ("name": "countryCode"
) is https://example.org/countries.csv#countryCode
.
Standard mode output for this example is provided in [turtle] syntax below:
@base <http://example.org/countries.csv> . @prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . _:d4f8e548-9601-4e41-aadb-09a8bce32625 a csvw:TableGroup ; csvw:table [ a csvw:Table ; csvw:url <http://example.org/countries.csv> ; csvw:row [ a csvw:Row ; csvw:rownum "1"^^xsd:integer ; csvw:url <#row=2> ; csvw:describes _:8228a149-8efe-448d-b15f-8abf92e7bd17 ], [ a csvw:Row ; csvw:rownum "2"^^xsd:integer ; csvw:url <#row=3> ; csvw:describes _:ec59dcfc-872a-4144-822b-9ad5e2c6149c ], [ a csvw:Row ; csvw:rownum "3"^^xsd:integer ; csvw:url <#row=4> ; csvw:describes _:e8f2e8e9-3d02-4bf5-b4f1-4794ba5b52c9 ] ] . _:8228a149-8efe-448d-b15f-8abf92e7bd17 <#countryCode> "AD" ; <#latitude> "42.5" ; <#longitude> "1.6" ; <#name> "Andorra" . _:ec59dcfc-872a-4144-822b-9ad5e2c6149c <#countryCode> "AE" ; <#latitude> "23.4" ; <#longitude> "53.8" ; <#name> "United Arab Emirates" . _:e8f2e8e9-3d02-4bf5-b4f1-4794ba5b52c9 <#countryCode> "AF" ; <#latitude> "33.9" ; <#longitude> "67.7" ; <#name> "Afghanistan" .
Even though the table was defined in isolation, the annotated table is wrapped in a group of tables.
The type of both table and group of tables objects is explicitly stated; csvw:TableGroup
and csvw:Table
respectively.
The csvw:url
property provides reference to the original tabular data file and to specific rows therein - noting the need to escape the Turtle-syntax reserved character =
(U+003D
) within the fragment identifier.
The row number is provided for each row using csvw:rownum
property.
A subject and row are related using the csvw:describes
property.
7.2 Example with single table and rich annotations
This example is based on Use Case #11 - City of Palo Alto Tree Data and comprises a single annotated table describing an inventory of tree maintenance operations. The input tabular data file, published at https://example.org/tree-ops-ext.csv
, and the associated metadata description https://example.org/tree-ops-ext.csv-metadata.json
are provided below:
GID,On Street,Species,Trim Cycle,Diameter at Breast Ht,Inventory Date,Comments,Protected,KML 1,ADDISON AV,Celtis australis,Large Tree Routine Prune,11,10/18/2010,,,"<Point><coordinates>-122.156485,37.440963</coordinates></Point>" 2,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,11,6/2/2010,,,"<Point><coordinates>-122.156749,37.440958</coordinates></Point>" 6,ADDISON AV,Robinia pseudoacacia,Large Tree Routine Prune,29,6/1/2010,cavity or decay; trunk decay; codominant leaders; included bark; large leader or limb decay; previous failure root damage; root decay; beware of BEES,YES,"<Point><coordinates>-122.156299,37.441151</coordinates></Point>"
{ "@context": ["https://www.w3.org/ns/csvw", {"@language": "en"}], "@id": "https://example.org/tree-ops-ext", "url": "tree-ops-ext.csv", "dc:title": "Tree Operations", "dcat:keyword": ["tree", "street", "maintenance"], "dc:publisher": [{ "schema:name": "Example Municipality", "schema:url": {"@id": "https://example.org"} }], "dc:license": {"@id": "https://opendefinition.org/licenses/cc-by/"}, "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"}, "notes": [{ "@type": "oa:Annotation", "oa:hasTarget": {"@id": "https://example.org/tree-ops-ext"}, "oa:hasBody": { "@type": "oa:EmbeddedContent", "rdf:value": "This is a very interesting comment about the table; it's a table!", "dc:format": {"@value": "text/plain"} } }], "dialect": {"trim": true}, "tableSchema": { "columns": [{ "name": "GID", "titles": [ "GID", "Generic Identifier" ], "dc:description": "An identifier for the operation on a tree.", "datatype": "string", "required": true, "suppressOutput": true }, { "name": "on_street", "titles": "On Street", "dc:description": "The street that the tree is on.", "datatype": "string" }, { "name": "species", "titles": "Species", "dc:description": "The species of the tree.", "datatype": "string" }, { "name": "trim_cycle", "titles": "Trim Cycle", "dc:description": "The operation performed on the tree.", "datatype": "string", "lang": "en" }, { "name": "dbh", "titles": "Diameter at Breast Ht", "dc:description": "Diameter at Breast Height (DBH) of the tree (in feet), measured 4.5ft above ground.", "datatype": "integer" }, { "name": "inventory_date", "titles": "Inventory Date", "dc:description": "The date of the operation that was performed.", "datatype": {"base": "date", "format": "M/d/yyyy"} }, { "name": "comments", "titles": "Comments", "dc:description": "Supplementary comments relating to the operation or tree.", "datatype": "string", "separator": ";" }, { "name": "protected", "titles": "Protected", "dc:description": "Indication (YES / NO) whether the tree is subject to a protection order.", "datatype": {"base": "boolean", "format": "YES|NO"}, "default": "NO" }, { "name": "kml", "titles": "KML", "dc:description": "KML-encoded description of tree location.", "datatype": "xml" }], "primaryKey": "GID", "aboutUrl": "https://example.org/tree-ops-ext#gid-{GID}" } }
The notes annotation in the metadata description uses the Open Annotation data model currently under development within the Web Annotations Working Group. This is purely illustrative; no constraints are placed on the value of the notes annotation.
The annotated table generated from parsing the tabular data file and associated metadata is shown below and provides the basis for the conversion to RDF.
Core annotations for the resulting table T, with 9 columns and 3 rows, are shown below:
id | core annotations | ||||
---|---|---|---|---|---|
id | url | columns | rows | notes | |
T | <https://example.org/tree-ops-ext> | https://example.org/tree-ops-ext.csv | C1, C2, C3, C4, C5, C6, C7, C8, C9 | R1, R2, R3 | [{ "@type": "oa:Annotation", ... }] |
Non-core annotations for the table T are:
dc:title
"Tree Operations"
dcat:keyword
["tree", "street", "maintenance"]
dc:publisher
[{ "schema:name": "Example Municipality", "schema:url": { "@id": "https://example.org" } }]
dc:license
{ "@id": "https://opendefinition.org/licenses/cc-by/" }
dc:modified
"2010-12-31"
The value of the notes annotation has been shortened for clarity in the table above.
Annotations for the columns, rows and cells in table T are shown in the tables below:
Column annotations:
id | core annotations | annotations | |||||||
---|---|---|---|---|---|---|---|---|---|
table | number | source number | cells | name | titles | required | suppress output | dc:description | |
C1 | T | 1 | 1 | C1.1, C2.1, C3.1 | GID | GID , Generic Identifier | true | true | An identifier for the operation on a tree. |
C2 | T | 2 | 2 | C1.2, C2.2, C3.2 | on_street | On Street | The street that the tree is on. | ||
C3 | T | 3 | 3 | C1.3, C2.3, C3.3 | species | Species | The species of the tree. | ||
C4 | T | 4 | 4 | C1.4, C2.4, C3.4 | trim_cycle | Trim Cycle | The operation performed on the tree. | ||
C5 | T | 5 | 5 | C1.5, C2.5, C3.5 | dbh | Diameter at Breast Ht | Diameter at Breast Height (DBH) of the tree (in feet), measured 4.5ft above ground. | ||
C6 | T | 6 | 6 | C1.6, C2.6, C3.6 | inventory_date | Inventory Date | The date of the operation that was performed. | ||
C7 | T | 7 | 7 | C1.7, C2.7, C3.7 | comments | Comments | Supplementary comments relating to the operation or tree. | ||
C8 | T | 8 | 8 | C1.8, C2.8, C3.8 | protected | Protected | Indication (YES / NO) whether the tree is subject to a protection order. | ||
C9 | T | 9 | 9 | C1.9, C2.9, C3.9 | kml | KML | KML-encoded description of tree location. |
In this example, output for column C1 (GID
) is not required; note the suppress output annotation on this column.
Row annotations:
id | core annotations | ||||
---|---|---|---|---|---|
table | number | source number | cells | primary key | |
R1 | T | 1 | 2 | C1.1, C1.2, C1.3, C1.4, C1.5, C1.6, C1.7, C1.8, C1.9 | C1.1 |
R2 | T | 2 | 3 | C2.1, C2.2, C2.3, C2.4, C2.5, C2.6, C2.7, C2.8, C2.9 | C2.1 |
R3 | T | 3 | 4 | C3.1, C3.2, C3.3, C3.4, C3.5, C3.6, C3.7, C3.8, C3.9 | C3.1 |
Cell annotations:
id | core annotations | |||||
---|---|---|---|---|---|---|
table | column | row | string value | value | about URL | |
C1.1 | T | C1 | R1 | "1" | "1" | https://example.org/tree-ops-ext#gid-1 |
C1.2 | T | C2 | R1 | "ADDISON AV" | "ADDISON AV" | <https://example.org/tree-ops-ext#gid-1> |
C1.3 | T | C3 | R1 | "Celtis australis" | "Celtis australis" | <https://example.org/tree-ops-ext#gid-1> |
C1.4 | T | C4 | R1 | "Large Tree Routine Prune" | "Large Tree Routine Prune" (English) | <https://example.org/tree-ops-ext#gid-1> |
C1.5 | T | C5 | R1 | "11" | 11 | <https://example.org/tree-ops-ext#gid-1> |
C1.6 | T | C6 | R1 | "10/18/2010" | 2010-10-18 | <https://example.org/tree-ops-ext#gid-1> |
C1.7 | T | C7 | R1 | "" | null | <https://example.org/tree-ops-ext#gid-1> |
C1.8 | T | C8 | R1 | "" | false | <https://example.org/tree-ops-ext#gid-1> |
C1.9 | T | C9 | R1 | "<Point><coordinates>-122.156485,37.440963</coordinates></Point>" | "<Point><coordinates>-122.156485,37.440963</coordinates></Point>" (XML) | <https://example.org/tree-ops-ext#gid-1> |
C2.1 | T | C1 | R2 | "2" | "2" | <https://example.org/tree-ops-ext#gid-2> |
C2.2 | T | C2 | R2 | "EMERSON ST" | "EMERSON ST" | <https://example.org/tree-ops-ext#gid-2> |
C2.3 | T | C3 | R2 | "Liquidambar styraciflua" | "Liquidambar styraciflua" | <https://example.org/tree-ops-ext#gid-2> |
C2.4 | T | C4 | R2 | "Large Tree Routine Prune" | "Large Tree Routine Prune" (English) | <https://example.org/tree-ops-ext#gid-2> |
C2.5 | T | C5 | R2 | "11" | 11 | <https://example.org/tree-ops-ext#gid-2> |
C2.6 | T | C6 | R2 | "6/2/2010" | 2010-06-02 | <https://example.org/tree-ops-ext#gid-2> |
C2.7 | T | C7 | R2 | "" | null | <https://example.org/tree-ops-ext#gid-2> |
C2.8 | T | C8 | R2 | "" | false | <https://example.org/tree-ops-ext#gid-2> |
C2.9 | T | C9 | R2 | "<Point><coordinates>-122.156749,37.440958</coordinates></Point>" | "<Point><coordinates>-122.156749,37.440958</coordinates></Point>" (XML) | <https://example.org/tree-ops-ext#gid-2> |
C3.1 | T | C1 | R3 | "6" | "6" | <https://example.org/tree-ops-ext#gid-6> |
C3.2 | T | C2 | R3 | "ADDISON AV" | "ADDISON AV" | <https://example.org/tree-ops-ext#gid-6> |
C3.3 | T | C3 | R3 | "Robinia pseudoacacia" | "Robinia pseudoacacia" | <https://example.org/tree-ops-ext#gid-6> |
C3.4 | T | C4 | R3 | "Large Tree Routine Prune" | "Large Tree Routine Prune" (English) | <https://example.org/tree-ops-ext#gid-6> |
C3.5 | T | C5 | R3 | "29" | 29 | <https://example.org/tree-ops-ext#gid-6> |
C3.6 | T | C6 | R3 | "6/1/2010" | 2010-06-01 | <https://example.org/tree-ops-ext#gid-6> |
C3.7 | T | C7 | R3 | "cavity or decay; trunk decay; codominant leaders; included bark; large leader or limb decay; previous failure root damage; root decay; beware of BEES" | "cavity or decay" , "trunk decay" , "codominant leaders" , "included bark" , "large leader or limb decay" , "previous failure root damage" , "root decay" , "beware of BEES" | <https://example.org/tree-ops-ext#gid-6> |
C3.8 | T | C8 | R3 | "YES" | true | <https://example.org/tree-ops-ext#gid-6> |
C3.9 | T | C9 | R3 | "<Point><coordinates>-122.156299,37.441151</coordinates></Point>" | "<Point><coordinates>-122.156299,37.441151</coordinates></Point>" (XML) | <https://example.org/tree-ops-ext#gid-6> |
Minimal mode output for this example is provided in [turtle] syntax below:
@base <http://example.org/tree-ops-ext.csv> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <http://example.org/tree-ops-ext#gid-1> <#on_street> "ADDISON AV" ; <#species> "Celtis australis" ; <#trim_cycle> "Large Tree Routine Prune"@en ; <#dbh> 11 ; <#inventory_date> "2010-10-18"^^xsd:date ; <#protected> false ; <#kml> "<Point><coordinates>-122.156485,37.440963</coordinates></Point>"^^rdf:XMLLiteral . <http://example.org/tree-ops-ext#gid-2> <#on_street> "EMERSON ST" ; <#species> "Liquidambar styraciflua" ; <#trim_cycle> "Large Tree Routine Prune"@en ; <#dbh> 11 ; <#inventory_date> "2010-06-02"^^xsd:date ; <#protected> false ; <#kml> "<Point><coordinates>-122.156749,37.440958</coordinates></Point>"^^rdf:XMLLiteral . <http://example.org/tree-ops-ext#gid-6> <#on_street> "ADDISON AV" ; <#species> "Robinia pseudoacacia" ; <#trim_cycle> "Large Tree Routine Prune"@en ; <#dbh> 29 ; <#inventory_date> "2010-06-01"^^xsd:date ; <#comments> "cavity or decay", "trunk decay", "codominant leaders", "included bark", "large leader or limb decay", "previous failure root damage", "root decay", "beware of BEES" ; <#protected> true ; <#kml> "<Point><coordinates>-122.156299,37.441151</coordinates></Point>"^^rdf:XMLLiteral .
The subject described by each row is explicitly defined using the about URL annotation; e.g. the subject of row R1 is https://example.org/tree-ops-ext#gid-1
.
Output for column C1 ({ "name": "GID" }
) is not included as the column suppress output annotation is true
.
A language tag is specified for values of column C4 ({ "name": "trim_cycle" }
) as the cell value language annotation is en
.
The datatype
annotation is set on columns C5, C6, C8 and C9 ({ "name": "dbh"}
, { "name": "inventory_date" }
, { "name": "protected" }
and { "name": "kml" }
); as integer
, date
, boolean
and xml
respectively. The datatype
property is inherited by all cells in each of those columns, therefore the RDF output for those cells includes the appropriate datatype IRI.
Cells C1.7 and C2.7 (rows R1 and R2; column, { "name": "comments" }
) have null
values - no output is included for these cells.
Cell C3.7 (row R3; column, { "name": "comments" }
) contains an unordered sequence of values; the set of values are included as a simple set of triples as opposed to an instance of rdf:List
as the ordered annotation has defaulted to false
.
Standard mode output for this example is provided in [turtle] syntax below:
@base <http://example.org/tree-ops-ext.csv> . @prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix dc: <http://purl.org/dc/terms/> . @prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix oa: <http://www.w3.org/ns/oa#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix schema: <http://schema.org/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . _:68fc08e5-56a0-47e2-a784-3a644d8257c4 a csvw:TableGroup ; csvw:table <http://example.org/tree-ops-ext> . <http://example.org/tree-ops-ext> a csvw:Table ; csvw:url <http://example.org/tree-ops-ext.csv> ; dc:title "Tree Operations"@en ; dcat:keyword "tree"@en, "street"@en, "maintenance"@en ; dc:publisher [ schema:name "Example Municipality"@en ; schema:url <http://example.org> ] ; dc:license <http://opendefinition.org/licenses/cc-by/> ; dc:modified "2010-12-31"^^xsd:date ; csvw:note [ a oa:Annotation ; oa:hasTarget <http://example.org/tree-ops-ext> ; oa:hasBody [ a oa:EmbeddedContent ; rdf:value "This is a very interesting comment about the table; it's a table!"@en ; dc:format "text/plain" ] ] ; csvw:row [ a csvw:Row ; csvw:rownum 1 ; csvw:url <#row=2> ; csvw:describes <http://example.org/tree-ops-ext#gid-1> ], [ a csvw:Row ; csvw:rownum 2 ; csvw:url <#row=3> ; csvw:describes <http://example.org/tree-ops-ext#gid-2> ], [ a csvw:Row ; csvw:rownum 3 ; csvw:url <#row=4> ; csvw:describes <http://example.org/tree-ops-ext#gid-6> ] . <http://example.org/tree-ops-ext#gid-1> <#on_street> "ADDISON AV" ; <#species> "Celtis australis" ; <#trim_cycle> "Large Tree Routine Prune"@en ; <#dbh> 11 ; <#inventory_date> "2010-10-18"^^xsd:date ; <#protected> false ; <#kml> "<Point><coordinates>-122.156485,37.440963</coordinates></Point>"^^rdf:XMLLiteral . <http://example.org/tree-ops-ext#gid-2> <#on_street> "EMERSON ST" ; <#species> "Liquidambar styraciflua" ; <#trim_cycle> "Large Tree Routine Prune"@en ; <#dbh> 11 ; <#inventory_date> "2010-06-02"^^xsd:date ; <#protected> false ; <#kml> "<Point><coordinates>-122.156749,37.440958</coordinates></Point>"^^rdf:XMLLiteral . <http://example.org/tree-ops-ext#gid-6> <#on_street> "ADDISON AV" ; <#species> "Robinia pseudoacacia" ; <#trim_cycle> "Large Tree Routine Prune"@en ; <#dbh> 29 ; <#inventory_date> "2010-06-01"^^xsd:date ; <#comments> "cavity or decay", "trunk decay", "codominant leaders", "included bark", "large leader or limb decay", "previous failure root damage", "root decay", "beware of BEES" ; <#protected> true ; <#kml> "<Point><coordinates>-122.156299,37.441151</coordinates></Point>"^^rdf:XMLLiteral .
Table T ({ "url": "https://example.org/tree-ops-ext.csv"}
) has been explicitly identified: { "@id": "<https://exmple.org/tree-ops-ext>"}
.
Non-core annotations and notes specified for table T ({ "url": "https://example.org/tree-ops-ext.csv"}
) are included in the output.
As the metadata description file https://example.org/tree-ops-ext.csv-metadata.json
defines a default language within the context ("@context": ["https://www.w3.org/ns/csvw", {"@language": "en"}]
), all non-core annotations of type string
(e.g. dc:title
, dcat:keyword
, dc:publisher
, dc:license
and dc:modified
) are expressed in the RDF output using the the appropriate language tag.
7.3 Example with single table and using virtual columns to produce multiple subjects per row
This example uses a single annotated table describing a listing of music events. Each row from the tabular data file corresponds to three resources; the music event itself, the location where that event occurs and the offer to sell tickets for that event. The goal is to convert the CSV content into schema.org markup that a search engine such as Google can use to index music events. Details of how Google expects this information to be structured can be found here.
The input tabular data file, published at https://example.org/events-listing.csv
, and the associated metadata description https://example.org/events-listing.csv-metadata.json
are provided below:
Name, Start Date, Location Name, Location Address, Ticket Url B.B. King,2014-04-12T19:30,"Lupo’s Heartbreak Hotel","79 Washington St., Providence, RI",https://www.etix.com/ticket/1771656 B.B. King,2014-04-13T20:00,"Lynn Auditorium","Lynn, MA, 01901",https://frontgatetickets.com/venue.php?id=11766
{ "@context": ["https://www.w3.org/ns/csvw", {"@language": "en"}], "url": "events-listing.csv", "dialect": {"trim": true}, "tableSchema": { "columns": [{ "name": "name", "titles": "Name", "aboutUrl": "#event-{_row}", "propertyUrl": "schema:name" }, { "name": "start_date", "titles": "Start Date", "datatype": { "base": "datetime", "format": "yyyy-MM-ddTHH:mm" }, "aboutUrl": "#event-{_row}", "propertyUrl": "schema:startDate" }, { "name": "location_name", "titles": "Location Name", "aboutUrl": "#place-{_row}", "propertyUrl": "schema:name" }, { "name": "location_address", "titles": "Location Address", "aboutUrl": "#place-{_row}", "propertyUrl": "schema:address" }, { "name": "ticket_url", "titles": "Ticket Url", "datatype": "anyURI", "aboutUrl": "#offer-{_row}", "propertyUrl": "schema:url" }, { "name": "type_event", "virtual": true, "aboutUrl": "#event-{_row}", "propertyUrl": "rdf:type", "valueUrl": "schema:MusicEvent" }, { "name": "type_place", "virtual": true, "aboutUrl": "#place-{_row}", "propertyUrl": "rdf:type", "valueUrl": "schema:Place" }, { "name": "type_offer", "virtual": true, "aboutUrl": "#offer-{_row}", "propertyUrl": "rdf:type", "valueUrl": "schema:Offer" }, { "name": "location", "virtual": true, "aboutUrl": "#event-{_row}", "propertyUrl": "schema:location", "valueUrl": "#place-{_row}" }, { "name": "offers", "virtual": true, "aboutUrl": "#event-{_row}", "propertyUrl": "schema:offers", "valueUrl": "#offer-{_row}" }] } }
The CSV to RDF translation is limited to providing one statement, or triple, per column in the table. The target schema.org markup requires 10 statements to describe each event. As the base tabular data file contains 5 columns, an additional 5 virtual columns have been added in order to provide for the full complement of statements—including the relationships between the 3 resources (event, location, and offer) described by each row of the table. Note that the virtual annotation is set to true
for these virtual columns.
Furthermore, note that no attempt is made to reconcile between locations or offers that may be associated with more than one event; every row in the table will create both a new location resource and offer resource in addition to the event resource. If considered necessary, applications such as OpenRefine may be used to identify and reconcile duplicate location resources once the RDF output has been generated.
The annotated table generated from parsing the tabular data file and associated metadata is shown below and provides the basis for the conversion to RDF.
Annotations for the resulting table T, with 10 columns and 2 rows, are shown below:
id | core annotations | ||
---|---|---|---|
url | columns | rows | |
T | https://example.org/events-listing.csv | C1, C2, C3, C4, C5, C6, C7, C8, C9, C10 | R1, R2 |
Annotations for the columns, rows and cells in table T are shown in the tables below:
Column annotations:
C1 | T | 1 | 1 | C1.1, C2.1 | name | Name | |
C2 | T | 2 | 2 | C1.2, C2.2 | start_date | Start Date | |
C3 | T | 3 | 3 | C1.3, C2.3 | location_name | Location Name | |
C4 | T | 4 | 4 | C1.4, C2.4 | location_address | Location Address | |
C5 | T | 5 | 5 | C1.5, C2.5 | ticket_url | Ticket Url | |
C6 | T | 6 | 6 | C1.6, C2.6 | type_event | true | |
C7 | T | 7 | 7 | C1.7, C2.7 | type_place | true | |
C8 | T | 8 | 8 | C1.8, C2.8 | type_offer | true | |
C9 | T | 9 | 9 | C1.9, C2.9 | location | true | |
C10 | T | 10 | 10 | C1.10, C2.10 | offers | true |
Row annotations:
id | core annotations | |||
---|---|---|---|---|
table | number | source number | cells | |
R1 | T | 1 | 2 | C1.1, C1.2, C1.3, C1.4, C1.5, C1.6, C1.7, C1.8, C1.9, C1.10 |
R2 | T | 2 | 3 | C2.1, C2.2, C2.3, C2.4, C2.5, C2.6, C2.7, C2.8, C2.9, C2.10 |
Cell annotations:
id | core annotations | |||||||
---|---|---|---|---|---|---|---|---|
table | column | row | string value | value | about URL | property URL | value URL | |
C1.1 | T | C1 | R1 | "B.B. King" | "B.B. King" | <https://example.org/events-listing.csv#event-1> | schema:name | |
C1.2 | T | C2 | R1 | "2014-04-12T19:30" | 2014-04-12T19:30:00 | <https://example.org/events-listing.csv#event-1> | schema:startDate | |
C1.3 | T | C3 | R1 | "Lupo’s Heartbreak Hotel" | "Lupo’s Heartbreak Hotel" | <https://example.org/events-listing.csv#place-1> | schema:name | |
C1.4 | T | C4 | R1 | "79 Washington St., Providence, RI" | "79 Washington St., Providence, RI" | <https://example.org/events-listing.csv#place-1> | schema:address | |
C1.5 | T | C5 | R1 | "https://www.etix.com/ticket/1771656" | <https://www.etix.com/ticket/1771656> | <https://example.org/events-listing.csv#offer-1> | schema:url | |
C1.6 | T | C6 | R1 | "" | null | <https://example.org/events-listing.csv#event-1> | rdf:type | schema:MusicEvent |
C1.7 | T | C7 | R1 | "" | null | <https://example.org/events-listing.csv#place-1> | rdf:type | schema:Place |
C1.8 | T | C8 | R1 | "" | null | <https://example.org/events-listing.csv#offer-1> | rdf:type | schema:Offer |
C1.9 | T | C9 | R1 | "" | null | <https://example.org/events-listing.csv#event-1> | schema:location | <https://example.org/events-listing.csv#place-1> |
C1.10 | T | C10 | R1 | "" | null | <https://example.org/events-listing.csv#event-1> | schema:offers | <https://example.org/events-listing.csv#offer-1> |
C2.1 | T | C1 | R2 | "B.B. King" | "B.B. King" | <https://example.org/events-listing.csv#event-2> | schema:name | |
C2.2 | T | C2 | R2 | "2014-04-13T20:00" | 2014-04-13T20:00:00 | <https://example.org/events-listing.csv#event-2> | schema:startDate | |
C2.3 | T | C3 | R2 | "Lynn Auditorium" | "Lynn Auditorium" | <https://example.org/events-listing.csv#place-2> | schema:name | |
C2.4 | T | C4 | R2 | "Lynn, MA, 01901" | "Lynn, MA, 01901" | <https://example.org/events-listing.csv#place-2> | schema:address | |
C2.5 | T | C5 | R2 | "https://frontgatetickets.com/venue.php?id=11766" | <https://frontgatetickets.com/venue.php?id=11766> | <https://example.org/events-listing.csv#offer-2> | schema:url | |
C2.6 | T | C6 | R2 | "" | null | <https://example.org/events-listing.csv#event-2> | rdf:type | schema:MusicEvent |
C2.7 | T | C7 | R2 | "" | null | <https://example.org/events-listing.csv#place-2> | rdf:type | schema:Place |
C2.8 | T | C8 | R2 | "" | null | <https://example.org/events-listing.csv#offer-2> | rdf:type | schema:Offer |
C2.9 | T | C9 | R2 | "" | null | <https://example.org/events-listing.csv#event-2> | schema:location | <https://example.org/events-listing.csv#place-2> |
C2.10 | T | C10 | R2 | "" | null | <https://example.org/events-listing.csv#event-2> | schema:offers | <https://example.org/events-listing.csv#offer-2> |
Minimal mode output for this example is provided in [turtle] syntax below:
@base <http://example.org/events-listing.csv> . @prefix schema: <http://schema.org/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <#event-1> a schema:MusicEvent ; schema:name "B.B. King" ; schema:startDate "2014-04-12T19:30:00"^^xsd:dateTime ; schema:location <#place-1> ; schema:offers <#offer-1> . <#place-1> a schema:Place ; schema:name "Lupo’s Heartbreak Hotel" ; schema:address "79 Washington St., Providence, RI" . <#offer-1> a schema:Offer ; schema:url "https://www.etix.com/ticket/1771656"^^xsd:anyURI . <#event-2> a schema:MusicEvent ; schema:name "B.B. King" ; schema:startDate "2014-04-13T20:00:00"^^xsd:dateTime ; schema:location <#place-2> ; schema:offers <#offer-2> . <#place-2> a schema:Place ; schema:name "Lynn Auditorium" ; schema:address "Lynn, MA, 01901" . <#offer-2> a schema:Offer ; schema:url "https://frontgatetickets.com/venue.php?id=11766"^^xsd:anyURI .
Three resources are defined for each row within the table; event, location, and offer.
Each column description in the metadata explicitly defines both aboutUrl
and propertyUrl
properties which are used to create the about URL and property URL annotations on the column's cells.
Columns C6, C7 and C8 ({ "name": "type_event"}
, { "name": "type_place"}
and { "name": "type_offer"}
) define the semantic types of the resources described by each row: schema:MusicEvent
, schema:Place
and schema:Offer
respectively.
Column C9 ({ "name": "location"}
) uses the about URL, property URL and value URL to assert the relationship between the event and location resources.
Column C10 ({ "name": "offer"}
) uses the about URL, property URL and value URL to assert the relationship between the event and offer resources.
Standard mode output for this example is provided in [turtle] syntax below:
@base <http://example.org/events-listing.csv> . @prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix schema: <http://schema.org/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . _:95cc7970-ce99-44b0-900c-e2c2c028bbd3 a csvw:TableGroup ; csvw:table [ a csvw:Table ; csvw:url <http://example.org/events-listing.csv> ; csvw:row [ a csvw:Row ; csvw:rownum 1 ; csvw:url <#row=2> ; csvw:describes <#event-1>, <#place-1>, <#offer-1> ], [ a csvw:Row ; csvw:rownum 2 ; csvw:url <#row=3> ; csvw:describes <#event-2>, <#place-2>, <#offer-2> ] ] . <#event-1> a schema:MusicEvent ; schema:name "B.B. King" ; schema:startDate "2014-04-12T19:30:00"^^xsd:dateTime ; schema:location <#place-1> ; schema:offers <#offer-1> . <#place-1> a schema:Place ; schema:name "Lupo’s Heartbreak Hotel" ; schema:address "79 Washington St., Providence, RI" . <#offer-1> a schema:Offer ; schema:url "https://www.etix.com/ticket/1771656"^^xsd:anyURI . <#event-2> a schema:MusicEvent ; schema:name "B.B. King" ; schema:startDate "2014-04-13T20:00:00"^^xsd:dateTime ; schema:location <#place-2> ; schema:offers <#offer-2> . <#place-2> a schema:Place ; schema:name "Lynn Auditorium" ; schema:address "Lynn, MA, 01901" . <#offer-2> a schema:Offer ; schema:url "https://frontgatetickets.com/venue.php?id=11766"^^xsd:anyURI .
The resources described by each row are explicitly defined using the about URL annotation this case three resources per row (event, location, and offer). The relationship between the row and each subject resource is asserted using the csvw:describes
property; e.g. for row R1, a blank node, we state the following triples:
- subject
- R1
- predicate
csvw:describes
- object
t1:event-1
- subject
- R1
- predicate
csvw:describes
- object
t1:place-1
- subject
- R1
- predicate
csvw:describes
- object
t1:offer-1
7.4 Example with group of tables comprising four interrelated tables
This example is based on Use Case #4 - Publication of public sector roles and salaries and uses four annotated tables published as a group of tables. Information about senior roles and junior roles within a government department or organization are published in CSV format by each department. These are validated against a centrally published schema to ensure that all the data published by departments is consistent. Additionally, lists of organizations and professions are also published centrally, providing controlled vocabularies against which departmental submissions are validated.
Information published about junior and senior roles provides summary information for each post within the government department or organization. Whilst the junior role information is anonymous, providing only an indication of the number of full-time-equivalent (FTE) staff occupying a given post, the senior role information specifies the named individual occupying each post. As such, each row from the tabular data file describing senior roles corresponds to two resources; the post and the person occupying that post.
This example is concerned only with converting the information provided by each government department or organization rather than the centrally published information listing organizations and professions.
The input tabular data files and associated metadata descriptions are provided below:
Organization Unique Reference,Organization Name,Department Reference hefce.ac.uk,Higher Education Funding Council for England,bis.gov.uk bis.gov.uk,"Department for Business, Innovation and Skills",xx
Profession Finance Information Technology Operational Delivery Policy
Post Unique Reference,Name,Grade,Job Title,Reports to Senior Post,Profession,Organization Reference 90115,Steve Egan,SCS1A,Deputy Chief Executive,90334,Finance,hefce.ac.uk 90334,Sir Alan Langlands,SCS4,Chief Executive,xx,Policy,hefce.ac.uk
Reporting Senior Post,Grade,Payscale Minimum (£),Payscale Maximum (£),Generic Job Title,Number of Posts (FTE),Profession,Organization Reference 90115,4,17426,20002,Administrator,8.67,Operational Delivery,hefce.ac.uk 90115,5,19546,22478,Administrator,0.5,Operational Delivery,hefce.ac.uk
{ "@type": "TableGroup", "@context": ["https://www.w3.org/ns/csvw", {"@language": "en"}], "tables": [{ "url": "gov.uk/data/organizations.csv", "tableSchema": "gov.uk/schema/organizations.json", "suppressOutput": true }, { "url": "gov.uk/data/professions.csv", "tableSchema": "gov.uk/schema/professions.json", "suppressOutput": true }, { "url": "senior-roles.csv", "tableSchema": "gov.uk/schema/senior-roles.json" }, { "url": "junior-roles.csv", "tableSchema": "gov.uk/schema/junior-roles.json" }] }
{ "@id": "https://example.org/gov.uk/schema/organizations.json", "@context": "https://www.w3.org/ns/csvw", "columns": [{ "name": "ref", "titles": "Organization Unique Reference", "datatype": "string", "required": true, "propertyUrl": "dc:identifier" }, { "name": "name", "titles": "Organization Name", "datatype": "string", "propertyUrl": "foaf:name" }, { "name": "department", "titles": "Department Reference", "datatype": "string", "null": "xx", "propertyUrl": "org:subOrganizationOf", "valueUrl": "https://example.org/organization/{department}" }], "primaryKey": "ref", "aboutUrl": "https://example.org/organization/{ref}", "foreignKeys": [{ "columnReference": "department", "reference": { "schemaReference": "https://example.org/gov.uk/schema/organizations.json", "columnReference": "ref" } }] }
{ "@id": "https://example.org/gov.uk/schema/professions.json", "@context": "https://www.w3.org/ns/csvw", "columns": [{ "name": "name", "titles": "Profession", "datatype": "string", "required": true }], "primaryKey": "name" }
{ "@id": "https://example.org/gov.uk/schema/senior-roles.json", "@context": "https://www.w3.org/ns/csvw", "columns": [{ "name": "ref", "titles": "Post Unique Reference", "datatype": "string", "required": true, "propertyUrl": "dc:identifier" }, { "name": "name", "titles": "Name", "datatype": "string", "aboutUrl": "https://example.org/organization/{organizationRef}/person/{_row}", "propertyUrl": "foaf:name" }, { "name": "grade", "titles": "Grade", "datatype": "string", "propertyUrl": "https://example.org/gov.uk/def/grade" }, { "name": "job", "titles": "Job Title", "datatype": "string", "propertyUrl": "https://example.org/gov.uk/def/job" }, { "name": "reportsTo", "titles": "Reports to Senior Post", "datatype": "string", "null": "xx", "propertyUrl": "org:reportsTo", "valueUrl": "https://example.org/organization/{organizationRef}/post/{reportsTo}" }, { "name": "profession", "titles": "Profession", "datatype": "string", "propertyUrl": "https://example.org/gov.uk/def/profession" }, { "name": "organizationRef", "titles": "Organization Reference", "datatype": "string", "propertyUrl": "org:postIn", "valueUrl": "https://example.org/organization/{organizationRef}", "required": true }, { "name": "post_holder", "virtual": true, "propertyUrl": "org:heldBy", "valueUrl": "https://example.org/organization/{organizationRef}/person/{_row}" }], "primaryKey": "ref", "aboutUrl": "https://example.org/organization/{organizationRef}/post/{ref}", "foreignKeys": [{ "columnReference": "reportsTo", "reference": { "schemaReference": "https://example.org/gov.uk/schema/senior-roles.json", "columnReference": "ref" } }, { "columnReference": "profession", "reference": { "schemaReference": "https://example.org/gov.uk/schema/professions.json", "columnReference": "name" } }, { "columnReference": "organizationRef", "reference": { "schemaReference": "https://example.org/gov.uk/schema/organizations.json", "columnReference": "ref" } }] }
{ "@id": "https://example.org/gov.uk/schema/junior-roles.json", "@context": "https://www.w3.org/ns/csvw", "columns": [{ "name": "reportsToSenior", "titles": "Reporting Senior Post", "datatype": "string", "propertyUrl": "org:reportsTo", "valueUrl": "https://example.org/organization/{organizationRef}/post/{reportsToSenior}", "required": true }, { "name": "grade", "titles": "Grade", "datatype": "string", "propertyUrl": "https://example.org/gov.uk/def/grade" }, { "name": "min_pay", "titles": "Payscale Minimum (£)", "datatype": "integer", "propertyUrl": "https://example.org/gov.uk/def/min_pay" }, { "name": "max_pay", "titles": "Payscale Maximum (£)", "datatype": "integer", "propertyUrl": "https://example.org/gov.uk/def/max_pay" }, { "name": "job", "titles": "Generic Job Title", "datatype": "string", "propertyUrl": "https://example.org/gov.uk/def/job" }, { "name": "number", "titles": "Number of Posts (FTE)", "datatype": "number", "propertyUrl": "https://example.org/gov.uk/def/number_of_posts" }, { "name": "profession", "titles": "Profession", "datatype": "string", "propertyUrl": "https://example.org/gov.uk/def/profession" }, { "name": "organizationRef", "titles": "Organization Reference", "datatype": "string", "propertyUrl": "org:postIn", "valueUrl": "https://example.org/organization/{organizationRef}", "required": true }], "foreignKeys": [{ "columnReference": "reportsToSenior", "reference": { "schemaReference": "https://example.org/gov.uk/schema/senior-roles.json", "columnReference": "ref" } }, { "columnReference": "profession", "reference": { "schemaReference": "https://example.org/gov.uk/schema/professions.json", "columns": "name" } }, { "columnReference": "organizationRef", "reference": { "schemaReference": "https://example.org/gov.uk/schema/organizations.json", "columns": "ref" } }] }
This example makes extensive use of the example.org
domain. As described in [RFC6761], this domain is used for illustrative examples within documentation. In reality, the resources described here with the URL path /gov.uk
would be centrally published by the UK Government at, say, the domain data.gov.uk
.
Given that these resources are centrally published with an aspiration for reuse, the schema descriptions have been factored out into separate resources. As such, the top-level metadata description resource metadata.json
simply provides the list of tables and binds each of them to the appropriate schema that is defined elsewhere.
Finally, note that because the centrally published metadata descriptions are intended to be reused across many government departments and organizations, extra consideration has been given to defining URIs for the person and post resources defined in each row of the senior roles tabular data and subsequently referenced from the junior roles tabular data. To ensure that naming clashes are avoided, the unique reference for the organization to which the person or post belongs has been included in a path segment of the identifier. For example, the URI template property aboutUrl
used to identify the senior post is specified as https://example.org/organization/{organizationRef}/post/{ref}
, thus yielding the URI https://example.org/organization/hefce.ac.uk/post/90115
for the post described in the first row of the senior roles tabular data.
The group of tables generated from parsing the tabular data files and associated metadata is shown below and provides the basis for the conversion to RDF.
Annotations for the group of tables G and the four tables Ta, Tb, Tc, and Td are shown below:
Group of Tables annotations:
id | core annotations |
---|---|
tables | |
G | Ta, Tb, Tc, Td |
Table annotations:
id | core annotations | ||||
---|---|---|---|---|---|
url | columns | rows | suppress output | foreign keys | |
Ta | https://example.org/gov.uk/data/organizations.csv | Ca1, Ca2, Ca3 | Ra1, Ra2 | true | Fa1 |
Tb | https://example.org/gov.uk/professions.csv | Cb1 | Rb1, Rb2, Rb3, Rb4 | true | |
Tc | https://example.org/senior-roles.csv | Cc1, Cc2, Cc3, Cc4, Cc5, Cc6 | Rc1, Rc2 | false | Fc1, Fc2, Fc3 |
Td | https://example.org/junior-roles.csv | Cd1, Cd2, Cd3, Cd4, Cd5, Cd6, Cd7 | Rd1, Rd2 | false | Fd1, Fd2, Fd3 |
In this example, output for the centrally published lists of organizations and professions, tables Ta and Tb (https://example.org/gov.uk/data/organizations.csv
and https://example.org/gov.uk/data/professions.csv
respectively), are not required; only information from the departmental submissions is to be translated to RDF. Note the suppress output annotation on this table.
The following foreign keys are defined:
id | columns in table | columns in referenced table |
---|---|---|
Fa1 | Ca3 | Ca1 |
Fc1 | Cc5 | Cc1 |
Fc2 | Cc6 | Cb1 |
Fc3 | Cc7 | Ca1 |
Fd1 | Cd1 | Cc1 |
Fd2 | Cd7 | Cb1 |
Fd3 | Cd8 | Ca1 |
Annotations for the columns, rows and cells in table T are shown in the tables below:
Column annotations:
id | core annotations | |||||||
---|---|---|---|---|---|---|---|---|
table | number | source number | cells | name | titles | required | virtual | |
Ca1 | Ta | 1 | 1 | Ca1.1, Ca2.1 | ref | Organization Unique Reference | true | |
Ca2 | Ta | 1 | 1 | Ca1.2, Ca2.2 | name | Organization Name | ||
Ca3 | Ta | 1 | 1 | Ca1.3, Ca2.3 | department | Department Reference | ||
Cb1 | Tb | 1 | 1 | Cb1.1, Cb2.1, Cb3.1, Cb4.1 | name | Profession | true | |
Cc1 | Tc | 1 | 1 | Cc1.1, Cc2.1 | ref | Post Unique Reference | true | |
Cc2 | Tc | 2 | 2 | Cc1.2, Cc2.2 | name | Name | ||
Cc3 | Tc | 3 | 3 | Cc1.3, Cc2.3 | grade | Grade | ||
Cc4 | Tc | 4 | 4 | Cc1.4, Cc2.4 | job | Job Title | ||
Cc5 | Tc | 5 | 5 | Cc1.5, Cc2.5 | reportsTo | Reports to Senior Post | ||
Cc6 | Tc | 6 | 6 | Cc1.6, Cc2.6 | profession | Profession | ||
Cc7 | Tc | 7 | 7 | Cc1.7, Cc2.7 | organizationRef | Organization Reference | true | |
Cc8 | Tc | 8 | 8 | Cc1.8, Cc2.8 | post_holder | true | ||
Cd1 | Td | 1 | 1 | Cd1.1, Cd2.1 | reportsToSenior | Reporting Senior Post | true | |
Cd2 | Td | 2 | 2 | Cd1.2, Cd2.2 | grade | Grade | ||
Cd3 | Td | 3 | 3 | Cd1.3, Cd2.3 | min_pay | Payscale Minimum (£) | ||
Cd4 | Td | 4 | 4 | Cd1.4, Cd2.4 | max_pay | Payscale Maximum (£) | ||
Cd5 | Td | 5 | 5 | Cd1.5, Cd2.5 | job | Generic Job Title | ||
Cd6 | Td | 6 | 6 | Cd1.6, Cd2.6 | number | Number of Posts (FTE) | ||
Cd7 | Td | 7 | 7 | Cd1.7, Cd2.7 | profession | Profession | ||
Cd8 | Td | 8 | 8 | Cd1.8, Cd2.8 | organizationRef | Organization Reference | true |
Column Cc8, with the virtual
annotation specified as true
, is used to relate the person resource, whose name is provided in column Cc2, to the associated post resource within the current row of table Tc ({ "url": "https://example.org/senior-roles.csv" }
).
Row annotations:
id | core annotations | |||
---|---|---|---|---|
table | number | source number | cells | |
Ra1 | Ta | 1 | 2 | Ca1.1, Ca1.2, Ca1.3 |
Ra2 | Ta | 2 | 3 | Ca2.1, Ca2.2, Ca2.3 |
Rb1 | Tb | 1 | 2 | Cb1.1 |
Rb2 | Tb | 2 | 3 | Cb2.1 |
Rb3 | Tb | 3 | 4 | Cb3.1 |
Rb4 | Tb | 4 | 5 | Cb4.1 |
Rc1 | Tc | 1 | 2 | Cc1.1, Cc1.2, Cc1.3, Cc1.4, Cc1.5, Cc1.6, Cc1.7, Cc1.8 |
Rc2 | Tc | 2 | 3 | Cc2.1, Cc2.2, Cc2.3, Cc2.4, Cc2.5, Cc2.6, Cc2.7, Cc2.8 |
Rd1 | Td | 1 | 2 | Cd1.1, Cd1.2, Cd1.3, Cd1.4, Cd1.5, Cd1.6, Cd1.7, Cd1.8 |
Rd2 | Td | 2 | 3 | Cd2.1, Cd2.2, Cd2.3, Cd2.4, Cd2.5, Cd2.6, Cd2.7, Cd2.8 |
Cell annotations:
id | core annotations | |||||||
---|---|---|---|---|---|---|---|---|
table | column | row | string value | value | about URL | property URL | value URL | |
Ca1.1 | Ta | Ca1 | Ra1 | "hefce.ac.uk" | "hefce.ac.uk" | <https://example.org/organization/hefce.ac.uk> | dc:identifier | |
Ca1.2 | Ta | Ca2 | Ra1 | "Higher Education Funding Council for England" | "Higher Education Funding Council for England" | <https://example.org/organization/hefce.ac.uk> | foaf:name | |
Ca1.3 | Ta | Ca3 | Ra1 | "bis.gov.uk" | "bis.gov.uk" | <https://example.org/organization/hefce.ac.uk> | org:subOrganizationOf | <https://example.org/organization/bis.gov.uk> |
Ca2.1 | Ta | Ca1 | Ra2 | "bis.gov.uk" | "bis.gov.uk" | <https://example.org/organization/bis.gov.uk> | dc:identifier | |
Ca2.2 | Ta | Ca2 | Ra2 | "Department for Business, Innovation and Skills" | "Department for Business, Innovation and Skills" | <https://example.org/organization/bis.gov.uk> | foaf:name | |
Ca2.3 | Ta | Ca3 | Ra2 | "xx" | null | <https://example.org/organization/bis.gov.uk> | org:subOrganizationOf | |
Cb1.1 | Tb | Cb1 | Rb1 | "Finance" | "Finance" | |||
Cb2.1 | Tb | Cb1 | Rb2 | "Information Technology" | "Information Techology" | |||
Cb3.1 | Tb | Cb1 | Rb3 | "Operational Delivery" | "Operational Delivery" | |||
Cb4.1 | Tb | Cb1 | Rb4 | "Policy" | "Policy" | |||
Cc1.1 | Tc | Cc1 | Rc1 | "90115" | "90115" | <https://example.org/organization/hefce.ac.uk/post/90115> | dc:identifier | |
Cc1.2 | Tc | Cc2 | Rc1 | "Steve Egan" | "Steve Egan" | <https://example.org/organization/hefce.ac.uk/person/1> | foaf:name | |
Cc1.3 | Tc | Cc3 | Rc1 | "SCS1A" | "SCS1A" | <https://example.org/organization/hefce.ac.uk/post/90115> | <https://example.org/gov.uk/def/grade> | |
Cc1.4 | Tc | Cc4 | Rc1 | "Deputy Chief Executive" | "Deputy Chief Executive" | <https://example.org/organization/hefce.ac.uk/post/90115> | <https://example.org/gov.uk/def/job> | |
Cc1.5 | Tc | Cc5 | Rc1 | "90334" | "90334" | <https://example.org/organization/hefce.ac.uk/post/90115> | org:reportsTo | <https://example.org/organization/hefce.ac.uk/post/90334> |
Cc1.6 | Tc | Cc6 | Rc1 | "Finance" | "Finance" | <https://example.org/organization/hefce.ac.uk/post/90115> | <https://example.org/gov.uk/def/profession> | |
Cc1.7 | Tc | Cc7 | Rc1 | "hefce.ac.uk" | "hefce.ac.uk" | <https://example.org/organization/hefce.ac.uk/post/90115> | org:postIn | <https://example.org/organization/hefce.ac.uk> |
Cc1.8 | Tc | Cc8 | Rc1 | "" | null | <https://example.org/organization/hefce.ac.uk/post/90115> | org:heldBy | <https://example.org/organization/hefce.ac.uk/person/1> |
Cc2.1 | Tc | Cc1 | Rc2 | "90334" | "90334" | <https://example.org/organization/hefce.ac.uk/post/90334> | dc:identifier | |
Cc2.2 | Tc | Cc2 | Rc2 | "Sir Alan Langlands" | "Sir Alan Langlands" | <https://example.org/organization/hefce.ac.uk/person/2> | foaf:name | |
Cc2.3 | Tc | Cc3 | Rc2 | "SCS4" | "SCS4" | <https://example.org/organization/hefce.ac.uk/post/90334> | <https://example.org/gov.uk/def/grade> | |
Cc2.4 | Tc | Cc4 | Rc2 | "Chief Executive" | "Chief Executive" | <https://example.org/organization/hefce.ac.uk/post/90334> | <https://example.org/gov.uk/def/job> | |
Cc2.5 | Tc | Cc5 | Rc2 | "xx" | null | <https://example.org/organization/hefce.ac.uk/post/90334> | org:reportsTo | |
Cc2.6 | Tc | Cc6 | Rc2 | "Policy" | "Policy" | <https://example.org/organization/hefce.ac.uk/post/90334> | <https://example.org/gov.uk/def/profession> | |
Cc2.7 | Tc | Cc7 | Rc2 | "hefce.ac.uk" | "hefce.ac.uk" | <https://example.org/organization/hefce.ac.uk/post/90334> | org:postIn | <https://example.org/organization/hefce.ac.uk> |
Cc2.8 | Tc | Cc8 | Rc2 | "" | null | <https://example.org/organization/hefce.ac.uk/post/90334> | org:heldBy | <https://example.org/organization/hefce.ac.uk/person/2> |
Cd1.1 | Td | Cd1 | Rd1 | "90115" | "90115" | org:reportsTo | <https://example.org/organization/hefce.ac.uk/post/90115> | |
Cd1.2 | Td | Cd2 | Rd1 | "4" | "4" | <https://example.org/gov.uk/def/grade> | ||
Cd1.3 | Td | Cd3 | Rd1 | "17426" | 17426 | <https://example.org/gov.uk/def/min_pay> | ||
Cd1.4 | Td | Cd4 | Rd1 | "20002" | 20002 | <https://example.org/gov.uk/def/max_pay> | ||
Cd1.5 | Td | Cd5 | Rd1 | "Administrator" | "Administrator" | <https://example.org/gov.uk/def/job> | ||
Cd1.6 | Td | Cd6 | Rd1 | "8.67" | 8.67 | <https://example.org/gov.uk/def/number_of_posts> | ||
Cd1.7 | Td | Cd7 | Rd1 | "Operational Delivery" | "Operational Delivery" | <https://example.org/gov.uk/def/profession> | ||
Cd1.8 | Td | Cd8 | Rd1 | "hefce.ac.uk" | "hefce.ac.uk" | org:postIn | <https://example.org/organization/hefce.ac.uk> | |
Cd2.1 | Td | Cd1 | Rd2 | "90115" | "90115" | org:reportsTo | <https://example.org/organization/hefce.ac.uk/post/90115> | |
Cd2.2 | Td | Cd2 | Rd2 | "5" | "5" | <https://example.org/gov.uk/def/grade> | ||
Cd2.3 | Td | Cd3 | Rd2 | "19546" | 19546 | <https://example.org/gov.uk/def/min_pay> | ||
Cd2.4 | Td | Cd4 | Rd2 | "22478" | 22478 | <https://example.org/gov.uk/def/max_pay> | ||
Cd2.5 | Td | Cd5 | Rd2 | "Administrator" | "Administrator" | <https://example.org/gov.uk/def/job> | ||
Cd2.6 | Td | Cd6 | Rd2 | "0.5" | 0.5 | <https://example.org/gov.uk/def/number_of_posts> | ||
Cd2.7 | Td | Cd7 | Rd2 | "Operational Delivery" | "Operational Delivery" | <https://example.org/gov.uk/def/profession> | ||
Cd2.8 | Td | Cd8 | Rd2 | "hefce.ac.uk" | "hefce.ac.uk" | org:postIn | <https://example.org/organization/hefce.ac.uk> |
Notice that value URL is not specified for cells Ca2.3 and Cc2.5 because in each case the cell value is null
and the virtual annotation of column Cb5 is not defined.
Minimal mode output for this example is provided in [turtle] syntax below:
@prefix dc: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix org: <http://www.w3.org/ns/org#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <http://example.org/organization/hefce.ac.uk/post/90115> dc:identifier "90115" ; org:heldBy <http://example.org/organization/hefce.ac.uk/person/1> ; <http://example.org/gov.uk/def/grade> "SCS1A" ; <http://example.org/gov.uk/def/job> "Deputy Chief Executive" ; org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90334> ; <http://example.org/gov.uk/def/profession> "Finance" ; org:postIn <http://example.org/organization/hefce.ac.uk> . <http://example.org/organization/hefce.ac.uk/person/1> foaf:name "Steve Egan" . <http://example.org/organization/hefce.ac.uk/post/90334> dc:identifier "90334" ; org:heldBy <http://example.org/organization/hefce.ac.uk/person/2> ; <http://example.org/gov.uk/def/grade> "SCS4" ; <http://example.org/gov.uk/def/job> "Chief Executive" ; <http://example.org/gov.uk/def/profession> "Policy" ; org:postIn <http://example.org/organization/hefce.ac.uk> . <http://example.org/organization/hefce.ac.uk/person/2> foaf:name "Sir Alan Langlands" . _:d8b8e40c-8c74-458b-99f7-64d1cf5c65f2 org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115> ; <http://example.org/gov.uk/def/grade> "4" ; <http://example.org/gov.uk/def/min_pay> "17426"^^xsd:integer ; <http://example.org/gov.uk/def/max_pay> "20002"^^xsd:integer ; <http://example.org/gov.uk/def/job> "Administrator" ; <http://example.org/gov.uk/def/number_of_posts> "8.67"^^xsd:double ; <http://example.org/gov.uk/def/profession> "Operational Delivery" ; org:postIn <http://example.org/organization/hefce.ac.uk> . _:fa1fa954-dd5f-4aa1-b2bc-20bf9867fac6 org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115> ; <http://example.org/gov.uk/def/grade> "5" ; <http://example.org/gov.uk/def/min_pay> "19546"^^xsd:integer ; <http://example.org/gov.uk/def/max_pay> "22478"^^xsd:integer ; <http://example.org/gov.uk/def/job> "Administrator" ; <http://example.org/gov.uk/def/number_of_posts> "0.5"^^xsd:double ; <http://example.org/gov.uk/def/profession> "Operational Delivery" ; org:postIn <http://example.org/organization/hefce.ac.uk> .
Output for tables Ta and Tb ({ "url": "https://example.org/gov.uk/data/organizations.csv" }
and { "url": "https://example.org/gov.uk/data/professions.csv" }
) are not included as the suppress output annotation is true
.
The property URL is specified for all cells in tables Tc and Td.
Columns Cc5 and Cd1 ({ "name": "reportsTo" }
and { "name": "reportsToSenior" }
) use the about URL, property URL and value URL annotations to assert the relationship between the post described by a given row and the senior post to which it reports.
Similarly, columns Cc7 and Cd8 (both with { "name": "organizationRef" }
) use the about URL, property URL and value URL annotations to assert the relationship between the post described by a given row and the organization to which it belongs.
Finally, note that two resources are created for each row within table Tc ({ "url": "https://example.org/senior-roles.csv" }
): the person and the post they occupy. The relationship between these resources is specified via virtual column Cc8 ({ "name": "post_holder" }
) using the about URL, property URL and value URL annotations.
Standard mode output for this example is provided in [turtle] syntax below:
@prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix dc: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix org: <http://www.w3.org/ns/org#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . _:3d36cfbb-d2d5-4573-a1a7-3bf817062db8 a csvw:TableGroup ; csvw:table [ a csvw:Table ; csvw:url <http://example.org/senior-roles.csv> ; csvw:row [ a csvw:Row ; csvw:rownum "1"^^xsd:integer ; csvw:url <http://example.org/senior-roles.csv#row=2> ; csvw:describes <http://example.org/organization/hefce.ac.uk/post/90115>, <https://example.org/organization/hefce.ac.uk/person/1> ], [ a csvw:Row ; csvw:rownum "2"^^xsd:integer ; csvw:url <http://example.org/senior-roles.csv#row=3> ; csvw:describes <http://example.org/organization/hefce.ac.uk/post/90334>, <https://example.org/organization/hefce.ac.uk/person/2> ] ], [ a csvw:Table ; csvw:url <http://example.org/junior-roles.csv> ; csvw:row [ a csvw:Row ; csvw:rownum "1"^^xsd:integer ; csvw:url <http://example.org/junior-roles.csv#row=2> ; csvw:describes _:d8b8e40c-8c74-458b-99f7-64d1cf5c65f2 ], [ a csvw:Row ; csvw:rownum "2"^^xsd:integer ; csvw:url <http://example.org/junior-roles.csv#row=3> ; csvw:describes _:fa1fa954-dd5f-4aa1-b2bc-20bf9867fac6 ] ] . <http://example.org/organization/hefce.ac.uk/post/90115> dc:identifier "90115" ; org:heldBy <http://example.org/organization/hefce.ac.uk/person/1> ; <http://example.org/gov.uk/def/grade> "SCS1A" ; <http://example.org/gov.uk/def/job> "Deputy Chief Executive" ; org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90334> ; <http://example.org/gov.uk/def/profession> "Finance" ; org:postIn <http://example.org/organization/hefce.ac.uk> . <http://example.org/organization/hefce.ac.uk/person/1> foaf:name "Steve Egan" . <http://example.org/organization/hefce.ac.uk/post/90334> dc:identifier "90334" ; org:heldBy <http://example.org/organization/hefce.ac.uk/person/2> ; <http://example.org/gov.uk/def/grade> "SCS4" ; <http://example.org/gov.uk/def/job> "Chief Executive" ; <http://example.org/gov.uk/def/profession> "Policy" ; org:postIn <http://example.org/organization/hefce.ac.uk> . <http://example.org/organization/hefce.ac.uk/person/2> foaf:name "Sir Alan Langlands" . _:d8b8e40c-8c74-458b-99f7-64d1cf5c65f2 org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115> ; <http://example.org/gov.uk/def/grade> "4" ; <http://example.org/gov.uk/def/min_pay> "17426"^^xsd:integer ; <http://example.org/gov.uk/def/max_pay> "20002"^^xsd:integer ; <http://example.org/gov.uk/def/job> "Administrator" ; <http://example.org/gov.uk/def/number_of_posts> "8.67"^^xsd:double ; <http://example.org/gov.uk/def/profession> "Operational Delivery" ; org:postIn <http://example.org/organization/hefce.ac.uk> . _:fa1fa954-dd5f-4aa1-b2bc-20bf9867fac6 org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115> ; <http://example.org/gov.uk/def/grade> "5" ; <http://example.org/gov.uk/def/min_pay> "19546"^^xsd:integer ; <http://example.org/gov.uk/def/max_pay> "22478"^^xsd:integer ; <http://example.org/gov.uk/def/job> "Administrator" ; <http://example.org/gov.uk/def/number_of_posts> "0.5"^^xsd:double ; <http://example.org/gov.uk/def/profession> "Operational Delivery" ; org:postIn <http://example.org/organization/hefce.ac.uk> .
Table group G was explicitly defined, but has not been explicitly identified; the table group and table resources are treated as blank nodes.
The person and post resources described by each row of table Tc ({ "url": "https://example.org/senior-roles.csv"}
) are explcitly defined using the aboutUrl
property; therefore, say, for row Rc1 we state the following triples:
- subject
- Rc1
- predicate
csvw:describes
- object
<https://example.org/organization/hefce.ac.uk/post/90115>
- subject
- Rc1
- predicate
csvw:describes
- object
<https://example.org/organization/hefce.ac.uk/person/1>
Conversely, the aboutUrl
property has not been defined for resources described by each row of table Td ({ "url": "https://example.org/junior-roles.csv"}
); therefore blank nodes are used, e.g. for row Rd1 we state the following triple:
- subject
- Rd1
- predicate
csvw:describes
- object
_:d8b8e40c-8c74-458b-99f7-64d1cf5c65f2
A. Relationship to RDB Direct Mapping
This section is non-normative.
The "Direct Mapping of Relational Data to RDF" W3C Recommendation [rdb-direct-mapping] defines a simple transformation (referred to as direct mapping) from a relational representation of data to RDF. The direct mapping takes as input a relational database (data and schema), and generates an RDF graph called the direct graph. Tables in a relational databases bear a strong resemblance to tabular data as defined in [tabular-data-model]; this section highlights the similarities and differences between the direct mapping and the tabular data mapping defined by this document. The following statements summarize the relationships:
- It is possible, for a single table in a relational database, to provide an annotation to the corresponding tabular data (i.e., the CSV export of the table from the database) such that the output of the direct mapping is semantically equivalent to the output of the (minimal) tabular data mapping. See the example with a single table.
- If the database consists of several interrelated tables it is not always possible to provide an annotation to the corresponding tabular data to achieve the same level of semantic equivalence. Examples are provided indicating where tabular data mapping can and cannot match direct mapping: the example with two tables and the example with three tables respectively.
The rest of this section will provide some examples. These are somewhat simplified forms of the examples as used in the [rdb-direct-mapping] document.
A.1 Single table case
Consider a simple table in a relational database, called People
:
PK | ||
ID | fname | addr |
---|---|---|
7 | Bob | 18 |
A relational table always has a schema that the direct mapping makes use of. In this example, the schema defines that:
- The table has three columns named
ID
,fname
, andaddr
. - Values in
ID
are the primary keys in the table, and the corresponding cells contain integers. - Cells in the
fname
column contain strings. - Cells in the
addr
column contain integers.
Using the schema information, the direct graph is as follows (see [rdb-direct-mapping] for further details):
<https://foo.example.org/DB/People/ID=7> rdf:type <https://foo.example.org/DB/People>; <https://foo.example.org/DB/People/#ID> 7; <https://foo.example.org/DB/People/#fname> "Bob"; <https://foo.example.org/DB/People/#addr> 18.
where https://foo.example.org/DB/
is the URL for the database containing the People
table.
When exporting the table into CSV the column names would naturally be mapped on the titles of the respective columns. The (minimal) tabular data mapping yields:
[ <https://foo.example.org/CSV/People/#ID> "7"; <https://foo.example.org/CSV/People/#fname> "Bob"; <https://foo.example.org/CSV/People/#addr> "18"; ]
where https://foo.example.org/CSV/People
is the URL of the CSV file exported from the relational database.
Comparing the two conversion results:
- The tabular data mapping does not have the information that the first column provides unique identifiers for the rows (i.e., that it is a primary key); consequently, a blank node must be used for the common row subject.
- The tabular data mapping does not have information on the data types and, therefore, cannot presume that the first and the third columns contain integers.
- The direct mapping adds a triple with predicate
rdf:type
to the direct graph. However, by default (e.g., without additional annotation), the tabular data mapping has insufficient information to make such an assertion.
It is, however, possible to add annotation to the tabular data so that the two graphs would be semantically equivalent. Indeed, consider the following metadata using the definitions of the [tabular-metadata] specification:
{ "@context": "https://www.w3.org/ns/csvw", "tableSchema": { "url" : "https://foo.example.org/CSV/People", "aboutUrl" : "https://foo.example.org/CSV/People/ID={ID}", "columns": [{ "name": "ID", "datatype" : "integer" }, { "name": "fname", }, { "name": "addr", "datatype" : "integer" }, { "name": "type", "virtual": true, "propertyUrl": "rdf:type", "valueUrl" : "https://foo.example.org/CSV/People" }], } }
The metadata adds annotations on datatypes, adds the RDF typing triple explicitly (using a virtual column), and uses the aboutUrl
URL template property to provide the common subject. Essentially, the metadata provides the information that the direct mapping retrieves from the table schema.
The [tabular-metadata] specification includes an annotation for primary keys. However, that annotation is used for validation and does not influence the tabular data mapping, hence it is not included in the example metadata.
A processor exporting the table into a CSV file may be able to automatically generate the metadata to complement the CSV file.
A.2 Simple, two table case
Consider a two interrelated simple tables in a relational database, called People
and Addresses
, respectively:
PK | → Addresses(ID) | |
ID | fname | addr |
---|---|---|
7 | Bob | 18 |
PK | ||
ID | city | state |
---|---|---|
18 | Cambridge | MA |
Beyond what is already specified for the People
table the corresponding relational schema also specifies that:
- The
Addresses
table has three columns namedID
,city
, andstate
. - Values of
ID
are primary keys in theAddresses
table, and the corresponding cells contain integers. - Cells in both the
city
andstate
columns in theAddresses
table contain strings. - Cells in the
addr
column of thePeople
table are foreign keys that reference theID
field of theAddresses
table.
Using the schema information, the direct graph is as follows (see [rdb-direct-mapping] for further details):
<https://foo.example.org/DB/People/ID=7> rdf:type <https://foo.example.org/DB/People>;
<https://foo.example.org/DB/People#ID> 7;
<https://foo.example.org/DB/People#fname> "Bob";
<https://foo.example.org/DB/People#addr> 18;
<https://foo.example.org/DB/People#ref-addr> <https://foo.example.org/DB/Addresses/ID=18>.
<https://foo.example.org/DB/Addresses/ID=18> rdf:type <https://foo.example.org/DB/Addresses>;
<https://foo.example.org/DB/Addresses#ID> 18;
<https://foo.example.org/DB/Addresses#city> "Cambridge";
<https://foo.example.org/DB/Addresses#state> "MA".
Note the highlighted RDF triple linking to the relevant row of the Addresses
table; this corresponds to the foreign key in the schema.
Using the following metadata the tabular data mapping yields a semantically equivalent result:
{ "@context": "https://www.w3.org/ns/csvw", "resources" : [{ "url": "https://foo.example.org/CSV/People", "aboutUrl" : "https://foo.example.org/CSV/People/ID={ID}", "tableSchema": { "columns": [{ "name": "ID", "datatype": "integer" }, { "name": "fname", }, { "name": "addr", "datatype": "integer" }, { "name": "ref", "virtual": true, "propertyUrl": "https://foo.example.org/CSV/People#ref-addr", "valueUrl" : "https://foo.example.org/CSV/Addresses/ID={addr}" }, { "name": "type", "virtual": true, "propertyUrl": "rdf:type", "valueUrl" : "https://foo.example.org/CSV/People" }], } }, { "url": "https://foo.example.org/CSV/Addresses", "aboutUrl" : "https://foo.example.org/CSV/Addresses/ID={ID}", "tableSchema": [{ "columns": [{ "name": "ID", "datatype": "integer" }, { "name": "city", }, { "name": "state", }, { "name": "type", "virtual": true, "propertyUrl": "rdf:type", "valueUrl" : "https://foo.example.org/CSV/Addresses" }], }] }
A.3 More complex example with three tables
Consider a case with three, interrelated simple tables in a relational database, called People
, Addresses
, and Departments
, respectively:
PK | → Addresses(ID) | → Department(name,city) | ||
ID | fname | addr | deptName | deptCity |
---|---|---|---|---|
7 | Bob | 18 | accounting | Cambridge |
PK | ||
ID | city | state |
---|---|---|
18 | Cambridge | MA |
PK | Unique Key | |
ID | name | city |
---|---|---|
23 | accounting | Cambridge |
The corresponding relational schema specifies that:
- For the
People
table- The table has five columns, named
ID
,name
,addr
,deptName
, anddeptCity
. - Values in
ID
are the primary keys in the table, and the corresponding cells contain integers. - Cells in the
name
,deptName
, anddeptCity
columns contain strings. - Cells in the
addr
columns contains integers.
- The table has five columns, named
- For the
Addresses
table- The table has three columns, named
ID
,city
, andstate
. - Values in
ID
are the primary keys in the table, and the corresponding cells contain integers. - Cells in both the
city
andstate
columns contain strings.
- The table has three columns, named
- For the
Departments
table- The table has three columns, named
ID
,name
, andcity
. - Values in
ID
are the primary keys in the table, and the corresponding cells contain integers. - Cells in both the
name
andcity
columns contain strings. - The combination of
name
andcity
are unique keys.
- The table has three columns, named
- The cells in the
addr
column of thePeople
table are foreign keys that reference theID
field of theAddresses
table. - The cells in the
deptName
anddeptCity
columns in thePeople
table are combined foreign keys referencing candidate keys for thename
andcity
pairs of theDepartments
table.
Using the schema information, the direct graph is as follows (see [rdb-direct-mapping] for further details):
<https://foo.example.org/DB/People/ID=7> rdf:type <https://foo.example.org/DB/People>;
<https://foo.example.org/DB/People/#ID> 7;
<https://foo.example.org/DB/People/#fname> "Bob";
<https://foo.example.org/DB/People/#addr> 18;
<https://foo.example.org/DB/People/#ref-addr> <https://foo.example.org/DB/Addresses/ID=18>;
<https://foo.example.org/DB/People/#deptName> "accounting";
<https://foo.example.org/DB/People/#deptCity> "Cambridge";
<https://foo.example.org/DB/People/#ref-deptName;deptCity> <https://foo.example.org/DB/Department/ID=23>.
<https://foo.example.org/DB/Addresses/ID=18> rdf:type <https://foo.example.org/DB/Addresses>;
<https://foo.example.org/DB/Addresses/#ID> 18;
<https://foo.example.org/DB/Addresses/#city> "Cambridge";
<https://foo.example.org/DB/Addresses/#addr> "MA".
<https://foo.example.org/DB/Departments/ID=23> rdf:type <https://foo.example.org/DB/Addresses>;
<https://foo.example.org/DB/Departments/#ID> 23;
<https://foo.example.org/DB/Departments/#name> "accounting";
<https://foo.example.org/DB/Departments/#city> "Cambridge".
The major difference, compared to the simpler example with foreign keys is the usage of unique keys.
To generate the correct object URI in the highlighted statement of the direct graph (above) the processor has to:
- Extract the values of columns
deptName
anddeptCity
for the current row in thePeople
table in order to determine the value of the compound unique key to theDepartments
table for that row. - Find the associated row in the
Departments
table that matches that compound unique key value and determine the subject for that row.
This can be done because the direct mapping processor has simultaneous access to several tables within the same relational database. It is therefore straightforward to access all the tables in parallel and establish the necessary relationships to generate the triples.
However, this combination cannot be handled by the tabular data mapping. The situation for tabular data is indeed different: tables are typically generated from single and, potentially, very large CSV files, meaning that a tabular data mapping processor cannot be expected to handle several tables in parallel. That is the reason why the [tabular-data-model] does not include features that would require such parallel access. As a consequence, the output of the direct mapping for such tables cannot be reproduced by the tabular data mapping.
Note that the [tabular-data-model] includes an annotation for transformations; implementations may include scripts or templates to transform the output of the tabular data mapping and generate the required RDF graph.
B. Acknowledgements
C. Changes since previous versions
C.1 Changes since candidate recommendation of 16 July 2015
- Editorial changes arising from issue 679.
C.2 Changes since the working draft of 16 April 2015
- Added an appendix describing the relationships to the "Direct Mapping to RDF Specification" [rdb-direct-mapping].
- The section on datatype has been changed to allow for the reference to externally defined datatypes (as datatype IRI-s in RDF Literals) using the new
@id
annotation in the model. - The section on generating RDF has been amended to include the provision of row title annotation values in standard mode.
- The section on generating RDF has been amended to include a note on the fact that no Unicode normalization is necessary during conversion.
C.3 Changes since the first public working draft of 08 January 2015
The document has undergone substantial changes since the first public working draft. Below are some of the changes made:
- Introduced the concept of a "standard" and a "minimal" mode.
- Removed the separate section on "Mapping Core Tabular Data" (to be in line with the latest version of the tabular model specification).
- Reformulated the mappings in terms of table "annotations" (i.e., based on the tabular model) rather than table "properties" (i.e., not based on the specificities of the metadata vocabulary).
- Provided a set of standard mappings from the general datatypes and the corresponding JSON types.
- Defined a JSON-LD dialect to RDF mapping for non-core annotations.
- Provided a non-normative section on how to include provenance information.
- Added more detailed examples.
D. References
D.1 Normative references
- [RFC2119]
- S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
- [json-ld-api]
- Markus Lanthaler; Gregg Kellogg; Manu Sporny. JSON-LD 1.0 Processing Algorithms and API. 16 January 2014. W3C Recommendation. URL: https://www.w3.org/TR/json-ld-api/
- [rdf11-concepts]
- Richard Cyganiak; David Wood; Markus Lanthaler. RDF 1.1 Concepts and Abstract Syntax. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/rdf11-concepts/
- [tabular-data-model]
- Jeni Tennison; Gregg Kellogg. Model for Tabular Data and Metadata on the Web. W3C Recommendation. URL: https://www.w3.org/TR/2015/REC-tabular-data-model-20151217/
- [tabular-metadata]
- Jeni Tennison; Gregg Kellogg. Metadata Vocabulary for Tabular Data. W3C Recommendation. URL: https://www.w3.org/TR/2015/REC-tabular-metadata-20151217/
D.2 Informative references
- [RFC6570]
- J. Gregorio; R. Fielding; M. Hadley; M. Nottingham; D. Orchard. URI Template. March 2012. Proposed Standard. URL: https://tools.ietf.org/html/rfc6570
- [RFC6761]
- S. Cheshire; M. Krochmal. Special-Use Domain Names. February 2013. Proposed Standard. URL: https://tools.ietf.org/html/rfc6761
- [RFC7111]
- M. Hausenblas; E. Wilde; J. Tennison. URI Fragment Identifiers for the text/csv Media Type. January 2014. Informational. URL: https://tools.ietf.org/html/rfc7111
- [UAX15]
- Mark Davis; Ken Whistler. Unicode Normalization Forms. 31 August 2012. Unicode Standard Annex #15. URL: https://www.unicode.org/reports/tr15
- [csvw-context]
- Gregg Kellogg. Metadata Vocabulary for Tabular Data. URL: https://www.w3.org/ns/csvw
- [json-ld]
- Manu Sporny; Gregg Kellogg; Markus Lanthaler. JSON-LD 1.0. 16 January 2014. W3C Recommendation. URL: https://www.w3.org/TR/json-ld/
- [n-triples]
- Gavin Carothers; Andy Seaborne. RDF 1.1 N-Triples. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/n-triples/
- [prov-o]
- Timothy Lebo; Satya Sahoo; Deborah McGuinness. PROV-O: The PROV Ontology. 30 April 2013. W3C Recommendation. URL: https://www.w3.org/TR/prov-o/
- [rdb-direct-mapping]
- Marcelo Arenas; Alexandre Bertails; Eric Prud'hommeaux; Juan Sequeda. A Direct Mapping of Relational Data to RDF. 27 September 2012. W3C Recommendation. URL: https://www.w3.org/TR/rdb-direct-mapping/
- [rdf-schema]
- Dan Brickley; Ramanathan Guha. RDF Schema 1.1. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/rdf-schema/
- [rdfa-primer]
- Ivan Herman; Ben Adida; Manu Sporny; Mark Birbeck. RDFa 1.1 Primer - Third Edition. 17 March 2015. W3C Note. URL: https://www.w3.org/TR/rdfa-primer/
- [trig]
- Gavin Carothers; Andy Seaborne. RDF 1.1 TriG. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/trig/
- [turtle]
- Eric Prud'hommeaux; Gavin Carothers. RDF 1.1 Turtle. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/turtle/