CARVIEW |
Gleaning Resource Descriptions from Dialects of Languages (GRDDL)
W3C Working Draft 24 October 2006
- This Version:
- https://www.w3.org/TR/2006/WD-grddl-20061024/
- Latest Version:
- https://www.w3.org/TR/grddl/
- Editor:
- Dan Connolly
- Authors:
- see Acknowledgments
Copyright © 2006 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
Abstract
GRDDL is a mechanism for Gleaning Resource Descriptions from Dialects of Languages. This GRDDL specification introduces markup for declaring that an XML document includes gleanable data and for linking to an algorithm, typically represented in XSLT, for gleaning the resource descriptions from the document.
The markup includes a namespace-qualified attribute for use in general-purpose XML documents and a profile-qualified link relationship for use in valid XHTML documents. The GRDDL mechanism also allows an XML namespace document (or XHTML profile document) to declare that every document associated with that namespace (or profile) includes gleanable data and for linking to an algorithm for gleaning the data.
A corresponding GRDDL Use Case Working Draft provides motivating examples. A GRDDL Primer demonstrates the mechanism on XHTML documents which include widely-deployed dialects, more recently known as microformats.
Status of This Document
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This is the First Public Working Draft of the GRDDL design by the GRDDL Working Group, which was chartered in July 2006 to review the specification and develop use cases, tutorial materials, and tests. The GRDDL design was released as a Note as early as April 2004; see the change log appendix for details.
GRDDL is intended to contribute to addressing Web Architecture issues such as RDFinXHTML-35 and namespaceDocument-8 as well as issues postponed by the RDF Core working group such as rdfms-validating-embedded-rdf and faq-html-compliance.
There are now multiple implementations, including an online service, and a growing collection of tests. A number of issues remain to be decided by the working group; this draft takes a position on some of them.
A few editorial notes and TODOs in this style remain. In particular, the figures have not yet been updated with respect to changes in the text; we hope they are more helpful than distracting in their present state.
Please send comments about this document to public-grddl-comments@w3.org (with public archive).
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Table of Contents
- Introduction
- Adding GRDDL to well-formed XML
- GRDDL for XML Namespaces
- The GRDDL profile for XHTML
- GRDDL for HTML Profiles
- GRDDL Transformations
- Security Considerations
- The GRDDL Vocabulary
- References
- Appendix: Transformations for Styling versus data extraction
- Appendix: Available Software and Services
- Appendix: Issues
- Appendix: Test Cases
- Acknowledgements and Change History
1. Introduction: Data and Documents
There are many dialects of languages in practice among the many XML documents on the web. There are dialects of XHTML, XML and RDF that are used to represent everything from poetry to prose, purchase orders to invoices, spreadsheets to databases, schemas to scripts, and linked lists to ontologies. Some offer more formally defined semantics and others more loosely-couple semantics. Recently, two progressive encoding techniques have emerged to overlay additional semantics onto valid XHTML documents: RDFa and microformats offer simple, open data formats built upon existing and widely adopted standards.
While this breadth of expression is quite liberating, inspiring new dialects to codify both common and customized meanings, it can prove to be a barrier to understanding across different domains or fields. How, for example, does software discover the author of a poem, a spreadsheet and an ontology? And how can software determine whether authors of each are in fact the same person?
DocBook V4.X | TEI |
<book> <bookinfo> <title>The Stand</title> <author> <firstname>Stephen</firstname> <surname>King</surname> </author></bookinfo> ... </book> |
<TEI ... > ... <title>The Stand</titl> <author><persName> <forename>Stephen</forename> <surname>King</surname> </persName></author> ... </TEI> |
Atom | Open Office |
<entry ... > <title>The Stand</title> <author> <name>Stephen King</name> </author> ... </entry> |
<office:document-meta ... > <office:meta> <dc:title>The Stand</dc:title> <meta:initial-creator> Stephen King </meta:initial-creator> <dc:creator>Stephen King</dc:creator> </office:meta> </office:document-meta> |
Resource Descriptions
The Resource Description Framework[RDFC04] provides a standard for making statements about resources in the form of a subject-predicate-object expression. One way to represent the fact "The Stand's author is Stephen King" in RDF would be as a triple whose subject is "The Stand," whose predicate is "has the author," and whose object is "Stephen King." The predicate, "has the author" expresses a relationship between the subject (The Stand) and the object (Stephen King). Using URIs to uniquely identify the book, the author and even the relationship would facilitate software design because not everyone knows Stephen King or even spells his name consistently.
<rdf:RDF xmlns:rdf="https://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="https://purl.org/dc/elements/1.1/" xmlns:foaf="https://xmlns.com/foaf/0.1/" > <rdf:Description rdf:about="https://www.stephenking.com/pages/works/stand/"> <dc:title>The Stand</dc:title> <dc:creator>Stephen King</dc:creator> <foaf:maker> <foaf:Person> <foaf:isPrimaryTopicOf rdf:resource="https://en.wikipedia.org/wiki/Stephen_King" /> </foaf:Person> </foaf:maker> <dc:format>Book</dc:format> </rdf:Description> </rdf:RDF>
GRDDL is a mechanism for Gleaning Resource Descriptions from Dialects of Languages. That is, GRDDL provides a relatively inexpensive mechanism for bootstrapping RDF content from uniform XML dialects; shifting the burden from formulating RDF to creating transformation algorithms specifically for each dialect. XML Transformation languages such as XSLT are quite versatile in their ability to process, manipulate, and generate XML. The use of XSLT to generate XHTML from single-purpose XML vocabularies is historically celebrated as a powerful idiom for separating structured content from presentation.
GRDDL shifts this idiom to a different end: separating structured content from its authoritative meaning (or semantics). GRDDL works by associating transformations for an individual document, either through direct inclusion of references or indirectly through profile and namespace documents. Content authors can nominate the transformations for producing RDF from their content and use GRDDL to refer to them.
Faithful Renditions
By specifying a GRDDL transformation, the author of a document states that the transformation will provide a faithful rendition of the source document, or some portion of the source document, that preserves its meaning in RDF.
Likewise, by specifying a GRDDL namespace transformation or profile transformation, the creator of that namespace or profile states that the transformation will provide a faithful rendition of a class of source documents which relate to that namespace or profile, or some portion of such a source document, that preserves its meaning in RDF. A namespace document or a profile document also provide a means for their authors to explain, prosaically, the purpose of the transformation or any policy statements.
GRDDL Primer
The GRDDL Primer[primer] is a step-by-step tutorial on the GRDDL mechanism. It develops on a number of examples from the GRDDL Use Cases document to illustrate GRDDL techniques for associating documents with transformations for extracting RDF.
GRDDL Use Cases
The use cases document[usecases] collects a number of use cases together with their goals and requirements for GRDDL. These use cases also illustrate how XML and XHTML documents can be decorated with microformat, Embedded RDF or RDFa statements to support GRDDL transformations in charge of extracting valuable data that can then be used to automate a variety of tasks.
GRDDL Specification
This GRDDL specification is a concise technical specification of the GRDDL mechanism and its XML syntax. It specifies the GRDDL syntax to use in valid XHTML and well-formed XML documents, as well as how to encode GRDDL into namespaces and HTML profiles. Discussions of the GRDDL transformation link and security issues are also covered. Appendices provide links to extended examples and existing software and services that employ GRDDL.
2. Adding GRDDL to well-formed XML
The general form of associating a GRDDL transformation link with a
well-formed XML document is by adorning the root element with a
grddl
namespace declaration and a
grddl:transformation
attribute whose value is a URI
reference, or list of URI references, that refer to executable scripts
or programs which are expected to transform the source document into
RDF. This method is suitable for use with a wide
variety of XML dialects that are are not constrained from adding
attributes by an XML DTD.
Stated more formally:
- An XML document whose root element has an
attribute with a local name of
transformation
and a namespace name ofhttps://www.w3.org/2003/g/data-view#
has a GRDDL transformation for each resource identified by a URI reference listed in the value of the attribute (c.f. section 4.4.1. URI references in [WEBARCH]). - If ?D is an XML document with GRDDL transformation
?TX, then
the result of applying ?TX to ?D is a GRDDL result of ?D.
note that issue-output-formats is open; this draft takes the somewhat liberal position that GRDDL transformations yield RDF graphs, not RDF/XML documents.
- If ?G1 and ?G2 are GRDDL results of ?D, then the merge [RDF-MT] of ?G1 and ?G2 is also a GRDDL result of ?D.
GRDDL in well-formed XML
In any dialect of XML not constrained by an XML DTD, the following example applies:
<root-element xmlns:grddl="https://www.w3.org/2003/g/data-view#" grddl:transformation="https://example.com/fmt3/txformRDF.xsl"> <etc> ... </etc> </root-element>
GRDDL in well-formed XHTML
In any dialect of XHTML that is not constrained by DTD syntax, the above example can be written:
<html xmlns="https://www.w3.org/1999/xhtml" xmlns:data-view="https://www.w3.org/2003/g/data-view#" data-view:transformation="https://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokFOAF.xsl https://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokCC.xsl https://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokGeoURL.xsl"> <head> <title>Joe Lambda's Home page [an example of RDF in XHTML]</title> ...
Notice that data-view
was used in the namespace declaration
in place of grddl
in this example to illustrate the fact
that the prefix does not have to be grddl
and to emphasize the data-centric focus of the RDF/XML view.
As you will see in later sections, there are other ways to add GRDDL to HTML documents, especially designed to leverage HTML's existing capabilities and thereby overcome constraints imposed by the XML DTDs for some dialects of HTML. See Using GRDDL with valid XHTML and GRDDL for HTML Profiles.
3. Using GRDDL with XML Namespace Documents
Transformations can be associated not only with individual documents but also with whole dialects that share an XML namespace. Any resource available for retrieval from a namespace URI is a namespace document (cf. section 4.5.4. Namespace documents in [WEBARCH]). For example, a namespace document may have an XML Schema representation or an RDF Schema representation, or perhaps both, using content negotiation.
To associate a GRDDL transformation with a whole dialect, have the
namespace document include the
grddl:namespaceTransformation
property. The precise
methods for allowing various types of namespace documents to include
this property are detailed below, first formally and then by example.
- if an information resource ?D has an XML representation whose root element has a namespace name ?NS then any GRDDL result of the resource identified by ?NS is a GRDDL result of ?D
- if an information resource ?D has an XML representation whose root element has a namespace name ?NSDOC** and ?D has a GRDDL result that includes, for any ?TX, the RDF triple { ?NSDOC <https://www.w3.org/2003/g/data-view#namespaceTransformation> ?TX } then ?TX is also a transformation of ?D
Note issue issue-mt-ns is open. perhaps: special case for the RDF/XML namespace: RDF/XML documents are associated with RDF graphs as per the RDF/XML specification.
Using GRDDL with an RDF Namespace document
For example, consider this privacy policy written in P3Q, a contrived analog to P3P[P3P]:
<POLICIES xmlns="https://www.w3.org/2004/01/rdxh/p3q-ns-example"> <EXPIRY max-age="604800"/> ...
The namespace document for P3Q relates the grokP3Q.xsl transformation to all P3Q documents:
<rdf:RDF xmlns:rdf="https://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dataview="https://www.w3.org/2003/g/data-view#"> <rdf:Description rdf:about="https://www.w3.org/2004/01/rdxh/p3q-ns-example"> <dataview:namespaceTransformation rdf:resource="https://www.w3.org/2004/01/rdxh/grokP3Q.xsl"/> </rdf:Description> </rdf:RDF>
The Working Group is likely to add a section to the GRDDL primer much like this subsection. Since this subsection has no novel normative material, we're interested in feedback on whether it should remain part of this specification once it is covered by the primer.
Using GRDDL with an XML Schema namespace document
A namespace transformation link may be discoverable by transforming the namespace document itself. Note that this means that namespace documents need not be written in RDF/XML directly.
Consider a purchase order that has a namespace document represented in XML Schema, where the XML Schema bears a data-view:transformation attribute licensing extraction of statements that include namespaceTransformation statements:
<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema" xmlns="http:.../Order-1.0" targetNamespace="http:.../Order-1.0" version="1.0" ... xmlns:data-view="https://www.w3.org/2003/g/data-view#" data-view:transformation="https://www.w3.org/2003/g/embeddedRDF.xsl" > <xsd:element name="Order" type="OrderType"> <xsd:annotation <xsd:documentation>This element is the root element.</xsd:documentation> </xsd:annotation> ... <xsd:annotation> <xsd:appinfo> <rdf:RDF xmlns:rdf="https://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="https://www.w3.org/2003/g/po-ex"> <data-view:namespaceTransformation rdf:resource="grokPO.xsl" /> </rdf:Description> </rdf:RDF> </xsd:appinfo> </xs:annotation>
Every purchase order using that schema as a namespace document
is linked to the grokPO.xsl
transformation, as
illustrated below:
@@oops... the figure uses "result" as the result of an XSLT transformation, but that clashes with GRDDL result.
The Working Group is likely to add a section to the GRDDL primer much like this subsection. Since this subsection has no novel normative material, we're interested in feedback on whether it should remain part of this specification once it is covered by the primer.
4. Using GRDDL with valid XHTML
To accomodate the DTD-based syntax of XHTML[XHTML], which precludes using attributes from
foreign namespaces, we use https://www.w3.org/2003/g/data-view
as a metadata profile (cf. section 7.4.4.3
Meta data profiles of [HTML4]).
The general form of adding a GRDDL assertion to a valid XHTML
document is by specifying the GRDDL profile in the
profile
attribute of the head
element, and
transformation
as the value of the rel
attribute of a link
or a
element whose
href
attribute value is a URI reference that refers to an
executable script or program which is expected to transform the source
document into RDF. This method is suitable for use
with valid XHTML documents which are constrained by an XML DTD.
Stated more formally:
- An XHTML document whose metadata profiles include https://www.w3.org/2003/g/data-view has a GRDDL transformation for each resource identified by a link of type transformation.
@@TODO: be more clear about what "whose metadata profiles" means
An example Dublin Core META transformation
For example, this document follows the conventions of [RFC2731], and it explicitly uses the GRDDL profile and links to an XSLT transformation that extracts the metadata in RDF/XML in a way that preserves the meaning of the document:
<html xmlns="https://www.w3.org/1999/xhtml"> <head profile="https://www.w3.org/2003/g/data-view"> <title>Some Document</title> <link rel="transformation" href="https://www.w3.org/2000/06/dc-extract/dc-extract.xsl" /> <meta name="DC.Subject" content="ADAM; Simple Search; Index+; prototype" /> ... </head> ... </html>
In the figure below, the arrow labelled info relates a document to an abstract notion of the information contained in the document. It shows that the RDF data extracted via the dc-extract.xsl transformation is part of the information contained in the document:
This is what the data looks like in RDF/XML:
<rdf:RDF xmlns:dc="https://purl.org/dc/elements/1.1/" xmlns:rdf="https://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about=""> <dc:subject>ADAM; Simple Search; Index+; prototype</dc:subject> </rdf:Description> </rdf:RDF>
Multiple transformations in XHTML
An XHTML document may conform to a number of dialects
simultaneously and link to more than one decoding algorithm. However,
since the href
attribute of the link
and
a
elements accept only a single URI reference, multiple
instances of these elements must be used to assert multiple links:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "https://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="https://www.w3.org/1999/xhtml"> <head profile="https://www.w3.org/2003/g/data-view"> <title>Joe Lambda's Home page [an example of RDF in XHTML]</title> <link rel="transformation" href="https://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokFOAF.xsl" /> <link rel="transformation" href="https://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokCC.xsl" /> <link rel="transformation" href="https://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokGeoURL.xsl" /> ...
5. GRDDL for HTML Profiles
XHTML provides the profile mechanism to link to the meaning of properties
and the set of legal values for those properties. As with namespace documents,
a profile document can effectively be written using XHTML with embedded RDF statements
and a GRDDL transformation to extract the definition of terms that are applicable.
Those terms can then be used in an XHTML document to convey profile-dependent meaning.
As discussed in
Using GRDDL with valid XHTML, the GRDDL profile can be used
with XHTML documents to apply GRDDL semantics over link
elements where
the value of rel
attribute is transformation
.
This very powerful and flexible mechanism integrates well with
microformat profiles[MF-RDF-FAQ] which overlay the normally semantically-poor HTML markup.
Adding GRDDL profileTransformation
assertion to a
profile document is much like adding a
namespaceTransformation
assertion to a namespace
document. For a dialect defined by a valid XHTML profile
documents, add
profile="https://www.w3.org/2003/g/data-view"
to the
head
element and make a link of type
profileTransformation
to the transformation of the
dialect.
A more formal description on the relation between GRDDL and XHTML profiles follows:
- if an information resource ?D has an XHTML representation whose profile attribute refers to ?PROFILE, then any GRDDL result of ?PROFILE is a GRDDL result of ?D
- if an information resource ?D has an XHTML representation whose profile attribute refers to ?PROFILE and ?D has a GRDDL result that includes, for any ?TX, the RDF triple { ?PROFILE <https://www.w3.org/2003/g/data-view#profileTransformation> ?TX } then ?TX is also a GRDDL transformation of ?D
In the following example, written in XHTML, the a
element is a link by HTML conventions and profile transformation
assertion by GRDDL convention:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "https://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="https://www.w3.org/1999/xhtml"> <head profile="https://www.w3.org/2003/g/data-view"> <link rel="transformation" href="https://www.w3.org/2003/g/glean-profile" /> ... <p>This is a profile transformation link: <a rel="profileTransformation" href="https://example.org/BIZ/calendar/extract-rdf.xsl">extract-rdf.xsl</a>
6. Transformation Algorithms
Transformations should have available representations in widely-supported formats. We expect most consumers to support XSLT version 1[XSLT1] for the foreseeable future, though XSLT2[XSLT2] deployment is increasing. While javascript, C, or any other programming language technically expresses the relevant information, XSLT is specifically designed to express XML to XML transformations and has some good safety characteristics.
XProc: An XML Pipeline Language[XPROC], a language for describing operations to be performed on XML documents, has recently been published as a W3C Working Draft. It merits consideration for expressing more complex or sophisticated transformations which require control over the flow of processing through a variety of XML processing tools. Using XProc, one could apply a sequence of operations such XInclude, validation, and transformation to a document, aborting if the result of an intermediate stage is not valid, for example.
7. Security considerations
RFC 2046, in section 9. Security Considerations says:
Implementors should pay special attention to the security implications of any media types that can cause the remote execution of any actions in the recipient's environment. In such cases, the discussion of the "application/postscript" type may serve as a model for considering other media types with remote execution capabilities.
Given the expressive power of XSLT, and the possibility to access external
resources from a XSLT style sheet (e.g. through the document
function or the xsl:import
mechanism), implementors should take
the appropriate measures to prevent malicious usage of this mechanism.
Note that evaluating a transformation may involve finding
representations for not only the resource identified as
the transformation, but also any resources referred to
by way of mechanisms such as xsl:include
.
Likewise, it may involve finding representations for
not only the source document but also any resources
referrred to using mechanisms such as the XSLT
document()
function.
7. The GRDDL Vocabulary
The following extract from the GRDDL profile document is written using XHTML and embedded RDF statements and includes all of the markup required to define an XHTML profile. (View the source to see the embedded markup)
This document, https://www.w3.org/2003/g/data-view, is a metadata profile in the sense of the HTML specification, in section 7.4.4.3 Meta data profiles.
We define the following terms as XHTML link relationships and RDF properties:
- rel
HTML4 definition of the 'rel' attribute.
- transformation
- relates a Document to an Algorithm, usually represented in XSLT, for extracting an RDF/XML representation of (some of) the document's meaning.. See GRDDL specification for full details.
- namespaceTransformation
- relates a Document, e.g. a namespace document, to an Algorithm, usually encoded in XSLT, for extracting an RDF/XML representation of (some of) the meaning of any document whose root element's namespace name refers to the subject document.
- profileTransformation
- relates a Document to an Algorithm, usually encoded in XSLT, for extracting an RDF/XML representation of (some of) the meaning of any XHTML document with a profile that refers to the subject document.
This document uses Embedded RDF to encode Description of a Project (DOAP) data as well as RDF Schema data and one or two RDDL properties. We have moved away from the RDDL syntax itself.
9. References
Normative References
Parts of the following specifications are include in this one by reference:
- HTML4
- HTML 4.01 Specification , D. Raggett, A. Le Hors, I. Jacobs, Editors, W3C Recommendation, 24 December 1999, https://www.w3.org/TR/1999/REC-html401-19991224 . Latest version available at https://www.w3.org/TR/html401 .
- RDFC04
- Resource Description Framework (RDF): Concepts and Abstract Syntax , G. Klyne, J. J. Carroll, Editors, W3C Recommendation, 10 February 2004, https://www.w3.org/TR/2004/REC-rdf-concepts-20040210/ . Latest version available at https://www.w3.org/TR/rdf-concepts/ .
- XHTML
- Modularization of XHTML™ , S. Schnitzenbaumer, F. Boumphrey, T. Wugofski, S. McCarron, M. Altheim, S. Dooley, Editors, W3C Recommendation, 10 April 2001, https://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/ . Latest version available at https://www.w3.org/TR/xhtml-modularization/ .
- WEBARCH
- Architecture of the World Wide Web, Volume One , N. Walsh, I. Jacobs, Editors, W3C Recommendation, 15 December 2004, https://www.w3.org/TR/2004/REC-webarch-20041215/ . Latest version available at https://www.w3.org/TR/webarch/ .
- RDF-MT
- RDF Semantics , P. Hayes, Editor, W3C Recommendation, 10 February 2004, https://www.w3.org/TR/2004/REC-rdf-mt-20040210/ . Latest version available at https://www.w3.org/TR/rdf-mt/ .
@@TODO: cite FOAF normatively or remove the dependency on foaf:Document from the GRDDL namespace document.
Informative references
The following documents provide additional background but are not part of this specification.
- primer
- GRDDL Primer , I. Davis, Editor, W3C Working Draft (work in progress), 2 October 2006, https://www.w3.org/TR/2006/WD-grddl-primer-20061002/ . Latest version available at https://www.w3.org/TR/grddl-primer/ .
- usecases
- GRDDL Use Cases: Scenarios of extracting RDF data from XML documents , F. Gandon, Editor, W3C Working Draft (work in progress), 2 October 2006, https://www.w3.org/TR/2006/WD-grddl-scenarios-20061002/ . Latest version available at https://www.w3.org/TR/grddl-scenarios/ .
- XSLT1
- XSL Transformations (XSLT) Version 1.0 , J. Clark, Editor, W3C Recommendation, 16 November 1999, https://www.w3.org/TR/1999/REC-xslt-19991116 . Latest version available at https://www.w3.org/TR/xslt .
- XSLT2
- XSL Transformations (XSLT) Version 2.0 , M. Kay, Editor, W3C Working Draft (work in progress), 11 February 2005, https://www.w3.org/TR/2005/WD-xslt20-20050211/ . Latest version available at https://www.w3.org/TR/xslt20 .
- RFC2731
- J. Kunze Encoding Dublin Core Metadata in HTML in 1999
- DCRDF
- Expressing Simple Dublin Core in RDF/XML Beckett, Miller, Brickley 2002-07-31
- P3P
- The Platform for Privacy Preferences 1.0 (P3P1.0) Specification , M. Marchiori, Editor, W3C Recommendation, 16 April 2002, https://www.w3.org/TR/2002/REC-P3P-20020416/ . Latest version available at https://www.w3.org/TR/P3P/ .
- STYPI
- Associating Style Sheets with XML documents , J. Clark, Editor, W3C Recommendation, 29 June 1999, https://www.w3.org/1999/06/REC-xml-stylesheet-19990629 . Latest version available at https://www.w3.org/TR/xml-stylesheet .
- XPROC
- XProc: An XML Pipeline Language , N. Walsh, Editor, W3C Working Draft (work in progress), 28 September 2006, https://www.w3.org/TR/2006/WD-xproc-20060928/ . Latest version available at https://www.w3.org/TR/xproc/ .
- MF-RDF-FAQ
- Microformat FAQs for RDF Fans, last modified 17:57, 30 May 2006
Appendix: Transformations for Styling versus data extraction
The xml-stylesheet processing instruction[STYPI] is generally deployed for automated presentation processing. This type of link is different from links to GRDDL transformation algorithms, which are intended to facilitate extracting data. Also, parsing the content of processing instructions is not supported by XML tools such as XSLT processors, and grounding processing instructions in URI space is not as straightforward as using namespaces with attributes.
Appendix: Available Software and Services
The authors provide pair of online services on an experimental, best-effort basis:
Client-side implementations are also in development:
- glean.py released 28 May 2004
- garner.py released 7 Jun 2004
- GRDDL parser for PHP announced 2 Nov 2004
- a Greasemonkey script allows to load RDF data from XHTML pages
- The Redland Raptor RDF Parser Demonstration supports some GRDDL since 04 Apr 2005
Appendix: Issues
The editor acknoweldges the following issues and expects the Working Group to make decisions about them:
- issue-tx-element: is there a way to push the grddl:transformation attribute down from the document element to individual elements without breaking the chain of authority? See RDF in parts of XHTML documents 09 Mar 2004 and discussion of trackback in an 18 March message.
- issue-base-param: how the transformation algorithm gets the base URI; specify how that works as an XSLT param
- issue-mt-ns: how
GRDDL interacts with XML and RDF media types , including:
- how a GRDDL client interacts with a document whose root element is an XSLT literal result element
- how a GRDDL client interacts with an RDF document that has a root element other than rdf:RDF. see Grddl Squirrel 6 test cases, McBride, 9 Mar 2006
- whether RDF/XML statements labelled as application/xml constitute a "document whose meaning includes the RDF statement ..." (9 Mar 2006 from McBride)
- what happens if data-view:transformation is given on an rdf:RDF root element (9 Mar 2006 from McBride)
- issue-output-formats: whether GRDDL transformations may produce RDF in a format other than RDF/XML .
discussed in the March 2006 SemWeb IG meeting; see irc notes
See also GRDDL extraction *to* RDFa Ben Adida (Friday, 8 September) and following, and comments on Sequential Transformations 20 Oct
- issue-conformance-labels: which conformance labels, if any, should we have? e.g. GRDDL client/processor, GRDDL document
- issue-http-header-links: should links expressed in HTTP headers be supported? Connolly 11Sep
The following issues have been resolved by the Working Group:
- issue-whichlangs: which languages, if any, should GRDDL clients/processors be required to support? XSLT1? XSLT2? ECMAscript? closed 30 Aug
Appendix: Test Cases
A collection of test cases is in development. The original announcement was 02 Feb 2005. As of April 2005, they include:
- xhtmlWithGrddlEnabledProfile
- xhtmlWithGrddlProfile
- xhtmlWithGrddlTransformationInBody
- xhtmlWithMoreThanOneGrddlTransformation
- xhtmlWithMoreThanOneProfile
- xmlWithGrddlAttribute
- xmlWithGrddlAttributeAndNonXMLNamespaceDocument
Tests pending:
- test use of grdd:transformation attribute in conjuction with namespace document(s), XHTML profile(s)
- for issue-base-param "I think I'll make a test case with the xslt2 function too." 30 Aug
- the po-doc.xml XML Schema example
- something to demonstrate microsummaries with GRDDL? see Dom 11 Sep
- Passing an XSLT 2.0 transformation to an XSLT 1.0 engine (6 Sep)
- multiple namespace transformations (6 Sep)
- running only some of the transforms for policy reasons 5 Sep
Extended Example
An example homepage with Dublin Core, GeoURL, RSS, Creative Commons, etc. demonstrates several transformations and dialects.
Acknowledgements and Change History
- In May 2003, Joseph Reagle sent a Kickoff of public-rdf-in-xhtml-tf@w3.org message.
- This design started with a sketch in May 2003 by Dan Connolly.
- In Nov 2003, Dominique Hazaël-Massieux wrote An RDF-in-XHTML Proposal, a predecessor of this spec.
- In Jan 2004, Dan Connolly integrated that draft into this one and sent a message calling for review. Discussion with Tim Berners-Lee led to generalizing from XHTML to all of XML and to indirection via namespace/profile document.
- In Feb 2004, Connolly presented a GRDDL design history and rationale which discusses contribution of this design to Web Architecture issues such as RDFinXHTML-35 and namespaceDocument-8. Feedback from Norm Walsh has been valuable, and Noah Mendelsohn noted a connection to the Cambridge Communiqué in a message of 22 March.
- In February 2004, the RDF Core specifications became W3C Recommendations; the issues rdfms-validating-embedded-rdf and faq-html-compliance were postponed, rather than addressed in those specifications.
- A 13 April 2004 snapshot was published as a W3C Coordination Group Note to faciliate exchange between the Semantic Web Best Practices and Deployment Working Group and the HTML Working Group.
- Ben Adida started contributing use cases from Creative Commons in a March 2004 meeting of the Semantic Web Best Practices & Deployment Working Group
- In a Semantic Web Interest Group meeting that week, Murray Maloney took and interest in the connection with XML Schemas and the readability of the specification, Brian McBride demonstrated some related implementation experience with transforming documents to RDF, and Ian Davis contributed the eRDF use case and profile.
- A 16 May 2005 snapshot was published as a W3C Team Submission by Dom and Dan
The GRDDL Working Group convened August 2006 with Harry Halpin as chair and several of the contributors and implementors above participating, plus Chimezie Ogbuji, Fabien Gandon, Brian Suda, and Rachel Yager.