CARVIEW |
WD-xptr-19980303
XML Pointer Language (XPointer)
World Wide Web Consortium Working Draft 03-March-1998
- This version:
- https://www.w3.org/TR/1998/WD-xptr-19980303
- Previous version:
- https://www.w3.org/TR/WD-xml-link-970731
- Latest version:
- https://www.w3.org/TR/WD-xptr
- Editors:
- Eve Maler (ArborText) <elm@arbortext.com>
- Steve DeRose (Inso Corp. and Brown University ) <sderose@eps.inso.com>
Status of this document
This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at https://www.w3.org/TR.
This work is part of the W3C XML Activity (for current status, see https://www.w3.org/MarkUp/XML/Activity ). For information about the XLink language in which XPointer is expected to be used, see https://www.w3.org/TR/WD-xlink .
See https://www.w3.org/TR/NOTE-xlink-principles for additional background on the design principles informing XPointer.
Abstract
This document specifies constructs that support addressing into the internal structures of XML documents. In particular, it provides for specific reference to elements, character strings, and other parts of XML documents, whether or not they bear an explicit ID attribute.
XML Pointer Language (XPointer)
Version 1.0
Table of Contents
1. Introduction1.1 Language Design Goals
1.2 Relationship to Existing Standards
1.3 Terminology
1.4 Notation
2. XPointers in Locators
3. The XPointer Language
3.1 XPointer Structure
3.2 Absolute Location Terms
3.2.1 The root Keyword
3.2.2 The origin Keyword
3.2.3 The id Keyword
3.2.4 The html Keyword
3.3 Relative Location Terms
3.3.1 Relative Location Term Arguments
3.3.2 Selection by Instance Number
3.3.3 Selection by Node Type
3.3.4 Selection by Attribute
3.3.5 The descendant Keyword
3.3.6 The ancestor Keyword
3.3.7 The preceding Keyword
3.3.8 The following Keyword
3.3.9 The psibling Keyword
3.3.10 The fsibling Keyword
3.4 Spanning Location Term
3.5 Attribute Location Term
3.6 String Location Term
3.7 Locations That Are Not Simply Nodes
3.7.1 Spanning Strings
3.7.2 The all Keyword
3.7.3 Spanning XPointers
3.8 Link Persistence
4. Conformance
Appendices
A. Unfinished WorkA.1 Case Sensitivity in Attribute Values
A.2 XPointers and Abstract Data Types
B. XPointers and TEI Extended Pointers
C. References
1. Introduction
This document specifies a language that supports addressing into the internal structures of XML documents. In particular, it provides for specific reference to elements, character strings, and other parts of XML documents, whether or not they bear an explicit ID attribute.
1.1 Language Design Goals
Following is a summary of the design principles governing XPointer:
- XPointers shall address into XML documents.
- XPointers shall be straightforwardly usable over the Internet.
- XPointers shall be straightforwardly usable in URIs.
- The XPointer design shall be prepared quickly.
- The XPointer design shall be formal and concise.
- The XPointer syntax shall be reasonably compact and human readable.
- XPointers shall be optimized for usability.
- XPointers must be feasible to implement.
1.2 Relationship to Existing Standards
Three standards have been especially influential:
- HTML: Has popularized an important location specifier type, the URL (now URI).
- HyTime:Defines location specifier types for all kinds of data.
- Text Encoding Initiative Guidelines (TEI P3): Provide a formal syntax for location specifiers for structured data, graphics, and other data.
Many other linking systems have also informed this design, especially Dexter, FRESS, MicroCosm, and InterMedia.
1.3 Terminology
The following basic terms apply in this document.
- element tree
- An abstract representation of the relevant structure specified by the tags, attributes, and other markup constructs in an XML document.
- link
- An explicit relationship between two or more data objects or portions of data objects.
- linking element
- An element that asserts the existence and describes the characteristics of a link.
- locator
- Data, provided as part of a link, which identifies a resource.
- resource
- In the abstract sense, an addressable unit of information or service that is participating in a link. Examples include files, images, documents, programs, and query results. Concretely, anything reachable by the use of a locator in some linking element. Note that this term and its definition are taken from the basic specifications governing the World Wide Web.
- sub-resource
- A portion of a resource, pointed to as the precise destination of a link. As one example, a link might specify that an entire document be retrieved and displayed, but that some specific part(s) of it is the specific linked data, to be treated in an application-appropriate manner such as indication by highlighting, scrolling, etc.
1.4 Notation
The formal grammar for locators is given using a simple Extended Backus-Naur Form (EBNF) location, as described in the XML specification.
2. XPointers in Locators
The locator for a resource is typically provided by means of a Uniform Resource Identifier, or URI. XPointers can be used as fragment identifiers in conjunction with the URI structure to specify a more precise sub-resource. Any fragment identifier that points into an XML resource must be an XPointer. However, for any locator in an XML resource that identifies a resource that is not an XML document (for example, an HTML or PDF document), XPointer does not constrain the syntax or semantics of the locator.
3. The XPointer Language
XPointers operate on the tree defined by the elements and other markup constructs of an XML document.
An XPointer consists
of a series of location terms, each of
which specifies a location, usually relative to the location specified by
the prior location term. Each location term has a keyword (such
as id
, child
, ancestor
, and so on)
and can have arguments such as an instance number, element type, or attribute.
For example, the location term child(2,CHAP)
refers to the second
child element whose type is CHAP
.
3.1 XPointer Structure
At the heart of the XPointer is the location term, the basic unit of addressing information. The combination of location terms in an XPointer has the effect of specifying a precise location.
XPointer | ||||||||||||||||||||||||||||||||||||
|
Many XPointers locate individual nodes in an element tree. However, some location
terms can locate more complex sets of data. For example, a string match may
locate only a portion of a node, and an XPointer containing the span
location term (called a spanning XPointer) can reference
sub-resources that do not constitute whole elements.
Note that the implementation of traversal to a resource is not constrained by this specification. In particular, handling a resource designated by a span is probably highly application-dependent. In a display-oriented application program, such traversal might simply highlight the designated characters; but a structure-oriented viewer might have no interest in sub-resources that are not complete nodes or subtrees. Note that a span cannot be treated as a set or list of elements, because spans may locate partial elements.
Location terms are classified into absolute terms, relative terms, span terms, attribute terms, and string data terms. An absolute term selects one or more elements or locations in an XML document without reference to any other sub-resource location. A relative or string data term specifes a location in terms of another location, called the location source . The location source is the entire resource if there are no preceding location terms; otherwise it is the location specified by the preceding term (which might be relative to a location term before that).
3.2 Absolute Location Terms
The keywords described in this section do not depend on the existence of a location source. They can be used to establish a location source or can serve as self-contained XPointers.
Absolute location terms | ||||||||||||
|
The empty parentheses after root
and origin
are
for consistency with other keywords and to avoid ambiguous interpretation
of an XPointer containing just the string "root" or "origin".
3.2.1 The root Keyword
If an XPointer begins with root()
, the location source is
the root element of the containing
resource. If an XPointer omits any leading absolute location term (that is,
it consists only of OtherTerms
), it is assumed
to have a leading root()
absolute location term.
3.2.2 The origin Keyword
The origin
keyword produces a meaningful location source for
any following location terms only if the XPointer is being processed by application
software in response to a request for
traversal such as defined in the XLink specification. If an XPointer
begins with origin()
, the location source is the sub-resource
from which the user initiated traversal rather than the default root element.
This allows XPointers to select abstract items such as "the next chapter".
It is an error to use origin()
in a locator where a URI is
also provided and identifies a containing resource different from the resource
from which traversal was initiated.
3.2.3 The id Keyword
If an XPointer begins with id(Name)
, the location source is the
element in the containing resource with an attribute having a declared type
of ID
and a value matching the given
Name
.
For example, the location term id(a27)
chooses the necessarily
unique element of the containing resource which has an attribute declared
to be of type ID
whose value is a27
.
Note that if an XML document does not declare all attributes whose values
are intended to serve as unique IDs, application software cannot reliably
distinguish ID attributes from others with the same string value. Application
software processing an XPointer must first attempt to locate an element with
a declared ID attribute whose value matches that
Name
argument. If unable to do so, at user option
the application software may locate any element having an attribute with the
desired value.
3.2.4 The html Keyword
If an XPointer begins with html(NAMEVALUE)
, the location source
is the first element whose type is A
and which has an attribute
called NAME
whose value is the same as the supplied NAMEVALUE
. This is exactly the function performed by the "#
" fragment
identifier in the context of an HTML document.
3.3 Relative Location Terms
The keywords described in this section depend on the existence of a location source. If none is explicitly provided, the location source is the root element of the containing resource. These location terms provide facilities for navigating forward, backward, up, and down through the element tree. These location terms all accept the same list of arguments.
Relative location terms | ||||||||||||||||||||||||||||||||
|
Each of these keywords identifies a sequence of elements or other XML node types from which the resulting location source will be chosen. The arguments passed to the keyword determine which node types from that sequence are in fact chosen. Each keyword summarized here is described in detail in the following sections.
child
- Identifies direct child nodes of the location source.
descendant
- Identifies nodes appearing anywhere within the content of the location source.
ancestor
- Identifies element nodes containing the location source.
preceding
- Identifies nodes that appear before (preceding) the location source.
following
- Identifies nodes that appear after (following) the location source.
psibling
- Identifies sibling nodes (sharing their parent with the location source) that appear before (preceding) the location source.
fsibling
- Identifies sibling nodes (sharing their parent with the location source) that appear after (following) the location source.
If the keyword is omitted, it is treated as equivalent to the immediately preceding keyword; the keyword must not be omitted from the first location term of any XPointer (including embedded ones). For example, the following two XPointers are equivalent:
child(2,SECTION).(1,SUBSECTION) |
|
3.3.1 Relative Location Term Arguments
All relative location terms operate using the same set of potential arguments:
Relative location term arguments | ||||||||||||
|
3.3.2 Selection by Instance Number
Elements and other node types can be selected by occurrence number:
Instance | ||||||||
|
For a positive instance number n, the nth of
the candidate locations is identified.
For a negative instance number, the candidate locations are counted from last
to first (in a manner that is specific to each keyword). If the instance value
all
is given, then all the candidate locations are selected. Numbers
that are out of range cause the XPointer to fail.
3.3.3 Selection by Node Type
XML sub-resources can be selected by specific node type as well as number:
Node type | ||||||||||||||||||||||||||||
|
The node type may be specified by one of the following values:
Name
-
Selects a particular XML element
type; only elements of the specified type will count as candidates.
For example, the following identifies the 29th paragraph of the fourth sub-division
of the third major division of the location source:
child(3,DIV1).child(4,DIV2).child(29,P)
The following XPointer selects the last
EXAMPLE
element in the document:descendant(-1,EXAMPLE)
#element
-
Identifies XML elements. If no
NodeType
is specified,#element
is the default. The following example identifies the fifth child element:child(5)
#pi
-
Identifies XML processing instructions. This node type cannot satisfy any
attribute constraints. The only location term that can meaningfully be used
with a PI location source is
string
. #comment
-
Identifies XML comments. This node type cannot satisfy any attribute constraints.
The only location term that can meaningfully be used with a comment location
source is
string
. #text
-
Selects among text regions directly inside elements and CDATA sections.
This node type cannot satisfy any attribute constraints. The only location
term that can meaningfully be used with a text-region location source is
string
. #cdata
-
Selects among text regions found inside CDATA sections. This node type
cannot satisfy any attribute constraints. The only location term that can
meaningfully be used with a CDATA-region location source is
string
. #all
-
Selects among nodes of all the above types.
No node but an element can satisfy any attribute constraints, so if attribute
constraints are provided,
#all
is effectively equivalent to#element
.
Among the node types, elements can contain other types, but no other types
can contain anything but strings. Thus, for example, ancestor
location terms locate only element node types, and descendant
location terms navigate downward through elements (not other node types) to
reach the desired element or non-element node type.
Selection by a named element type when possible is strongly recommended; see "3.8 Link Persistence" for more information.
Consider the following example:
<!DOCTYPE SPEECH [ |
The following XPointers select various sub-resources within this resource:
id(a27).child(2,DIRECTION)
-
Selects the second "
DIRECTION
" element (whose content is "To Ros.
"). id(a27).child(2,#element)
-
Selects the second child element (that is, the first direction, whose content
is "
crossing downstage
"). id(a27).child(2,#text)
-
Selects the second text region , "
Fare you well, my lord.
" (The line break between theSPEAKER
andDIRECTION
elements is the first text region.)
3.3.4 Selection by Attribute
Candidate elements can be selected based on their attribute names and values. Note that non-element node types have no attributes, and so can never satisfy selection criteria that include attribute name or value specifications.
Attribute | ||||||||||||||||||||||||||||||||
|
The Attr
and Val
arguments
are used to provide attribute names and values to use in selecting among candidate
elements.
If specified within quotation marks, the attribute-value argument is case-sensitive; otherwise not.
Attribute names may be specified as "*
" in location terms
in the (unlikely) event that an attribute value constitutes a constraint regardless
of what attribute name it is a value for.
For example, the following location term selects the first child of the
location source for which the attribute TARGET
has a value:
|
The following XPointer chooses an element using the N
attribute:
|
Beginning at the location source, the first child (whatever element type
it is) with an N
attribute having the value 2
is
chosen; then that element's first child element having the value 1
for the same attribute is chosen. Non-element node types cannot be
chosen because they cannot have an N
attribute.
The following example selects the first child of the location source that
is an FS
element for which the RESP
attribute has
been left unspecified:
child(1,FS,RESP,#IMPLIED) |
Note that the html
keyword is a synonym for a very specific
instance of attribute-based addressing such that the following two XPointers
are equivalent:
html(Sec3.2) |
root().descendant(1,A,NAME,"Sec3.2") |
3.3.5 The descendant Keyword
The descendant
keyword selects a node of the specified type
anywhere inside the location source, either directly or indirectly nested.
The descendant
location term looks down through trees of subelements
in order to end at the node type requested, not down through
nested levels of intermediate PIs, comments, or text regions. The search for
matching node types occurs in the same order that the start-tags of elements
occur in the XML data stream: The first child of the location source is tested
first, then (if it is an element) that element's first child, and so on. In
formal terms, this is a depth-first traversal.
For example, the following XPointer selects the second TERM
element with a LANG
attribute whose value is DE
,
occurring within the element with an ID
attribute whose value
is A23
:
id(a23).descendant(2,TERM,LANG,DE) |
If an instance number is positive, the search is depth-first and left-to-right.
If an instance number is negative, the search is depth-first but right-to-left,
in which the right-most, deepest matching element is numbered -1, etc. The
order in which elements are examined corresponds to the ordering of the first
tag encountered. Thus, the following example chooses the last NOTE
element in the document, that is, the one with the rightmost end-tag:
|
If the last NOTE
happens to be within another NOTE
,
the containing one is chosen, not the subelement, because it extends to a
later point in the document.
3.3.6 The ancestor Keyword
The ancestor
keyword selects an element from among the direct
ancestors of the location source. For positive instance numbers, it counts
upwards from the parent of the location source to the root of the containing
resource. For negative instance numbers, it counts downwards from the root
to the direct parent. Note that ancestor
can never select the
location source itself.
For example, the following XPointer first chooses the innermost element
(nearest ancestor) properly containing the location source and having attribute
N
with value 1
, and then the smallest DIV
element properly containing that ancestor:
ancestor(1,#element,N,1).(1,DIV) |
The node type parameter for ancestor
, if supplied, must be
either #element
or a particular element type name. If the current
location source is an attribute, the element on which that attribute occurs
is considered the first ancestor.
3.3.7 The preceding Keyword
The preceding
keyword selects a node of the specified type
from among those which precede the location source. The set of nodes which
may be selected is the set of all those in the entire document that occur
or begin before the location source. For a positive instance number, it counts
left from the location source; for a negative instance number, it counts right
from the root element of the containing resource. The first delimiter or tag
encountered, starting or ending, counts as an occurrence of that node.
For example, the following XPointer designates the fifth element that occurs
or starts before the element that has an ID
of a23
:
|
Because all ancestors of the location source contain it and potentially other content, ancestors both "precede" and "follow" their descendants. Therefore, the following example selects the root element (probably among other nodes):
|
3.3.8 The following Keyword
The following
keyword selects a node of the specified type
from among those which follow the location source. The set of nodes which
may be selected is the set of all those in the entire document that occur
or end after the location source. For a positive instance number, it counts
right from the location source; for a negative instance number, it counts
left from the end-tag of the root element of the containing resource. The
first delimiter or tag encountered, starting or ending, counts as an occurrence
of that node.
For example, the following XPointer designates the second PI that occurs
after the element that has an ID
of a23
:
id(a23).following(2,#pi) |
Because all ancestors of the location source contain it and potentially other content, ancestors both "precede" and "follow" their descendants. Therefore, the following example selects the root element (probably among other nodes):
|
3.3.9 The psibling Keyword
The psibling
keyword selects a node of the specified type
from among those which precede the location source within the same parent
element. The nodes immediately contained by the same
parent element are siblings; those siblings which precede the location
source are its elder siblings, and those which follow it are its younger siblings.
For a positive instance number, psibling
counts left from
the most recent elder sibling to the eldest sibling. For a negative instance
number, it counts right from the eldest sibling. The location term fails if
the location source does not have at least as many elder siblings as the absolute
value of the instance number.
For example, this XPointer designates the element immediately preceding
the element with an ID
of a23
, as long as they
share the same parent:
id(a23).psibling(1,#element) |
If the location source has at least one elder sibling, then the following location term designates the very eldest sibling:
psibling(-1,#element) |
This location term is synonymous with the following XPointer:
ancestor(1,#element).child(1,#element) |
The value all
may be used to select the entire range of
elder siblings of an element. For example, the following XPointer designates
the set of elements preceding the element that has an ID
of
a23
and are contained by the same parent:
id(a23).psibling(all,#element) |
3.3.10 The fsibling Keyword
The fsibling
keyword selects a node of the specified type
from among those which follow the location source within the same parent element.
The nodes immediately contained by the same
parent element are siblings; those siblings which precede the location
source are its elder siblings, and those which follow it are its younger siblings.
For a positive instance number, fsibling
counts right from
the most recent younger sibling to the youngest sibling. For a negative instance
number, it counts left from the youngest sibling. The location term fails
if the location source does not have at least as many younger siblings as
the absolute value of the instance number.
For example, this XPointer designates the element immediately following
the element with an ID
of a23
, as long as they
share the same parent:
id(a23).fsibling(1,#element) |
If the location source has at least one younger sibling, then the following location term designates the very youngest sibling:
fsibling(-1,#element) |
This location term is synonymous with the following XPointer:
ancestor(1,#element).child(-1,#element) |
The value all
may be used to select the entire range of
younger siblings of an element. For example, the following XPointer designates
the set of elements followed the element that has an ID
of
a23
and are contained by the same parent:
id(a23).fsibling(all,#element) |
3.4 Spanning Location Term
The span
keyword locates a sub-resource starting at the beginning
of the data selected by its first argument and continuing through to the end
of the data selected by its second argument. Both arguments are interpreted
relative to the location source for the spanning location term itself; the
second argument does not use the first argument as its location source.
Spanning term | ||||
|
Following is an example of a spanning XPointer that selects the first through
third children of the element with ID a23
:
id(a23).span(child(1),child(3)) |
3.5 Attribute Location Term
The attr
keyword takes only an attribute name as a selector
and returns the attribute's value.
Attribute-match term | ||||
|
3.6 String Location Term
The string
keyword selects one or more strings or positions
between strings in the location source.
String-match term | ||||||||||||
|
InstanceOrAll
-
Identifies the nth occurrence of the specified string. For
a positive instance number, it counts right from the beginning of the location
source. For a negative instance number, it counts left from the end of the
location source. For the value
all
, all occurrences of the string are used as candidates in forming the designated resource. SkipLit
-
Identifies the candidate string to be found within the location source.
A null
SkipLit
string is considered to identify the position immediately preceding each character in the location source. For example, assuming that the element with IDx37
contains the character string "Thomas", the following XPointer identifies the position before the third character ("o"):id(x37).string(3,"carview.php?tsp=")
Position
-
Identifies a character offset from the start of the candidate string(s)
to the beginning of the desired final string match. The position number may
not be zero; if omitted, it is assumed to be 1.
A positive position number counts right from the beginning of the specified
string. A negative position number counts left from the end of the string;
for example, position -1 is the position immediately preceding the last character
in the match. A position value of
end
selects the position immediately following the last character of the match. Length
- Specifies the number of characters to be selected. A length of zero or an omitted length references a precise point preceding the character indicated by Position.
When the location source is a PI or comment, string
operates
on the content of that node. However, the content of PIs and comments is not
otherwise considered text content.
For example, the following XPointer selects the position immediately preceding
the letter "P
" (8 from the start of the string) in the third
occurrence of the string "Thomas Pynchon
":
root().string(3,"Thomas Pynchon",8) |
The following XPointer selects the fifth exclamation mark and the character immediately following it:
id(a27).string(5,'!',1,1) |
For purposes of string matching, the "text of the element" means all the character data in the element(s) in the current location source and descendant elements. Markup characters are ignored. The pattern matching is exact and character-for-character. No case, space, or combining-character normalization of any kind is to be performed. Thus, there would be no match to "Thomas Pynchon" in the following example. The first seeming match differs in case, and the second by omission of the word-separating space:
<example>thomas pynchon, |
3.7 Locations That Are Not Simply Nodes
Most location terms select a single element as their result: for example, the following XPointer selects one element:
id(foo).child(1,SEC) |
Such cases trivially correspond to nodes in element trees, thus admitting certain implementation simplifications. However, not all locations terms have this limitation:
- The
string
location term generally returns only part of a node, but if the matched content had markup within it, the result may include portions of multiple elements. - The
string
location term, when used with theall
instance value, returns a list of typically discontiguous portions of string data. - The relative location terms may specify the instance argument as
all
, meaning that all candidate nodes are included in the result. The result is thus a vector of possibly non-adjacent nodes, rather than a subtree. - A spanning XPointer may include various elements only partially.
Each of these cases is described in more detail below.
3.7.1 Spanning Strings
A string
location term may return parts of several elements.
For example, a string
that specified the 12 characters beginning
at the "c" below would return the entire text content of the EMPH
element, plus the text region that follows the EMPH
inside the
P
:
<P>Hello, <EMPH>cruel</EMPH> world.</P> |
3.7.2 The all Keyword
The XPointers shown below specify ordered lists of elements. The elements may or may not be contiguous; in the first case they probably are; in the second, they probably are not:
id(sec2.1).child(1,list).child(all,list) |
id(div1).descendant(all,h3) |
Note that a discontiguous series of elements such as this may be usefully implemented using the same underlying abstract type that would represent the results of a query in certain processing scenarios.
3.7.3 Spanning XPointers
The following spanning XPointer selects everything from the last
P
element in one section through the first P
in another:
span(id(sec2.1).child(-1,P),id(sec2.2).child(1,P)) |
Span locations are not subtrees of XML documents, nor are they mere content
data strings. Thus, the result of a spanning selection cannot generally be
expressed as a well-formed XML document, nor as a node or list of nodes from
an element tree. This is because in general, some elements are neither "in"
nor "out of" the span, but in fact are partly in it. For example, the example
above includes the end of the element with id sec2.1
, but not
its start. Because of this, implementations that support spans cannot represent
them merely as single nodes or as well-formed XML documents; instead they
must represent them as pairs of locations or by some other means that can
express their greater generality.
Some processing semantics that make sense for nodes or vectors of nodes may not make sense for spans. A browser could easily highlight just the character content of a span, but there may be no appropriate semantics to apply in an outliner or tree-oriented display.
3.8 Link Persistence
It is impossible to guarantee that links to target resources will never break; the resources could be changed in such a way that even the most robust link will break. At worst, the author of a target resource could rewrite it to discuss another subject entirely, making all links irrelevant even if they refer to resources using IDs. However, under typical conditions, some XPointers can be reasonably robust.
The most robust locators are usually those which use only an ID, and this
is the preferred locator when available. However, not all elements have IDs,
and link creators often do not have enough control over a target resource
to have an ID added to it. In such cases the preferred locator is one that
points to the nearest containing element that does have an ID, and then walks
down the element tree using the child
location term. This form
is relatively robust for two reasons:
- It has a good probability of withstanding editing; for example, no edit outside the element with the ID can harm the reference.
- It will fail obviously rather than quietly if the link does break.
In addition, where relative location terms such as child
are
used, selection by named element type (where the second argument in a relative
location term has a Name
in it)
is preferred over selection without specifying a name, for two reasons:
- It is more clear because people typically refer to things by type: "the second section", "the third paragraph", etc.
- It is more robust because it increases the chance of detecting breakage if the original target no longer exists.
4. Conformance
A string conforms to XPointer if it adheres to the syntactic requirements imposed by this specification. Note that this does not require that the string, in association with a URI, actually point to a resource that exists at any given moment.
Application software conforms to XPointer if it interprets XPointer-conforming strings according to all required semantics prescribed by this specification and, for any optional semantics it chooses to support, supports them in the way prescribed. Application software is free to define its own requirements on where XPointer strings will be recognized. For example, an XML application program might choose to recognize XPointers only when they occur in locator attributes of XLink elements.
Appendices
A. Unfinished Work
A.1 Case Sensitivity in Attribute Values
It is possible to specify a link's resource based on the value of an attribute. It is is difficult to decide what the correct behavior is as regards case-sensitivity in matching. Ideally, the declared type of the attribute value should be taken into account, but that presupposes fetching and reading the document's DTD, which may not be appropriate in many XML applications. The current system, while easy to explain, may not prove suitable in the long run.
A.2 XPointers and Abstract Data Types
Formally, the operations of the XPointer mechanism may may be specified as operating on abstract data structures, such as defined in DOM and the HyTime standard ([ISO/IEC 10744]). Every node type in such locators has a corresponding expression in SDQL, and most also have direct equivalents in the HyTime location module.
B. XPointers and TEI Extended Pointers
The XPointer language is based on "extended pointers," a publicly available technology in use by various SGML-based hypermedia applications, defined in the Text Encoding Initiative guidelines [TEI]. This appendix describes how XPointers differ from extended pointers. The main differences facilitate the packaging of locators easily within URIs, and omit some more advanced capabilities:
- Arguments in locator terms must be separated by commas rather than spaces, to facilitate including XPointers within URIs without escapes, and location terms are now separated by periods.
- A spanning XPointer may contain two XPointers separated by the string
"
,
". This combines the capability of the TEIFROM
andTO
attributes into a single locator syntax for spans. - The argument-less terms
origin
androot
take an empty argument list, to distinguish them from possible IDs. - Regular expression matching for GIs and attributes is not included.
- The
PATTERN
term is replaced by a literalstring
matching term. - Options have been added to allow the specification of various non-element node types, including PIs, comments, and portions of unmarked-up text content within elements and CDATA sections.
- In addition, a few terms have been renamed for greater clarity.
- The
SPACE
,HyQ
, andFOREIGN
keywords have been omitted.
These changes have been communicated to the TEI, which is considering them for inclusion in a subsequent revision.
Note that the proposed TEI keyword ATTR
has been included
in XPointer.
C. References
- XLINK
- Eve Maler and Steve DeRose, editors. XML Linking Language (XLink) V1.0. ArborText, Inso, and Brown University. Burlington, Seekonk, et al.: World Wide Web Consortium, 1998. (See https://www.w3.org/TR/WD-xlink.)
- ISO/IEC 10744
- ISO (International Organization for Standardization). ISO/IEC 10744-1992 (E). Information technology --Hypermedia/Time-based Structuring Language (HyTime). [Geneva]: International Organization for Standardization, 1992. Extended Facilities Annex. [Geneva]: International Organization for Standardization, 1996. (See https://www.ornl.gov/sgml/wg8/docs/n1920/html/n1920.html ).
- IETF RFC 1738
- IETF (Internet Engineering Task Force). RFC 1738: Uniform Resource Locators. 1991. (See https://www.w3.org/Addressing/rfc1738.txt.)
- IETF RFC 1808
- IETF (Internet Engineering Task Force). RFC 1808: Relative Uniform Resource Locators. 1995. (See https://www.w3.org/Addressing/rfc1808.txt ).
- TEI
- C. M. Sperberg-McQueen and Lou Burnard, editors. Guidelines for Electronic Text Encoding and Interchange. Association for Computers and the Humanities (ACH), Association for Computational Linguistics (ACL), and Association for Literary and Linguistic Computing (ALLC). Chicago, Oxford: Text Encoding Initiative, 1994.
- DOM
- Document Object Model Specification. World Wide Web Consortium, 1997. (See https://www.w3.org/TR/WD-DOM.)
- CHUM
- Steven J. DeRose and David G. Durand. 1995. "The TEI Hypertext Guidelines." In Computing and the Humanities 29(3). Reprinted in Text Encoding Initiative: Background and Context, ed. Nancy Ide and Jean V?ronis, ISBN 0-7923-3704-2.
Copyright © 1998 W3C (MIT, INRIA, Keio ), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.