CARVIEW |
Date: 2009-03-02, last change: $Date: 2009/05/12 14:22:44 $
Status: personal view only. Editing status: first draft. Written as it became clear that my views on conneg were not really available in a coherent place.
Content Negotiation of Content-type
Content negotiation is a flexiblity point in the web architecture which has been there fom the very beginning of HTTP 1.0 (though not in HTTP 0.9). It was designed to allow the web to evolve througth the introduction of new formats.
In this note we discuss only the negotiation of Content-type, not that of natural language.
In the simplest from, the client sends a Accept:
header with a list of content types it understands, and the
server sends back the information using one of them.
In a slightly more complex form, the client sends weights (qc) which express some form of relative disadvantage of different types. The server may combine this information with a knowledge of the different formats available, and the extent (qs) to which they really convey intent of the resource. A simple model is that the q can be thought of a proporion of the original value which is preserved.
Content negotiation is useful. One use is to shepherd a new data format into a world which initially does not typically accept it. It allows those systems which do accept the new format to advertize it and so particpate in the new technology.
Examples
The tabulator project gave an interesting test case. The tabulator is a Firefox add-on whcih allows it to browse RDF data as well as HTML documents. For the sake of argument, let's say that it renders them both equally well. For this reason it sets the q values for HTML and for RDF the same. It exists in a qorld in which
Example: A page of RDF and HTML generated from it.
The HTML page contains the same information as RDF, but degraded in that it is not machine readable.
- People want an HTML browser user to see something useful when following a link to a thing described in an RDF file say <foo.rdf#bar>.
- There are rdf-only clients which use the data.
- Tabulator users want to see the object itself rather than an HTML page, because they have more power when using the data browser.
HTML | RDF | result | |
qs = .5 | qs = 1 | ||
HTML browser ( HTML qc=1) | .5 | 0 | Gets HTML |
RDF system (RDF qc=1) | 0 | 1 | Gets RDF |
Tabulator (HTML qc=1, RDF qc=1) | 0.5 | 1 | Gets RDF |
Example: A page of HTML and some RDF extracted from it.
There is an RDF page which is derived from the HTML page but contains very little of the information on it. The HTML page is soemone's home page with all kinds of details of their life and photos and contact information.
- People want an HTML browser user to see something useful when following a link to a thing described in an RDF file say <foo.rdf#bar>.
- There are rdf-only clients whcih use the data.
- Tabulator users want to see the object itself rather than an HTML page, because they have more power when using the data browser.
HTML | RDF | result | |
qs = 1 | qs = 0.5 | ||
HTML browser ( HTML qc=1) | 1 | 0 | Gets HTML |
RDF system (RDF qc=1) | 0 | 0.5 | Gets RDF |
Tabulator (HTML qc=1, RDF qc=1) | 1 | 0.5 | Gets HTML |
The tabulator user follows a link from a friend to the page, and gets the full web page.
In this case, to cater for the user who explicitly wants the data, then a normal way is to put an explicit link to another URI which is for the RDF itself.
When not to use conneg
It doesn't work for everything. A specific problem with trying to use it for everything is that a typical web browser, especially if you include all th eresources of a laptop operatng system, can often understand hundred of mime types, and there isn't space in the Accept: header.
It must only used to negotiate between things which, while being of different content types, carry the same information. More or less, in that there may be quality degradation. A jpeg version of photo may have less quality than the PNG alternative. An RDF/XML file may not be able to express all the information in an N3 file, but it might be an important subset. When there is a subset relationship, then q values must be used to allow the best one to be selected.
It is really important that content negotiation is not used to give different clients under diffeernt circumstaces completely different documents, such as a picture of something and an essay about it, or the card catalogue for a book and the content of the book.
Content negotiation must never be used to negotiate between a document and metadata about that document.
Bad example:
A server serves using content negotiation
- A PNG image of a person
- Some RDF about the person
This is a bad idea, in that a user will expect to bookmark a picture and return to it later, and not expect instead, from using and RDF-aware browser, to be given data. The RDF file and the PNG data are not different content-types of the same information. They are completely different information.