CARVIEW |
File API
W3C Working Draft 21 April 2015
- This version:
- https://www.w3.org/TR/2015/WD-FileAPI-20150421/
- Latest published version:
- https://www.w3.org/TR/FileAPI/
- Latest editor's draft:
- https://w3c.github.io/FileAPI/
- Test suite:
- https://w3c-test.org/FileAPI/
- Previous version:
- https://www.w3.org/TR/2013/WD-FileAPI-20130912/
- Editors:
- Arun Ranganathan, Mozilla Corporation, <arun@mozilla.com>
- Jonas Sicking, Mozilla Corporation <jonas@sicking.cc>
- Repository and Participation:
- We are on github.
- File a bug/issue.
- Commit history.
- Mailing list search.
Copyright © 2015 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
Abstract
This specification provides an API for representing file objects in web applications, as well as programmatically selecting them and accessing their data. This includes:
- A FileList interface, which represents an array of individually selected files from the underlying system.
The user interface for selection can be invoked via
<input type="file">
, i.e. when theinput
element is in theFile Upload
state [HTML] . - A Blob interface, which represents immutable raw binary data, and allows access to ranges of bytes within the
Blob
object as a separate Blob. - A File interface, which includes readonly informational attributes about a file such as its name and the date of the last modification (on disk) of the file.
- A FileReader interface, which provides methods to read a File or a Blob, and an event model to obtain the results of these reads.
- A URL scheme for use with binary data such as files, so that they can be referenced within web applications.
Additionally, this specification defines objects to be used within threaded web applications for the synchronous reading of files.
The section on Requirements and Use Cases [REQ] covers the motivation behind this specification.
This API is designed to be used in conjunction with other APIs and elements on the web platform,
notably: XMLHttpRequest (e.g. with an overloaded send()
method for File or Blob
arguments), postMessage
, DataTransfer
(part
of the drag and drop API defined in [HTML]) and
Web Workers. Additionally, it should be possible to programmatically obtain a list of files from the
input
element when it is
in the File Upload state
[HTML].
These kinds of behaviors are defined in the appropriate affiliated specifications.
Status of this Document
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
If you have comments for this spec, please send them to public-webapps@w3.org with a Subject: prefix of[FileAPI]
. See
Bugzilla
for this specification's open bugs.
This document was published by the Web Applications Working Group as a Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-webapps@w3.org (subscribe, archives). All comments are welcome.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 August 2014 W3C Process Document.
Table of Contents
- 1. Introduction
- 2. Conformance
- 3. Dependencies
- 4. Terminology
- 5. The Blob Interface and Binary Data
- 6. The File Interface
- 7. The FileList Interface
- 8. Reading Data
- 9. Errors and Exceptions
- 10. A URL for Blob and File reference
- 11. Security Considerations
- 12. Requirements and Use Cases
- 13. Appendix A
- 14. Acknowledgements
- 15. References
1. Introduction
This section is informative.
Web applications should have the ability to manipulate as wide as possible a range of user input, including files that a user may wish to upload to a remote server or manipulate inside a rich web application. This specification defines the basic representations for files, lists of files, errors raised by access to files, and programmatic ways to read files. Additionally, this specification also defines an interface that represents "raw data" which can be asynchronously processed on the main thread of conforming user agents. The interfaces and API defined in this specification can be used with other interfaces and APIs exposed to the web platform.
The File
interface represents file data typically obtained from the underlying (OS) file system, and the Blob
interface
("Binary Large Object" - a name originally introduced to web APIs in Google Gears) represents immutable raw data. File
or
Blob
reads should happen asynchronously on the main thread, with an optional synchronous API used
within threaded web applications. An asynchronous API for reading files prevents blocking and UI "freezing" on a user
agent's main thread. This specification defines an asynchronous API based on an event model to read and access a File
or Blob
's
data. A FileReader
object provides asynchronous read methods to
access that file's data through event handler attributes and the firing of events. The use of events and event handlers allows separate code blocks the ability
to monitor the progress of the read (which is particularly useful for remote drives or mounted drives, where file access performance may vary from local drives)
and error conditions that may arise during reading of a file. An example will be illustrative.
In the example below, different code blocks handle progress, error, and success conditions.
function startRead() {
// obtain input element through DOM
var file = document.getElementById('file').files[0];
if(file){
getAsText(file);
}
}
function getAsText(readFile) {
var reader = new FileReader();
// Read file into memory as UTF-16
reader.readAsText(readFile, "UTF-16");
// Handle progress, success, and errors
reader.onprogress = updateProgress;
reader.onload = loaded;
reader.onerror = errorHandler;
}
function updateProgress(evt) {
if (evt.lengthComputable) {
// evt.loaded and evt.total are ProgressEvent properties
var loaded = (evt.loaded / evt.total);
if (loaded < 1) {
// Increase the prog bar length
// style.width = (loaded * 200) + "px";
}
}
}
function loaded(evt) {
// Obtain the read file data
var fileString = evt.target.result;
// Handle UTF-16 file dump
if(utils.regexp.isChinese(fileString)) {
//Chinese Characters + Name validation
}
else {
// run other charset test
}
// xhr.send(fileString)
}
function errorHandler(evt) {
if(evt.target.error.name == "NotReadableError") {
// The file could not be read
}
}
2. Conformance
Everything in this specification is normative except for examples and sections marked as being informative.
The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “RECOMMENDED”, “MAY” and “OPTIONAL” in this document are to be interpreted as described in Key words for use in RFCs to Indicate Requirement Levels [RFC2119].
The following conformance classes are defined by this specification:
- conforming user agent
-
A user agent is considered to be a conforming user agent if it satisfies all of the MUST-, REQUIRED- and SHALL-level criteria in this specification that apply to implementations. This specification uses both the terms "conforming user agent" and "user agent" to refer to this product class.
User agents may implement algorithms in this specifications in any way desired, so long as the end result is indistinguishable from the result that would be obtained from the specification's algorithms.
User agents that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL] as this specification uses that specification and terminology.
3. Dependencies
This specification relies on underlying specifications.
- DOM
A conforming user agent must support at least the subset of the functionality defined in DOM4 that this specification relies upon; in particular, it must support
EventTarget
. [DOM4]- Progress Events
A conforming user agent must support the Progress Events specification. Data access on read operations is enabled via Progress Events.[ProgressEvents]
- HTML
A conforming user agent must support at least the subset of the functionality defined in HTML that this specification relies upon; in particular, it must support event loops and event handler attributes. [HTML]
- Web IDL
A conforming user agent must also be a conforming implementation of the IDL fragments in this specification, as described in the Web IDL specification. [WebIDL]
- Typed Arrays
A conforming user agent must support the Typed Arrays specification [TypedArrays].
Parts of this specification rely on the Web Workers specification; for those parts of this specification, the Web Workers specification is a normative dependency. [Workers]
4. Terminology
4.1 Terms
The terms and algorithms document, unloading document cleanup steps, event handler attributes, event handler event type, effective script origin, incumbent settings object, event loops, task, task source, URL, global script cleanup jobs list, global script cleanup, queue a task, UTF-8, UTF-16. structured clone, collect a sequence of characters and converting a string to ASCII lowercase are as defined by the HTML specification [HTML].
The terms origin and same origin are as defined by the Web Origin Concept specification [ORIGIN].
When this specification says to terminate an algorithm the user agent must terminate the algorithm after finishing the step it is on, and return from it. Asynchronous read methods defined in this specification may return before the algorithm in question is terminated, and can be terminated by an abort()
call.
The term throw in this specification, as it pertains to exceptions, is used as defined in the DOM4 specification [DOM4].
The term byte in this specification is used as defined in the Encoding Specification [Encoding Specification].
The term chunk in this specification is used as defined in the Streams Specification [Streams Specification.]
The term context object in this specification is used as defined in the DOM4 specification [DOM4].
The terms URL, relative URL, base URL, URL parser, basic URL Parser, scheme, host, relative scheme, scheme data, and fragment are as defined by the WHATWG URL Specification [URL].
The terms request, response, body and cross-origin request are as defined in the WHATWG Fetch Specification [Fetch Specification].
The term Unix Epoch is used in this specification to refer to the time 00:00:00 UTC on January 1 1970 (or 1970-01-01T00:00:00Z ISO 8601); this is the same time that is conceptually "0" in ECMA-262 [ECMA-262].
The algorithms and steps in this specification use the following mathematical operations:
max(a,b) returns the maximum of a and b, and is always performed on integers as they are defined in WebIDL [WebIDL]; in the case of max(6,4) the result is 6. This operation is also defined in ECMAScript [ECMA-262].
min(a,b) returns the minimum of a and b, and is always performed on integers as they are defined in WebIDL [WebIDL]; in the case of min(6,4) the result is 4. This operation is also defined in ECMAScript [ECMA-262].
Mathematical comparisons such as < (less than), ≤ (less than or equal to) and > (greater than) are as in ECMAScript [ECMA-262].
5. The Blob Interface and Binary Data
A Blob
object refers to a byte sequence, and has a size
attribute which is the total number of bytes in the byte sequence, and a type
attribute, which is an ASCII-encoded string in lower case representing the media type of the byte sequence.
A Blob
must have a readability state, which is one of OPENED
or CLOSED
. A Blob
that refers to a byte sequence, including one of 0 bytes, is said to be in the OPENED
readability state. A Blob
is said to be closed if its close
method has been called. A Blob
that is closed is said to be in the CLOSED
readability state.
Each Blob
must have an internal snapshot state, which must be initially set to the state of the underlying storage, if any such underlying storage exists, and must be preserved through structured clone. Further normative definition of snapshot state can be found for files.
[Constructor,
Constructor(sequence<(ArrayBuffer or ArrayBufferView or Blob or DOMString)> blobParts, optional BlobPropertyBag options), Exposed=Window,Worker]
interface Blob {
readonly attribute unsigned long long size;
readonly attribute DOMString type;
readonly attribute boolean isClosed;
//slice Blob into byte-ranged chunks
Blob slice([Clamp] optional long long start,
[Clamp] optional long long end,
optional DOMString contentType);
void close();
};
dictionary BlobPropertyBag {
DOMString type = "";
};
5.1. Constructors
The Blob()
constructor can be invoked with zero or more parameters. When the Blob()
constructor is invoked, user agents must run the following Blob constructor steps:
If invoked with zero parameters, return a new
Blob
object with its readability state set toOPENED
, consisting of 0 bytes, withsize
set to 0, and withtype
set to the empty string.Otherwise, the constructor is invoked with a
blobParts
sequence. Let a be that sequence.Let bytes be an empty sequence of bytes.
Let length be a's length. For 0 ≤ i < length, repeat the following steps:
Let element be the ith element of a.
If element is a
DOMString
, run the following substeps:If element is an
ArrayBufferView
[TypedArrays], convert it to a sequence ofbyteLength
bytes from the underlyingArrayBuffer
, starting at thebyteOffset
of theArrayBufferView
[TypedArrays], and append those bytes to bytes.If element is an
ArrayBuffer
[TypedArrays], convert it to a sequence ofbyteLength
bytes, and append those bytes to bytes.If element is a
Blob
, append the bytes it represents to bytes. Thetype
of theBlob
array element is ignored.
If the
type
member of the optionaloptions
argument is provided and is not the empty string, run the following sub-steps:- Let t be the
type
dictionary member. If t contains any characters outside the range U+0020 to U+007E, then set t to the empty string and return from these substeps. - Convert every character in t to lowercase using the "converting a string to ASCII lowercase" algorithm [WebIDL].
- Let t be the
Return a
Blob
object with its readability state set toOPENED
, referring to bytes as its associated byte sequence, with itssize
set to the length of bytes, and itstype
set to the value of t from the substeps above.NoteThe type t of a
Blob
is considered a parsable MIME type if the ASCII-encoded string representing the Blob object's type, when converted to a byte sequence, does not return undefined for the parse MIME type algorithm [MIMESNIFF].
5.1.1. Constructor Parameters
The Blob()
constructor can be invoked with the parameters below:
- A
blobParts
sequence
- which takes any number of the following types of elements, and in any order:
ArrayBuffer
[TypedArrays] elements.ArrayBufferView
[TypedArrays] elements.Blob
elements.DOMString
[WebIDL] elements.
- An optional
BlobPropertyBag
- which takes one member:
type
, the ASCII-encoded string in lower case representing the media type of theBlob
. Normative conditions for this member are provided in the Blob constructor steps.
Examples of constructor usage follow.
// Create a new Blob object
var a = new Blob();
// Create a 1024-byte ArrayBuffer
// buffer could also come from reading a File
var buffer = new ArrayBuffer(1024);
// Create ArrayBufferView objects based on buffer
var shorts = new Uint16Array(buffer, 512, 128);
var bytes = new Uint8Array(buffer, shorts.byteOffset + shorts.byteLength);
var b = new Blob(["foobarbazetcetc" + "birdiebirdieboo"], {type: "text/plain;charset=UTF-8"});
var c = new Blob([b, shorts]);
var a = new Blob([b, c, bytes]);
var d = new Blob([buffer, b, c, bytes]);
5.2. Attributes
size
Returns the size of the byte sequence in number of bytes. On getting, conforming user agents must return the total number of bytes that can be read by a
FileReader
orFileReaderSync
object, or 0 if the Blob has no bytes to be read. If theBlob
has a readability state ofCLOSED
thensize
must return 0.type
The ASCII-encoded string in lower case representing the media type of the
Blob
. On getting, user agents must return the type of aBlob
as an ASCII-encoded string in lower case, such that when it is converted to a byte sequence, it is a parsable MIME type [MIMESNIFF], or the empty string -- 0 bytes -- if the type cannot be determined. Thetype
attribute can be set by the web application itself through constructor invocation and through theslice
call; in these cases, further normative conditions for this attribute are in the Blob constructor steps, the File Constructor steps, and the slice method algorithm respectively. User agents can also determine thetype
of aBlob
, especially if the byte sequence is from an on-disk file; in this case, further normative conditions are in the file type guidelines..isClosed
The boolean value that indicates whether the
Blob
is in theCLOSED
readability state. On getting, user agents must returnfalse
if theBlob
is in theOPENED
readability state, andtrue
if theBlob
is in theCLOSED
readability state as a result of theclose
method being called.
Use of the type
attribute informs the encoding determination and parsing the Content-Type header when dereferencing Blob URLs.
5.3. Methods and Parameters
5.3.1. The slice method
The slice
method returns a new Blob
object with bytes ranging from the optional start
parameter upto but not including the optional end
parameter, and with a
type
attribute that is the value of the optional contentType
parameter. It must act as follows :
Let O be the
Blob
context object on which theslice
method is being called.The optional
start
parameter is a value for the start point of aslice
call, and must be treated as a byte-order position, with the zeroth position representing the first byte. User agents must processslice
withstart
normalized according to the following:If the optional
start
parameter is not used as a parameter when making this call, let relativeStart be 0.If
start
is negative, let relativeStart be max((size
+start
), 0).Else, let relativeStart be min(start, size).
The optional
end
parameter is a value for the end point of aslice
call. User agents must processslice
withend
normalized according to the following:If the optional
end
parameter is not used as a parameter when making this call, let relativeEnd besize
.If
end
is negative, let relativeEnd be max((size + end), 0)Else, let relativeEnd be min(end, size)
The optional
contentType
parameter is used to set the ASCII-encoded string in lower case representing the media type of the Blob. User agents must process theslice
withcontentType
normalized according to the following:If the
contentType
parameter is not provided, let relativeContentType be set to the empty string .Else let relativeContentType be set to
contentType
and run the substeps below:- If relativeContentType contains any characters outside the range of U+0020 to U+007E, then set relativeContentType to the empty string and return from these substeps.
- Convert every character in relativeContentType to lower case using the "Converting a string to ASCII lowercase" algorithm.
Let span be max((relativeEnd - relativeStart), 0).
Return a new
Blob
object S with the following characteristics:S has a readability state equal to that of O's readability state.
NoteThe readability state of the context object is retained by the
Blob
object returned by theslice
call; this has implications on whether the returnedBlob
is actually usable for read operations or as a Blob URL.S refers to span consecutive bytes from O, beginning with the byte at byte-order position relativeStart.
S.
size
= span.S.
type
= relativeContentType.NoteThe type t of a
Blob
is considered a parsable MIME type if the ASCII-encoded string representing the Blob object's type, when converted to a byte sequence, does not return undefined for the parse MIME type algorithm [MIMESNIFF].
The examples below illustrate the different types of slice
calls possible. Since the
File
interface inherits from the Blob
interface, examples are based on the use of the File
interface.
// obtain input element through DOM
var file = document.getElementById('file').files[0];
if(file)
{
// create an identical copy of file
// the two calls below are equivalent
var fileClone = file.slice();
var fileClone2 = file.slice(0, file.size);
// slice file into 1/2 chunk starting at middle of file
// Note the use of negative number
var fileChunkFromEnd = file.slice(-(Math.round(file.size/2)));
// slice file into 1/2 chunk starting at beginning of file
var fileChunkFromStart = file.slice(0, Math.round(file.size/2));
// slice file from beginning till 150 bytes before end
var fileNoMetadata = file.slice(0, -150, "application/experimental");
}
5.3.2. The close method
The close
method is said to close a Blob
, and must act as follows on the Blob
on which the method has been called:
If the readability state of the context object is
CLOSED
, terminate this algorithm.Otherwise, set the readability state of the context object to
CLOSED
.If the context object has an entry in the Blob URL Store, remove the entry that corresponds to the context object.
6. The File Interface
A File
object is a Blob
object with a name
attribute, which is a string; it can be created within the web application via a constructor, or is a reference to a byte sequence from a file from the underlying (OS) file system.
If a File
object is a reference to a byte sequence originating from a file on disk, then its snapshot state should be set to the state of the file on disk at the time the File
object is created.
This is a non-trivial requirement to implement for user agents, and is thus not a must but a should [RFC2119]. User agents should endeavor to have a File
object's snapshot state set to the state of the underlying storage on disk at the time the reference is taken. If the file is modified on disk following the time a reference has been taken, the File
's snapshot state will differ from the state of the underlying storage. User agents may use modification time stamps and other mechanisms to maintain snapshot state, but this is left as an implementation detail.
When a File
object refers to a file on disk, user agents must return the type
of that file, and must follow the file type guidelines below:
User agents must return the
type
as an ASCII-encoded string in lower case, such that when it is converted to a corresponding byte sequence, it is a parsable MIME type [MIMESNIFF], or the empty string -- 0 bytes -- if the type cannot be determined.When the file is of type
text/plain
user agents must NOT append a charset parameter to the dictionary of parameters portion of the media type [MIMESNIFF].User agents must not attempt heuristic determination of encoding, including statistical methods.
[Constructor(sequence<(Blob or DOMString or ArrayBufferView or ArrayBuffer)> fileBits,
[EnsureUTF16] DOMString fileName, optional FilePropertyBag options), Exposed=Window,Worker]
interface File : Blob {
readonly attribute DOMString name;
readonly attribute long long lastModified;
};
dictionary FilePropertyBag {
DOMString type = "";
long long lastModified;
};
6.1 Constructor
The File
constructor is invoked with two or three parameters, depending on whether the optional dictionary parameter is used. When the File()
constructor is invoked, user agents must run the following File constructor steps:
- Let a be the
fileBits
sequence argument. Let bytes be an empty sequence of bytes. Let length be a's length. For 0 ≤ i < length, repeat the following steps:- Let element be the i'th element of a.
- If element is a
DOMString
, run the following substeps: - If element is an
ArrayBufferView
[TypedArrays], convert it to a sequence ofbyteLength
bytes from the underlying ArrayBuffer, starting at thebyteOffset
of theArrayBufferView
[TypedArrays], and append those bytes to bytes. If element is an
ArrayBuffer
[TypedArrays], convert it to a sequence ofbyteLength
bytes, and append those bytes to bytes.If element is a
Blob
, append the bytes it represents to bytes. Thetype
of theBlob
argument must be ignored.
- Let n be a new string of the same size as the
fileName
argument to the constructor. Copy every character fromfileName
to n, replacing any "/" character (U+002F SOLIDUS) with a ":" (U+003A COLON).NoteUnderlying OS filesystems use differing conventions for file name; with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to byte sequences.
- If the optional
FilePropertyBag
dictionary argument is used, then run the following substeps:- If the
type
member is provided and is not the empty string, let t be set to thetype
dictionary member. If t contains any characters outside the range U+0020 to U+007E, then set t to the empty string and return from these substeps. - Convert every character in t to lowercase using the "converting a string to ASCII lowercase algorithm" [WebIDL].
- If the
lastModified
member is provided, let d be set to thelastModified
dictionary member. If it is not provided, set d to the current date and time represented as the number of milliseconds since the Unix Epoch (which is the equivalent ofDate.now()
[ECMA-262]).NoteSince ECMA-262
Date
objects convert tolong long
values representing the number of milliseconds since the Unix Epoch, thelastModified
member could be aDate
object [ECMA-262].
- If the
- Return a new
File
object F such that:- F has a readability state of
OPENED
. - F refers to the bytes byte sequence.
F.size
is set to the number of total bytes in bytes.F.name
is set to n.F.type
is set to t.NoteThe type t of a
File
is considered a parsable MIME type if the ASCII-encoded string representing the File object's type, when converted to a byte sequence, does not return undefined for the parse MIME type algorithm [MIMESNIFF].F.lastModified
is set to d.
- F has a readability state of
6.1.1 Constructor Parameters
The File()
constructor can be invoked with the parameters below:
- A
fileBits
sequence
- which takes any number of the following elements, and in any order:
ArrayBuffer
[TypedArrays] elements.ArrayBufferView
[TypedArrays] elements.DOMString
[WebIDL] elements.
- A
name
parameter -
A
DOMString
[WebIDL] parameter representing the name of the file; normative conditions for this constructor parameter can be found in the File constructor steps. - An optional
FilePropertyBag
dictionary which takes the following members:
An optional
type
member; the ASCII-encoded string in lower case representing the media type of theFile
. Normative conditions for this member are provided in the File constructor steps.An optional
lastModified
member, which must be along long
; normative conditions for this member are provided in the File constructor steps.
6.2. Attributes
name
The name of the file; on getting, this must return the name of the file as a string. There are numerous file name variations and conventions used by different underlying OS file systems; this is merely the name of the file, without path information. On getting, if user agents cannot make this information available, they must return the empty string. If a
File
object is created using a constructor, further normative conditions for this attribute are found in the file constructor steps.lastModified
The last modified date of the file. On getting, if user agents can make this information available, this must return a
long long
set to the time the file was last modified as the number of milliseconds since the Unix Epoch. If the last modification date and time are not known, the attribute must return the current date and time as along long
representing the number of milliseconds since the Unix Epoch; this is equivalent toDate.now()
[ECMA-262]. If aFile
object is created using a constructor, further normative conditions for this attribute are found in the file constructor steps.
The File
interface is available on objects that expose an attribute of type FileList
; these objects are defined in
HTML [HTML]. The File
interface, which inherits from Blob
, is immutable, and thus represents file data that can be
read into memory at the time a read operation is initiated. User agents must process reads on files that no longer exist at the time of read as
errors, throwing a NotFoundError
exception if using a FileReaderSync
on a Web Worker [Workers] or firing an error
event with the error
attribute returning a NotFoundError
DOMError
.
In the examples below, metadata from a file object is displayed meaningfully, and a file object is created with a name and a last modified date.
var file = document.getElementById("filePicker").files[0];
var date = new Date(file.lastModified);
println("You selected the file " + file.name + " which was modified on " + date.toDateString() + " .");
...
// Generate a file with a specific last modified date
var d = new Date(2013, 12, 5, 16, 23, 45, 600);
var generatedFile = new File(["Rough Draft ...."], "Draft1.txt", {type: "text/plain"; lastModified: d})
...
7. The FileList Interface
The FileList
interface should be considered "at risk" since the general trend on the Web Platform is to replace such interfaces with the Array
platform object in ECMAScript [ECMA-262]. In particular, this means syntax of the sort filelist.item(0)
is at risk; most other programmatic use of FileList
is unlikely to be affected by the eventual migration to an Array
type.
This interface is a list of File
objects.
Sample usage typically involves DOM access to the <input type="file">
element within a form, and then accessing selected files.
// uploadData is a form element
// fileChooser is input element of type 'file'
var file = document.forms['uploadData']['fileChooser'].files[0];
// alternative syntax can be
// var file = document.forms['uploadData']['fileChooser'].files.item(0);
if(file)
{
// Perform file ops
}
7.1. Attributes
length
must return the number of files in the
FileList
object. If there are no files, this attribute must return 0.
7.2. Methods and Parameters
item(index)
must return the indexth
File
object in theFileList
. If there is no indexthFile
object in theFileList
, then this method must returnnull
.index
must be treated by user agents as value for the position of aFile
object in theFileList
, with 0 representing the first file. Supported property indices [WebIDL] are the numbers in the range zero to one less than the number ofFile
objects represented by theFileList
object. If there are no suchFile
objects, then there are no supported property indices [WebIDL].
8. Reading Data
8.1 The Read Operation
The algorithm below defines a read operation, which takes a Blob
and a synchronous flag as input, and reads bytes into a byte stream which is returned as the result of the read operation, or else fails along with a failure reason. Methods in this specification invoke the read operation with the synchronous flag either set or unset.
The synchronous flag determines if a read operation is synchronous or asynchronous, and is unset by default. Methods may set it. If it is set, the read operation takes place synchronously. Otherwise, it takes place asynchronously.
To perform a read operation on a Blob
and the synchronous flag, run the following steps:
Let s be a a new body, b be the
Blob
to be read from, and bytes initially set to an empty byte sequence. Set the length on s to thesize
of b. While there are still bytes to be read in b perform the following substeps:NoteThe algorithm assumes that invoking methods have checked for readability state. A
Blob
in theCLOSED
state must not have a read operation called on it.If the synchronous flag is set, follow the steps below:
Let bytes be the byte sequence that results from reading a chunk from b. If an error occurs reading a chunk from b, return s with the error flag set, along with a failure reason, and terminate this algorithm.
NoteAlong with returning failure, the synchronous part of this algorithm must return the failure reason that occurred for throwing an exception by synchronous methods that invoke this algorithm with the synchronous flag set.
If there are no errors, push bytes to s, and increment s's transmitted [Fetch] by the number of bytes in bytes. Reset bytes to the empty byte sequence and continue reading chunks as above.
When all the bytes of b have been read into s, return s and terminate this algorithm.
Otherwise, the synchronous flag is unset. Return s and process the rest of this algorithm asynchronously.
Let bytes be the byte sequence that results from reading a chunk from b. If an error occurs reading a chunk from b, set the error flag on s, and terminate this algorithm with a failure reason.
NoteThe asynchronous part of this algorithm must signal the failure reason that occurred for asynchronous error reporting by methods expecting s and which invoke this algorithm with the synchronous flag unset.
If no error occurs, push bytes to s, and increment s's transmitted [Fetch] by the number of bytes in bytes. Reset bytes to the empty byte sequence and continue reading chunks as above.
To perform an annotated task read operation on a Blob b, perform the steps below:
Perform a read operation on b with the synchronous flag unset, along with the additional steps below.
If the read operation terminates with a failure reason, queue a task to process read error with the failure reason and terminate this algorithm.
When the first chunk is being pushed to the body s during the read operation, queue a task to process read.
Once the body s from the read operation has at least one chunk read into it, or there are no chunks left to read from b, queue a task to process read data. Keep queuing tasks to process read data for every chunk read or every 50ms, whichever is least frequent.
When all of the chunks from b are read into the body s from the read operation, queue a task to process read EOF.
Use the file reading task source for all these tasks.
8.2. The File Reading Task Source
This specification defines a new generic task source called the file reading task source, which is used for all tasks that are queued in this specification to read byte sequences associated with Blob
and File
objects. It is to be used for features that trigger in response to asynchronously reading binary data.
8.3 The FileReader API
[Constructor, Exposed=Window,Worker]
interface FileReader: EventTarget {
// async read methods
void readAsArrayBuffer(Blob blob);
void readAsText(Blob blob, optional DOMString label);
void readAsDataURL(Blob blob);
void abort();
// states
const unsigned short EMPTY = 0;
const unsigned short LOADING = 1;
const unsigned short DONE = 2;
readonly attribute unsigned short readyState;
// File or Blob data
readonly attribute (DOMString or ArrayBuffer)? result;
readonly attribute DOMError? error;
// event handler attributes
attribute EventHandler onloadstart;
attribute EventHandler onprogress;
attribute EventHandler onload;
attribute EventHandler onabort;
attribute EventHandler onerror;
attribute EventHandler onloadend;
};
8.3.1. Constructors
When the FileReader()
constructor is invoked, the user agent must return a new FileReader
object.
In environments where the global object is represented by a Window
or a WorkerGlobalScope
object, the FileReader
constructor
must be available.
8.3.2. Event Handler Attributes
The following are the event
handler attributes (and their corresponding event
handler event types) that user agents must support on
FileReader
as
DOM attributes:
event handler attribute | event handler event type |
---|---|
onloadstart
| loadstart
|
onprogress
| progress
|
onabort
| abort
|
onerror
| error
|
onload
| load
|
onloadend
| loadend
|
8.3.3. FileReader States
The FileReader
object can be in one of 3 states. The
readyState
attribute, on getting,
must return the current state, which must be one of the following values:
EMPTY
(numeric value 0)The
FileReader
object has been constructed, and there are no pending reads. None of the read methods have been called. This is the default state of a newly mintedFileReader
object, until one of the read methods have been called on it.LOADING
(numeric value 1)A
File
orBlob
is being read. One of the read methods is being processed, and no error has occurred during the read.DONE
(numeric value 2)The entire
File
orBlob
has been read into memory, OR a file error occurred during read, OR the read was aborted usingabort()
. TheFileReader
is no longer reading aFile
orBlob
. IfreadyState
is set toDONE
it means at least one of the read methods have been called on thisFileReader
.
8.3.4. Reading a File or Blob
Multiple Reads
The FileReader
interface makes available three asynchronous read methods - readAsArrayBuffer
, readAsText
, and readAsDataURL
, which read files into memory. If multiple concurrent read methods are called on the same FileReader
object, user agents must throw an InvalidStateError
[DOM4] on any of the read methods that occur when readyState
= LOADING
.
8.3.4.1. The result
attribute
On getting, the result
attribute returns a Blob
's data as a DOMString
, or as
an ArrayBuffer
[TypedArrays], or null
, depending on the read method
that has been called on the FileReader
, and any errors that may have occurred.
The list below is normative for the result
attribute and is the conformance criteria for this attribute:
On getting, if the
readyState
isEMPTY
(no read method has been called) then theresult
attribute must returnnull
.On getting, if an error in reading the
File
orBlob
has occurred (using any read method), then theresult
attribute must returnnull
.On getting, if the
readAsDataURL
read method is used, theresult
attribute must return aDOMString
that is a Data URL [DataURL] encoding of theFile
orBlob
's data.On getting, if the
readAsText
read method is called and no error in reading theFile
orBlob
has occurred, then theresult
attribute must return a string representing theFile
orBlob
's data as a text string, and should decode the string into memory in the format specified by the encoding determination as aDOMString
.On getting, if the
readAsArrayBuffer
read method is called and no error in reading theFile
orBlob
has occurred, then theresult
attribute must return anArrayBuffer
[TypedArrays] object.
8.3.4.2. The readAsDataURL(blob)
method
When the readAsDataURL(blob)
method is called, the user agent must run the steps below.
If
readyState
=LOADING
throw anInvalidStateError
exception [DOM4] and terminate this algorithm.Note: The
readAsDataURL()
method returns due to the algorithm being terminated.If the
blob
is in the CLOSED readability state, set theerror
attribute of the context object to return anInvalidStateError
DOMError and fire a progress event callederror
at the context object. Terminate this algorithm.Otherwise set
readyState
toLOADING
.Initiate an annotated task read operation using the
blob
argument as input and handle tasks queued on the file reading task source per below.To process read error with a failure reason, proceed to the error steps.
To process read fire a progress event called
loadstart
at the context object.To process read data fire a progress event called
progress
at the context object.To process read EOF run these substeps:
Set
readyState
toDONE
.Set the
result
attribute to the body returned by the read operation as a DataURL [DataURL]; on getting, theresult
attribute returns theblob
as a Data URL [DataURL].Use the
blob
'stype
attribute as part of the Data URL if it is available in keeping with the Data URL specification [DataURL].If the
type
attribute is not available on theblob
return a Data URL without a media-type. [DataURL].Data URLs that do not have media-types [RFC2046] must be treated as plain text by conforming user agents. [DataURL].
- Fire a progress event called
load
at the context object. - Unless
readyState
isLOADING
fire a progress event calledloadend
at the context object. IfreadyState
isLOADING
do NOT fireloadend
at the context object.
8.3.4.3. The readAsText(blob, label)
method
The readAsText()
method can be called with an optional parameter, label
, which is a DOMString
argument that represents the label of an encoding [Encoding Specification]; if provided, it must be used as part of the encoding determination used when processing this method call.
When the readAsText(blob, label)
method is called (the label
argument is optional),
the user agent must run the steps below.
If
readyState
=LOADING
throw anInvalidStateError
[DOM4] and terminate these steps.Note: The
readAsText()
method returns due to the algorithm being terminated.If the
blob
is in the CLOSED readability state, set theerror
attribute of the context object to return anInvalidStateError
DOMError and fire a progress event callederror
at the context object. Terminate this algorithm.Otherwise set
readyState
toLOADING
.Initiate an annotated task read operation using the
blob
argument as input and handle tasks queued on the file reading task source per below.To process read error with a failure reason, proceed to the error steps.
To process read fire a progress event called
loadstart
at the context object.To process read data fire a progress event called
progress
at the context object.To process read EOF run these substeps:
Set
readyState
toDONE
Set the
result
attribute to the body returned by the read operation, represented as a string in a format determined by the encoding determination.- Fire a progress event called
load
at the context object. - Unless
readyState
isLOADING
fire a progress event calledloadend
at the context object. IfreadyState
isLOADING
do NOT fireloadend
at the context object.
8.3.4.4. The readAsArrayBuffer(blob)
method
When the readAsArrayBuffer(blob)
method is called, the user agent must run the steps below.
If
readyState
=LOADING
throw anInvalidStateError
exception [DOM4] and terminate these steps.Note: The
readAsArrayBuffer()
method returns due to the algorithm being terminated.If the
blob
is in the CLOSED readability state, set theerror
attribute of the context object to return anInvalidStateError
DOMError and fire a progress event callederror
at the context object. Terminate this algorithm.Note: The
readAsArrayBuffer()
method returns due to the algorithm being terminated.Otherwise set
readyState
toLOADING
.Initiate an annotated task read operation using the
blob
argument as input and handle tasks queued on the file reading task source per below.To process read error with a failure reason, proceed to the error steps.
To process read fire a progress event called
loadstart
at the context object.To process read data fire a progress event called
progress
at the context object.To process read EOF run these substeps:
Set
readyState
toDONE
Set the
result
attribute to the body returned by the read operation as anArrayBuffer
[TypedArrays] object.Fire a progress event called
load
at the context object.Unless
readyState
isLOADING
fire a progress event calledloadend
at the context object. IfreadyState
isLOADING
do NOT fireloadend
at the context object.
8.3.4.5. Error Steps
These error steps are to process read error with a failure reason.
- Set the context object's
readyState
toDONE
andresult
to null if it is not already set to null. Set the
error
attribute on the context object; on getting, theerror
attribute must be a aDOMError
object that corresponds to the failure reason. Fire a progress event callederror
at the context object.Unless
readyState
isLOADING
, fire a progress event calledloadend
at the context object. IfreadyState
isLOADING
do NOT fireloadend
at the context object.Terminate the algorithm for any read method.
Note: The read method returns due to the algorithm being terminated.
8.3.4.6. The abort() method
When the abort()
method is called, the user agent must run the steps below:
If
readyState
=EMPTY
or ifreadyState
=DONE
setresult
tonull
and terminate this algorithm.If
readyState
=LOADING
setreadyState
toDONE
andresult
tonull
.If there are any tasks from the context object on the file reading task source in an affiliated task queue, then remove those tasks from that task queue.
Terminate the algorithm for the read method being processed.
Fire a progress event called
abort
Fire a progress event called
loadend
.
8.3.4.7. Blob Parameters
The three asynchronous read methods, the three synchronous read methods, URL.createObjectURL
and URL.createFor
take a
Blob
parameter. This section defines this parameter.
8.4. Determining Encoding
When reading blob
objects using the readAsText()
read method, the following encoding determination steps must be followed:
Let encoding be null
If the
label
argument is present when calling the method, set encoding to the result of "getting an encoding" using the Encoding Specification [Encoding Specification] forlabel
.If the "getting an encoding" steps above return failure, then set encoding to null.
If encoding is null, and the
blob
argument'stype
attribute is present, and it uses a Charset Parameter [RFC2046], set encoding to the result of "getting an encoding" using the Encoding Specification [Encoding Specification] for the portion of the Charset Parameter that is a label of an encoding.If the "getting an encoding" steps above return failure, then set encoding to null.
If encoding is null, then set encoding to utf-8.
"Decode" [Encoding Specification] this
blob
using fallback encoding encoding, and return the result. On getting, theresult
attribute of theFileReader
object returns a string in encoding format. The synchronousreadAsText
method of theFileReaderSync
object returns a string in encoding format.
8.5.10. Events
The FileReader
object must be the event target for all events in this specification.
When this specification says to fire a progress event called e (for some
ProgressEvent
e
at a given FileReader
reader
as the context object),
the following are normative:
The progress event
e
does not bubble.e.bubbles
must be false [DOM4]The progress event
e
is NOT cancelable.e.cancelable
must be false [DOM4]The term "fire an event" is defined in DOM Core [DOM4]. Progress Events are defined in Progress Events [ProgressEvents].
8.5.10.1. Event Summary
The following are the events that are fired at FileReader
objects; firing events is defined in
DOM Core [DOM4].
Event name | Interface | Fired when… |
---|---|---|
loadstart
| ProgressEvent
| When the read starts. |
progress
| ProgressEvent
| While reading (and decoding) blob
|
abort
| ProgressEvent
| When the read has been aborted. For instance, by invoking the
abort() method.
|
error
| ProgressEvent
| When the read has failed (see errors). |
load
| ProgressEvent
| When the read has successfully completed. |
loadend
| ProgressEvent
| When the request has completed (either in success or failure). |
8.5.10.2. Summary of Event Invariants
This section is informative. The following are invariants applicable to event firing for a given asynchronous read method in this specification:
Once a
loadstart
has been fired, a correspondingloadend
fires at completion of the read, EXCEPT ifthe read method has been cancelled using
abort()
and a new read method has been invoked;the event handler function for a
load
event initiates a new read;the event handler function for a
error
event initiates a new read.
ExampleThis example showcases "read-chaining" namely initiating another read from within an event handler while the "first" read continues processing.
ECMAScript// In code of the sort... reader.readAsText(file); reader.onload = function(){reader.readAsText(alternateFile);} ..... //... the loadend event must not fire for the first read reader.readAsText(file); reader.abort(); reader.onabort = function(){reader.readAsText(updatedFile);} //... the loadend event must not fire for the first read
One
progress
event will fire whenblob
has been completely read into memory.No
progress
event fires after any one ofabort
,load
, anderror
have fired. At most one ofabort
,load
, anderror
fire for a given read.
8.6 Reading on Threads
Web Workers allow for the use of synchronous File
or Blob
read APIs,
since such reads on threads do not block the main thread.
This section defines a synchronous API, which can be used within Workers [Web Workers]. Workers can avail of both the asynchronous API (the
FileReader
object) and the synchronous API (the FileReaderSync
object).
8.6.1. The FileReaderSync
API
This interface provides methods to synchronously read File
or Blob
objects into memory.
[Constructor, Exposed=Worker]
interface FileReaderSync {
// Synchronously return strings
ArrayBuffer readAsArrayBuffer(Blob blob);
DOMString readAsText(Blob blob, optional DOMString label);
DOMString readAsDataURL(Blob blob);
};
8.6.1.1. Constructors
When the FileReaderSync()
constructor is invoked, the user agent must return a new FileReaderSync
object.
In environments where the global object is represented by a WorkerGlobalScope
object, the FileReaderSync
constructor must be available.
8.6.1.2. The readAsText
method
When the readAsText(blob, label)
method is called (the
label
argument is optional), the following steps must be followed:
If
readyState
=LOADING
throw anInvalidStateError
exception [DOM4] and terminate these steps.Note: The method returns due to the algorithm being terminated.
If the
blob
has been closed through theclose
method, throw anInvalidStateError
exception [DOM4] and terminate this algorithm.Note: The method returns due to the algorithm being terminated.
Otherwise, initiate a read operation using the
blob
argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception. Terminate this algorithm.If no error has occurred, return the result of the read operation represented as a string in a format determined through the encoding determination algorithm.
8.6.1.3. The readAsDataURL
method
When the readAsDataURL(blob)
method is called, the following steps must be followed:
If
readyState
=LOADING
throw anInvalidStateError
exception [DOM4] and terminate these steps.Note: The method returns due to the algorithm being terminated.
If the
blob
has been closed through theclose
method, throw anInvalidStateError
exception [DOM4] and terminate this algorithm.Note: The method returns due to the algorithm being terminated.
Otherwise, initiate a read operation using the
blob
argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception. Terminate this algorithm.If no error has occurred, return the result of the read operation as a Data URL [DataURL] subject to the considerations below.
- Use the
blob
'stype
attribute as part of the Data URL if it is available in keeping with the Data URL specification [DataURL] . - If the
type
attribute is not available on theblob
return a Data URL without a media-type. [DataURL].Data URLs that do not have media-types [RFC2046] must be treated as plain text by conforming user agents. [DataURL].
- Use the
8.6.1.4. The readAsArrayBuffer
method
When the readAsArrayBuffer(blob)
method is called, the following steps must be followed:
If
readyState
=LOADING
throw anInvalidStateError
exception [DOM4] and terminate these steps.Note: The method returns due to the algorithm being terminated.
If the
blob
has been closed through theclose
method, throw anInvalidStateError
exception [DOM4] and terminate this algorithm.Note: The method returns due to the algorithm being terminated.
Otherwise, initiate a read operation using the
blob
argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception. Terminate this algorithm.If no error has occurred, return the result of the read operation as an
ArrayBuffer
[TypedArrays].
9. Errors and Exceptions
Error conditions can occur when reading files from the underlying filesystem. The list below of potential error conditions is informative.
The
File
orBlob
being accessed may not exist at the time one of the asynchronous read methods or synchronous read methods are called. This may be due to it having been moved or deleted after a reference to it was acquired (e.g. concurrent modification with another application). SeeNotFoundError
A
File
orBlob
may be unreadable. This may be due to permission problems that occur after a reference to aFile
orBlob
has been acquired (e.g. concurrent lock with another application). Additionally, the snapshot state may have changed. SeeNotReadableError
User agents MAY determine that some files are unsafe for use within Web applications. A file may change on disk since the original file selection, thus resulting in an invalid read. Additionally, some file and directory structures may be considered restricted by the underlying filesystem; attempts to read from them may be considered a security violation. See the security considerations. See
SecurityError
9.1. Throwing an Exception or Returning an Error
This section is normative. Error conditions can arise when reading a File
or a Blob
.
The read operation can terminate due to error conditions when reading a File
or a Blob
; the particular error condition that causes a read operation to return failure or queue a task to process read error is called a failure reason.
Synchronous read methods throw exceptions of the type in the table below if there has been an error owing to a particular failure reason.
Asynchronous read methods use the error
attribute of the FileReader
object, which must return a DOMError object [DOM4] of the most appropriate type from the table below if there has been an error owing to a particular failure reason, or otherwise return null.
Type | Description and Failure Reason |
---|---|
NotFoundError
| If the File or Blob resource could not be found at the time the read was processed, this is the NotFound failure reason. For asynchronous read methods the error attribute must return a "NotFoundError " DOMError and synchronous read methods must throw a NotFoundError exception.
|
SecurityError
| If:
For asynchronous read methods the This is a security error to be used in situations not covered by any other failure reason. |
NotReadableError
| If:
For asynchronous read methods the |
10. A URL for Blob and File reference
This section defines a scheme for a URL used to refer to Blob
objects (and File
objects).
10.1. Requirements for a New Scheme
This specification defines a scheme with URLs of the sort: blob:550e8400-e29b-41d4-a716-446655440000#aboutABBA
.
This section provides some requirements and is an informative discussion.
This scheme should be able to be used with web APIs such as
XMLHttpRequest
[XHR], and with elements that are designed to be used with HTTP URLs, such as theimg
element [HTML]. In general, this scheme should be designed to be used wherever URLs can be used on the web.This scheme should have defined response codes, so that web applications can respond to scenarios where the resource is not found, or raises an error, etc.
This scheme should have an origin policy and a lifetime stipulation, to allow safe access to binary data from web applications.
URLs in this scheme should be used as a references to "in-memory" Blobs, and also be re-used elsewhere on the platform to refer to binary resources (e.g. for video-conferencing [WebRTC]). URLs in this scheme are designed for impermanence, since they will be typically used to access "in memory" resources.
Developers should have the ability to revoke URLs in this scheme, so that they no longer refer to
Blob
objects. This includes scenarios where file references are no longer needed by a program, but also other uses ofBlob
objects. Consider a scenario where aBlob
object can be exported from a drawing program which uses the canvas element and API [HTML]. A snapshot of the drawing can be created by exporting aBlob
. This scheme can be used with theimg
[HTML] element to display the snapshot; if the user deletes the snapshot, any reference to the snapshot in memory via a URL should be invalid, and hence the need to be able to revoke such a URL.
10.2. Discussion of Existing Schemes
This section is an informative discussion of existing schemes that may have been repurposed or reused for the use cases for URLs above, and justification for why a new scheme is considered preferable. These schemes include HTTP [RFC7230], file [RFC1630][RFC1738], and a scheme such as urn:uuid [RFC4122]. One broad consideration in determining what scheme to use is providing something with intuitive appeal to web developers.
HTTP could be repurposed for the use cases mentioned above; it already comes with well-defined request-response semantics that are already used by web applications. But
Blob
resources are typically "in-memory" resident (e.g. after a file has been read into memory), and are thus unlike "traditional" HTTP resources that are dereferenced via DNS. While some user agents automatically "proxy" the underlying file system on the local machine via an HTTP server (e.g. with URLs of the sort https://localhost), HTTP is not traditionally used with local resources. Moreover, an important use case for these URLs are that they can be revoked with an API call. HTTP URLs have traditionally been used for resources that may be more permanent (and that are certainly not chiefly memory-resident, such as files that a web application can read). Reusing the HTTP scheme might be confusing for web developers owing to well-established practice.The reuse of file URLs would involve changes to file URL use today, such as adding response codes. While they are used inconsistently in web applications, the structure of the URLs would change, and request-response behavior would have to be superimposed on what already works in a more-or-less ad-hoc manner. Modifying this for the use cases cited above is imprudent, given legacy usage. Additionally, the use cases for a Blob URL scheme call for uses beyond the file system.
A scheme of the sort urn:uuid [RFC4122] could be used, though use of this scheme is unprecedented in HTML and JavaScript web applications. The urn:uuid scheme is very generic. URLs in the scheme urn:uuid have the disadvantage of unfamiliarity and inconsistency across the web platform. A new scheme has the advantage of being explicit about what is being referenced. In theory, URLs make no guarantee about what sort of resource is obtained when they are dereferenced; that is left to content labeling and media type. But in practice, the name of the scheme creates an expectation about both the resource and the protocol of the request-response transaction. Choosing a name that clarifies the primary use case - namely, access to memory-resident
Blob
resources - is a worthwhile compromise, and favors clarity, familiarity, and consistency across the web platform.
10.3. The Blob URL
A Blob URL must consist of the blob: scheme followed by scheme data which must consist of a string tuple comprising the origin of the Blob URL, the "/" (U+0024 SOLIDUS) character, and a UUID [RFC4122; for an ABNF of UUID, see Appendix A]. A Blob URL may contain an optional fragment. The Blob URL is serialized as a string according to the Unicode Serialization of a Blob URL algorithm.
A fragment, if used,
has a distinct interpretation depending on the media type of the Blob
or File
resource in question (see fragment discussion).
blob = scheme ":" origin "/" UUID [fragIdentifier]
scheme = "blob"
; scheme is always "blob"
; origin is a string representation of the Blob URL's origin.
; UUID is as defined in [RFC4122] and Appendix A
; fragIdentifier is optional and as defined in [RFC3986] and Appendix A
An example of a Blob URL might be blob:https://example.org/9115d58c-bcda-ff47-86e5-083e9a215304.
10.3.1. Origin of Blob URLs
Blob URLs are created using URL.createObjectURL
and/or URL.createFor
, and are revoked using URL.revokeObjectURL
. The origin of a Blob URL must be the same as the effective script origin specified by the incumbent settings object at the time the method that created it -- either URL.createObjectURL
or URL.createFor
-- was called. The origin of a Blob URL, when serialized as a string, must conform to the Web Origin Specification's Unicode Serialization of an Origin algorithm [ORIGIN]. Cross-origin requests on Blob URLs must return a network error.
In practice this means that HTTP and HTTPS origins are covered by this specification as valid origins for use with Blob URLs. This specification does not address the case of non-HTTP and non-HTTPS origins. For instance blob:file:///Users/arunranga/702efefb-c234-4988-a73b-6104f1b615ee (which uses the "file:" origin, which is part of the Blob URL's scheme data) may have behavior that is undefined, even though user agents may treat such Blob URLs as valid.
10.3.2. Unicode Serialization of a Blob URL
The Unicode Serialization of a Blob URL is the value returned by the following algorithm, which is invoked by URL.createObjectURL
and/or URL.createFor
:
Let result be the empty string. Append the string "blob" (that is, the Unicode code point sequence U+0062, U+006C, U+006F, U+0062) to result.
Append the ":" (U+003A COLON) character to result.
Let O be the origin of the Blob URL. If the Unicode Serialization of an Origin algorithm [ORIGIN] on O returns null, user agents may substitute an implementation defined value for the return value of the Unicode Serialization of an Origin algorithm. Append the result of the Unicode Serialization of an Origin algorithm for O [ORIGIN] to result.
Append the "/" character (U+0024 SOLIDUS) to result.
Generate a UUID [RFC4122] as a Unicode string and append it to result.
Return result.
10.3.3. Discussion of Fragment Identifier
The fragment's resolution and processing directives depend on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the Blob URL is dereferenced. For example, in an HTML file [HTML] the fragment could be used to refer to an anchor within the file. If the user agent does not recognize the media type of the resource, OR if a fragment is not meaningful within the resource, it must ignore the fragment. The fragment must not be used to identify a resource; only the blob: scheme and the scheme data constitute a valid resource identifier.
A valid Blob URL reference could look like: blob:https://example.org:8080/550e8400-e29b-41d4-a716-446655440000#aboutABBA
where "#aboutABBA" might be an HTML fragment identifier referring to an
element with an id attribute of "aboutABBA".
The fragment is not used to identify the resource. Neither the URL.createObjectURL
method nor the URL.createFor
method generate a fragment.
10.4. Dereferencing Model for Blob URLs
The URL and Fetch Specifications should be considered normative for parsing and fetching Blob URLs. The section below is for informational purposes only, and is not considered normative.
Blob
URLs are dereferenced when the user agent retrieves the resource identified by the Blob URL and returns it to the requesting entity. This section provides guidance on requests and responses.
Only requests with GET [RFC7231] are supported. Specifically, responses are only a subset of the following from HTTP [RFC7231]:
10.4.1. 200 OK
This response is used if the request has succeeded, and no network errors are generated.
10.4.2. Response Headers
Along with 200 OK responses, user agents use a Content-Type header [RFC7231] that is equal to the value of the Blob
's type
attribute, if it is not the empty string.
Along with 200 OK responses, user agents use a Content-Length header [RFC7230] that is equal to the value of the Blob
's size
attribute.
If a Content-Type header [RFC7231] is provided, then user agents obtain and process that media type in a manner consistent with the Mime Sniffing specification [MIMESNIFF].
If a resource identified with a Blob URL is a File
object, user agents use that file's name
attribute, as if the response had a Content-Disposition
header with the filename parameter set to the File
's name
attribute [RFC6266].
A corollary to this is a non-normative implementation guideline: that the "File Save As" user interface in user agents takes into account the File
's name
attribute if presenting a default name to save the file as.
10.4.3. Network Errors
Responses that do not succeed with a 200 OK act as if a network error has occurred [Fetch]. Network errors are used when:
Any request method other than GET is used.
The Blob URL does not have an entry in the Blob URL Store.
The Blob has been closed; this also results in the Blob URL not having an entry in the Blob URL Store.
A cross-origin request is made on a Blob URL.
A security error has occurred.
10.4.4. Sample Request and Response Headers
This section is informative.
This section provides sample exchanges between web applications and user agents using Blob URLs. A request can be triggered using HTML markup of the sort <img src="blob:https://example.org:8080/550e8400-e29b-41d4-a716-446655440000">
. These examples merely illustrate the request and response; web developers are not likely to interact with all the headers, but the getAllResponseHeaders()
method of XMLHttpRequest
, if used, will show relevant response headers [XHR].
Requests could look like this:
GET https://example.org:8080/550e8400-e29b-41d4-a716-446655440000
If the blob
has an affiliated media type [RFC2046] represented by its type
attribute, then the response message should include the Content-Type header from RFC7231 [RFC7231]. See processing media types.
200 OK
Content-Type: image/jpeg
Content-Length: 21897
....
If there is a file error associated with the blob
, then a user agent acts as if a network error has occurred.
10.5. Creating and Revoking a Blob URL
Blob
URLs are created and revoked using methods exposed on the URL
object, supported by
global objects Window
[HTML] and WorkerGlobalScope
[Web Workers]. Revocation of a Blob
URL
decouples the Blob
URL from the resource it refers to, and if it is dereferenced after it is revoked, user agents must act as if a network error has occurred.
This section describes a supplemental interface to the URL specification [URL API] and presents methods for Blob
URL creation and revocation.
partial interface URL {
static DOMString createObjectURL(Blob blob);
static DOMString createFor(Blob blob);
static void revokeObjectURL(DOMString url);
};
ECMAScript user agents of this specification must ensure that they do not expose a prototype
property on the URL interface
object unless the user agent also implements the URL [URL API] specification. In other words, URL.prototype
must
evaluate to true if the user agent implements the URL [URL API] specification, and must NOT evaluate to true otherwise.
10.5.1. Methods and Parameters
- The
createObjectURL
static method Returns a unique
Blob
URL. This method must act as follows:If called with a
Blob
argument that has a readability state ofCLOSED
, user agents must return the output of the Unicode Serialization of an Blob URL.NoteNo entry is added to the Blob URL Store; consequently, when this Blob URL is dereferenced, a network error occurs.
Otherwise, user agents must run the following sub-steps:
- Let url be the result of the Unicode Serialization of a Blob URL algorithm.
- Add an entry to the Blob URL Store for url and
blob
. - Return url.
- The
createFor
static method Returns a unique
Blob
URL;Blob
URLs created with this method are said to be auto-revoking since user-agents are responsible for the revocation of Blob URLs created with this method, subject to the lifetime stipulation for Blob URLs. This method must act as follows:If called with a
Blob
argument that has a readability state ofCLOSED
, user agents must return the output of the Unicode Serialization of a Blob URL.NoteNo entry is added to the Blob URL Store; consequently, when this Blob URL is dereferenced, a network error occurs.
Otherwise, user agents must run the following steps:
- Let url be the result of the Unicode Serialization of a Blob URL.
- Add an entry to the Blob URL Store for url and
blob
. - Add an entry to the Revocation List for url.
- Return url.
ExampleIn the example below, after obtaining a reference to a
Blob
object (in this case, a user-selectedFile
from the underlying file system), the static methodURL.createObjectURL()
is called on thatBlob
object.ECMAScriptvar file = document.getElementById('file').files[0]; if(file){ blobURLref = window.URL.createObjectURL(file); myimg.src = blobURLref; .... }
- The
revokeObjectURL
static method Revokes the Blob URL provided in the string
url
by removing the corresponding entry from the Blob URL Store. This method must act as follows:If the
url
refers to aBlob
that has a readability state ofCLOSED
OR if the value provided for theurl
argument is not aBlob
URL, OR if the value provided for theurl
argument does not have an entry in the Blob URL Store, this method call does nothing. User agents may display a message on the error console.Otherwise, user agents must remove the entry from the Blob URL Store for
url
.NoteSubsequent attemps to dereference
url
result in a network error, since the entry has been removed from the Blob URL Store.
The
url
argument to therevokeObjectURL
method is a Blob URL string.ExampleIn the example below,
window1
andwindow2
are separate, but in the same origin;window2
could be aniframe
[HTML] insidewindow1
.ECMAScriptmyurl = window1.URL.createObjectURL(myblob); window2.URL.revokeObjectURL(myurl);
Since
window1
andwindow2
are in the same origin and share the same Blob URL Store, theURL.revokeObjectURL
call ensures that subsequent dereferencing ofmyurl
results in a the user agent acting as if a network error has occurred.
10.5.2. Examples of Blob URL Creation and Revocation
Blob URLs are strings that are used to dereference Blob
objects, and can persist for as long as the document
from which they were minted using URL.createObjectURL()
or URL.createFor
- see Lifetime of Blob URLs.
This section gives sample usage of creation and revocation of Blob URLs with explanations.
In the example below, two img
elements [HTML] refer to the same Blob URL:
<script>url = URL.createObjectURL(blob); </script><script> img2.src=url;</script>
In the example below, URL.revokeObjectURL()
is explicitly called.
var blobURLref = URL.createObjectURL(file);
img1 = new Image();
img2 = new Image();
// Both assignments below work as expected
img1.src = blobURLref;
img2.src = blobURLref;
// ... Following body load
// Check if both images have loaded
if(img1.complete && img2.complete)
{
// Ensure that subsequent refs throw an exception
URL.revokeObjectURL(blobURLref);
}
else {
msg("Images cannot be previewed!");
// revoke the string-based reference
URL.revokeObjectURL(blobURLref);
}
The example above allows multiple references to a single Blob URL, and the web developer then revokes the Blob URL string after both image objects have been loaded. While not restricting number of uses of the Blob URL offers more flexibility, it increases the likelihood of leaks; developers should pair it with a corresponding call to URL.revokeObjectURL
.
var blobURLref2 = URL.createFor(file);
img1 = new Image();
img1.src = blobURLref2;
....
The example above uses URL.createFor
, which allows uses such as the one above, and obviates the need for a corresponding call by the web developer to URL.revokeObjectURL
.
// file is an HTML file
// One of the anchor identifiers in file is "#applicationInfo"
var blobURLAnchorRef = URL.createFor(file) + "#applicationInfo";
// openPopup is an utility function to open a small informational window
var windowRef = openPopup(blobURLAnchorRef, "Application Info");
....
The example above uses URL.createFor
to mint a Blob URL, and appends a fragment that has an understood meaning within the resource (in this case, it references an anchor within the HTML document).
10.6. Lifetime of Blob URLs
A global object which exposes URL.createObjectURL
or URL.createFor
must maintain a Blob URL Store which is a list of Blob URLs created by the URL.createObjectURL
method or the URL.createFor
method, and the blob
resource that each refers to.
A global object which exposes URL.createFor
must maintain a Revocation List which is a list of Blob URLs created with the URL.createFor
method.
When this specification says to add an entry to the Blob URL Store for a Blob URL and a blob
input, the user-agent must add the Blob URL and a reference to the blob
it refers to to the Blob URL Store.
When this specification says to add an entry to the The Revocation List for a Blob URL, user agents must add the Blob URL to the Revocation List. This is only for auto-revoking Blob URLs.
When this specification says to remove an entry from the Blob URL Store for a given Blob URL or for a given Blob
, user agents must remove the Blob URL and the blob
it refers to from the Blob URL Store. Subsequent attempts to dereference this URL must result in a network error.
The Revocation List and the Blob URL Store must be processed together as follows:
- Add removing the entries for all the Blob URLs in the Revocation List to the global script cleanup jobs list.
When all the Blob URLs in the Revocation List have had their corresponding entries in the Blob URL Store removed, remove all the Blob URLs in the Revocation List.
This specification adds an additional unloading document cleanup step [HTML]: user agents must remove all Blob URLs from the Blob URL Store within that document.
User agents are free to garbage collect resources removed from the Blob URL Store.
11. Security Considerations
This section is informative.
This specification allows web content to read files from the underlying file system, as well as provides a means for files to be accessed by unique identifiers,
and as such is subject to some security considerations. This specification also assumes that the
primary user interaction is with the <input type="file"/>
element of HTML forms [HTML], and that all files that are being read by
FileReader
objects have first been selected by the user. Important security considerations include preventing malicious file
selection attacks (selection looping), preventing access to system-sensitive files, and guarding against modifications of files on disk after a selection has taken place.
Preventing selection looping. During file selection, a user may be bombarded with the file picker associated with
<input type="file"/>
(in a "must choose" loop that forces selection before the file picker is dismissed) and a user agent may prevent file access to any selections by making theFileList
object returned be of size 0.System-sensitive files (e.g. files in /usr/bin, password files, and other native operating system executables) typically should not be exposed to web content, and should not be accessed via Blob URLs. User agents may throw a
SecurityError
exception for synchronous read methods, or return aSecurityError
DOMError
for asynchronous reads.
This section is provisional; more security data may supplement this in subsequent drafts.
12. Requirements and Use Cases
This section covers what the requirements are for this API, as well as illustrates some use cases. This version of the API does not satisfy all use cases; subsequent versions may elect to address these.
Once a user has given permission, user agents should provide the ability to read and parse data directly from a local file programmatically.
- Example: A lyrics viewer. User wants to read song lyrics from songs in his plist file. User browses for plist file. File is opened, read, parsed, and presented to the user as a sortable, actionable list within a web application. User can select songs to fetch lyrics. User uses the "browse for file" dialog.
Data should be able to be stored locally so that it is available for later use, which is useful for offline data access for web applications.
Example: A Calendar App. User's company has a calendar. User wants to sync local events to company calendar, marked as "busy" slots (without leaking personal info). User browses for file and selects it. The text/calendar file is parsed in the browser, allowing the user to merge the files to one calendar view. The user wants to then save the file back to his local calendar file. (using "Save As" ?). The user can also send the integrated calendar file back to the server calendar store asynchronously.
User agents should provide the ability to save a local file programmatically given an amount of data and a file name.
NoteWhile this specification doesn't provide an explicit API call to trigger downloads, the HTML5 specification has addressed this. The
download
attribute of thea
element [HTML] initiates a download, saving aFile
with the name specified. The combination of this API and thedownload
attribute ona
elements allows for the creation of files within web applications, and the ability to save them locally.- Example: A Spreadsheet App. User interacts with a form, and generates some input. The form then generates a CSV (Comma Separated Variables) output for the user to import into a spreadsheet, and uses "Save...". The generated output can also be directly integrated into a web-based spreadsheet, and uploaded asynchronously.
User agents should provide a streamlined programmatic ability to send data from a file to a remote server that works more efficiently than form-based uploads today
- Example: A Video/Photo Upload App. User is able to select large files for upload, which can then be "chunk-transfered" to the server.
User agents should provide an API exposed to script that exposes the features above. The user is notified by UI anytime interaction with the file system takes place, giving the user full ability to cancel or abort the transaction. The user is notified of any file selections, and can cancel these. No invocations to these APIs occur silently without user intervention.
13. Appendix A
This section uses the Augmented Backus-Naur Form (ABNF), defined in [RFC5234] to describe components of blob: URLs.
13.1. An ABNF for UUID
The following is an ABNF [ABNF] for UUID. UUID strings must only use characters in the ranges U+002A to U+002B, U+002D to U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005E to U+007E [Unicode], and should be at least 36 characters long.
UUID = time-low "-" time-mid "-"
time-high-and-version "-"
clock-seq-and-reserved
clock-seq-low "-" node
time-low = 4hexOctet
time-mid = 2hexOctet
time-high-and-version = 2hexOctet
clock-seq-and-reserved = hexOctet
clock-seq-low = hexOctet
node = 6hexOctet
hexOctet = hexDigit hexDigit
hexDigit =
"0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
"a" / "b" / "c" / "d" / "e" / "f" /
"A" / "B" / "C" / "D" / "E" / "F"
13.2 An ABNF for Fragment Identifiers
fragIdentifier = "#" fragment
; Fragment Identifiers depend on the media type of the Blob
; fragment is defined in [RFC3986]
; fragment processing for HTML is defined in [HTML]
fragment = *( pchar / "/" / "?" )
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
14. Acknowledgements
This specification was originally developed by the SVG Working Group. Many thanks to Mark Baker and Anne van Kesteren for their feedback.
Thanks to Robin Berjon for editing the original specification.
Special thanks to Olli Pettay, Nikunj Mehta, Garrett Smith, Aaron Boodman, Michael Nordman, Jian Li, Dmitry Titov, Ian Hickson, Darin Fisher, Sam Weinig, Adrian Bateman and Julian Reschke.
Thanks to the W3C WebApps WG, and to participants on the public-webapps@w3.org listserv
15. References
15.1. Normative references
- RFC2119
- Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. IETF.
- HTML
- HTML 5: A vocabulary and associated APIs for HTML and XHTML, I. Hickson, R. Berjon, S. Faulkner, T. Leithead, E. Doyle Navara, E. O'Connor, S. Pfeiffer. W3C.
- Web Origin Concept
- Web Origin Concept, A. Barth. IETF.
- ProgressEvents
- Progress Events, A. van Kesteren. W3C.
- RFC2397
- The "data" URL Scheme, L. Masinter. IETF.
- Web Workers
- Web Workers (work in progress), I. Hickson. W3C.
- DOM4
- DOM4 (work in progress), A. Gregor, A. van Kesteren, Ms2ger. W3C.
- Unicode
- The Unicode Standard, Version 5.2.0., J. D. Allen, D. Anderson, et al. Unicode Consortium.
- RFC7230
- Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing, R. Fielding, J. Reschke. IETF.
- RFC7231
- Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content, R. Fielding, J. Reschke. IETF.
- RFC2046
- Multipurpose Internet Mail Extensions (MIME) Part Two: Media Extensions, N. Freed, N. Borenstein. IETF.
- RFC6266
- Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP), J. Reschke. IETF.
- Encoding Specification
- Encoding Living Standard, A. van Kesteren, J. Bell.
- Streams Specification
- Sreams Living Standard, Domenic Denicola, Takeshi Yoshino.
- Typed Arrays
- Typed Arrays (work in progress), V. Vukicevic, K. Russell. Khronos Group.
- RFC5234
- Augmented BNF for Syntax Specifications: ABNF, D. Crocker, P. Overell. IETF.
- URL Specification
- URL Living Standard (work in progress), A. van Kesteren.
- Fetch Specification
- Fetch Specification, A. van Kesteren. WHATWG.
- WebIDL Specification
- WebIDL (work in progress), C. McCormack.
- ECMAScript
- ECMAScript 5th Edition, A. Wirfs-Brock, P. Lakshman et al.
- MIME Sniffing
- MIME Sniffing (work in progress), A. Barth, I. Hickson.
- XMLHttpRequest
- XMLHttpRequest Living Standard, A. van Kesteren.
15.2. Informative References
- Google Gears Blob API
- Google Gears Blob API (deprecated)
- RFC4122
- A Universally Unique IDentifier (UUID) URN Namespace, P. Leach, M. Mealling, R. Salz. IETF.
- RFC3986
- Uniform Resource Identifier (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter. IETF.
- RFC1630
- Universal Resource Identifiers in WWW, T. Berners-Lee. IETF.
- RFC1738
- Uniform Resource Locators (URL), T. Berners-Lee, L. Masinter, M. McCahill. IETF.
- WebRTC 1.0
- WebRTC 1.0, A. Bergkvist, D. Burnett, C. Jennings, A. Narayanan. W3C.