CARVIEW |
Select Language
HTTP/2 302
server: nginx
date: Sat, 16 Aug 2025 09:43:37 GMT
content-type: text/plain; charset=utf-8
content-length: 0
x-archive-redirect-reason: found capture at 20080302145956
location: https://web.archive.org/web/20080302145956/https://www.oreilly.com/catalog/javaio/toc.html
server-timing: captures_list;dur=0.605363, exclusion.robots;dur=0.019882, exclusion.robots.policy;dur=0.009296, esindex;dur=0.010937, cdx.remote;dur=20.974001, LoadShardBlock;dur=351.979127, PetaboxLoader3.datanode;dur=205.339989, PetaboxLoader3.resolve;dur=67.412788
x-app-server: wwwb-app203
x-ts: 302
x-tr: 399
server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0
set-cookie: wb-p-SERVER=wwwb-app203; path=/
x-location: All
x-rl: 0
x-na: 0
x-page-cache: MISS
server-timing: MISS
x-nid: DigitalOcean
referrer-policy: no-referrer-when-downgrade
permissions-policy: interest-cohort=()
HTTP/2 200
server: nginx
date: Sat, 16 Aug 2025 09:43:39 GMT
content-type: text/html
x-archive-orig-date: Sun, 02 Mar 2008 14:59:56 GMT
x-archive-orig-server: Apache
x-archive-orig-p3p: policyref="https://www.oreillynet.com/w3c/p3p.xml",CP="CAO DSP COR CURa ADMa DEVa TAIa PSAa PSDa IVAa IVDa CONo OUR DELa PUBi OTRa IND PHY ONL UNI PUR COM NAV INT DEM CNT STA PRE"
x-archive-orig-last-modified: Tue, 26 Feb 2008 08:34:03 GMT
x-archive-orig-accept-ranges: bytes
x-archive-orig-content-length: 655503
x-archive-orig-x-cache: MISS from oregano.bp
x-archive-orig-x-cache-lookup: MISS from oregano.bp:3128
x-archive-orig-via: 1.0 oregano.bp:3128 (squid/2.6.STABLE12)
x-archive-orig-connection: close
x-archive-guessed-content-type: text/html
x-archive-guessed-charset: utf-8
memento-datetime: Sun, 02 Mar 2008 14:59:56 GMT
link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first memento"; datetime="Wed, 13 Oct 1999 11:58:19 GMT", ; rel="prev memento"; datetime="Wed, 30 Jan 2008 01:56:55 GMT", ; rel="memento"; datetime="Sun, 02 Mar 2008 14:59:56 GMT", ; rel="next memento"; datetime="Wed, 30 Apr 2008 09:53:11 GMT", ; rel="last memento"; datetime="Fri, 26 Sep 2008 10:00:00 GMT"
content-security-policy: default-src 'self' 'unsafe-eval' 'unsafe-inline' data: blob: archive.org web.archive.org web-static.archive.org wayback-api.archive.org athena.archive.org analytics.archive.org pragma.archivelab.org wwwb-events.archive.org
x-archive-src: 51_2_20080302140300_crawl104-c/51_2_20080302145358_crawl100.arc.gz
server-timing: captures_list;dur=0.625281, exclusion.robots;dur=0.022983, exclusion.robots.policy;dur=0.011522, esindex;dur=0.011137, cdx.remote;dur=49.533969, LoadShardBlock;dur=345.453005, PetaboxLoader3.datanode;dur=363.647988, PetaboxLoader3.resolve;dur=148.829348, load_resource;dur=316.914372
x-app-server: wwwb-app203
x-ts: 200
x-tr: 1019
server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0
x-location: All
x-rl: 0
x-na: 0
x-page-cache: MISS
server-timing: MISS
x-nid: DigitalOcean
referrer-policy: no-referrer-when-downgrade
permissions-policy: interest-cohort=()
content-encoding: gzip
O'Reilly Media | Java I/O
Buy this Book
Read it Now!
Reprint Licensing

--
Please select a chapter from the Table of Contents and click the button above to begin the licensing process.
Java I/O
Cover | Table of Contents | Index | Sample Chapter | Colophon
Table of Contents
- Chapter 1: Introducing I/O
- Content preview·Buy reprint rights for this chapterInput and output, I/O for short, are fundamental to any computer operating system or programming language. Only theorists find it interesting to write programs that don't require input or produce output. At the same time, I/O hardly qualifies as one of the more "thrilling" topics in computer science. It's something in the background, something you use every day—but for most developers, it's not a topic with much sex appeal.There are plenty of reasons for Java programmers to find I/O interesting. Java includes a particularly rich set of I/O classes in the core API, mostly in the
java.io
package. For the most part I/O in Java is divided into two types: byte- and number-oriented I/O, which is handled by input and output streams; and character and text I/O, which is handled by readers and writers. Both types provide an abstraction for external data sources and targets that allows you to read from and write to them, regardless of the exact type of the source. You use the same methods to read from a file that you do to read from the console or from a network connection.But that's just the tip of the iceberg. Once you've defined abstractions that let you read or write without caring where your data is coming from or where it's going to, you can do a lot of very powerful things. You can define I/O streams that automatically compress, encrypt, and filter from one data format to another, and more. Once you have these tools, programs can send encrypted data or write zip files with almost no knowledge of what they're doing; cryptography or compression can be isolated in a few lines of code that say, "Oh yes, make this an encrypted output stream."In this book, I'll take a thorough look at all parts of Java's I/O facilities. This includes all the different kinds of streams you can use. We're also going to investigate Java's support for Unicode (the standard multilingual character set). We'll look at Java's powerful facilities for formatting I/O—oddly enough, not part of thejava.io
package proper. (We'll see the reasons for this design decision later.) Finally, we'll take a brief look at the Java Communications API (Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - What Is a Stream?
- Content preview·Buy reprint rights for this chapterA stream is an ordered sequence of bytes of undetermined length. Input streams move bytes of data into a Java program from some generally external source. Output streams move bytes of data from Java to some generally external target. (In special cases streams can also move bytes from one part of a Java program to another.)The word stream is derived from an analogy with a stream of water. An input stream is like a siphon that sucks up water; an output stream is like a hose that sprays out water. Siphons can be connected to hoses to move water from one place to another. Sometimes a siphon may run out of water if it's drawing from a finite source like a bucket. On the other hand, if the siphon is drawing water from a river, it may well provide water indefinitely. So too an input stream may read from a finite source of bytes like a file or an unlimited source of bytes like
System.in
. Similarly an output stream may have a definite number of bytes to output or an indefinite number of bytes.Input to a Java program can come from many sources. Output can go to many different kinds of destinations. The power of the stream metaphor and in turn the stream classes is that the differences between these sources and destinations are abstracted away. All input and output are simply treated as streams.The first source of input most programmers encounter isSystem.in
. This is the same thing asstdin
in C, generally some sort of console window, probably the one in which the Java program was launched. If input is redirected so the program reads from a file, thenSystem.in
is changed as well. For instance, on Unix, the following command redirectsstdin
so that when the MessageServer program reads fromSystem.in
, the actual data comes from the file data.txt instead of the console:% java MessageServer < data.txt
The console is also available for output through the static fieldAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Numeric Data
- Content preview·Buy reprint rights for this chapterInput streams read bytes and output streams write bytes. Readers read characters and writers write characters. Therefore, to understand input and output, you first need a solid understanding of how Java deals with bytes, integers, characters, and other primitive data types, and when and why one is converted into another. In many cases Java's behavior is not obvious.The fundamental integer data type in Java is the
int
, a four-byte, big-endian, two's complement integer. Anint
can take on all values between -2,147,483,648 and 2,147,483,647. When you type a literal integer like 7, -8345, or 3000000000 in Java source code, the compiler treats that literal as anint
. In the case of 3000000000 or similar numbers too large to fit in anint
, the compiler emits an error message citing "Numeric overflow."longs
are eight-byte, big-endian, two's complement integers with ranges from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.long
literals are indicated by suffixing the number with a lower- or uppercase L. An uppercase L is preferred because the lowercase l is too easily confused with the numeral 1 in most fonts. For example, 7L, -8345L, and 3000000000L are all 64-bitlong
literals.There are two more integer data types available in Java, theshort
and thebyte
.shorts
are two-byte, big-endian, two's complement integers with ranges from -32,768 to 32,767. They're rarely used in Java and are included mainly for compatibility with C.bytes
, however, are very much used in Java. In particular they're used in I/O. Abyte
is an eight-bit, two's complement integer that ranges from -128 to 127. Note that like all numeric data types in Java, abyte
is signed. The maximumbyte
value is 127. 128, 129, and so on through 255 are not legal values for bytes.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Character Data
- Content preview·Buy reprint rights for this chapterNumbers are only part of the data a typical Java program needs to read and write. Most programs also need to handle text, which is composed of characters. Since computers only really understand numbers, characters are encoded by matching each character in a given script to a particular number. For example, in the common ASCII encoding, the character A is mapped to the number 65; the character B is mapped to the number 66; the character C is mapped to the number 67; and so on. Different encodings may encode different scripts or may encode the same or similar scripts in different ways.Java understands several dozen different character sets for a variety of languages, ranging from ASCII to the Shift Japanese Input System (SJIS) to Unicode. Internally, Java uses the Unicode character set. Unicode is a two-byte extension of the one-byte ISO Latin-1 character set, which in turn is an eight-bit superset of the seven-bit ASCII character set.ASCII, the American Standard Code for Information Interchange, is a seven-bit character set. Thus it defines 27 or 128 different characters whose numeric values range from to 127. These characters are sufficient for handling most of American English and can make reasonable approximations to most European languages (with the notable exceptions of Russian and Greek). It's an often used lowest common denominator format for different computers. If you were to read a
byte
value between and 127 from a stream, then cast it to achar
, the result would be the corresponding ASCII character.ASCII characters 0-31 and character 127 are nonprinting control characters. Characters 32-47 are various punctuation and space characters. Characters 48-57 are the digits 0-9. Characters 58-64 are another group of punctuation characters. Characters 65-90 are the capital letters A-Z. Characters 91-96 are a few more punctuation marks. Characters 97-122 are the lowercase letters a-z. Finally, characters 123 through 126 are a few remaining punctuation symbols. The complete ASCII character set is shown in Table 2.1 in Appendix B.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Readers and Writers
- Content preview·Buy reprint rights for this chapterIn Java 1.1 and later, streams are primarily intended for data that can be read as pure bytes—basically byte data and numeric data encoded as binary numbers of one sort or another. Streams are specifically not intended for use when reading and writing text, including both ASCII text, like "Hello World," and numbers formatted as text, like "3.1415929." For these purposes, you should use readers and writers.Input and output streams are fundamentally byte-based. Readers and writers are based on characters, which can have varying widths depending on the character set. For example, ASCII and ISO Latin-1 use one-byte characters. Unicode uses two-byte characters. UTF-8 uses characters of varying width (between one and three bytes). Since characters are ultimately composed of bytes, readers take their input from streams. However, they convert those bytes into
char
s according to a specified encoding format before passing them along. Similarly, writers convertchar
s to bytes according to a specified encoding before writing them onto some underlying stream.Thejava.io.Reader
andjava.io.Writer
classes are abstract superclasses for classes that read and write character-based data. The subclasses are notable for handling the conversion between different character sets. There are nine reader and eight writer classes in the core Java API, all in thejava.io
package:BufferedReader
BufferedWriter
CharArrayReader
CharArrayWriter
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The Ubiquitous IOException
- Content preview·Buy reprint rights for this chapterAs computer operations go, input and output are unreliable. They are subject to problems completely outside the programmer's control. Disks can develop bad sectors while a file is being read; construction workers drop backhoes through the cables that connect your WAN; users unexpectedly cancel their input; telephone repair crews shut off your modem line while trying to repair someone else's. (This last one actually happened to me while writing this chapter. My modem kept dropping the connection and then not getting a dial tone; I had to hunt down the telephone "repairman" in my building's basement and explain to him that he was working on the wrong line.)Because of these potential problems and many more, almost every method that performs input or output is declared to throw
IOException
.IOException
is a checked exception, so you must either declare that your methods throw it or enclose the call that can throw it in atry
/catch
block. The only real exceptions to this rule are thePrintStream
andPrintWriter
classes. Because it would be inconvenient to wrap atry
/catch
block around each call toSystem.out.println()
, Sun decided to havePrintStream
(and laterPrintWriter
) catch and eat any exceptions thrown inside aprint()
orprintln()
method. If you do want to check for exceptions inside aprint()
orprintln()
method, you can callcheckError()
:public boolean checkError()
ThecheckError()
method returnstrue
if an exception has occurred on this print stream,false
if one hasn't. It only tells you that an error occurred. It does not tell you what sort of error occurred. If you need to know more about the error, you'll have to use a different output stream or writer class.IOException
has many subclasses—15 injava.io
—and methods often throw a more specific exception that subclassesIOException
. (However, methods usually only declare that they throw anIOException
.) Here are the subclasses ofIOException
that you'll find inAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The Console: System.out, System.in, and System.err
- Content preview·Buy reprint rights for this chapterThe console is the default destination for output written to
System.out
orSystem.err
and the default source of input forSystem.in
. On most platforms the console is the command-line environment from which the Java program was initially launched, perhaps an xterm (Figure 1.1) or a DOS shell window (Figure 1.2). The word console is something of a misnomer, since on Unix systems the console refers to a very specific command-line shell, rather than being a generic term for command-line shells overall.Figure 1.1: An xterm console on UnixFigure 1.2: A DOS shell console on Windows NTMany common misconceptions about I/O occur because most programmers' first exposure to I/O is through the console. The console is convenient for quick hacks and toy examples commonly found in textbooks, and I will use it for that in this book, but it's really a very unusual source of input and destination for output, and good Java programs avoid it. It behaves almost, but not completely, unlike anything else you'd want to read from or write to. While consoles make convenient examples in programming texts like this one, they're a horrible user interface and really have little place in modern programs. Users are more comfortable with a well-defined graphical user interface. Furthermore, the console is unreliable across platforms. The Mac, for example, has no native console. Macintosh Runtime for Java 2 and earlier has a console window that works only for output, but not for input; that is,System.out
works butSystem.in
does not. Figure 1.3 shows the Mac console window.Figure 1.3: The Mac console, used exclusively by Java programsAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Security Checks on I/O
- Content preview·Buy reprint rights for this chapterOne of the original fears about downloading executable content like applets from the Internet was that a hostile applet could erase your hard disk or read your Quicken files. Nothing's happened to change that since Java was introduced. This is why Java applets run under the control of a security manager that checks each operation an applet performs to prevent potentially hostile acts.The security manager is particularly careful about I/O operations. For the most part, the checks are related to these questions:
-
Can an applet read a file?
-
Can an applet write a file?
-
Can an applet delete a file?
-
Can an applet determine whether a file exists?
-
Can an applet make a network connection to a particular host?
-
Can applet accept an incoming connection from a particular host?
The short answer to all these questions is "No, it cannot." A slightly more elaborate answer would specify a few exceptions. Applets can make network connections to the host they came from; applets can read a few very specific files that contain information about the Java environment; and trusted applets may sometimes run without these restrictions. But for almost all practical purposes, the answer is almost always no.For more exotic situations, such as trusted applets, see Java Security by Scott Oaks, (O'Reilly & Associates, 1998). Trusted applets are useful on corporate networks, but you shouldn't waste a lot of time laboring under the illusion that anyone on the Internet at large will trust your applets.Because of these security issues, you need to be careful when using code fragments and examples from this book in an applet. Everything shown here works when run in an application, but when run in an applet, it may fail with aAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! -
- Chapter 2: Output Streams
- Content preview·Buy reprint rights for this chapterThe
java.io.OutputStream
class declares the three basic methods you need to write bytes of data onto a stream. It also has methods for closing and flushing streams.public abstract void write(int b) throws IOException public void write(byte[] data) throws IOException public void write(byte[] data, int offset, int length) throws IOException public void flush() throws IOException public void close() throws IOException
OutputStream
is an abstract class. Subclasses provide implementations of the abstractwrite(int
b)
method. They may also override the four nonabstract methods. For example, theFileOutputStream
class overrides all five methods with native methods that know how to write bytes into files on the host platform. AlthoughOutputStream
is abstract, often you only need to know that the object you have is anOutputStream
; the more specific subclass ofOutputStream
is hidden from you. For example, thegetOutputStream()
method ofjava.net.URLConnection
has the signature:public OutputStream getOutputStream() throws IOException
Depending on the type of URL associated with thisURLConnection
object, the actual class of the output stream that's returned may be asun.net.TelnetOutputStream
, asun.net.smtp.SmtpPrintStream
, asun.net.www.http.KeepAliveStream
, or something else completely. All you know as a programmer, and all you need to know, is that the object returned is in fact some instance ofOutputStream
. That's why the detailed classes that handle particular kinds of connections are hidden inside thesun
packages.Furthermore, even when working with subclasses whose types you know, you still need to be able to use the methods inherited fromOutputStream
. And since methods that are inherited are not included in the online documentation, it's important to remember that they're there. For example, thejava.io.DataOutputStream
class does not declare aAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The OutputStream Class
- Content preview·Buy reprint rights for this chapterThe
java.io.OutputStream
class declares the three basic methods you need to write bytes of data onto a stream. It also has methods for closing and flushing streams.public abstract void write(int b) throws IOException public void write(byte[] data) throws IOException public void write(byte[] data, int offset, int length) throws IOException public void flush() throws IOException public void close() throws IOException
OutputStream
is an abstract class. Subclasses provide implementations of the abstractwrite(int
b)
method. They may also override the four nonabstract methods. For example, theFileOutputStream
class overrides all five methods with native methods that know how to write bytes into files on the host platform. AlthoughOutputStream
is abstract, often you only need to know that the object you have is anOutputStream
; the more specific subclass ofOutputStream
is hidden from you. For example, thegetOutputStream()
method ofjava.net.URLConnection
has the signature:public OutputStream getOutputStream() throws IOException
Depending on the type of URL associated with thisURLConnection
object, the actual class of the output stream that's returned may be asun.net.TelnetOutputStream
, asun.net.smtp.SmtpPrintStream
, asun.net.www.http.KeepAliveStream
, or something else completely. All you know as a programmer, and all you need to know, is that the object returned is in fact some instance ofOutputStream
. That's why the detailed classes that handle particular kinds of connections are hidden inside thesun
packages.Furthermore, even when working with subclasses whose types you know, you still need to be able to use the methods inherited fromOutputStream
. And since methods that are inherited are not included in the online documentation, it's important to remember that they're there. For example, thejava.io.DataOutputStream
class does not declare aclose()
method, but you can still call the one it inherits from its superclass.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Writing Bytes to Output Streams
- Content preview·Buy reprint rights for this chapterThe fundamental method of the
OutputStream
class iswrite()
:public abstract void write(int b) throws IOException
This method writes a single unsigned byte of data whose value should be between and 255. If you pass a number larger than 255 or smaller than zero, it's reduced modulo 256 before being written.Example 2.1,AsciiChart
, is a simple program that writes the printable ASCII characters (32 to 126) on the console. The console interprets the numeric values as ASCII characters, not as numbers. This is a feature of the console, not of theOutputStream
class or the specific subclass of whichSystem.out
is an instance. Thewrite()
method merely sends a particular bit pattern to a particular output stream. How that bit pattern is interpreted depends on what's connected to the other end of the stream.Example 2.1. The AsciiChart Programimport java.io.*; public class AsciiChart { public static void main(String[] args) { for (int i = 32; i < 127; i++) { System.out.write(i); // break line after every eight characters. if (i % 8 == 7) System.out.write('\n'); else System.out.write('\t'); } System.out.write('\n'); } }
Notice the use of thechar
literals'\t'
and'\n'
. The compiler converts these to the numbers 9 and 10, respectively. When these numbers are written on the console, the console interprets those numbers as a tab and a linefeed, respectively. The same effect could have been achieved by writing theif
clause like this:if (i % 8 == 7) System.out.write(10); else System.out.write(9);
Here's the output:% java AsciiChart
! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ %
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Writing Arrays of Bytes
- Content preview·Buy reprint rights for this chapterIt's often faster to write larger chunks of data than to write byte by byte. Two overloaded variants of the
write()
method do this:public void write(byte[] data) throws IOException public void write(byte[] data, int offset, int length) throws IOException
The first variant writes the entirebyte
arraydata
. The second writes only the sub-array ofdata
starting atoffset
and continuing forlength
bytes. For example, the following code fragment blasts the bytes in a string ontoSystem.out
:String s = "How are streams treating you?"; byte[] data = s.getBytes(); System.out.write(data);
Conversely, you may run into performance problems if you attempt to write too much data at a time. The exact turnaround point depends on the eventual destination of the data. Files are often best written in small multiples of the block size of the disk, typically 512, 1024, or 2048 bytes. Network connections often require smaller buffer sizes, 128 or 256 bytes. The optimal buffer size depends on too many system-specific details for anything to be guaranteed, but I often use 128 bytes for network connections and 1024 bytes for files.Example 2.2 is a simple program that constructs a byte array filled with an ASCII chart, then blasts it onto the console in one call towrite()
.Example 2.2. The AsciiArray Programimport java.io.*; public class AsciiArray { public static void main(String[] args) { byte[] b = new byte[(127-31)*2]; int index = 0; for (int i = 32; i < 127; i++) { b[index++] = (byte) i; // Break line after every eight characters. if (i % 8 == 7) b[index++] = (byte) '\n'; else b[index++] = (byte) '\t'; } b[index++] = (byte) '\n'; try { System.out.write(b); } catch (IOException e) { System.err.println(e); } } }
The output is the same as in Example 2.1. Because of the nature of the console, this particular program probably isn't a lot faster than Example 2.1, but it certainly could be if you were writing data into a file rather than onto the console. The difference in performance between writing aAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Flushing and Closing Output Streams
- Content preview·Buy reprint rights for this chapterMany output streams buffer writes to improve performance. Rather than sending each byte to its destination as it's written, the bytes are accumulated in a memory buffer ranging in size from several bytes to several thousand bytes. When the buffer fills up, all the data is sent at once. The
flush()
method forces the data to be written whether or not the buffer is full:public void flush() throws IOException
This is not the same as any buffering performed by the operating system or the hardware. These buffers will not be emptied by a call toflush()
. (Thensync()
method in theFileDescriptor
class, discussed in Chapter 12, can sometimes be used to empty these buffers.) For example, assuming out is anOutputStream
of some sort, you would callout.flush()
to empty the buffers.If you only use a stream for a short time, you don't need to flush it explicitly. It should be flushed when the stream is closed. This should happen when the program exits or when you explicitly invoke theclose()
method:public void close() throws IOException
For example, again assumingout
is anOutputStream
of some sort, callingout.close()
closes the stream and implicitly flushes it. Once you have closed an output stream, you can no longer write to it. Attempting to do so will throw anIOException
.Again,System.out
is a partial exception because as aPrintStream
, all exceptions it throws are eaten. Once you closeSystem.out
, you can't write to it, but trying to do so won't throw any exceptions. However, your output will not appear on the console.You only need to flush an output stream explicitly if you want to make sure data is sent before you're through with the stream. For example, a program that sends a burst of data across the network periodically should flush after each burst of data is written to the stream.Flushing is often important when you're trying to debug a crashing program. All streams flush automatically when their buffers fill up, and all streams should be flushed when a program terminates normally. If a program terminates abnormally, however, buffers may not get flushed. In this case, unless there is an explicit call toAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Subclassing OutputStream
- Content preview·Buy reprint rights for this chapter
OutputStream
is an abstract class that mainly describes the operations available with any particularOutputStream
object. Specific subclasses know how to write bytes to particular destinations. For instance, aFileOutputStream
uses native code to write data in files. AByteArrayOutputStream
uses pure Java to write its output in a potentially expandingbyte
array.Recall that there are three overloaded variants of thewrite()
method inOutputStream
, one abstract, two concrete:public abstract void write(int b) throws IOException public void write(byte[] data) throws IOException public void write(byte[] data, int offset, int length) throws IOException
Subclasses must implement the abstractwrite(int
b)
method. They often choose to override the third variant,write(byte[],
data
int
offset,
int
length)
, for reasons of performance. The implementation of the three-argument version of thewrite()
method inOutputStream
simply invokeswrite(int
b)
repeatedly; that is:public void write(byte[] data, int offset, int length) throws IOException { for (int i = offset; i < offset+length; i++) write(data[i]); }
Most subclasses can provide more efficient implementations of this method. The one-argument variant ofwrite()
merely invokeswrite(data,
0,
data.length)
; if the three-argument variant has been overridden, this method will perform reasonably well. However, a few subclasses may override it anyway.Example 2.3 is a simple program calledNullOutputStream
that mimics the behavior of /dev/null on Unix operating systems. Data written into a null output stream is lost.Example 2.3. The NullOutputStream Classpackage com.macfaq.io; import java.io.*; public class NullOutputStream extends OutputStream { public void write(int b) { } public void write(byte[] data) { } public void write(byte[] data, int offset, int length) { } }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - A Graphical User Interface for Output Streams
- Content preview·Buy reprint rights for this chapterAs a useful example, I'm going to show a subclass of
java.awt.TextArea
that can be connected to an output stream. As data is written onto the stream, it is appended to the text area in the default character set (generally ISO Latin-1). (This isn't ideal. Since text areas contain text, a writer would be a better source for this data; in later chapters I'll expand on this class to use a writer instead. For now this makes a neat example.) This subclass is shown in Example 2.4.The actual output stream is contained in an inner class inside theStreamedTextArea
class. EachStreamedTextArea
component contains aTextAreaOutputStream
object in itstheOutput
field. Client programmers access this object via thegetOutputStream()
method of theStreamedTextArea
class. TheStreamedTextArea
class has five overloaded constructors that imitate the five constructors in thejava.awt.TextArea
class, each taking a different combination of text, rows, columns, and scrollbar information. The first four constructors merely pass their arguments and suitable defaults to the most general fifth constructor usingthis()
. The fifth constructor calls the most general superclass constructor, then callssetEditable(false)
to ensure that the user doesn't change the text while output is streaming into it.I've chosen not to override any methods in theTextArea
superclass. However, you might want to do so if you feel a need to change the normal abilities of a text area. For example, you could include a do-nothingappend()
method so that data can only be moved into the text area via the provided output stream or asetEditable()
method that doesn't allow the client programmer to make this area editable.Example 2.4. The StreamedTextArea Componentpackage com.macfaq.awt; import java.awt.*; import java.io.*; public class StreamedTextArea extends TextArea { OutputStream theOutput = new TextAreaOutputStream(); public StreamedTextArea() { this("carview.php?tsp=", 0, 0, SCROLLBARS_BOTH); } public StreamedTextArea(String text) { this(text, 0, 0, SCROLLBARS_BOTH); } public StreamedTextArea(int rows, int columns) { this("carview.php?tsp=", rows, columns, SCROLLBARS_BOTH); } public StreamedTextArea(String text, int rows, int columns) { this(text, rows, columns, SCROLLBARS_BOTH); } public StreamedTextArea(String text, int rows, int columns, int scrollbars) { super(text, rows, columns, scrollbars); setEditable(false); } public OutputStream getOutputStream() { return theOutput; } class TextAreaOutputStream extends OutputStream { public synchronized void write(int b) { // recall that the int should really just be a byte b &= 0x000000FF; // must convert byte to a char in order to append it char c = (char) b; append(String.valueOf(c)); } public synchronized void write(byte[] data, int offset, int length) { append(new String(data, offset, length)); } } }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 3: Input Streams
- Content preview·Buy reprint rights for this chapter
The
java.io.InputStream
class is the abstract superclass for all input streams. It declares the three basic methods needed to read bytes of data from a stream. It also has methods for closing and flushing streams, checking how many bytes of data are available to be read, skipping over input, marking a position in a stream and resetting back to that position, and determining whether marking and resetting are supported.public abstract int read() throws IOException public int read(byte[] data) throws IOException public int read(byte[] data, int offset, int length) throws IOException public long skip(long n) throws IOException public int available() throws IOException public void close() throws IOException public synchronized void mark(int readlimit) public synchronized void reset() throws IOException public boolean markSupported()
The fundamental method of theInputStream
class isread()
, which reads a single unsigned byte of data and returns the integer value of the unsigned byte. This is a number between and 255:public abstract int read() throws IOException
The following code reads 10 bytes from theSystem.in
input stream and stores them in theint
arraydata
:int[] data = new int[10]; for (int i = 0; i < data.length; i++) { data[i] = System.in.read(); }
Notice that althoughread()
is reading abyte
, it returns anint
. If you want to store the raw bytes instead, you can cast theint
to abyte
. For example:byte[] b = new byte[10]; for (int i = 0; i < b.length; i++) { b[i] = (byte) System.in.read(); }
Of course, this produces a signed byte instead of the unsigned byte returned by theread()
method (that is, a byte in the range -128 to 127 instead of to 255). As long as you're clear in your mind and your code about whether you're working with signed or unsigned data, you won't have any trouble. Signed bytes can be converted back toAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The InputStream Class
- Content preview·Buy reprint rights for this chapter
The
java.io.InputStream
class is the abstract superclass for all input streams. It declares the three basic methods needed to read bytes of data from a stream. It also has methods for closing and flushing streams, checking how many bytes of data are available to be read, skipping over input, marking a position in a stream and resetting back to that position, and determining whether marking and resetting are supported.public abstract int read() throws IOException public int read(byte[] data) throws IOException public int read(byte[] data, int offset, int length) throws IOException public long skip(long n) throws IOException public int available() throws IOException public void close() throws IOException public synchronized void mark(int readlimit) public synchronized void reset() throws IOException public boolean markSupported()
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The read( ) Method
- Content preview·Buy reprint rights for this chapterThe fundamental method of the
InputStream
class isread()
, which reads a single unsigned byte of data and returns the integer value of the unsigned byte. This is a number between and 255:public abstract int read() throws IOException
The following code reads 10 bytes from theSystem.in
input stream and stores them in theint
arraydata
:int[] data = new int[10]; for (int i = 0; i < data.length; i++) { data[i] = System.in.read(); }
Notice that althoughread()
is reading abyte
, it returns anint
. If you want to store the raw bytes instead, you can cast theint
to abyte
. For example:byte[] b = new byte[10]; for (int i = 0; i < b.length; i++) { b[i] = (byte) System.in.read(); }
Of course, this produces a signed byte instead of the unsigned byte returned by theread()
method (that is, a byte in the range -128 to 127 instead of to 255). As long as you're clear in your mind and your code about whether you're working with signed or unsigned data, you won't have any trouble. Signed bytes can be converted back toint
s in the range to 255 like this:int i = (b >= 0) ? b : 256 + b;
When you callread()
, you also have to catch theIOException
that it might throw. As I've observed, input and output are often subject to problems outside of your control: disks fail, network cables break, and so on. Therefore, virtually any I/O method can throw anIOException
, andread()
is no exception. You don't get anIOException
ifread()
encounters the end of the input stream; in this case, it returns -1. You use this as a flag to watch for the end of stream. The following code shows how to catch theIOException
and test for the end of the stream:try { int[] data = new int[10]; for (int i = 0; i < data.length; i++) { int datum = System.in.read(); if (datum == -1) break; data[i] = datum; } } catch (IOException e) {System.err.println("Couldn't read from System.in!");}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading Chunks of Data from a Stream
- Content preview·Buy reprint rights for this chapterInput and output are often the performance bottlenecks in a program. Reading from or writing to disk can be hundreds of times slower than reading from or writing to memory; network connections and user input are even slower. While disk capacities and speeds have increased over time, they have never kept pace with CPU speeds. Therefore, it's important to minimize the number of reads and writes a program actually performs.All input streams have overloaded
read()
methods that read chunks of contiguous data into abyte
array. The first variant tries to read enough data to fill the arraydata
. The second variant tries to readlength
bytes of data starting at positionoffset
into the arraydata
. Neither of these methods is guaranteed to read as many bytes as they want. Both methods return the number of bytes actually read, or -1 on end of stream.public int read(byte[] data) throws IOException public int read(byte[] data, int offset, int length) throws IOException
The default implementation of these methods in thejava.io.InputStream
class merely calls the basicread()
method enough times to fill the requested array or subarray. Thus, reading 10 bytes of data takes 10 times as long as reading one byte of data. However, most subclasses ofInputStream
override these methods with more efficient methods, perhaps native, that read the data from the underlying source as a block.For example, to attempt to read 10 bytes fromSystem.in
, you could write the following code:try { byte[] b = new byte[10]; System.in.read(b); } catch (IOException e) {System.err.println("Couldn't read from System.in!");}
Reads don't always succeed in getting as many bytes as you want. Conversely, there's nothing to stop you from trying to read more data into the array than will fit. If you read more data than the array can hold, anArrayIndexOutOfBoundsException
will be thrown. For example, the following code loops repeatedly until it either fills the array or sees the end of stream:Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Counting the Available Bytes
- Content preview·Buy reprint rights for this chapterIt's sometimes convenient to know how many bytes are available to be read before you attempt to read them. The
InputStream
class'savailable()
method tells you how many bytes you can read without blocking. It returns if there's no data available to be read.public int available() throws IOException
For example:try { byte[] b = new byte[100]; int offset = 0; while (offset < b.length) { int a = System.in.available(); int bytesRead = System.in.read(b, offset, a); if (bytesRead == -1) break; // end of stream offset += bytesRead; } catch (IOException e) {System.err.println("Couldn't read from System.in!");}
There's a potential bug in this code. There may be more bytes available than there's space in the array to hold them. One common idiom is to size the array according to the numberavailable()
returns, like this:try { byte[] b = new byte[System.in.available()]; System.in.read(b); } catch (IOException e) {System.err.println("Couldn't read from System.in!");}
This works well if you're only going to perform a single read. For multiple reads, however, the overhead of creating multiple arrays is excessive. You should probably reuse the array and only create a new array if more bytes are available than will fit in the array.Theavailable()
method injava.io.InputStream
always returns 0. Subclasses are supposed to override it, but I've seen a few that don't. You may be able to read more bytes from the underlying stream without blocking thanavailable()
suggests; you just can't guarantee that you can. If this is a concern, you can place input in a separate thread so that blocked input doesn't block the rest of the program.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Skipping Bytes
- Content preview·Buy reprint rights for this chapterAlthough you can just read from a stream and ignore the bytes read, Java provides a
skip()
method that jumps over a certain number of bytes in the input:public long skip(long bytesToSkip) throws IOException
The argument toskip()
is the number of bytes to skip. The return value is the number of bytes actually skipped, which may be less thanbytesToSkip
. -1 is returned if the end of stream is encountered. Both the argument and return value arelong
s, allowingskip()
to handle extremely long input streams. Skipping is often faster than reading and discarding the data you don't want. For example, when an input stream is attached to a file, skipping bytes just requires that an integer called the file pointer be changed, whereas reading involves copying bytes from the disk into memory. For example, to skip the next 80 bytes of the input streamin
:try { long bytesSkipped = 0; long bytesToSkip = 80; while (bytesSkipped < bytesToSkip) { long n = in.skip(bytesToSkip - bytesSkipped); if (n == -1) break; bytesSkipped += n; } } catch (IOException e) {System.err.println(e);}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Closing Input Streams
- Content preview·Buy reprint rights for this chapterWhen you're through with a stream, you should close it. This allows the operating system to free any resources associated with the stream; exactly what these resources are depends on your platform and varies with the type of the stream. However, systems only have finite resources. For example, on most personal computer operating systems, no more than several hundred files can be open at once. Multiuser operating systems have larger limits, but limits nonetheless.To close a stream, you invoke its
close()
method:public void close() throws IOException
Not all streams need to be closed—System.in
generally does not need to be closed, for example. However, streams associated with files and network connections should always be closed when you're done with them. For example:try { URL u = new URL("https://www.javasoft.com/"); InputStream in = u.openStream(); // Read from the stream... in.close(); } catch (IOException e) {System.err.println(e);}
Once you have closed an input stream, you can no longer read from it. Attempting to do so will throw anIOException
.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Marking and Resetting
- Content preview·Buy reprint rights for this chapterIt's often useful to be able to read a few bytes and then back up and reread them. For example, in a Java compiler, you don't know for sure whether you're reading the token
<
,<<
, or<<
= until you've read one too many characters. It would be useful to be able to back up and reread the token once you know which token you've read. Compiler design and other parsing problems provide many more examples, and this need occurs in other domains as well.Some (but not all) input streams allow you to mark a particular position in the stream and then return to it. Three methods in thejava.io.InputStream
class handle marking and resetting:public synchronized void mark(int readLimit) public synchronized void reset() throws IOException public boolean markSupported()
The booleanmarkSupported()
method returnstrue
if this stream supports marking andfalse
if it doesn't. If marking is not supported,reset()
throws anIOException
andmark()
does nothing. Assuming the stream does support marking, themark()
method places a bookmark at the current position in the stream. You can rewind the stream to this position later withreset()
as long as you haven't read more thanreadLimit
bytes. There can be only one mark in the stream at any given time. Marking a second location erases the first mark.The only two input stream classes injava.io
that always support marking areBufferedInputStream
(of whichSystem.in
is an instance) andByteArrayInputStream
. However, other input streams, likeDataInputStream
, may support marking if they're chained to a buffered input stream first.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Subclassing InputStream
- Content preview·Buy reprint rights for this chapterImmediate subclasses of
InputStream
must provide an implementation of the abstractread()
method. They may also override some of the nonabstract methods. For example, the defaultmarkSupported()
method returns false,mark()
does nothing, andreset()
throws anIOException
. Any class that allows marking and resetting must override these three methods. Furthermore, they may want to override methods that perform functions likeskip()
and the other tworead()
methods to provide more efficient implementations.Example 3.2 is a simple class calledRandomInputStream
that "reads" random bytes of data. This provides a useful source of unlimited data you can use in testing. Ajava.util.Random
object provides the data.Example 3.2. The RandomInputStream Classpackage com.macfaq.io; import java.util.*; import java.io.*; public class RandomInputStream extends InputStream { private transient Random generator = new Random(); public int read() { int result = generator.nextInt() % 256; if (result < 0) result = -result; return result; } public int read(byte[] data, int offset, int length) throws IOException { byte[] temp = new byte[length]; generator.nextBytes(temp); System.arraycopy(temp, 0, data, offset, length); return length; } public int read(byte[] data) throws IOException { generator.nextBytes(data); return data.length; } public long skip(long bytesToSkip) throws IOException { // It's all random so skipping has no effect. return bytesToSkip; } }
The no-argumentread()
method returns a randomint
in the range of an unsigned byte (0 to 255). The other tworead()
methods fill a specified part of an array with random bytes. They return the number of bytes read (in this case the number of bytes created).Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - An Efficient Stream Copier
- Content preview·Buy reprint rights for this chapterAs a useful example of both input and output streams, in Example 3.3 I'll present a
StreamCopier
class that copies data between two streams as quickly as possible. (I'll reuse this class in later chapters.) This method reads from the input stream and writes onto the output stream until the input stream is exhausted. A 256-byte buffer is used to try to make the reads efficient. Amain()
method provides a simple test for this class by reading fromSystem.in
and copying toSystem.out
.Example 3.3. The StreamCopier Classpackage com.macfaq.io; import java.io.*; public class StreamCopier { public static void main(String[] args) { try { } catch (IOException e) {System.err.println(e);} } public static void copy(InputStream in, OutputStream out) throws IOException { // Do not allow other threads to read from the input // or write to the output while copying is taking place synchronized (in) { synchronized (out) { byte[] buffer = new byte[256]; while (true) { int bytesRead = in.read(buffer); if (bytesRead == -1) break; out.write(buffer, 0, bytesRead); } } } } }
Here's a simple test run:D:\JAVA\ioexamples\03>java com.macfaq.io.StreamCopier
this is a test this is a test 0987654321 0987654321 ^Z
Input was not fed from the console (DOS prompt) to theStreamCopier
program until the end of each line. Since I ran this in Windows, the end-of-stream character is Ctrl-Z. On Unix it would have been Ctrl-D.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 4: File Streams
- Content preview·Buy reprint rights for this chapterUntil now, most of the examples in this book have used the streams
System.in
andSystem.out
. These are convenient for examples, but in real life, you'll more commonly attach streams to data sources like files and network connections. You'll use thejava.io.FileInputStream
andjava.io.FileOutputStream
classes, which are concrete subclasses ofjava.io.InputStream
andjava.io.OutputStream
,t
o read and write files.FileInputStream
andFileOutputStream
provide input and output streams that let you read and write files. We'll discuss these classes in detail in this chapter; they provide the standard methods for reading and writing data. What they don't provide is a mechanism for file-specific operations, like finding out whether a file is readable or writable. For that, you may want to look forward to Chapter 12, which talks about theFile
class itself and the way Java works with files.java.io.FileInputStream
is a concrete subclass ofjava.io.InputStream
. It provides an input stream connected to a particular file.public class FileInputStream extends InputStream
FileInputStream
has all the usual methods of input streams, such asread()
,available()
,skip()
, andclose()
, which are used exactly as they are for any other input stream.public native int read() throws IOException public int read(byte[] data) throws IOException public int read(byte[] data, int offset, int length) throws IOException public native long skip(long n) throws IOException public native int available() throws IOException public native void close() throws IOException
These methods are all implemented in native code, except for the two multibyteread()
methods. These, however, just pass their arguments on to a private native method calledreadBytes()
, so effectively all these methods are implemented with native code. (In Java 2,Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading Files
- Content preview·Buy reprint rights for this chapter
java.io.FileInputStream
is a concrete subclass ofjava.io.InputStream
. It provides an input stream connected to a particular file.public class FileInputStream extends InputStream
FileInputStream
has all the usual methods of input streams, such asread()
,available()
,skip()
, andclose()
, which are used exactly as they are for any other input stream.public native int read() throws IOException public int read(byte[] data) throws IOException public int read(byte[] data, int offset, int length) throws IOException public native long skip(long n) throws IOException public native int available() throws IOException public native void close() throws IOException
These methods are all implemented in native code, except for the two multibyteread()
methods. These, however, just pass their arguments on to a private native method calledreadBytes()
, so effectively all these methods are implemented with native code. (In Java 2,read(byte[]
data,
int
offset,
int
length)
is a native method thatread(byte[]
data)
invokes.)There are threeFileInputStream()
constructors, which differ only in how the file to be read is specified:public FileInputStream(String fileName) throws IOException public FileInputStream(File file) throws FileNotFoundException public FileInputStream(FileDescriptor fdObj)
The first constructor uses a string containing the name of the file. The second constructor uses ajava.io.File
object. The third constructor uses ajava.io.FileDescriptor
object. Filenames are platform-dependent, so hardcoded file names should be avoided where possible. Using the first constructor violates Sun's rules for "100% Pure Java" immediately. Therefore, the second two constructors are much preferred. Nonetheless, the second two will have to wait untilFile
objects and file descriptors are discussed in Chapter 12. For now, I will use only the first.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Writing Files
- Content preview·Buy reprint rights for this chapterThe
java.io.FileOutputStream
class is a concrete subclass ofjava.io.OutputStream
that provides output streams connected to files.public class FileOutputStream extends OutputStream
This class has all the usual methods of output streams, such aswrite()
,flush()
, andclose()
, which are used exactly as they are for any other output stream.public native void write(int b) throws IOException public void write(byte[] data) throws IOException public void write(byte[] data, int offset, int length) throws IOException public native void close() throws IOException
These are all implemented in native code except for the two multibytewrite()
methods. These, however, just pass their arguments on to a private native method calledwriteBytes()
, so effectively all these methods are implemented with native code.There are three mainFileOutputStream()
constructors, differing primarily in how the file is specified:public FileOutputStream(String filename) throws IOException public FileOutputStream(File file) throws IOException public FileOutputStream(FileDescriptor fd)
The first constructor uses a string containing the name of the file; the second constructor uses ajava.io.File
object; the third constructor uses ajava.io.FileDescriptor
object. I will avoid using the second and third constructors until I've discussedFile
objects and file descriptors (Chapter 12). To write data to a file, just pass the name of the file to theFileOutputStream()
constructor, then use thewrite()
methods as normal. If the file does not exist, all three constructors will create it. If the file does exist, any data inside it will be overwritten.A fourth constructor also lets you specify whether the file's contents should be erased before data is written into it (append
==
false
) or whether data is to be tacked onto the end of the file (Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Viewer, Part 1
- Content preview·Buy reprint rights for this chapterI often find it useful to be able to open an arbitrary file and interpret it in an arbitrary fashion. Most commonly I want to view a file as text, but occasionally it's useful to interpret it as hexadecimal integers, IEEE 754 floating-point data, or something else. In this book, I'm going to develop a program that lets you open any file and view its contents in a variety of different ways. In each chapter, I'll add a piece to the program until it's fully functional. Since this is only the beginning of the program, it's important to keep the code as general and adaptable as possible.Example 4.3 reads a series of filenames from the command line in the
main()
method. Each filename is passed to a method that opens the file. The file's data is read and printed onSystem.out
. Exactly how the data is printed onSystem.out
is determined by a command-line switch. If the user selects ASCII format (-a
), then the data will be assumed to be ASCII (more properly, ISO Latin-1) text and printed aschar
s. If the user selects decimal dump (-d
), then each byte should be printed as unsigned decimal numbers between and 255, 16 to a line. For example:000 234 127 034 234 234 000 000 000 002 004 070 000 234 127 098
Leading zeros are used to maintain a constant width for the printed byte values and for each line. A simple selection algorithm is used to determine how many leading zeros to attach to each number. For hex dump format (-h
), each byte should be printed as two hexadecimal digits. For example:CA FE BA BE 07 89 9A 65 45 65 43 6F F6 7F 8F EE E5 67 63 26 98 9E 9C
Hexadecimal encoding is easier, because each byte is always exactly two hex digits. The staticInteger.toHexString()
method is used to convert each byte read into two hexadecimal digits.ASCII format is the default and is the simplest to implement. This conversion can be accomplished merely by copying the input data to the console.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 5: Network Streams
- Content preview·Buy reprint rights for this chapterFrom its first days, Java has had the network in mind, more so than any other common programming language. Java is the first programming language to provide as much support for network I/O as it does for file I/O, perhaps even more—Java's
URL
,URLConnection
,Socket
, andServerSocket
classes are all fertile sources of streams. The exact type of the stream used by a network connection is typically hidden inside the undocumentedsun
classes. Thus, network I/O relies primarily on the basicInputStream
andOutputStream
methods, which you can wrap with any higher-level stream that suits your needs: buffering, cryptography, compression, or whatever your application requires.Thejava.net.URL
class represents a Uniform Resource Locator like https://metalab.unc.edu/javafaq/. Each URL unambiguously identifies the location of a resource on the Internet. TheURL
class has four constructors. All are declared to throwMalformedURLException
, a subclass ofIOException
.public URL(String u) throws MalformedURLException public URL(String protocol, String host, String file) throws MalformedURLException public URL(String protocol, String host, int port, String file) throws MalformedURLException public URL(URL context, String u) throws MalformedURLException
AMalformedURLException
is thrown if the constructor's arguments do not specify a valid URL. Often this means a particular Java implementation does not have the right protocol handler installed. Thus, given a complete absolute URL like https://www.poly.edu/schedule/fall97/bgrad.html#cs, you construct aURL
object like this:URL u = null; try { u = new URL("https://www.poly.edu/schedule/fall97/bgrad.html#cs"); } catch (MalformedURLException e) { }
You can also construct theURL
object by passing its pieces to the constructor:URL u = null; try { u = new URL("http", "www.poly.edu", "/schedule/fall97/bgrad.html#cs"); } catch (MalformedURLException e) { }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - URLs
- Content preview·Buy reprint rights for this chapterThe
java.net.URL
class represents a Uniform Resource Locator like https://metalab.unc.edu/javafaq/. Each URL unambiguously identifies the location of a resource on the Internet. TheURL
class has four constructors. All are declared to throwMalformedURLException
, a subclass ofIOException
.public URL(String u) throws MalformedURLException public URL(String protocol, String host, String file) throws MalformedURLException public URL(String protocol, String host, int port, String file) throws MalformedURLException public URL(URL context, String u) throws MalformedURLException
AMalformedURLException
is thrown if the constructor's arguments do not specify a valid URL. Often this means a particular Java implementation does not have the right protocol handler installed. Thus, given a complete absolute URL like https://www.poly.edu/schedule/fall97/bgrad.html#cs, you construct aURL
object like this:URL u = null; try { u = new URL("https://www.poly.edu/schedule/fall97/bgrad.html#cs"); } catch (MalformedURLException e) { }
You can also construct theURL
object by passing its pieces to the constructor:URL u = null; try { u = new URL("http", "www.poly.edu", "/schedule/fall97/bgrad.html#cs"); } catch (MalformedURLException e) { }
You don't normally need to specify a port for a URL; most protocols have default ports. For instance, the HTTP port is 80. Sometimes the port used does change, and in that case you can use the third constructor:URL u = null; try { u = new URL("http", "www.poly.edu", 80, "/schedule/fall97/bgrad.html#cs"); } catch (MalformedURLException e) { }
Finally, many HTML files contain relative URLs. The fourth constructor in the previous code creates URLs relative to a given URL and is particularly useful when parsing HTML. For example, the following code creates a URL pointing to the file 08.html, taking the rest of the URL fromAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - URL Connections
- Content preview·Buy reprint rights for this chapterURL connections are closely related to URLs, as their name implies. Indeed, you get a reference to a
URLConnection
by using theopenConnection()
method of aURL
object; in many ways, theURL
class is only a wrapper around theURLConnection
class. However, URL connections provide more control over the communication between the client and the server. In particular, URL connections provide not just input streams by which the client can read data from the server, but also output streams to send data from the client to the server. This is essential for protocols like mailto.Thejava.net.URLConnection
class is an abstract class that handles communication with different kinds of servers, like FTP servers and web servers. Protocol-specific subclasses ofURLConnection
, hidden inside thesun
classes, handle different kinds of servers.URL connections take place in five steps:-
The
URL
object is constructed. -
The
openConnection()
method of theURL
object creates theURLConnection
object. -
The parameters for the connection and the request properties that the client sends to the server are set up.
-
The
connect()
method makes the connection to the server, perhaps using a socket for a network connection or a file input stream for a local connection. The response header information is read from the server. -
Data is read from the connection by using the input stream returned by
getInputStream()
or through a content handler withgetContent()
. Data can be sent to the server using the output stream provided bygetOutputStream()
.
This scheme is very much based on the HTTP/1.0 protocol. It does not fit other schemes that have a more interactive "request, response, request, response, request, response" pattern instead of HTTP/1.0's "single request, single response, close connection" pattern. In particular, FTP and even HTTP/1.1 aren't well suited to this pattern. I wouldn't be surprised to see this replaced with something more general in a future version of Java.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! -
- Sockets
- Content preview·Buy reprint rights for this chapterBefore data is sent across the Internet from one host to another, it is split into packets of varying but finite size called datagrams. Datagrams range in size from a few dozen bytes to about 60,000 bytes. Anything larger, and often things smaller, must be split into smaller pieces before it can be transmitted. The advantage of this scheme is that if one packet is lost, it can be retransmitted without requiring redelivery of all other packets. Furthermore, if packets arrive out of order, they can be reordered at the receiving end of the connection.Fortunately, packets are invisible to the Java programmer. The host's native networking software splits data into packets on the sending end and reassembles packets on the receiving end. Instead, the Java programmer is presented with a higher-level abstraction called a socket. The socket represents a reliable connection for the transmission of data between two hosts. It isolates you from the details of packet encodings, lost and retransmitted packets, and packets that arrive out of order. A socket performs four fundamental operations:
-
Connect to a remote machine
-
Send data
-
Receive data
-
Close the connection
A socket may not be connected to more than one host at a time. However, a socket may both send data to and receive data from the host to which it's connected.Thejava.net.Socket
class is Java's interface to a network socket and allows you to perform all four fundamental socket operations. It provides raw, uninterpreted communication between two hosts. You can connect to remote machines; you can send data; you can receive data; you can close the connection. No part of the protocol is abstracted out, as it is withURL
andURLConnection
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! -
- Server Sockets
- Content preview·Buy reprint rights for this chapterThere are two ends to each connection: the client, which initiates the connection, and the server, which responds to the connection. So far, we've only discussed the client side and assumed that a server existed out there for the client to talk to. To implement a server, you need to write a program that waits for other hosts to connect to it. A server socket binds to a particular port on the local machine (the server); once it has successfully bound to a port, it listens for incoming connection attempts from remote machines (the clients). When the server detects a connection attempt, it accepts the connection. This creates a socket between the two machines over which the client and the server communicate.Many clients can connect to a port on the server simultaneously. Incoming data is distinguished by the port to which it is addressed and the client host and port from which it came. The server can tell for which service (like HTTP or FTP) the data is intended by inspecting the port. It knows where to send any response by looking at the client address and port stored with the data.No more than one server socket can listen to a particular port at one time. Therefore, since a server may need to handle many connections at once, server programs tend to be heavily multithreaded. Generally, the server socket listening on the port only accepts the connections. It passes off the actual processing of each connection to a separate thread. Incoming connections are stored in a queue until the server can accept them. On most systems, the default queue length is between 5 and 50. Once the queue fills up, further incoming connections are refused until space in the queue opens up.The
java.net.ServerSocket
class represents a server socket. Three constructors let you specify the port to bind to, the queue length for incoming connections, and the IP address:public ServerSocket(int port) throws IOException public ServerSocket(int port, int backlog) throws IOException public ServerSocket(int port, int backlog, InetAddress bindAddr) throws IOException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - URLViewer
- Content preview·Buy reprint rights for this chapterExample 5.6 is an improved version of the
URLViewer
you first encountered in Chapter 2. This is a simple application that provides a window in which you can view the contents of a URL. It assumes that those contents are more or less ASCII text. (In future chapters, I'll remove that restriction.) Figure 5.1 shows the result. Our application has a text area in which the user can type a URL, a Load button that the user uses to load the specified URL, and aStreamedTextArea
component that displays the text from the URL. Each of these corresponds to a field in theURLViewer
class.Figure 5.1: The URLViewerExample 5.6. The URLViewer Programimport java.awt.*; import java.awt.event.*; import java.io.*; import java.net.*; import com.macfaq.awt.*; import com.macfaq.io.*; public class URLViewer extends Frame implements WindowListener, ActionListener { TextField theURL = new TextField(); Button loadButton = new Button("Load"); StreamedTextArea theDisplay = new StreamedTextArea(); public URLViewer() { super("URL Viewer"); } public void init() { this.add("North", theURL); this.add("Center", theDisplay); Panel south = new Panel(); south.add(loadButton); this.add("South", south); theURL.addActionListener(this); loadButton.addActionListener(this); this.addWindowListener(this); this.setLocation(50, 50); this.pack(); this.show(); } public void actionPerformed(ActionEvent evt) { try { URL u = new URL(theURL.getText()); InputStream in = u.openStream(); OutputStream out = theDisplay.getOutputStream(); StreamCopier.copy(in, out); in.close(); out.close(); } catch (MalformedURLException ex) {theDisplay.setText("Invalid URL");} catch (IOException ex) {theDisplay.setText("Invalid URL");} } public void windowClosing(WindowEvent e) { this.setVisible(false); this.dispose(); } public void windowOpened(WindowEvent e) {} public void windowClosed(WindowEvent e) {} public void windowIconified(WindowEvent e) {} public void windowDeiconified(WindowEvent e) {} public void windowActivated(WindowEvent e) {} public void windowDeactivated(WindowEvent e) {} public static void main(String args[]) { URLViewer me = new URLViewer(); me.init(); } }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 6: Filter Streams
- Content preview·Buy reprint rights for this chapterFilter input streams read data from a preexisting input stream like a
FileInputStream
and have an opportunity to work with or change the data before it is delivered to the client program. Filter output streams write data to a preexisting output stream such as aFileOutputStream
and have an opportunity to work with or change the data before it is written onto the underlying stream. Multiple filters can be chained onto a single underlying stream. Filter streams are used for encryption, compression, translation, buffering, and much more.The word filter is derived by analogy from a water filter. A water filter sits between the pipe and faucet, pulling out impurities. A stream filter sits between the source of the data and its eventual destination and applies a specific algorithm to the data. As drops of water are passed through the water filter and modified, so too are bytes of data passed through the stream filter. Of course, there are some big differences—most notably, a stream filter can add data or some other kind of annotation to the stream, in addition to removing things you don't want; it may even produce a stream that is completely different from its original input (for example, by compressing the original data).java.io.FilterInputStream
andjava.io.FilterOutputStream
are concrete superclasses for input and output stream subclasses that somehow modify or manipulate data of an underlying stream:public class FilterInputStream extends InputStream public class FilterOutputStream extends OutputStream
Each of these classes has a single protected constructor that specifies the underlying stream from which the filter stream reads or writes data:protected FilterInputStream(InputStream in) protected FilterOutputStream(OutputStream out)
These constructors set protectedInputStream
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The Filter Stream Classes
- Content preview·Buy reprint rights for this chapter
java.io.FilterInputStream
andjava.io.FilterOutputStream
are concrete superclasses for input and output stream subclasses that somehow modify or manipulate data of an underlying stream:public class FilterInputStream extends InputStream public class FilterOutputStream extends OutputStream
Each of these classes has a single protected constructor that specifies the underlying stream from which the filter stream reads or writes data:protected FilterInputStream(InputStream in) protected FilterOutputStream(OutputStream out)
These constructors set protectedInputStream
andOutputStream
fields, calledin
andout
, inside theFilterInputStream
andFilterOutputStream
classes, respectively.protected InputStream in protected OutputStream out
Since the constructors are protected, filter streams may only be created by subclasses. Each subclass implements a particular filtering operation. Normally, such a pattern suggests that polymorphism is going to be used heavily, with subclasses standing in for the common superclass; however, it is uncommon to use filter streams polymorphically as instances ofFilterInputStream
orFilterOutputStream
. Most of the time, references to a filter stream are either references to a more specific subclass likeBufferedInputStream
or they're polymorphic references toInputStream
orOutputStream
with no hint of the filter left.Beyond the constructors, bothFilterInputStream
andFilterOutputStream
declare exactly the methods of their respective superclasses. ForFilterInputStream
, these are:public int read() throws IOException public int read(byte[] data) throws IOException public int read(byte[] data, int offset, int length) throws IOException public long skip(long n) throws IOException public int available() throws IOException public void close() throws IOException public synchronized void mark(int readlimit) public synchronized void reset() throws IOException public boolean markSupported()
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The Filter Stream Subclasses
- Content preview·Buy reprint rights for this chapterThe
java.io
package contains many useful filter stream classes. TheBufferedInputStream
andBufferedOutputStream
classes buffer reads and writes by first putting data into a buffer (an internal array of bytes). Thus, an application can read or write bytes to the stream without necessarily calling the underlying native methods. The data is read from or written into the buffer in blocks; subsequent accesses go straight to the buffer. This improves performance in many situations. Buffered input streams also allow the reader to back up and reread data.Thejava.io.PrintStream
class, whichSystem.out
andSystem.err
are instances of, allows very simple printing of primitive values, objects, and string literals. It uses the platform's default character encoding to convert characters into bytes. This class traps allIOException
s and is primarily intended for debugging.System.out
andSystem.err
are the most popular examples of thePrintStream
class, but you can connect aPrintStream
filter to other output streams as well. For example, you can chain aPrintStream
to aFileOutputStream
to easily write text into a file.ThePushbackInputStream
class has a one-byte pushback buffer so a program can "unread" the last character read. The next time data is read from the stream, the unread character is reread.TheDataInputStream
andDataOutputStream
classes read and write primitive Java data types and strings in a machine-independent way. (Big-endian for integer types, IEEE-754 forfloats
anddoubles
, UTF-8 for Unicode.) These are important enough to justify a chapter of their own and will be discussed in the next chapter. TheObjectInputStream
andObjectOutputStream
classes extendDataInputStream
andDataOutputStream
with methods to read and write arbitrary Java objects as well as primitive data types. These will be taken up in Chapter 11.Thejava.util.zip
package also includes several filter stream classes. The filter input streams in this package decompress compressed data; the filter output streams compress raw data. These will be discussed in Chapter 9.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Buffered Streams
- Content preview·Buy reprint rights for this chapterBuffered input streams read more data than they initially need into a buffer (an internal array of bytes). When the stream's
read()
methods are invoked, the data is removed from the buffer rather than the underlying stream. When the buffer runs out of data, the buffered stream refills its buffer from the underlying stream. Likewise, buffered output streams store data in an internal byte array until the buffer is full or the stream is flushed; then the data is written out to the underlying output stream in one swoop. In situations where it's almost as fast to read or write several hundred bytes from the underlying stream as it is to read or write a single byte, a buffered stream can provide a significant performance gain.There are twoBufferedInputStream
constructors and twoBufferedOutputStream
constructors:public BufferedInputStream(InputStream in) public BufferedInputStream(InputStream in, int size) public BufferedOutputStream(OutputStream out) public BufferedOutputStream(OutputStream out, int size)
The first argument is the underlying stream from which data will be read or to which data will be written. Thesize
argument is the number of bytes in the buffer. If a size isn't specified, a 2048-byte buffer is used. The best size for the buffer depends on the platform and is generally related to the block size of the disk (at least for file streams). Less than 512 bytes is probably too small and more than 4096 bytes is probably too large. Ideally, you want an integral multiple of the block size of the disk. However, you might want to use smaller buffer sizes for unreliable network connections. For example:URL u = new URL("https://java.developer.com"); BufferedInputStream bis = new BufferedInputStream(u.openStream(), 256);
Example 6.4 copies files named on the command line toSystem.out
with buffered reads and writes.Example 6.4. A BufferedStreamCopierAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - PushbackInputStream
- Content preview·Buy reprint rights for this chapterThe
java.io.PushbackInputStream
class provides a pushback buffer so a program can "unread" the last several bytes read. The next time data is read from the stream, the unread bytes are reread.public void unread(int b) throws IOException public void unread(byte[] data, int offset, int length) throws IOException public void unread(byte[] data) throws IOException
By default the buffer is only one byte long, and trying to unread more than one byte throws anIOException
. However, you can change the default buffer size with the second constructor:public PushbackInputStream(InputStream in) public PushbackInputStream(InputStream in, int size)
Although bothPushbackInputStream
andBufferedInputStream
use buffers, only aPushbackInputStream
allows unreading, and only aBufferedInputStream
allows marking and resetting. In aPushbackInputStream
,markSupported()
returns false.public boolean markSupported()
Theread()
andavailable()
methods work exactly as with normal input streams. However, they first attempt to read from the pushback buffer.public int read() throws IOException public int read(byte[] data, int offset, int length) throws IOException public int available() throws IOException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Print Streams
- Content preview·Buy reprint rights for this chapter
System.out
andSystem.err
are instances of thejava.io.PrintStream
class. This is a subclass ofFilterOutputStream
that converts numbers and objects to text.System.out
is primarily used for simple, character-mode applications and for debugging. Its raison d'être is convenience, not robustness; print streams ignore many issues involved in internationalization and error checking. This makesSystem.out
easy to use in quick and dirty hacks and simple examples, while simultaneously making it unsuitable for production code, which should use thejava.io.PrintWriter
class (discussed in Chapter 15) instead.ThePrintStream
class hasprint()
andprintln()
methods that handle every Java data type. Theprint()
andprintln()
methods differ only in thatprintln()
prints a platform-specific line terminator after printing its arguments andprint()
does not. These methods are:public void print(boolean b) public void print(char c) public void print(int i) public void print(long l) public void print(float f) public void print(double d) public void print(char[] s) public void print(String s) public void print(Object o) public void println() public void println(boolean b) public void println(char c) public void println(int i) public void println(long l) public void println(float f) public void println(double d) public void println(char[] s) public void println(String s) public void println(Object o)
Anything at all can be passed to aprint()
method; whatever argument you give is guaranteed to match at least one of these methods. Object types are converted to strings by invoking theirtoString()
method. Primitive types are converted with the appropriateString.valueOf()
method.One aspect of makingSystem.out
simple for quick jobs is not in thePrintStream
class at all but in the compiler. Because Java overloads the+
operator to signify concatenation of strings, primitive data types, and objects, you can pass multiple variables to theAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Multitarget Output Streams
- Content preview·Buy reprint rights for this chapterAs a final example, I present two slightly unusual filter output streams that direct their data to multiple underlying streams. The
TeeOutputStream
class, given in Example 6.5, has not one but two underlying streams. TheTeeOutputStream
does not modify the data that's written in any way; it merely writes it on both of its underlying streams.Example 6.5. The TeeOutputStream Classpackage com.macfaq.io; import java.io.*; public class TeeOutputStream extends FilterOutputStream { OutputStream out1; OutputStream out2; public TeeOutputStream(OutputStream stream1, OutputStream stream2) { super(stream1); out1 = stream1; out2 = stream2; } public synchronized void write(int b) throws IOException { out1.write(b); out2.write(b); } public synchronized void write(byte[] data, int offset, int length) throws IOException { out1.write(data, offset, length); out2.write(data, offset, length); } public void flush() throws IOException { out1.flush(); out2.flush(); } public void close() throws IOException { out1.close(); out2.close(); } }
It would be possible to store one of the output streams inFilterOutputStream
's protectedout
field and the other in a field in this class. However, it's simpler and cleaner to maintain the parallelism between the two streams by storing them both in theTeeOutputStream
class.I've synchronized thewrite()
methods to make sure that two different threads don't try to write to the sameTeeOutputStream
at the same time. Depending on unpredictable thread-scheduling issues, this could lead to data being written out of order or in different orders on different streams. It's important to make sure that one write is completely finished on all streams before the next write begins.Example 6.6 demonstrates how one might use this class to write aTeeCopier
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Viewer, Part 2
- Content preview·Buy reprint rights for this chapterThere's a saying among object-oriented programmers that you should create one design just to throw away. Now that we've got filter streams in hand, I'm ready to throw out the monolithic design for the
FileDumper
program used in Chapter 4. I'm going to rewrite it using a more flexible, extensible, object-oriented approach that relies on multiple chained filters. This allows us to extend the system to handle new formats without rewriting all the old classes. (It also makes some of the examples in subsequent chapters smaller, since I won't have to repeat all the code each time.) The basic idea is to make each interpretation of the data a filter input stream. Bytes from the underlying stream move into the filter; the filter converts the bytes into strings. Since more bytes generally come out of the filter than go into it (for instance, the single byte 32 is replaced by the four bytes "0", "3", "2", " " in decimal dump format), our filter streams buffer the data as necessary.The architecture revolves around the abstractDumpFilter
class shown in Example 6.9. The public interface of this class is identical to that ofFilterInputStream
. Internally, a buffer holds the string interpretation of each byte as an array of bytes. Theread()
method returns bytes from this array as long as possible. Anindex
field tracks the next available byte. Whenindex
reaches the length of the array, the abstractfill()
method is invoked to read from the underlying stream and place data in the buffer. By changing how thefill()
method translates the bytes it reads into the bytes in the buffer, you can change how the data is interpreted.Example 6.9. DumpFilterpackage com.macfaq.io; import java.io.*; public abstract class DumpFilter extends FilterInputStream { // This is really an array of unsigned bytes. protected int[] buf = new int[0]; protected int index = 0; public DumpFilter(InputStream in) { super(in); } public int read() throws IOException { int result; if (index < buf.length) { result = buf[index]; index++; } // end if else { try { this.fill(); // fill is required to put at least one byte // in the buffer or throw an EOF or IOException. result = buf[0]; index = 1; } catch (EOFException e) {result = -1;} } // end else return result; } protected abstract void fill() throws IOException; public int read(byte[] data, int offset, int length) throws IOException { if (data == null) { throw new NullPointerException(); } else if ((offset < 0) || (offset > data.length) || (length < 0) || ((offset + length) > data.length) || ((offset + length) < 0)) { throw new ArrayIndexOutOfBoundsException(); } else if (length == 0) { return 0; } // Check for end of stream. int datum = this.read(); if (datum == -1) { return -1; } data[offset] = (byte) datum; int bytesRead = 1; try { for (; bytesRead < length ; bytesRead++) { datum = this.read(); // In case of end of stream, return as much as we've got, // then wait for the next call to read to return -1. if (datum == -1) break; data[offset + bytesRead] = (byte) datum; } } catch (IOException e) { // Return what's already in the data array. } return bytesRead; } public int available() throws IOException { return buf.length - index; } public long skip(long bytesToSkip) throws IOException { long bytesSkipped = 0; for (; bytesSkipped < bytesToSkip; bytesSkipped++) { int c = this.read(); if (c == -1) break; } return bytesSkipped; } public synchronized void mark(int readlimit) {} public synchronized void reset() throws IOException { throw new IOException("marking not supported"); } public boolean markSupported() { return false; } }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 7: Data Streams
- Content preview·Buy reprint rights for this chapterData streams read and write strings, integers, floating-point numbers, and other data that's commonly presented at a higher level than mere bytes. The
java.io.DataInputStream
andjava.io.DataOutputStream
classes read and write the primitive Java data types (boolean
,int
,double
, etc.) and strings in a particular, well-defined, platform-independent format. SinceDataInputStream
andDataOutputStream
use the same formats, they're complementary. What a data output stream writes, a data input stream can read. These classes are especially useful when you need to move data between platforms that may use different native formats for integers or floating-point numbers.Thejava.io.DataInputStream
andjava.io.DataOutputStream
classes are subclasses ofFilterInputStream
andFilterOutputStream
, respectively.public class DataInputStream extends FilterInputStream implements DataInput public class DataOutputStream extends FilterOutputStream implements DataOutput
They have all the usual methods you've come to associate with input and output stream classes, such asread()
,write()
,flush()
,available()
,skip()
,close()
,markSupported()
, andreset()
. (Data input streams support marking if, and only if, their underlying input stream supports marking.) However, the real purpose ofDataInputStream
andDataOutputStream
is not to read and write raw bytes using the standard input and output stream methods. It's to read and interpret multibyte data likeint
s,float
s,double
s, andchar
s.Thejava.io.DataInput
interface declares 15 methods that read various kinds of data:public abstract boolean readBoolean() throws IOException public abstract byte readByte() throws IOException public abstract int readUnsignedByte() throws IOException public abstract short readShort() throws IOException public abstract int readUnsignedShort() throws IOException public abstract char readChar() throws IOException public abstract int readInt() throws IOException public abstract long readLong() throws IOException public abstract float readFloat() throws IOException public abstract double readDouble() throws IOException public abstract String readLine() throws IOException public abstract String readUTF() throws IOException public void readFully(byte[] data) throws IOException public void readFully(byte[] data, int offset, int length) throws IOException public int skipBytes(int n) throws IOException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The Data Stream Classes
- Content preview·Buy reprint rights for this chapterThe
java.io.DataInputStream
andjava.io.DataOutputStream
classes are subclasses ofFilterInputStream
andFilterOutputStream
, respectively.public class DataInputStream extends FilterInputStream implements DataInput public class DataOutputStream extends FilterOutputStream implements DataOutput
They have all the usual methods you've come to associate with input and output stream classes, such asread()
,write()
,flush()
,available()
,skip()
,close()
,markSupported()
, andreset()
. (Data input streams support marking if, and only if, their underlying input stream supports marking.) However, the real purpose ofDataInputStream
andDataOutputStream
is not to read and write raw bytes using the standard input and output stream methods. It's to read and interpret multibyte data likeint
s,float
s,double
s, andchar
s.Thejava.io.DataInput
interface declares 15 methods that read various kinds of data:public abstract boolean readBoolean() throws IOException public abstract byte readByte() throws IOException public abstract int readUnsignedByte() throws IOException public abstract short readShort() throws IOException public abstract int readUnsignedShort() throws IOException public abstract char readChar() throws IOException public abstract int readInt() throws IOException public abstract long readLong() throws IOException public abstract float readFloat() throws IOException public abstract double readDouble() throws IOException public abstract String readLine() throws IOException public abstract String readUTF() throws IOException public void readFully(byte[] data) throws IOException public void readFully(byte[] data, int offset, int length) throws IOException public int skipBytes(int n) throws IOException
These methods are all available from theDataInputStream
class and any other class that implementsAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading and Writing Integers
- Content preview·Buy reprint rights for this chapterThe
DataOutputStream
class has methods for writing all of Java's primitive integer data types:byte
,short
,int
, andlong
. TheDataInputStream
class has methods to read these types. It also has methods for reading two integer data types not directly supported by Java or theDataOutputStream
class: the unsignedbyte
and the unsignedint
.While Java's platform independence guarantees that you don't have to worry about precise data formats when working exclusively in Java, you frequently need to read data created by a program written in another language. Similarly, it's not unusual to have to write data that will be read by a program written in a different language. For example, most Java network clients (like HotJava) talk primarily to servers written in other languages, and most Java network servers (like the Java Web Server) talk primarily to clients written in other languages. You cannot naively assume that the data format Java uses is the data format other programs will understand; you must take care to understand and recognize the data formats being used.Although other schemes are possible, almost all modern computers have standardized on binary arithmetic performed on integers composed of an integral number of bytes. Furthermore, they've standardized on two's complement arithmetic for signed numbers. In two's complement arithmetic, the most significant bit is 1 for a negative number and for a positive number; the absolute value of a negative number is calculated by taking the complement of the number and adding 1. In Java terms, this means(-n
==
~n
+
1)
istrue
wheren
is a negativeint
.Regrettably, this is about all that's been standardized. One big difference between computer architectures is the size of anint
. Probably the majority of modern computers use four-byte integers that can hold a number between -2,147,483,648 and 2,147,483,647. However, some systems are moving to 64-bit architectures where the native integer ranges from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 and takes eight bytes. And many older systems use 16-bit integers that only range from -32,768 to 32,767. Exactly how many bytes a C compiler uses for eachAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading and Writing Floating-Point Numbers
- Content preview·Buy reprint rights for this chapterJava understands two floating-point number formats, both specified by the IEEE 754 standard. Floats are stored in four bytes with a 1-bit sign, a 24-bit mantissa, and an 8-bit exponent. Float values range from 1.40129846432481707×10 -45 to 3.40282346638528860×10 38, either positive or negative. Doubles take up eight bytes with a one-bit sign, 53-bit mantissa, and 11-bit exponent. This gives them a range of 4.94065645841246544×10 -324 to 1.79769313486231570×10 308, either positive or negative. Both
float
s anddouble
s also have representations of positive and negative zero, positive and negative infinity, and not a number (orNaN
).Astute readers will notice that the number of bits given forfloat
s anddouble
s adds up to 33 and 65 bits, respectively, one too many for the width of the number. A trick is used whereby the first bit of the mantissa of a nonzero number is assumed to be 1. With this trick, it is unnecessary to include the first bit of the mantissa. Thus, an extra bit of precision is gained for free.The details of this format are too complicated to discuss here. You can order the actual specification from the IEEE for about $29.00. That's approximately $1.50 a page, more than a little steep in my opinion. The specification isn't available online, but it was published in the February 1985 issue of ACM SIGPLAN Notices (Volume 22, #2, pp. 9-18), which should be available in any good technical library. The main thing you need to know is that these formats are supported by most modern RISC architectures and by all Pentium and Motorola 680x0 chips with either external or internal floating-point units (FPUs). Nowadays the only chips that don't natively support this format are a few embedded processors and some old 486SX, 68LC040, and other earlier FPU-less chips in legacy hardware. And even these systems are able to emulate IEEE 754 floating-point arithmetic in software.TheDataInputStream
class reads and theDataOutputStream
class writes floating-point numbers of either four or eight bytes in length, as specified in the IEEE 754 standard. They do not support the 10-byte and longer longAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading and Writing Booleans
- Content preview·Buy reprint rights for this chapterThe
DataOutputStream
class has awriteBoolean()
method and theDataInputStrea
m class has a correspondingreadBoolean()
method:public final void writeBoolean(boolean b) throws IOException public final boolean readBoolean() throws IOException
Although theoretically a single bit could be used to indicate the value of aboolean
, in practice a whole byte is used. This makes alignment much simpler and doesn't waste enough space to be an issue on modern machines. ThewriteBoolean()
method writes a zero byte (0x00) to indicatefalse
, a one byte (0x01) to indicatetrue
. ThereadBoolean()
method interprets asfalse
and any positive number astrue
. Negative numbers indicate end of stream and lead to anEOFException
being thrown.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading Byte Arrays
- Content preview·Buy reprint rights for this chapterAs already mentioned, the
DataInputStream
class has the usual two methods for reading bytes into abyte
array:public int read(byte[] data) throws IOException public int read(byte[] data, int offset, int length) throws IOException
Neither of these methods guarantees that all the bytes requested will be read. Instead, you're expected to check the number of bytes actually read, then callread()
again for a different part of the array as necessary. For example, to read 1024 bytes from theInputStream
in
into thebyte
arraydata
:int offset = 0; while (true){ int bytesRead = in.read(data, offset, data.length - offset); offset += bytesRead; if (bytesRead == -1 || offset >= data.length) break; }
TheDataInputStream
class has tworeadFully()
methods that provide this logic. Each reads repeatedly from the underlying input stream until the arraydata
or specified portion thereof is filled.public final void readFully(byte[] data) throws IOException public final void readFully(byte[] data, int offset, int length) throws IOException
If the data runs out before the array is filled and no more data is forthcoming, then anIOException
is thrown.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading and Writing Text
- Content preview·Buy reprint rights for this chapterBecause of the difficulties caused by different character sets, reading and writing text is one of the trickiest things you can do with streams. Most of the time, text should be handled with readers and writers, a subject we'll take up in Chapter 15. However, the
DataInputStream
andDataOutputStream
classes do provide methods a Java program can use to read and write text that another Java program will understand. The text format used is a compressed form of Unicode called UTF-8. It's unlikely that other, non-Java programs will understand this format unless they've been specially coded to interoperate with text data written by Java, especially since Java's UTF-8 differs slightly from the standard UTF-8 used in XML and elsewhere.Java strings andchar
s are Unicode. However, Unicode isn't particularly efficient. Most files of English text contain almost nothing but ASCII characters. Thus, using two bytes for these characters is really overkill. UTF-8 solves this problem by encoding the ASCII characters in a single byte at the expense of having to use three bytes for many more of the less common characters. For the purposes of this chapter, UTF-8 provides a more efficient way to read and write strings; it is used by thereadUTF()
andwriteUTF()
methods implemented by theDataInputStream
andDataOutputStream
classes. For a full description of UTF-8, see Chapter 14.The variant form of UTF-8 that these classes use is intended for string literals embedded in compiled byte code and serialized Java objects and for communication between two Java programs. It is not intended for reading and writing arbitrary UTF-8 text. To read standard UTF-8, you should use anInputStreamReader
; to write it, you should use anOutputStreamWriter
. These classes do not improperly encode the null character and will be discussed in Chapter 15.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Miscellaneous Methods
- Content preview·Buy reprint rights for this chapterThe
DataInputStream
andDataOutputStream
classes each have one method left to discuss,skipBytes()
andsize()
, respectively.TheDataOutputStream
class has a protected field calledwritten
that stores the number of bytes written to the output stream since it was constructed. The value of this field is returned by the publicsize()
method:protected int written public final int size()
Every time you invokewriteInt()
,writeBytes()
,writeUTF()
, or some other write method, thewritten
field is incremented by the number of bytes written. This might be useful if for some reason you're trying to limit the number of bytes you write. For instance, you may prefer to open a new file when you reach some preset size rather than continuing to write into a very large file.TheDataInputStream
class'sskipBytes()
method skips over a specified number of bytes without reading them. Unlike theskip()
method ofjava.io.InputStream
thatDataInputStream
inherits,skipBytes()
either skips over all the bytes it's asked to skip or it throws an exception:public final int skipBytes(int n) throws IOException public long skip(long n) throws IOException
skipBytes()
blocks and waits for more data untiln
bytes have been skipped (successful execution) or an exception is thrown. The method returns the number of bytes skipped, which is alwaysn
(because if it's notn
, an exception is thrown and nothing is returned). On end of stream, anEOFException
is thrown. AnIOException
is thrown if the underlying stream throws anIOException
.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading and Writing Little-Endian Numbers
- Content preview·Buy reprint rights for this chapterIt's likely that at some point in time you'll need to read a file full of little-endian data, especially if you're working on Intel hardware or with data written by native code on such a platform. Java has essentially no support for little-endian numbers. The
LittleEndianOutputStream
class in Example 7.8 and theLittleEndianInputStream
class in Example 7.9 provide the support you need to do this. These classes are closely modeled on thejava.io.DataInputStream
andjava.io.DataOutputStream
classes. Some of the methods in these classes do exactly the same thing as the same methods in theDataInputStream
andDataOutputStream
classes. After all, a big-endian byte is no different from a little-endian byte. In fact, these two classes come very close to implementing thejava.io.DataInput
andjava.io.DataOutput
interfaces. Actually doing so would have been a bad idea, however, because client programmers will expect objects implementingDataInput
andDataOutput
to use big-endian numbers, and it's best not to go against such common assumptions.I also considered making the little-endian classes subclasses ofDataInputStream
andDataOutputStream
. While this would have eliminated some duplicated methods likereadBoolean()
andwriteBoolean()
, it would also have required the new, little-endian methods to have unwieldy names likereadLittleEndianInt()
andwriteLittleEndianInt()
. Furthermore, it's unlikely you'll need to read or write both little-endian and big-endian numbers from the same stream. Most streams will contain one or the other but not both.Example 7.8. A LittleEndianOutputStream Class/* * @(#)LittleEndianOutputStream.java 1.0 98/08/29 */ package com.macfaq.io; import java.io.*; /** * A little-endian output stream writes primitive Java numbers * and characters to an output stream in a little-endian format. * The standard java.io.DataOutputStream class which this class * imitates uses big-endian integers. * * @author Elliotte Rusty Harold * @version 1.0, 29 Aug 1998 * @see com.macfaq.io.LittleEndianInputStream * @see java.io.DataOutputStream */ public class LittleEndianOutputStream extends FilterOutputStream { /** * The number of bytes written so far to the little-endian output stream. */ protected int written; /** * Creates a new little-endian output stream and chains it to the * output stream specified by the out argument. * * @param out the underlying output stream. * @see java.io.FilterOutputStream#out */ public LittleEndianOutputStream(OutputStream out) { super(out); } /** * Writes the specified byte value to the underlying output stream. * * @param b the <code>byte</code> value to be written. * @exception IOException if the underlying stream throws an IOException. */ public synchronized void write(int b) throws IOException { out.write(b); written++; } /** * Writes <code>length</code> bytes from the specified byte array * starting at <code>offset</code> to the underlying output stream. * * @param data the data. * @param offset the start offset in the data. * @param length the number of bytes to write. * @exception IOException if the underlying stream throws an IOException. */ public synchronized void write(byte[] data, int offset, int length) throws IOException { out.write(data, offset, length); written += length; } /** * Writes a <code>boolean</code> to the underlying output stream as * a single byte. If the argument is true, the byte value 1 is written. * If the argument is false, the byte value <code>0</code> is written. * * @param b the <code>boolean</code> value to be written. * @exception IOException if the underlying stream throws an IOException. */ public void writeBoolean(boolean b) throws IOException { if (b) this.write(1); else this.write(0); } /** * Writes out a <code>byte</code> to the underlying output stream * * @param b the <code>byte</code> value to be written. * @exception IOException if the underlying stream throws an IOException. */ public void writeByte(int b) throws IOException { out.write(b); written++; } /** * Writes a two byte <code>short</code> to the underlying output stream in * little-endian order, low byte first. * * @param s the <code>short</code> to be written. * @exception IOException if the underlying stream throws an IOException. */ public void writeShort(int s) throws IOException { out.write(s & 0xFF); out.write((s >>> 8) & 0xFF); written += 2; } /** * Writes a two byte <code>char</code> to the underlying output stream * in little-endian order, low byte first. * * @param c the <code>char</code> value to be written. * @exception IOException if the underlying stream throws an IOException. */ public void writeChar(int c) throws IOException { out.write(c & 0xFF); out.write((c >>> 8) & 0xFF); written += 2; } /** * Writes a four-byte <code>int</code> to the underlying output stream * in little-endian order, low byte first, high byte last * * @param i the <code>int</code> to be written. * @exception IOException if the underlying stream throws an IOException. */ public void writeInt(int i) throws IOException { out.write(i & 0xFF); out.write((i >>> 8) & 0xFF); out.write((i >>> 16) & 0xFF); out.write((i >>> 24) & 0xFF); written += 4; } /** * Writes an eight-byte <code>long</code> to the underlying output stream * in little-endian order, low byte first, high byte last * * @param l the <code>long</code> to be written. * @exception IOException if the underlying stream throws an IOException. */ public void writeLong(long l) throws IOException { out.write((int) l & 0xFF); out.write((int) (l >>> 8) & 0xFF); out.write((int) (l >>> 16) & 0xFF); out.write((int) (l >>> 24) & 0xFF); out.write((int) (l >>> 32) & 0xFF); out.write((int) (l >>> 40) & 0xFF); out.write((int) (l >>> 48) & 0xFF); out.write((int) (l >>> 56) & 0xFF); written += 8; } /** * Writes a 4 byte Java float to the underlying output stream in * little-endian order. * * @param f the <code>float</code> value to be written. * @exception IOException if an I/O error occurs. */ public final void writeFloat(float f) throws IOException { this.writeInt(Float.floatToIntBits(f)); } /** * Writes an 8 byte Java double to the underlying output stream in * little-endian order. * * @param d the <code>double</code> value to be written. * @exception IOException if an I/O error occurs. */ public final void writeDouble(double d) throws IOException { this.writeLong(Double.doubleToLongBits(d)); } /** * Writes a string to the underlying output stream as a sequence of * bytes. Each character is written to the data output stream as * if by the <code>writeByte()</code> method. * * @param s the <code>String</code> value to be written. * @exception IOException if the underlying stream throws an IOException. * @see java.io.LittleEndianOutputStream#writeByte(int) * @see java.io.LittleEndianOutputStream#out */ public void writeBytes(String s) throws IOException { int length = s.length(); for (int i = 0; i < length; i++) { out.write((byte) s.charAt(i)); } written += length; } /** * Writes a string to the underlying output stream as a sequence of * characters. Each character is written to the data output stream as * if by the <code>writeChar</code> method. * * @param s a <code>String</code> value to be written. * @exception IOException if the underlying stream throws an IOException. * @see java.io.LittleEndianOutputStream#writeChar(int) * @see java.io.LittleEndianOutputStream#out */ public void writeChars(String s) throws IOException { int length = s.length(); for (int i = 0; i < length; i++) { int c = s.charAt(i); out.write(c & 0xFF); out.write((c >>> 8) & 0xFF); } written += length * 2; } /** * Writes a string of no more than 65,535 characters * to the underlying output stream using UTF-8 * encoding. This method first writes a two byte short * in <b>big</b> endian order as required by the * UTF-8 specification. This gives the number of bytes in the * UTF-8 encoded version of the string, not the number of characters * in the string. Next each character of the string is written * using the UTF-8 encoding for the character. * * @param s the string to be written. * @exception UTFDataFormatException if the string is longer than * 65,535 characters. * @exception IOException if the underlying stream throws an IOException. */ public void writeUTF(String s) throws IOException { int numchars = s.length(); int numbytes = 0; for (int i = 0 ; i < numchars ; i++) { int c = s.charAt(i); if ((c >= 0x0001) && (c <= 0x007F)) numbytes++; else if (c > 0x07FF) numbytes += 3; else numbytes += 2; } if (numbytes > 65535) throw new UTFDataFormatException(); out.write((numbytes >>> 8) & 0xFF); out.write(numbytes & 0xFF); for (int i = 0 ; i < numchars ; i++) { int c = s.charAt(i); if ((c >= 0x0001) && (c <= 0x007F)) { out.write(c); } else if (c > 0x07FF) { out.write(0xE0 | ((c >> 12) & 0x0F)); out.write(0x80 | ((c >> 6) & 0x3F)); out.write(0x80 | (c & 0x3F)); written += 2; } else { out.write(0xC0 | ((c >> 6) & 0x1F)); out.write(0x80 | (c & 0x3F)); written += 1; } } written += numchars + 2; } /** * Returns the number of bytes written to this little-endian output stream. * (This class is not thread-safe with respect to this method. It is * possible that this number is temporarily less than the actual * number of bytes written.) * @return the value of the <code>written</code> field. * @see java.io.LittleEndianOutputStream#written */ public int size() { return this.written; } }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Thread Safety
- Content preview·Buy reprint rights for this chapterThe
LittleEndianInputStream
class is not perfectly thread-safe. Consider thereadInt()
method:public int readInt() throws IOException { int byte1 = in.read(); int byte2 = in.read(); int byte3 = in.read(); int byte4 = in.read(); if (byte4 == -1 || byte3 == -1 || byte2 == -1 || byte1 == -1) { throw new EOFException(); } return (byte4 << 24) + (byte3 << 16) + (byte2 << 8) + byte1; }
If two threads are trying to read from this input stream at the same time, there is no guarantee that bytes 1 through 4 will be read in order. The first thread might read bytes 1 and 2, then the second thread could preempt it and read any number of bytes. When the first thread regained control, it would no longer be able to read bytes 3 and 4, but would read whichever bytes happened to be next in line. It would then return an erroneous result.A synchronized block would solve this problem neatly:public int readInt() throws IOException { int byte1, byte2, byte3, byte4; synchronized (this) { byte1 = in.read(); byte2 = in.read(); byte3 = in.read(); byte4 = in.read(); } if (byte4 == -1 || byte3 == -1 || byte2 == -1 || byte1 == -1) { throw new EOFException(); } return (byte4 << 24) + (byte3 << 16) + (byte2 << 8) + byte1; }
It isn't necessary to synchronize the entire method, only the four lines that read from the underlying stream. However, this solution is still imperfect. It is remotely possible that another thread has a reference to the underlying stream rather than the little-endian input stream and will try to read directly from that. Therefore, you might be better off synchronizing on the underlying input streamin
.However, this would only prevent another thread from reading from the underlying input stream if the second thread also synchronized on the underlying input stream. In general you can't count on this, so it's not really a solution. In fact, Java really doesn't provide a good means to guarantee thread safety when you have to modify objects you don't control passed as arguments to your methods.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Viewer, Part 3
- Content preview·Buy reprint rights for this chapterIn Chapter 4, I introduced a
FileDumper
program that could print the raw bytes of a file in ASCII, hexadecimal, or decimal. In this chapter, I'm going to expand that program so that it can interpret the file as containing binary numbers of varying widths. In particular I'm going to make it possible to dump a file asshort
s, unsignedshort
s,int
s,long
s,float
s, anddouble
s. Integer types may be either big-endian or little-endian. The main class,FileDumper3
, is shown in Example 7.10. As in Chapter 4, this program reads a series of filenames and arguments from the command line in themain()
method. Each filename is passed to a method that opens a file input stream from the file. Depending on the command-line arguments, a particular subclass ofDumpFilter
from Chapter 6 is selected and chained to the input stream. Finally, theStreamCopier.copy()
method pours data from the input stream ontoSystem.out
.Example 7.10. The FileDumper3 Classimport java.io.*; import com.macfaq.io.*; public class FileDumper3 { public static final int ASC = 0; public static final int DEC = 1; public static final int HEX = 2; public static final int SHORT = 3; public static final int INT = 4; public static final int LONG = 5; public static final int FLOAT = 6; public static final int DOUBLE = 7; public static void main(String[] args) { if (args.length < 1) { System.err.println( "Usage: java FileDumper3 [-ahdsilfx] [-little] file1 file2..."); } boolean bigEndian = true; int firstFile = 0; int mode = ASC; // Process command-line switches. for (firstFile = 0; firstFile < args.length; firstFile++) { if (!args[firstFile].startsWith("-")) break; if (args[firstFile].equals("-h")) mode = HEX; else if (args[firstFile].equals("-d")) mode = DEC; else if (args[firstFile].equals("-s")) mode = SHORT; else if (args[firstFile].equals("-i")) mode = INT; else if (args[firstFile].equals("-l")) mode = LONG; else if (args[firstFile].equals("-f")) mode = FLOAT; else if (args[firstFile].equals("-x")) mode = DOUBLE; else if (args[firstFile].equals("-little")) bigEndian = false; } for (int i = firstFile; i < args.length; i++) { try { InputStream in = new FileInputStream(args[i]); dump(in, System.out, mode, bigEndian); if (i < args.length-1) { // more files to dump System.out.println(); System.out.println("--------------------------------------"); System.out.println(); } } catch (Exception e) { System.err.println(e); e.printStackTrace(); } } } public static void dump(InputStream in, OutputStream out, int mode, throws IOException { // The reference variable in may point to several different objects // within the space of the next few lines. We can attach // more filters here to do decompression, decryption, and more. if (bigEndian) { DataInputStream din = new DataInputStream(in); switch (mode) { case HEX: in = new HexFilter(in); break; case DEC: in = new DecimalFilter(in); break; case INT: in = new IntFilter(din); break; case SHORT: in = new ShortFilter(din); break; case LONG: in = new LongFilter(din); break; case DOUBLE: in = new DoubleFilter(din); break; case FLOAT: in = new FloatFilter(din); break; default: } } else { LittleEndianInputStream lin = new LittleEndianInputStream(in); switch (mode) { case HEX: in = new HexFilter(in); break; case DEC: in = new DecimalFilter(in); break; case INT: in = new LEIntFilter(lin); break; case SHORT: in = new LEShortFilter(lin); break; case LONG: in = new LELongFilter(lin); break; case DOUBLE: in = new LEDoubleFilter(lin); break; case FLOAT: in = new LEFloatFilter(lin); break; default: } } StreamCopier.copy(in, out); in.close(); } }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 8: Streams in Memory
- Content preview·Buy reprint rights for this chapterIn the last several chapters, you've learned how to use streams to move data between a running Java program and external programs and stores. Streams can also be used to move data from one part of a Java program to another. This chapter explores three such methods. Sequence input streams chain several input streams together so that they appear as a single stream. Byte array streams allow output to be stored in byte arrays and input to be read from byte arrays. Finally, piped input and output streams allow output from one thread to become input for another thread.The
java.io.SequenceInputStream
class connects multiple input streams together in a particular order:public class SequenceInputStream extends InputStream
Reads from aSequenceInputStream
first read all the bytes from the first stream in the sequence, then all the bytes from the second stream in the sequence, then all the bytes from the third stream, and so on. When the end of one of the streams is reached, that stream is closed; the next data comes from the next stream. Of course, this assumes that the streams in the sequence are in fact finite. There are two constructors for this class:public SequenceInputStream(Enumeration e) public SequenceInputStream(InputStream in1, InputStream in2)
The first constructor creates a sequence out of all the elements of theEnumeration
e
. This assumes all objects in the enumeration are input streams. If this isn't the case, aClassCastException
will be thrown the first time a read is attempted from an object that is not anInputStream
. The second constructor creates a sequence input stream that reads first fromin1
, then fromin2
. Note thatin1
orin2
may themselves be sequence input streams, so repeated application of this constructor allows a sequence input stream with an indefinite number of underlying streams to be created. For example, to read the home pages of both JavaSoft and AltaVista, you might do this:Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Sequence Input Streams
- Content preview·Buy reprint rights for this chapterThe
java.io.SequenceInputStream
class connects multiple input streams together in a particular order:public class SequenceInputStream extends InputStream
Reads from aSequenceInputStream
first read all the bytes from the first stream in the sequence, then all the bytes from the second stream in the sequence, then all the bytes from the third stream, and so on. When the end of one of the streams is reached, that stream is closed; the next data comes from the next stream. Of course, this assumes that the streams in the sequence are in fact finite. There are two constructors for this class:public SequenceInputStream(Enumeration e) public SequenceInputStream(InputStream in1, InputStream in2)
The first constructor creates a sequence out of all the elements of theEnumeration
e
. This assumes all objects in the enumeration are input streams. If this isn't the case, aClassCastException
will be thrown the first time a read is attempted from an object that is not anInputStream
. The second constructor creates a sequence input stream that reads first fromin1
, then fromin2
. Note thatin1
orin2
may themselves be sequence input streams, so repeated application of this constructor allows a sequence input stream with an indefinite number of underlying streams to be created. For example, to read the home pages of both JavaSoft and AltaVista, you might do this:try { URL u1 = new URL("https://java.sun.com/"); URL u2 = new URL("https://www.altavista.com"); SequenceInputStream sin = new SequenceInputStream(u1.openStream(), u2.openStream()); } catch (IOException e) { //...
Example 8.1 reads a series of filenames from the command line, creates a sequence input stream from file input streams for each file named, then copies the contents of all the files ontoSystem.out
. TheSequenceInputStream
class already provides the necessary layer of abstraction for this problem. There's nothing to be gained by constructing a new object that chains streams together and prints them. Therefore, this class only has aAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Byte Array Streams
- Content preview·Buy reprint rights for this chapterIt's sometimes convenient to use stream methods to manipulate data in byte arrays. For example, you might receive an array of raw bytes that you want to interpret as double-precision, floating-point numbers. (This is common when using UDP to transfer data across the Internet, for one example.) The quickest way to do this is to use a
DataInputStream
. However, before you can create a data input stream, you first need to create a raw, byte-oriented stream. This is what thejava.io.ByteArrayInputStream
class gives you. Similarly, you might want to send a group of double-precision, floating-point numbers across the network with UDP. Before you can do this, you have to convert the numbers into bytes. The simplest solution is to use a data output stream chained to ajava.io.ByteArrayOutputStream
. By chaining the data output stream to a byte array output stream, you can write the binary form of the floating-point numbers into a byte array, then send the entire array in a single packet.Byte array input and output streams are commonly used when sending and receiving UDP data over the Internet. Unlike the more common TCP data, which acts like the streams I discuss in this book, UDP data arrives in raw packets of bytes, which do not necessarily have any relation to the previous packet or the next packet. Each packet is just a group of bytes to be processed in isolation from other packets. Thus, you may get nothing for several seconds, or even minutes, and then suddenly have a few hundred numbers to deal with.In Java, UDP data is sent and received via thejava.net.DatagramSocket
andjava.net.DatagramPacket
classes. Thereceive()
method of theDatagramSocket
class returns its data in aDatagramPacket
, which is little more than a wrapper around a byte array. This byte array can be easily used as the source of aByteArrayInputStream
. UDP is discussed in more detail in Chapter 9 of my book Java Network Programming (O'Reilly & Associates, 1997).Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Communicating Between Threads with Piped Streams
- Content preview·Buy reprint rights for this chapterThe
java.io.PipedInputStream
class andjava.io.PipedOutputStream
class provide a convenient means to move streaming data from one thread to another. Output from one thread becomes input for the other thread, as shown in Figure 8.1Figure 8.1: Data moving between threads with piped streamspublic class PipedInputStream extends InputStream public class PipedOutputStream extends OutputStream
ThePipedInputStream
class has two constructors:public PipedInputStream() public PipedInputStream(PipedOutputStream source) throws IOException
The no-argument constructor creates a piped input stream that is not yet connected to a piped output stream. The second constructor creates a piped input stream that's connected to the piped output streamsource
.ThePipedOutputStream
class also has two constructors:public PipedOutputStream(PipedInputStream sink) throws IOException public PipedOutputStream()
The no-argument constructor creates a piped output stream that is not yet connected to a piped input stream. The second constructor creates a piped output stream that's connected to the piped input streamsink
.Piped streams are normally created in pairs. The piped output stream becomes the underlying source for the piped input stream. For example:PipedOutputStream pout = new PipedOutputStream(); PipedInputStream pin = new PipedInputStream(pout);
This simple example is a little deceptive, because these lines of code will normally be in different methods and perhaps even different classes. Some mechanism must be established to pass a reference to thePipedOutputStream
into the thread that handles thePipedInputStream
. Or you can create them in the same thread, then pass a reference to the connected stream into a separate thread. Alternately, you can reverse the order:Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 9: Compressing Streams
- Content preview·Buy reprint rights for this chapterThe
java.util.zip
package, shown in Figure 9.1, contains six stream classes and another half dozen assorted classes that read and write data in zip, gzip, and inflate/deflate formats. Java uses these classes to read and write JAR archives and to display PNG images. You can use thejava.util.zip
classes as general utilities for general-purpose compression and decompression. Among other things, these classes make it trivial to write a simple file compression or decompression program.Figure 9.1: The java.util.zip package hierarchyThejava.util.zip.Deflater
andjava.util.zip.Inflater
classes provide compression and decompression services for all other classes. They are Java's compression and decompression engines. These classes support several related compression formats, including zlib, deflate, and gzip. These formats are documented in RFCs 1950, 1951, and 1952. (Seeftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html
) They all use the Lempel-Ziv 1977 (LZ77) compression algorithm (named after the inventors, Jakob Ziv and Abraham Lempel), though each has a different way of storing metadata that describes an archive's contents. Since compression and decompression are extremely CPU-intensive operations, for the most part these classes are Java wrappers around native methods written in C. More precisely, these are wrappers around the zlib compression library written by Jean-Loup Gailly and Mark Adler. According to Greg Roelofs, writing on the zlib web page athttps://www.cdrom.com/pub/infozip/zlib/
, "zlib is designed to be a free, general-purpose, legally unencumbered—that is, not covered by any patents—lossless data-compression library for use on virtually any computer hardware and operating system."Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Inflaters and Deflaters
- Content preview·Buy reprint rights for this chapterThe
java.util.zip.Deflater
andjava.util.zip.Inflater
classes provide compression and decompression services for all other classes. They are Java's compression and decompression engines. These classes support several related compression formats, including zlib, deflate, and gzip. These formats are documented in RFCs 1950, 1951, and 1952. (Seeftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html
) They all use the Lempel-Ziv 1977 (LZ77) compression algorithm (named after the inventors, Jakob Ziv and Abraham Lempel), though each has a different way of storing metadata that describes an archive's contents. Since compression and decompression are extremely CPU-intensive operations, for the most part these classes are Java wrappers around native methods written in C. More precisely, these are wrappers around the zlib compression library written by Jean-Loup Gailly and Mark Adler. According to Greg Roelofs, writing on the zlib web page athttps://www.cdrom.com/pub/infozip/zlib/
, "zlib is designed to be a free, general-purpose, legally unencumbered—that is, not covered by any patents—lossless data-compression library for use on virtually any computer hardware and operating system."Without going into excessive detail, zip, gzip, and zlib all compress data in more or less the same way. Repeated bit sequences in the input data are replaced with pointers back to the first occurrence of that bit sequence. Other tricks are used, but this is basically how these compression schemes work and has certain implications for compression and decompression code. First, you can't randomly access data in a compressed file. To decompress the nth byte of data, you must first decompress bytes 1 through n-1 of the data. Second, a single twiddled bit doesn't just change the meaning of the byte it's part of. It also changes the meaning of bytes that come after it in the data, since subsequent bytes may be stored as copies of the previous bytes. Therefore, compressed files are much more susceptible to corruption than uncompressed files. For more general information about compression and archiving algorithms and formats, theAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Compressing and Decompressing Streams
- Content preview·Buy reprint rights for this chapterThe
Inflater
andDeflater
classes are a little raw for easy digestion. It would be more convenient to write uncompressed data onto an output stream and have it compressed by the stream itself, without having to worry about the mechanics of deflation. Similarly, it would be useful to have an input stream class that could read from a compressed file but return the uncompressed data. Java, in fact, has several classes that do exactly this. Thejava.util.zip.DeflaterOutputStream
class is a filter stream that compresses the data it receives in deflated format before writing it out to the underlying stream. Thejava.util.zip.InflaterInputStream
class inflates deflated data before passing it to the reading program.java.util.zip.GZIPInputStream
andjava.util.zip.GZIPOutputStream
do the same thing except with the gzip format.DeflaterOutputStream
is a filter stream that deflates data before writing it onto the underlying stream:public class DeflaterOutputStream extends FilterOutputStream
Each stream uses a protectedDeflater
object calleddef
to compress data stored in a protected internal buffer calledbuf
:protected Deflater def; protected byte[] buf;
The same deflater must not be used in multiple streams at the same time, though Java takes no steps to guarantee this.The underlying output stream that receives the deflated data, the deflater objectdef
, and the length of the byte arraybuf
are all set by one of the threeDeflaterOutputStream
constructors:public DeflaterOutputStream(OutputStream out, Deflater def, int bufferLength) public DeflaterOutputStream(OutputStream out, Deflater def) public DeflaterOutputStream(OutputStream out)
The underlying output stream must be specified. The buffer length defaults to 512 bytes, and theAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Working with Zip Files
- Content preview·Buy reprint rights for this chapterGzip and deflate are compression formats. Zip is both a compression and an archive format. This means that a single zip file may contain more than one uncompressed file, along with information about the names, permissions, creation and modification dates, and other information about each file in the archive. This makes reading and writing zip archives somewhat more complex and somewhat less amenable to a stream metaphor than reading and writing deflated or gzipped files.The
java.util.zip.ZipFile
class represents a file in the zip format. Such a file might be created by zip, PKZip, ZipIt, WinZip, or any of the many other zip programs. Thejava.util.zip.ZipEntry
class represents a single file stored in such an archive.public class ZipFile extends Object implements ZipConstants public class ZipEntry extends Object implements ZipConstants
Thejava.util.zip.ZipConstants
interface that both these classes implement is a rare nonpublic interface that contains constants useful for reading and writing zip files. Most of these constants define the positions in a zip file where particular information, like the compression method used, is found. You don't need to concern yourself with it.TheZipFile
class contains two constructors. The first takes a filename as an argument. The second takes ajava.io.File
object as an argument.File
objects will be discussed in Chapter 12 ; for now, I'll just use the constructor that accepts a filename. Functionally, these two constructors are similar.public ZipFile(String filename) throws IOException public ZipFile(File file) throws ZipException, IOException
ZipException
is a subclass ofIOException
that generally indicates that data in the zip file doesn't fit the zip format. In this case, the zip exception's message will contain more details, like "invalid END header signature" or "cannot have more than one drive." While these may be useful to a zip expert, in general they indicate that the file is corrupted, and there's not much that can be done about it.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Checksums
- Content preview·Buy reprint rights for this chapterCompressed files are especially susceptible to corruption. While changing a bit from to 1 or vice versa in a text file generally only affects a single character, changing a single bit in a compressed file often makes the entire file unreadable. Therefore, it's customary to store a checksum with the compressed file so that the recipient can verify that the file is intact. The zip format does this automatically, but you may wish to use manual checksums in other circumstances as well.There are many different checksum schemes. A particularly simple example adds a parity bit to the data, typically 1 if the number of 1 bits is odd, if the number of 1 bits is even. This checksum can be calculated by summing up the number of 1 bits and taking the remainder when that sum is divided by two. However, this scheme isn't very robust. It can detect single-bit errors, but in the face of bursts of errors as often occur in transmissions over modems and other noisy connections, there's a 50/50 chance that corrupt data will be reported as correct.Better checksum schemes use more bits. For example, a 16-bit checksum could sum up the number of 1 bits and take the remainder modulo 65,536. This means that in the face of completely random data, there's only 1 in 65,536 chances of corrupt data being reported as correct. This chance drops exponentially as the number of bits in the checksum increases. More mathematically sophisticated schemes can reduce the likelihood of a false positive even further. For more details about checksums, see "Everything you wanted to know about CRC algorithms, but were afraid to ask for fear that errors in your understanding might be detected," by Ross Williams, available from
https://www.geocities.com/CapeCanaveral/Launchpad/3632/crcguide.htm
. Of course, the advantage of a class library is that you only really need to understand the interface of the classes you use and what they do in broad perspective. You don't necessarily have to know all the technical details of the algorithms used inside the classes.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - JAR Files
- Content preview·Buy reprint rights for this chapterJava 1.1 added support for Java ARchive files, JAR files for short. JAR files bundle the many different classes, images, and sound files an applet requires into a single file. It is generally faster for a web browser to download one JAR file than to download the individual files the archive contains, since only one HTTP connection is required. An applet stored in a JAR file, instead of as merely loose .class files, is embedded in a web page with an
<applet>
tag with anarchive
attribute pointing to the JAR file. For example:<applet code=NavigationMenu archive="NavigationMenu.jar" width=400 height=80> </applet>
Thecode
attribute still says that the main class of this applet is calledNavigationMenu
. However, a Java 1.1 web browser, rather than asking the web server for the file NavigationMenu.class as a Java 1.0 web browser would, asks the web server for the file NavigationMenu.jar. Then the browser looks inside NavigationMenu.jar to find the file NavigationMenu.class. Only if it doesn't find NavigationMenu.class inside NavigationMenu.jar does it then go back to the web server and ask for NavigationMenu.class. Now suppose theNavigationMenu
applet tries to load an image called menu.gif. The applet will look for this file inside the JAR archive too. It only has to make a new connection to the web server if it can't find menu.gif in the archive.Sun wisely decided not to attempt to define a new file format for JAR files. Instead, they stuck with the tried-and-true zip format. This means that the classes, images, sounds, and other files stored inside a JAR archive can be compressed, making the applet even faster to download. This also means that standard tools like PKZip and standard zip libraries likejava.util.zip
can work with JAR files.JAR files have also become Java's preferred means of distributing Java Beans and class libraries. For instance, the Java Cryptography Extension, discussed in the next chapter, is mostly a set of classes packed up in the fileAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Viewer, Part 4
- Content preview·Buy reprint rights for this chapterBecause of the nature of filter streams, it is relatively straightforward to add decompression services to the
FileDumper
program last seen in Chapter 7. Generally, you'll want to decompress a file before dumping it. Adding decompression does not require a new dump filter. Instead, it simply requires passing the file through an inflater input stream before passing it to one of the dump filters. We'll let the user choose from either gzipped or deflated files with the command-line switches-gz
and-deflate
. When one of these switches is seen, the appropriate inflater input stream is selected; it is an error to select both. Example 9.15,FileDumper4
, demonstrates.Example 9.15. FileDumper4import java.io.*; import java.util.zip.*; import com.macfaq.io.*; public class FileDumper4 { public static final int ASC = 0; public static final int DEC = 1; public static final int HEX = 2; public static final int SHORT = 3; public static final int INT = 4; public static final int LONG = 5; public static final int FLOAT = 6; public static final int DOUBLE = 7; public static void main(String[] args) { if (args.length < 1) { System.err.println("Usage: java FileDumper4 [-ahdsilfx] [-little]"+ "[-gzip|-deflated] file1..."); } boolean bigEndian = true; int firstFile = 0; int mode = ASC; boolean deflated = false; boolean gzipped = false; // Process command-line switches. for (firstFile = 0; firstFile < args.length; firstFile++) { if (!args[firstFile].startsWith("-")) break; if (args[firstFile].equals("-h")) mode = HEX; else if (args[firstFile].equals("-d")) mode = DEC; else if (args[firstFile].equals("-s")) mode = SHORT; else if (args[firstFile].equals("-i")) mode = INT; else if (args[firstFile].equals("-l")) mode = LONG; else if (args[firstFile].equals("-f")) mode = FLOAT; else if (args[firstFile].equals("-x")) mode = DOUBLE; else if (args[firstFile].equals("-little")) bigEndian = false; else if (args[firstFile].equals("-deflated") && !gzipped) deflated = true; else if (args[firstFile].equals("-gzip") && !deflated) gzipped = true; } for (int i = firstFile; i < args.length; i++) { try { InputStream in = new FileInputStream(args[i]); dump(in, System.out, mode, bigEndian, deflated, gzipped); if (i < args.length-1) { // more files to dump System.out.println(); System.out.println("--------------------------------------"); System.out.println(); } } catch (Exception e) { System.err.println(e); e.printStackTrace(); } } } public static void dump(InputStream in, OutputStream out, int mode, boolean bigEndian, boolean deflated, boolean gzipped) throws IOException { // The reference variable in may point to several different objects // within the space of the next few lines. We can attach // more filters here to do decompression, decryption, and more. if (deflated) { in = new InflaterInputStream(in); } else if (gzipped) { in = new GZIPInputStream(in); } // could really pass to FileDumper3 at this point if (bigEndian) { DataInputStream din = new DataInputStream(in); switch (mode) { case HEX: in = new HexFilter(in); break; case DEC: in = new DecimalFilter(in); break; case INT: in = new IntFilter(din); break; case SHORT: in = new ShortFilter(din); break; case LONG: in = new LongFilter(din); break; case DOUBLE: in = new DoubleFilter(din); break; case FLOAT: in = new FloatFilter(din); break; default: } } else { LittleEndianInputStream lin = new LittleEndianInputStream(in); switch (mode) { case HEX: in = new HexFilter(in); break; case DEC: in = new DecimalFilter(in); break; case INT: in = new LEIntFilter(lin); break; case SHORT: in = new LEShortFilter(lin); break; case LONG: in = new LELongFilter(lin); break; case DOUBLE: in = new LEDoubleFilter(lin); break; case FLOAT: in = new LEFloatFilter(lin); break; default: } } StreamCopier.copy(in, out); in.close(); } }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 10: Cryptographic Streams
- Content preview·Buy reprint rights for this chapterThis chapter discusses filter streams that implement some sort of cryptography. The Java core API contains two of these in the
java.security
package,DigestInputStream
andDigestOutputStream
. There are two more cryptography streams in thejavax.crypto
package,CipherInputStream
andCipherOutputStream
. All four of these streams use an engine object to handle the filtering.DigestInputStream
andDigestOutputStream
use aMessageDigest
object, whileCipherInputStream
andCipherOutputStream
use aCipher
object. The streams rely on the programmer to properly initialize and—in the case of the digest streams—clean up after the engines. Therefore, we'll first look at the engine classes, then at the streams built around these engines.In a sane world, these classes would all be part of the core API in ajava.crypto
package. Regrettably, U.S. export laws prohibit the export of cryptographic software without special permission. Therefore, the cryptography API and associated classes must be downloaded separately from the main JDK. Collectively these are called the Java Cryptography Extension, or JCE for short. To protect national security, you'll have to fill out a form promising you're not an international terrorist before you can download it. I feel safer already. If you're outside the United States and Canada, and you're one of the three people worldwide who actually respect U.S. export laws or who can't figure out how to penetrate the incredible security Sun has placed around JCE to make sure it doesn't fall into the hands of international terrorists, there are several third-party implementations of the JCE created outside the United States and thus not subject to its laws, including at least two free ones. These may not be completely synced with the beta release of the JCE 1.2 discussed here, but they should be close by the time you read this.Although the initial version of the JCE worked with Java 1.1, the only version available from Sun at the time of this writing, JCE 1.2, requires Java 2 to run. The material in this chapter about message digests, hash functions, and digest streams applies to both Java 1.1 and 2. The remainder of the chapter, encryption and decryption mostly, only works in Java 2.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Hash Function Basics
- Content preview·Buy reprint rights for this chapterSometimes it's essential to know whether data has changed. For instance, crackers invading Unix systems often replace crucial files like /etc/passwd or /usr/ucb/cc with their own hacked versions that allow them to regain access to the system if the original hole they entered through is plugged. Therefore, if you discover your system has been penetrated, one of the first things you need to do is to replace any changed files. Of course, this raises the question of how you identify the changed files, especially since anybody who's capable of replacing system executables is more than capable of resetting the last-modified date of the files. You can keep an offline copy of the system files, but this is costly and difficult, especially since multiple copies need to be stored for long periods of time. If you don't discover a penetration until several months after it occurred, you may need to roll back the system files to that point in time. Recent backups are likely to have been made after the penetration occurred and thus are also likely to be compromised.As a less threatening example, suppose you want to be notified whenever a particular web page changes. It's not hard to write a robot that connects to the site at periodic intervals, downloads the page, and compares it to a previously retrieved copy for changes. However, if you need to do this for hundreds or thousands of web pages, the space to store the pages becomes prohibitive. Email clients have similar needs. Many broken mail clients and mailing list managers send multiple copies of the same message. A mail client should recognize when multiple copies of the same message are being passed through the system and delete them. On an ISP level, it might be possible to use this as a spam filter by comparing messages sent to different customers.All these tasks need a way to compare files at different times without storing the files themselves. You can write a special kind of method called a hash function that reads an indefinite number of sequential bytes and assigns a number to that sequence of bytes. This number is called aAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The MessageDigest Class
- Content preview·Buy reprint rights for this chapterThe
java.security.MessageDigest
class is an abstract class that represents a hash code and its associated algorithm. Concrete subclasses (actually concrete subclasses ofjava.security.MessageDigestSPI
, though the difference isn't relevant from a client's point of view) implement particular, professionally designed, well-known hash code algorithms. Thus, rather than constructing instances of this class directly, you ask the staticMessageDigest.getInstance()
factory method to provide an implementation of an algorithm with a particular name. Table 10.1 lists the standard names for message digest algorithms. Depending on which service providers are installed, you may or may not have all of these. The JDK 1.1 includes SHA-1 (which is the same as SHA) and MD5 but not MD2. RSA's paywareCrypto-J cryptography library also supports MD2. (Seehttps://www.rsa.com/rsa/products/jsafe/
.)Table 10.1: Message Digest Algorithms in Java 1.1 NameAlgorithmSHA-1The Secure Hash Algorithm, as defined in Secure Hash Standard, NIST FIPS 180-1 (National Institute of Standards and Technology Federal Information Processing Standards Publications 180-1); produces 20-byte digests; seehttps://www.itl.nist.gov/div897/pubs/fip180-1.htm
SHAAnother name for SHA-1MD2RSA-MD2 as defined in RFC 1319 and RFC 1423 (RFC 1423 corrects a mistake in RFC 1319); produces 16-byte digests; suitable for use with digital signatures; seehttps://www.faqs.org/rfcs/rfc1319.html
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Digest Streams
- Content preview·Buy reprint rights for this chapterThe
MessageDigest
class isn't particularly hard to use, as I hope Example 10.1 and Example 10.2 demonstrated. It's flexible and can be used to calculate a digest for anything that can be converted into a byte array, such as a string, an array of floating point numbers, or the contents of a text area. Nonetheless, the input data almost always comes from streams. Therefore, thejava.security
package contains an input stream and an output stream class that each possess aMessageDigest
object to calculate a digest for the stream as it is read or written. These areDigestInputStream
andDigestOutputStream
.TheDigestInputStream
class is a subclass ofFilterInputStream
:public class DigestInputStream extends FilterInputStream
DigestInputStream
has all the usual methods of any input stream, likeread()
,skip()
, and close(). It overrides tworead()
methods to do its filtering. Clients use these methods exactly as they use theread()
methods of other input streams:public int read() throws IOException public int read(byte[] data, int offset, int length) throws IOException
DigestInputStream
does not change the data it reads in any way. However, as each byte or group of bytes is read, it is fed as input to aMessageDigest
object stored in the class as the protecteddigest
field:protected MessageDigest digest;
Thedigest
field is normally set in the constructor:public DigestInputStream(InputStream stream, MessageDigest digest)
For example:URL u = new URL("https://java.sun.com"); DigestInputStream din = new DigestInputStream(u.openStream(), MessageDigest.getInstance("SHA"));
Thedigest
is not cloned inside the class. Only a reference to it is stored. Therefore, the message digest used inside the stream should only be used by the stream. Simultaneous or interleaved use by other objects will corrupt the digest.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Encryption Basics
- Content preview·Buy reprint rights for this chapterIn this section we begin discussing cryptography. The packages, classes, and methods discussed in this and following sections are part of Sun's separately available Java Cryptography Extension ( JCE). As a standard extension to Java, the JCE cryptography classes live in the
javax
package rather than thejava
package. They are not part of the core API. You will need to download JCE fromhttps://java.sun.com/products/jce/index.html
and install it before continuing.Because Sun is not legally allowed to export the JCE outside the U.S. and Canada, a number of third parties in other countries have implemented their own versions. In particular, Austria's Institute for Applied Information Processing and Communications has released the IAIK_ JCE, which is free for noncommercial use and can be retrieved fromhttps://jcewww.iaik.tu-graz.ac.at/products/jce/index.php
. Also notable is the more-or-less open source Cryptix package, which can be downloaded from many mirror sites worldwide. Seehttps://www.cryptix.org/
.There are many different kinds of codes and ciphers, both for digital and nondigital data. To be precise, a code encrypts data at word or higher levels. Ciphers encrypt data at the level of letters or, in the case of digital ciphers, bytes. Most ciphers replace each byte in the original, unencrypted data, called plaintext, with a different byte, thus producing encrypted data, called ciphertext. There are many different possible algorithms for determining how plaintext is transformed into ciphertext (encryption) and how the ciphertext is transformed back into plaintext (decryption).All the algorithms discussed here, and included in the JCE, are key-based. The key is a sequence of bytes used to parameterize the cipher. The same algorithm will encrypt the same plaintext differently when a different key is used. Decryption also requires a key. Good algorithms make it effectively impossible to decrypt ciphertext without knowing the right key.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The Cipher Class
- Content preview·Buy reprint rights for this chapterThe
javax.crypto.Cipher
class is a concrete class that encrypts arrays of bytes. The default implementation performs no encryption, but you'll never see this. You'll only receive subclasses that implement particular algorithms.public class Cipher extends Object
The subclasses ofCipher
that do real encryption are supplied by providers. Different providers can provide different sets of algorithms. For instance, an authoritarian government might only allow the installation of algorithms it knew how to crack, and create a provider that provided those algorithms and only those algorithms. A corporation might want to install algorithms that allowed for key recovery in the event that an employee left the company or forgot their password.JDK 1.2 only includes the Sun provider that supplies no encryption schemes, though it does supply several digest algorithms. The JCE adds one more provider, SunJCE, which provides DES, triple DES (DESede), and password-based encryption (PBE). RSA's payware JSafe product has a security provider that provides the RSA, DES, DESede, RC2, RC4, and RC5 cipher algorithms. Ireland's Baltimore Technologies payware J/Crypto software has a security provider that provides the RSA, DES, DESede, RC2, RC4, and PBE cipher algorithms. Table 10.2 lists several of the available security providers and the algorithms they implement.Table 10.2: Security Providers Product (Company, Country)URLDigestsCiphersLicenseAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Cipher Streams
- Content preview·Buy reprint rights for this chapterThe
Cipher
class is the engine that powers encryption. Chapter 10 and Example 10.7 showed how this class could be used to encrypt and decrypt data read from a stream. Thejavax.crypto
package also providesCipherInputStream
andCipherOutputStream
filter streams that use aCipher
object to encrypt or decrypt data passed through the stream. LikeDigestInputStream
andDigestOutputStream
, they aren't a great deal of use in themselves. However, you can chain them in the middle of several other streams. For example, if you chain aGZIPOutputStream
to aCipherOutputStream
that is chained to aFileOutputStream
, you can compress, encrypt and write to a file, all with a single call towrite()
. This is shown in Figure 10.3. Similarly, you might read from a URL with the input stream returned byopenStream()
, decrypt the data read with aCipherInputStream
, then check the decrypted data with aMessageDigestInputStream
, then finally pass it all into anInputStreamReader
for conversion from ISO Latin-1 to Unicode. On the other side of the connection, a web server could read a file from its hard drive, write the file onto a socket with an output stream, calculate a digest with aDigestOutputStream
, and encrypt the file with aCipherOutputStream
.Figure 10.3: The CipherOutputStream in the middle of a chain of filtersCipherInputStream
is a subclass ofFilterInputStream
.public class CipherInputStream extends FilterInputStream
CipherInputStream
has all the usual methods of any input stream, likeread()
,skip()
, andclose()
. It overrides seven of these methods to do its filtering:public int read() throws IOException public int read(byte[] data) throws IOException public int read(byte[] data, int offset, int length) throws IOException public long skip(long n) throws IOException public int available() throws IOException public void close() throws IOException public boolean markSupported()
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Viewer, Part 5
- Content preview·Buy reprint rights for this chapterHandling a particular form of encryption in the
FileDumper
program is not hard. Handling the general case is not. It's not that decryption is difficult. In fact, it's quite easy. However, most encryption schemes require more than simply providing a key. You also need to know an assortment of algorithm parameters, like initialization vector, salt, iteration count, and more. Higher-level protocols are usually used to pass this information between the encryption program and the decryption program. The most common type of protocol is to simply store the information unencrypted at the beginning of the encrypted file. You saw an example of this in theFileDecryptor
andFileEncryptor
programs. TheFileEncryptor
chose a random initialization vector and placed its length and the vector itself at the beginning of the encrypted file so the decryptor could easily find it.For the next iteration of theFileDumper
program, I am going to use the simplest available encryption scheme, DES in ECB mode with PKCS5Padding. Furthermore, the key will simply be the first eight bytes of the password. This is probably the least secure algorithm discussed in this chapter; however, it doesn't require an initialization vector, salt, or other meta-information to be passed between the encryptor and the decryptor. Because of the nature of filter streams, it is relatively straightforward to add decryption services to theFileDumper
program, assuming you know the format in which the encrypted data is stored. Generally, you'll want to decrypt a file before dumping it. This does not require a new dump filter. Instead, I simply pass the file through a cipher input stream before passing it to one of the dump filters.When a file is both compressed and encrypted, compression is usually performed first. Therefore, we'll always decompress after decrypting. The reason is twofold. Since encryption schemes make data appear random, and compression works by taking advantage of redundancy in nonrandom data, it is difficult, if not impossible, to compress encrypted files. In fact, one quick test of how good an encryption scheme is checks whether encrypted files are compressible; if they are, it's virtually certain the encryption scheme is flawed and can be broken. Conversely, compressing files before encrypting them removes redundancy from the data that a code breaker can exploit. Therefore, it may serve to shore up some weaker algorithms. On the other hand, some algorithms have been broken by taking advantage of magic numbers and other known plaintext sequences that some compression programs insert into the encrypted data. Thus, there's no guarantee that compressing files before encrypting them will make them harder to penetrate. The best option is simply to use the strongest encryption that's available to you.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 11: Object Serialization
- Content preview·Buy reprint rights for this chapterThe last several chapters have shown you how to read and write Java's fundamental data types (
byte
,int
,String
, etc.). However, there's been one glaring omission. Java is a fully object-oriented language; and yet aside from the special case of strings, you haven't seen any general-purpose methods for reading or writing objects.Object serialization, first used in the context of Remote Method Invocation (RMI) and later for JavaBeans, addresses this need. Thejava.io.ObjectOutputStream
class provides awriteObject()
method you can use to write a Java object onto a stream. Thejava.io.ObjectInputStream
class has areadObject()
method you can use to read an object from a stream. In this chapter you'll learn how to use these two classes to read and write objects as well as how to customize the format used for serialization.Object serialization saves an object's state in a sequence of bytes so that the object can be reconstituted from those bytes at a later time. Serialization in Java was first developed for use in RMI. RMI allows an object in one virtual machine to invoke methods in an object in another virtual machine, possibly in a different computer on the other side of the planet, by sending arguments and return values across the Internet. This requires a way to convert those arguments and return values to and from byte streams. It's a trivial task for primitive data types, but you need to be able to convert objects as well. That's what object serialization provides.Object serialization is also used in the JavaBeans component software architecture. Bean classes are loaded into visual builder tools like the BeanBox (shown in Figure 11.1) or Borland's JBuilder. The designer then customizes the beans by assigning fonts, sizes, text, and other properties to each bean and connects them together with events. For instance, a button bean generally has a label property that is encoded as a string of text ("Start" in the button in Figure 11.1). The designer can change this text.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading and Writing Objects
- Content preview·Buy reprint rights for this chapterObject serialization saves an object's state in a sequence of bytes so that the object can be reconstituted from those bytes at a later time. Serialization in Java was first developed for use in RMI. RMI allows an object in one virtual machine to invoke methods in an object in another virtual machine, possibly in a different computer on the other side of the planet, by sending arguments and return values across the Internet. This requires a way to convert those arguments and return values to and from byte streams. It's a trivial task for primitive data types, but you need to be able to convert objects as well. That's what object serialization provides.Object serialization is also used in the JavaBeans component software architecture. Bean classes are loaded into visual builder tools like the BeanBox (shown in Figure 11.1) or Borland's JBuilder. The designer then customizes the beans by assigning fonts, sizes, text, and other properties to each bean and connects them together with events. For instance, a button bean generally has a label property that is encoded as a string of text ("Start" in the button in Figure 11.1). The designer can change this text.Figure 11.1: The BeanBox showing a Juggler bean and an ExplicitButton beanOnce the designer has assembled and customized the beans, the form containing all the beans must be saved. It's not enough to save the bean classes themselves; the customizations that have been applied to the beans must also be saved. That's where serialization comes in: it stores the bean as an object and thus includes any customizations, which are nothing more than the values of the bean's fields. The customized beans are stored in a .ser file, which is often placed inside a JAR archive. This JAR archive can then be loaded into web browsers as an applet; then both the classes and the objects used by the applet are loaded into the virtual machine. Thus, instead of having to write longAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Object Streams
- Content preview·Buy reprint rights for this chapterObjects are serialized by object output streams. They are deserialized by object input streams. These are instances of
java.io.ObjectOutputStream
andjava.io.ObjectInputStream
, respectively:public class ObjectOutputStream extends OutputStream implements ObjectOutput, ObjectStreamConstants public class ObjectInputStream extends InputStream implements ObjectInput, ObjectStreamConstants
TheObjectOutput
interface is a subinterface ofjava.io.DataOutput
that declares the basic methods used to write objects and data. TheObjectInput
interface is a subinterface ofjava.io.DataInput
that declares the basic methods used to read objects and data.java.io.ObjectStreamConstants
is an unimportant interface that merely declares mnemonic constants for "magic numbers" used in the object serialization. (A major goal of the object stream classes is shielding client programmers from details of the format used to serialize objects such as magic numbers.)Although these classes are not technically filter output streams, since they do not extendFilterOutputStream
andFilterInputStream
, they are chained to underlying streams in the constructors:public ObjectOutputStream(OutputStream out) throws IOException public ObjectInputStream(InputStream in) throws IOException
To write an object onto a stream, you chain an object output stream to the stream, then pass the object to the object output stream'swriteObject()
method:public final void writeObject(Object o) throws IOException
For example:try { Point p = new Point(34, 22); FileOutputStream fout = new FileOutputStream("point.ser"); ObjectOutputStream oout = new ObjectOutputStream(fout); oout.writeObject(p); oout.close(); } catch (Exception e) {System.err.println(e);}
Later, the object can be read back using thereadObject()
method of theObjectInputStream
class:public final Object readObject() throws OptionalDataException, ClassNotFoundException, IOException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - How Object Serialization Works
- Content preview·Buy reprint rights for this chapterObjects possess state. This state is stored in the values of the nonstatic, nontransient fields of an object's class. Consider this
TwoDPoint
class:public class TwoDPoint { public double x; public double y; }
Every object of this class has a state defined by the values of thedouble
fieldsx
andy
. If you know the values of those fields, you know the value of theTwoDPoint
. Nothing changes if you add some methods to the class or make the fields private, as in Example 11.1.Example 11.1. The TwoDPoint Classpublic class TwoDPoint { private double x; private double y; public TwoDPoint(double x, double y) { this.x = x; this.y = y; } public double getX() { return x; } public double getY() { return y; } public void setX(double x) { this.x = x; } public void setY(double y) { this.y = y; } public String toString() { return "[TwoDPoint:x=" + this.x + ", y=" + y +"]"; } }
The object information, the information stored in the fields, is still the same. If you know the values ofx
andy
, you know everything there is to know about the state of the object. The methods only affect the actions an object can perform. They do not change what an object is. Now suppose you wanted to save the state of a particular point object by writing a sequence of bytes onto a stream. This process is called serialization, since the object is serialized into a sequence of bytes. You could add awriteState()
method to your class that looked something like this:public void writeState(OutputStream out) throws IOException { DataOutputStream dout = new DataOutputStream(out); dout.writeDouble(x); dout.writeDouble(y); }
To restore the state of aPoint
object, you could add areadState()
method like this:public void readState(InputStream in) throws IOException { DataInputStream din = new DataInputStream(in); this.x = din.readDouble(); this.y = din.readDouble(); }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Performance
- Content preview·Buy reprint rights for this chapterSerialization is often the easiest way to save the state of your program. You simply write out the objects you're using, then read them back in when you're ready to restore the document. There is a downside, however. First of all, serialization is slow. If you can define a custom file format for your application's documents, using that format will almost certainly be much faster than object serialization.Second, serialization can slow or prevent garbage collection. Every time an object is written onto an object output stream, the stream holds on to a reference to the object. Then, if the same object is written onto the same stream again, it can be replaced with a reference to its first occurrence in the stream. However, this means that your program holds on to live references to the objects it has written until the stream is reset or closed—which means these objects won't be garbage-collected. The worst-case scenario is when you keep a stream open as long as your program runs and write every object you create onto the stream. This prevents any objects from being garbage-collected.The easy solution is to avoid keeping a running stream of the objects you create. Instead, save the entire state only when the entire state is available, and then close the stream immediately.If this isn't possible, you have the option to reset the stream by invoking its
reset()
method:public void reset() throws IOException
reset()
flushes theObjectOutputStream
object's internal cache of the objects it has already written so they can be garbage-collected. However, this also means that an object may be written onto the stream more than once, so use this method with caution.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The Serializable Interface
- Content preview·Buy reprint rights for this chapterUnlimited serialization would introduce some security problems. For one thing, it allows unrestricted access to an object's private fields. By chaining an object output stream to a byte array output stream, a hacker can convert an object into a byte array. The byte array can be manipulated and modified without any access protection or security manager checks. Then the byte array can be reconstituted into a Java object by using it as the source of a byte array input stream.Security isn't the only potential problem. Some objects exist only as long as the current program is running. A
java.net.Socket
object represents an active connection to a remote host. Suppose a socket is serialized to a file, and the program exits. Later the socket is deserialized from the file in a new program—but the connection it represents no longer exists. Similar problems arise with file descriptors, I/O streams, and many more classes.For these and other reasons, Java does not allow instances of arbitrary classes to be serialized. You can only serialize instances of classes that implement thejava.io.Serializable
interface. By implementing this interface, a class indicates that it may be serialized without undue problems.public interface Serializable
This interface does not declare any methods or fields; it serves purely to indicate that a class may be serialized. You should recall, however, that subclasses of a class that implements a particular interface also implement that interface by inheritance. Thus, many classes that do not explicitly declare that they implementSerializable
are in fact serializable. For instance,java.awt.Component
implementsSerializable
. Therefore, its direct and indirect subclasses, includingButton
,Scrollbar
,TextArea
,List
,Container
,Panel
,java.applet.Applet
, all subclasses ofApplet
, and all Swing components may be serialized.java.lang.Throwable
implementsSerializable
. Therefore, all exceptions and errors are serializable.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The ObjectInput and ObjectOutput Interfaces
- Content preview·Buy reprint rights for this chapterAs well as the
ObjectInputStream
andObjectOutputStream
classes, thejava.io
package also providesObjectInput
andObjectOutput
interfaces:public interface ObjectInput extends DataInput public interface ObjectOutput extends DataOutput
These interfaces are not much used in Java 1.1 and 2. The only classes in the core API that actually implement them areObjectInputStream
andObjectOutputStream
. However, several methods used for customization of the serialization process are declared to acceptObjectInput
orObjectOutput
objects as arguments, rather than specificallyObjectInputStream
orObjectOutputStream
objects. This provides a little wiggle room for Java to grow in unforeseen ways.TheObjectInput
interface declares seven methods, all of whichObjectInputStream
faithfully implements:public abstract Object readObject() throws ClassNotFoundException, IOException public abstract int read() throws IOException public abstract int read(byte[] data) throws IOException public abstract int read(byte[] data, int offset, int length) throws IOException public abstract long skip(long n) throws IOException public abstract int available() throws IOException public abstract void close() throws IOException
ThereadObject()
method has already been discussed in the context of object input streams. The other six methods behave exactly as they do for all input streams. In fact, at first glance, all these methods exceptreadObject()
appear superfluous, since anyInputStream
subclass will possessread()
,skip()
,available()
, andclose()
methods with these signatures. However, this interface may be implemented by classes that aren't subclasses ofInputStream
.TheObjectOutput
interface declares the following six methods, all of whichObjectOutputStream
faithfully implements. Except forwriteObject()
, which has already been discussed in the context of object output streams, these methods should behave exactly as they do for all output streams:Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Versioning
- Content preview·Buy reprint rights for this chapterWhen an object is written onto a stream, only the state of the object and the name of the object's class are stored; the byte codes for the object's class are not stored with the object. There's no guarantee that a serialized object will be deserialized into the same environment from which it was serialized. It's possible for the class definition to change between the time the object is written and the time it's read. For instance, a
Component
object may be written in Java 1.1 but read in Java 2. However, in Java 2 theComponent
class has three nonstatic, nontransient fields the 1.1 version ofComponent
does not:boolean inputMethodsEnabled; DropTarget dropTarget; private PropertyChangeSupport changeSupport;
There are even more differences when methods, constructors, and static and transient fields are considered. Not all changes, however, prevent deserialization. For instance, the values of static fields aren't saved when an object is serialized. Therefore, you don't have to worry about adding or deleting a static field to or from a class. Similarly, serialization completely ignores the methods in a class, so changing method bodies or adding or removing methods does not affect serialization. However, removing an instance field does affect serialization, because deserializing an object saved by the earlier version of the class will result in an attempt to set the value of a field that no longer exists.Changes to a class are divided into two groups: compatible changes and incompatible changes. Compatible changes are those that do not affect the serialization format of the object, like adding a method or deleting a static field. Incompatible changes are those that do prevent a previously serialized object from being restored. Examples include deleting an instance field or changing the type of a field. As a general rule, any change that affects the signatures of the nontransient instance fields of a class is incompatible, while any change that does not affect the signatures of the nontransient instance fields of a class is compatible. However, there are a couple of exceptions. The following is a complete list of compatible changes:Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Customizing the Serialization Format
- Content preview·Buy reprint rights for this chapterThe default serialization procedure does not always produce the results you want. Most often, a nonserializable field like a
Socket
or aFileOutputStream
needs to be excluded from serialization. Sometimes, a class may contain data in nonserializable fields like aSocket
that you nonetheless want to save—for example, the host that the socket's connected to. Or perhaps a singleton object wants to verify that no other instance of itself exists in the virtual machine before it's reconstructed. Or perhaps an incompatible change to a class (such as changing aFont
field to three separate fields storing the font's name, style, and size) can be made compatible with a little programmer-supplied logic. Or perhaps you want an exceptionally large array of image data to be compressed before being written to disk. For these or many other reasons, you're allowed to customize the serialization process.The simplest way to customize serialization is to declare certain fields transient. The values of transient fields will not be written onto the underlying output stream when an object in the class is serialized. However, this only goes as far as excluding certain information from serialization; it doesn't help you change the format that's used to store the data or take action on deserialization or ensure that no more than one instance of a singleton class is created.For more control over the details of your class's serialization, you can provide customreadObject()
andwriteObject()
methods. These are private methods that the virtual machine uses to read and write the data for your class. This gives you complete control over how objects in your class are written onto the underlying stream but does not require you to handle data stored in your objects' superclasses.If you need even more control over the superclasses and everything else, you can implement thejava.io.Externalizable
interface, a subinterface ofjava.io.Serializable
. When serializing an externalizable object, the virtual machine does almost nothing except identify the class. The class itself is completely responsible for reading and writing its state and its superclass's state in whatever format it chooses.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Resolving Classes
- Content preview·Buy reprint rights for this chapterThe
readObject()
method ofjava.io.ObjectInputStream
only creates new objects from known classes. It doesn't load classes. If a class for an object can't be found,readObject()
throws aClassNotFoundException
. It specifically does not attempt to read the class data from the object stream. This is limiting for some things you might want to do, particularly RMI. Therefore, trusted subclasses ofObjectInputStream
may be allowed to load classes from the stream or some other source like a URL. Specifically, a class is trusted if, and only if, it was loaded from the local class path; that is, theClassLoader
object returned bygetClassLoader()
isnull
.Two protected methods are involved. The first is theannotateClass()
method ofObjectOutputStream
:protected void annotateClass(Class c) throws IOException
InObjectOutputStream
this is a do-nothing method. A subclass ofObjectOutputStream
can provide a different implementation that provides data for the class. For instance, this might be the byte code of the class itself or a URL where the class can be found.Standard object input streams cannot read and resolve the class data written byannotateClass()
. For each subclass ofObjectOutputStream
that overridesannotateClass()
, there will normally be a corresponding subclass ofObjectInputStream
that implements theresolveClass()
method:protected Class resolveClass(ObjectStreamClass v) throws IOException, ClassNotFoundException
Injava.io.ObjectInputStream
, this is a do-nothing method. A subclass ofObjectInputStream
can provide an implementation that loads a class based on the data read from the stream. For instance, ifannotateClass()
wrote byte code to the stream, then theresolveClass()
method would need to have a class loader that read the data from the stream. IfannotateClass()
wrote the URL of the class to the stream, then theresolveClass()
method would need a class loader that read the URL from the stream and downloaded the class from that URL.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Resolving Objects
- Content preview·Buy reprint rights for this chapterThere may be occasions where you want to replace the objects read from the stream with other, alternative objects. Perhaps an old version of a program whose data you need to read used
Franc
objects, but the new version of the program usesEuro
objects. TheObjectInputStream
can replace eachFranc
object read with the equivalentEuro
object.Only trusted subclasses ofObjectInputStream
may replace objects. A class is only trusted if it was loaded from the local class path; that is, the class loader returned bygetClassLoader()
isnull
. To make it possible for a trusted subclass to replace objects, you must first passtrue
to itsenableResolveObject()
method:protected final boolean enableResolveObject(boolean enable) throws SecurityException
Generally, you would do this in the constructor of any class that needed to replace objects. Once object replacement is enabled, whenever an object is read, it is passed to theObjectInputStream
subclass'sresolveObject()
method beforereadObject()
returns:protected Object resolveObject(Object o) throws IOException
TheresolveObject()
method may return the object itself (the default behavior) or return a different object. Resolving objects is a tricky business. The substituted object must be compatible with the use of the original object, or errors will soon surface as the program tries to invoke methods or access fields that don't exist. Most of the time, the replacing object is an instance of a subclass of the class of the replaced object. Another possibility is that the replacing object and the object it replaces are both instances of different subclasses of a common superclass or interface, where the original object was only used as an instance of that superclass or interface.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Validation
- Content preview·Buy reprint rights for this chapterIt is not always enough to merely restore the state of a serialized object. You may need to verify that the value of a field still makes sense, you may need to notify another object that this object has come into existence, or you simply may need to have the entire graph of the object available before you can finish initializing it.For example, valid XML documents are essentially trees of elements combined with a document type definition (DTD). The DTD defines a grammar the document must follow. The Document Object Model (DOM) defines a means of representing XML (and HTML) documents as instances of Java classes and interfaces, including
XMLNode
,EntityReference
,EntityDeclaration
,DocumentType
,ElementDefinition
,AttributeDefinition
, and others.An XML document could be saved as a set of these serialized objects. In that case, when you deserialized the document, you would want to check that the deserialized document is still valid; that is, that the document adheres to the grammar given in the DTD. You can't do this until the entire document—all its elements, and its entire DTD—has been read. There are also a number of smaller checks you might want to perform. For instance, well-formedness (well-formedness is a slightly less stringent requirement than validity) requires that all entity references like&date;
be defined in the DTD. To check this, it's not enough to have deserialized theEntityReference
object. You must also have deserialized the correspondingDocumentType
object that contains the necessaryEntityDeclaration
objects.You can use theObjectInputStream
class'sregisterValidation()
method to specify anObjectInputValidation
object that will be notified of the object after its entire graph has been reconstructed but beforereadObject()
has returned it. This gives the validator an opportunity to make sure that the object doesn't violate any implicit assertions about the state of the system.public synchronized void registerValidation(ObjectInputValidation oiv, int priority) throws NotActiveException, InvalidObjectException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Sealed Objects
- Content preview·Buy reprint rights for this chapterThe JCE standard extension to Java 2, discussed in the last chapter, provides a
SealedObject
class that lets you encrypt objects written onto an object output stream using any available cipher. Most of the time, I suspect, you'll either encrypt the entire object output stream by chaining it to a cipher output stream, or you won't encrypt anything at all. However, if there's some reason to encrypt only some of the objects you're writing to the stream, you can make them sealed objects.Thejavax.crypto.SealedObject
class wraps a serializable object in an encrypted digital lockbox. The sealed object is serializable so it can be written onto object output streams and read from object input streams as normal. However, the object inside the sealed object can only be deserialized by someone who knows the key.public class SealedObject extends Object implements Serializable
The big advantage to using sealed objects rather than encrypting the entire output stream is that the sealed objects contain all necessary parameters for decryption (algorithm used, initialization vector, salt, iteration count). All the receiver of the sealed object needs to know is the key. Thus, there doesn't necessarily have to be any prior agreement about these other aspects of encryption.You seal an object with theSealedObject()
constructor. The constructor takes as arguments the object to be sealed, which must be serializable, and the properly initializedCipher
object with which to encrypt the object:public SealedObject(Serializable object, Cipher c) throws IOException, IllegalBlockSizeException
Inside the constructor, the object is immediately serialized by an object output stream chained to a byte array output stream. The byte array is then stored in a private field that is encrypted using theCipher
objectc
. The cipher's algorithms and parameters are also stored. Thus, the state of the original object written onto the ultimate object output stream is the state of the object when it was sealed; subsequent changes it may undergo between being sealed and being written are not reflected in the sealed object. Since serialization takes place immediately inside the constructor, the constructor throws aAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 12: Working with Files
- Content preview·Buy reprint rights for this chapterYou've already learned how to read and write data in files using file input streams and file output streams. That's not all there is to files. Files can be created, moved, renamed, copied, deleted, and otherwise manipulated without respect to their contents. Files are also often associated with meta-information that's not strictly part of the contents of the file, such as the time the file was created, the icon for the file, the permissions that determine which users can read or write to the file, and even the name of the file.While the abstraction of the contents of a file as an ordered sequence of bytes used by file input and output streams is almost standard across platforms, the meta-information is not. The
java.io.File
class attempts to provide a platform-independent abstraction for common file operations and meta-information. Unfortunately, this class really shows its Unix roots. It works well on Unix, adequately on Windows and OS/2—with a few caveats—and fails miserably on the Macintosh. Java 2 improves things, but there's still a lot of history—and coming up with something that genuinely works on all platforms is an extremely difficult problem.File manipulation is thus one of the real difficulties of cross-platform Java programming. Before you can hope to write truly cross-platform code, you need a solid understanding of the filesystem basics on all the target platforms. This chapter tries to cover those basics for the major platforms that support Java—Unix; DOS/Windows 3.x ; Windows 95, 98, and NT; OS/2; and the Mac—then it shows you how to write your file code so that it's as portable as possible.As far as a Java program knows, a file is a sequential set of bytes stored on a disk like a hard drive or a CD-ROM. There is a first byte in the file, a second byte, and so on, until the end of the file. In this way a file is similar to a stream. However, a program can jump around in a file, reading first one part of a file, then another. This isn't possible with a stream.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Understanding Files
- Content preview·Buy reprint rights for this chapterAs far as a Java program knows, a file is a sequential set of bytes stored on a disk like a hard drive or a CD-ROM. There is a first byte in the file, a second byte, and so on, until the end of the file. In this way a file is similar to a stream. However, a program can jump around in a file, reading first one part of a file, then another. This isn't possible with a stream.Macintosh files are a little different. Mac files are divided into two forks, each of which is equivalent to a separate file on other platforms. The first part of a Mac file is called the data fork and contains the text, image data, or other basic information of the file. The second part of the file is called the resource fork and typically contains localizable strings, pictures, icons, graphical user interface components like menubars and dialogs, executable code, and more. On a Macintosh, all the standard
java.io
classes work exclusively with the data fork.Every file has a name. The format of the filename is determined by the operating system. For example, in DOS and Windows 3.1, filenames are case-insensitive, (though generally rendered as all capitals), eight ASCII characters long with a three-letter extension. README.TXT is a valid DOS filename, but Read me before you run this program or your hard drive will get trashed is not. All ASCII characters from 32 up (that is, noncontrol characters), except for the 15 punctuation characters (+=/][":;,?*\<>|
) and the space character, may be used in filenames. A period may be used only as a separator between the eight-character name and the three-letter extension. Furthermore, the complete path to the file, including the disk drive and all directories, may not exceed 80 characters in length.On the other hand, Read me before you run this program or your hard drive will get trashed is a valid Win32 (Windows 95, 98, and NT) filename. On those systems filenames may contain up to 255 characters, though room also has to be left for the path to the file. The full pathname may not exceed 255 characters. Furthermore, Win32 filenames are stored in Unicode, though in most circumstances only the ISO Latin-1 character set is actually used to name files. Win32 systems allow any Unicode character with value 32 or above to be used, exceptAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Directories and Paths
- Content preview·Buy reprint rights for this chapterModern operating systems organize files into hierarchical directories. Each directory contains zero or more files or other directories. Like files, directories have names and attributes, though—depending on the operating system—those names and attributes may be different from the attributes allowed for files. For example on the Macintosh, a file or directory name can be up to 31 bytes long, but a volume name can be no more than 27 bytes long.To specify a file completely, you don't just give its name. You also give the directory the file lives in. Of course, that directory may itself be inside another directory, which may be in another directory, until you reach the root of the filesystem. The complete list of directories from the root to a specified file plus the name of the file itself is called the absolute path to the file. The exact syntax of absolute paths varies from system to system. Here are a few examples:DOSC:\PUBLIC\HTML\JAVAFAQ\INDEX.HTMWin32C:\public\html\javafaq\index.htmlMacOSMacintosh HD:public:html:javafaq:index.htmlUnix/public/html/javafaq/index.htmlAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The File Class
- Content preview·Buy reprint rights for this chapterInstances of the
java.io.File
class represent filenames on the local system, not actual files. Occasionally, this distinction is crucial. For instance,File
objects can represent directories as well as files. Also, you cannot assume that a file exists just because you have aFile
object for a file.public class File extends Object implements Serializable
In Java 2, theFile
class also implements thejava.lang.Comparable
interface:public class File extends Object implements Serializable, Comparable // Java 2
Although there are no guarantees that a file named by aFile
object actually exists, theFile
class does contain many methods for getting information about the attributes of a file and for manipulating those files. TheFile
class attempts to account for system-dependent features like the file separator character and file attributes, though in practice it doesn't do a very good job, especially in Java 1.0 and 1.1.EachFile
object contains a singleString
field calledpath
that contains either a relative or absolute path to the file, including the name of the file or directory itself:private String path
Many methods in this class work solely by looking at this string. They do not necessarily look at any part of the filesystem.Thejava.io.File
class has three constructors. Each accepts some variation of a filename as an argument. This one is the simplest:public File(String path)
Thepath
argument should be either an absolute or relative path to the file in a format understood by the host operating system. For example, using Unix filename conventions:File uf1 = new File("25.html"); File uf2 = new File("course/week2/25.html"); File uf3 = new File("/public/html/course/week2/25.html");
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Filename Filters
- Content preview·Buy reprint rights for this chapterYou often want to look for a particular kind of file—for example, text files. To do this, you need a
FilenameFilter
object that specifies which files you'll accept.FilenameFilter
is an interface in thejava.io
package:public interface FilenameFilter
This interface declares a single method,accept()
:public abstract boolean accept(File directory, String name);
Thedirectory
argument is aFile
object pointing to a directory, and thename
argument is the name of a file. The method should returntrue
if a file with this name in this directory passes through the filter andfalse
if it doesn't. BecauseFilenameFilter
is an interface, it must be implemented in a class. Example 12.6 is a class that filters out everything that is not an HTML file.Example 12.6. HTMLFilterimport java.io.*; public class HTMLFilter implements FilenameFilter { public boolean accept(File directory, String name) { if (name.endsWith(".html")) return true; if (name.endsWith(".htm")) return true; return false; } }
Files can be filtered using any criteria you like. Anaccept()
method may test modification date, permissions, file size, and any attribute Java supports. (You can't filter by attributes Java does not support, like Macintosh file and creator codes, at least not without native methods or some sort of access to the native API.) Thisaccept()
method tests whether the file ends with .html and is in a directory where the program can read files:public boolean accept(File directory, String name) { if (name.endsWith(".html") && directory.canRead()) { return true; } return false; }
Filename filters are primarily intended for the use of file dialogs, which will be discussed in the next chapter. However, in Java 2 theFile
class has alistFiles()
method that takes aFilenameFilter
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Filters
- Content preview·Buy reprint rights for this chapterJava 2 adds a new
java.io.FileFilter
interface that's very similar toFilenameFilter
:public abstract interface FileFilter // Java 2
Theaccept()
method ofFileFilter
takes a singleFile
object as an argument, rather than two strings giving the directory and path:public boolean accept(File pathname) // Java 2
Example 12.7 is a filter that only passes HTML files. Its logic is essentially the same as the filter of Example 12.6.Example 12.7. HTMLFileFilterimport java.io.*; public class HTMLFileFilter implements FileFilter { public boolean accept(File pathname) { if (pathname.getName().endsWith(".html")) return true; if (pathname.getName().endsWith(".htm")) return true; return false; } }
This class appears as an argument in one of thelistFiles()
methods ofjava.io.File
:public File[] listFiles(FileFilter filter) // Java 2
Example 12.8 uses theHTMLFileFilter
to list the HTML files in the current working directory.Example 12.8. List HTML Filesimport java.io.*; public class HTMLFiles { public static void main(String[] args) { File cwd = new File(System.getProperty("user.dir")); File[] htmlFiles = cwd.listFiles(new HTMLFileFilter()); for (int i = 0; i < htmlFiles.length; i++) { System.out.println(htmlFiles[i]); } } }
There's a nasty name conflict between thejava.io.FileFilter
interface and the abstractjavax.swing.filechooser.FileFilter
class discussed in the next chapter. I would not be surprised if this interface were replaced by a new abstractFileFilter
class more likejavax.swing.filechooser.FileFilter
.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Descriptors
- Content preview·Buy reprint rights for this chapterAs I've said several times so far, the existence of a
java.io.File
object doesn't imply the existence of the file it represents. Ajava.io.FileDescriptor
object does, however, refer to an actual file:public final class FileDescriptor extends Object
AFileDescriptor
object is an abstraction of an underlying machine-specific structure that represents an open file. While file descriptors are very important for the underlying OS and filesystem, their only real use in Java is to guarantee that data that's been written to a stream is in fact committed to disk; that is, to synchronize between the program and the hardware.In addition to open files, file descriptors can also represent open sockets, though this use won't be emphasized in this book. There are also three file descriptors for the console:System.in
,System.out
, andSystem.err
. These are available as the three mnemonic constantsFileDescriptor.in
,FileDescriptor.out
, andFileDescriptor.err
:public static final FileDescriptor in public static final FileDescriptor out public static final FileDescriptor err
Because file descriptors are very closely tied to the native operating system, you never construct your own file descriptors. Various methods in other classes that refer to open files or sockets may return them. Both theFileInputStream
andFileOutputStream
classes and theRandomAccessFile
class have agetFD()
method that returns the file descriptor associated with the open stream or file:public final FileDescriptor getFD() throws IOException
Thejava.net.SocketImpl
class stores the file descriptor for a socket in a protected field calledfd
:protected FileDescriptor fd
This field is returned bySocketImpl
's protectedgetFileDescriptor()
method:protected FileDescriptor getFileDescriptor()
Since file descriptors are only associated with openAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Random-Access Files
- Content preview·Buy reprint rights for this chapterFile input and output streams require you to start reading or writing at the beginning of a file and then read or write the file in order, possibly skipping over some bytes or backing up but more or less moving from start to finish. Sometimes, however, you need to read parts of a file in a more or less random order, where the data near the beginning of the file isn't necessarily read before the data nearer the end. Other times you need to both read and write the same file. For example, in record-oriented applications like databases, the actual data may be indexed; you would use the index to determine where in the file to find the record you need to read or write. While you could do this by constantly opening and closing the file and skipping to the point where you needed to read, this is far from efficient. Writes are even worse, since you would need to read and rewrite the entire file, even to change just one byte of data.Random-access files can be read from or written to or both from a particular byte position in the file. A single random-access file can be both read and written without first being closed. The position in the file where reads and writes start from is indicated by an integer called the file pointer. Each read or write advances the file pointer by the number of bytes read or written. Furthermore, the programmer can reposition the file pointer at different bytes in the file without closing the file.In Java, random file access is performed through the
java.io.RandomAccessFile
class. This is not a subclass ofjava.io.File
:public class RandomAccessFile extends Object implements DataInput, DataOutput
Among other differences betweenFile
objects andRandomAccessFile
objects, theRandomAccessFile
constructors actually open the file in question and throw anIOException
if it doesn't exist:public RandomAccessFile(String filename, String mode) throws FileNotFoundException public RandomAccessFile(File file, String mode) throws IOException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - General Techniques for Cross-Platform File Access Code
- Content preview·Buy reprint rights for this chapterFile manipulation vies with AWT for being the part of Java where it's hardest to write truly cross-platform, robust code. Until Java 2, Sun really didn't pay a lot of attention to differences between filesystems on different platforms. The situation is getting better, however. The
java.io.File
class does work much more reliably across Windows and Unix in Java 2 and has hooks to allow it to work more naturally on other platforms as well. Of course, Java 1.1 is still the primary delivery platform for most Java applications that work with files. To help you achieve greater serenity and overall cross-platform nirvana, I've summarized some basic rules from this chapter to help you write file manipulation code that's robust across a multitude of platforms:-
Never, never, never hardcode pathnames in your application.
-
Ask the user to name your files. If you must provide a name for a file, try to make it fit in an 8.3 DOS filename with only pure ASCII characters.
-
Do not assume the file separator is "/" (or anything else). Use
File.separatorChar
instead. -
Do not parse pathnames to find directories. Use the methods of the
java.io.File
class instead. -
Do not use
renameTo()
for anything except renaming a file. In particular, do not use it to move a file. -
Try to avoid moving and copying files from within Java programs if at all possible.
-
Do not use . to refer to the current directory. Use
System.getProperty ("user.dir")
instead. -
Do not use .. to refer to the parent directory. Use
getParent()
instead. -
Do not assume the current working directory is the one where your
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! -
- Chapter 13: File Dialogs and Choosers
- Content preview·Buy reprint rights for this chapterFilenames are problematic, even if you don't have to worry about cross-platform idiosyncrasies. Users forget filenames, mistype them, can't remember the exact path to files they need, and more. The proper way to ask a user to select a file is to show them a list of the files in the current directory and get them to select from that list. You also need to allow them to navigate between directories, insert and remove floppy disks, mount network servers, and more.Most graphical user interfaces (and not a few nongraphical ones) provide standard widgets for selecting a file. In Java the platform's native file selector widget is exposed through the
java.awt.FileDialog
class. Like many native peer-based classes, however,FileDialog
doesn't behave exactly the same on all platforms. Therefore, Swing (part of the Java Foundation Classes) provides a pure Java implementation of a file dialog, thejavax.swing.JFileChooser
class.JFileChooser
(and Swing in general) has much more reliable cross-platform behavior.I'm going to jump out of thejava.io
package for a minute to pick up one file-related class from the AWT,java.awt.FileDialog
. File dialogs are the standard open and save dialogs provided by the host GUI. Users use them to pick a directory and a name under which to save a file or to choose a file to open. The appearance varies from platform to platform, but the intent is the same. Figure 13.1 shows a standard Save dialog on the Mac; Figure 13.2 shows a standard open dialog on Solaris.Figure 13.1: The Mac's standard Save dialogFigure 13.2: Motif standard Open dialogFileDialog
is a subclass ofjava.awt.Dialog
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Dialogs
- Content preview·Buy reprint rights for this chapterI'm going to jump out of the
java.io
package for a minute to pick up one file-related class from the AWT,java.awt.FileDialog
. File dialogs are the standard open and save dialogs provided by the host GUI. Users use them to pick a directory and a name under which to save a file or to choose a file to open. The appearance varies from platform to platform, but the intent is the same. Figure 13.1 shows a standard Save dialog on the Mac; Figure 13.2 shows a standard open dialog on Solaris.Figure 13.1: The Mac's standard Save dialogFigure 13.2: Motif standard Open dialogFileDialog
is a subclass ofjava.awt.Dialog
that represents the native save and open dialog boxes:public class FileDialog extends Dialog
A file dialog is almost completely implemented by a native peer. Your program doesn't add components to a file dialog or handle user interaction with event listeners. It just displays the dialog and retrieves the name and directory of the file the user chose after the dialog is dismissed.Since applets normally can't read or write files, file dialogs are primarily useful only in applications. Nonetheless, there is no specific security manager check to see whether file dialogs are allowed. Sun's applet viewer, HotJava, and some recent versions of Netscape Navigator do allow untrusted applets to display file dialogs, retrieve the name and path of the file selected, and send that information back to the originating host over the network. Although this is a very minor security hole, since it only exposes the name and path of a single file selected by the user, it's still on the worrisome side for the paranoid. Internet Explorer 4.0 and Navigator 4.0.3 and earlier do not allow applets to display file dialogs. Certainly, you can't count on being allowed to use a file dialog in an applet, nor can you be guaranteed that it isn't allowed either.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - JFileChooser
- Content preview·Buy reprint rights for this chapterSwing, part of the Java Foundation Classes, provides a much more sophisticated and useful file chooser component written in pure Java,
javax.swing.JFileChooser
:public class JFileChooser extends JComponent implements Accessible
JFileChooser
is not an independent, free-standing window likeFileDialog
. Instead, it is a component you can add to your own frame, dialog, or other container or window. You can, however, ask theJFileChoose
r class to create a modal dialog just for your file chooser. Figure 13.3 shows a file chooser embedded in aJFrame
window with the Metal look and feel. Of course, like all Swing components, the exact appearance depends on the look and feel currently selected.Figure 13.3: A JFileChooser with the Metal look and feelFor the most part, the file chooser works as you expect, especially if you're accustomed to Windows. You select a file with the mouse. Double-clicking the filename or pressing the Open button returns the currently selected file. You can change which files are displayed by selecting different filters from the pop-up list of choosable file filters. All the components have tooltips to help users who are a little thrown by an unfamiliar look and feel. One difference between a Swing file chooser and a standard, native chooser may surprise you. While double-clicking on a directory will open the directory as you expect, selecting a directory and then pressing the Open button returns the selected directory as aFile
object.TheJFileChooser
class relies on support from several classes in thejavax.swing.filechooser
package, including:public abstract class FileFilter public abstract class FileSystemView public abstract class FileView
Unfortunately, these classes still have a few rough edges as of Java 2. They still don't support the Macintosh (though an early access release is available), and they have to jump through some hoops to account for the different levels of support for I/O in Java 1.1 and Java 2.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Viewer, Part 6
- Content preview·Buy reprint rights for this chapterWe've now got the tools needed to put a graphical user interface onto the
FileViewer
application we've been developing. The back end doesn't need to change at all. It's still based on the same filter streams we've used for the last several chapters. However, instead of reading filenames from the command line, we can get them from a file chooser. Instead of dumping the files onSystem.out
, we can display them in a text area. And instead of relying on the user remembering a lot of confusing command-line switches, we can provide simple radio buttons for the user to choose from. This has the added advantage of making it easy to repeatedly interpret the same file according to different filters.Figure 13.6 shows the finished application. This will give you some idea of what the code is aiming at. Initially, I started with a pencil-and-paper sketch, but I'll spare you my inartistic renderings. The singleJFrame
window is organized with a border layout. The west panel contains various controls for determining how the data is interpreted. The east panel contains theJFileChooser
used to select the file. Notice that the Approve button has been customized to say "View File" rather than "Open". Ideally, I'd like to make the Cancel button say "Quit" instead, but theJFileChooser
class doesn't allow you to do that without using resource bundles, a subject I would prefer to leave for another book. The south panel contains a scroll pane. Inside the scroll pane is a streamed text area.Figure 13.6: The FileViewerOne fact I discovered while developing this application was that Swing components don't get along well with standard AWT components likeFrame
andTextArea
. My initial attempts that mixed AWT components with the SwingJFileChooser
rapidly crashed the VM. Replacing all components with their Swing equivalents solved the problem.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 14: Multilingual Character Sets and Unicode
- Content preview·Buy reprint rights for this chapterWe live on a planet on which many languages are spoken. I can walk out my front door in Brooklyn on any given day and hear people conversing in French, Creole, Hebrew, Arabic, Spanish, and languages I don't even recognize. And the Internet is even more diverse than Brooklyn. A local doctor's office that sets up a storefront on the Web to sell vitamins may soon find itself shipping to customers whose native language is Chinese, Gujarati, Turkish, German, Portuguese, or something else. There's no such thing as a local business on the Internet.However, the first computers and the first programming languages were mostly designed by English-speaking programmers in countries where English was the native language. These programmers designed character sets that worked well for English text, though not much else. The preeminent such set is ASCII. Since ASCII is a seven-bit character set, each ASCII character can easily be represented as a single byte, signed or unsigned. Thus, it's natural for ASCII-based programming languages to equate the character data type with the byte data type. In these languages, such as C, the same operations that read and write bytes also read and write characters.Unfortunately, ASCII is inadequate for almost all non-English languages. It contains no cedillas, umlauts, betas, thorns, or any of the other thousands of non-English characters that are used to read and write text around the world. Fairly shortly after the development of ASCII, there was an explosion of extended character sets around the world, each of which encoded the basic ASCII characters as well as the additional characters needed for another language like Greek, Turkish, Arabic, Chinese, Japanese, or Russian. Many of these character sets are still used today, and much existing data is encoded in them.However, these character sets are still inadequate for many needs. For one thing, most assume that you only want to encode English plus one other language. This makes it difficult for a Russian classicist to write a commentary on an ancient Greek text, for example. Furthermore, documents are limited by their character sets. Email sent from Morocco may become illegible in India if the sender is using an Arabic character set but the recipient is using Devanagari.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Unicode
- Content preview·Buy reprint rights for this chapterUnicode is Java's native character set. Each Unicode character is a two-byte, unsigned number with a value between and 65,535. This provides enough space for characters from all the world's alphabetic scripts and the most common characters from the ideographic scripts of Chinese and Japanese. The current version of Unicode (2.1) defines 38,887 different characters from many languages, including English, Russian, Arabic, Hebrew, Greek, Thai, Korean, and Sanskrit. The most common ideographic characters from Japanese and Chinese are also included. However, Chinese alone contains over 80,000 different ideograms, so it's impossible to include them all in a two-byte set. A four-byte Universal Character Set (UCS) that will include the full Chinese and Japanese scripts is under development. Java does not yet support UCS.The first 128 Unicode characters (characters through 127) are identical to the ASCII character set. 32 is the ASCII space; therefore, 32 is the Unicode space. 33 is the ASCII exclamation point, so 33 is the Unicode exclamation point, and so on. Table 2.1, in Appendix B, shows this character set. The next 128 Unicode characters (characters 128 through 255) have the same values as the equivalent characters in the Latin-1 character set defined by ISO standard 8859-1. Latin-1, a slight variation of which is used by Windows, adds the various accented characters, umlauts, cedillas, upside-down question marks, and other characters needed to write text in most Western European languages. Table 2.2 shows these characters. The first 128 characters in Latin-1 are identical to the ASCII character set.Values beyond 255 encode characters from various other character sets. Where possible, character blocks describing a particular group of characters map onto established encodings for that set of characters by simple transposition. For instance, Unicode characters 884 through 1011 encode the Greek alphabet and associated characters like the Greek question mark (;). This is a direct transposition by 756 of characters 128 through 255 of the ISO 8859-7 character set, which is in turn based on the Greek national standard ELOT 928. For example, the small letter delta,Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Displaying Unicode Text
- Content preview·Buy reprint rights for this chapterAlthough internally Java can handle full Unicode data (it's just numbers, after all), not all Java environments can display all Unicode characters. In fact, I'll go so far as to say none of the current Java environments, whether standalone virtual machines or web browsers, can display all Unicode characters.Unicode is divided into blocks. For example, characters through 127 are the Basic Latin block and contain ASCII. Characters 128 through 255 are the Latin Extended-A block and contain the upper 128 characters of the Latin-1 character set. Characters 9984 through 10,175 are the Dingbats block and contain the characters in the popular Zapf Dingbats font. Characters 19,968 through 40,959 are the unified Chinese-Japanese-Korean ideograph block. Each block represents a script or a subset of a script. As a rule of thumb, most runtime environments can display only some of these blocks. Occasionally, a particular runtime may be able to display some characters from a block but not others. For instance, most Macintoshes can display the entire Latin Extended-A block except for the Icelandic characters þ, Þ, Ý, Ð, and ð .The biggest problem is the lack of fonts. Few computers have fonts for all the scripts Java supports. Even computers that possess the necessary fonts can't install a lot of them because of their size. A normal, 8-bit outline font ranges from about 30-60K. A Unicode font that omits the Han ideographs will be about 10 times that size. And a full Unicode font that includes the full range of Han ideographs will occupy between five and seven megabytes. Furthermore, text display algorithms based on English often break down when faced with right-to-left languages like Hebrew and Arabic, vertical languages like the traditional Chinese still used in Taiwan, or context-sensitive languages like Arabic.Finally, even web browsers that can handle Chinese, Cyrillic, Arabic, Japanese, or other non-Roman scripts in HTML don't necessarily support those same scripts in applets. (HotJava 1.1 and earlier is a notable offender here.) It's even sometimes the case that characters an applet can draw directly using aAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Unicode Escapes
- Content preview·Buy reprint rights for this chapterCurrently, there isn't a large installed base of Unicode text editors. There's an even smaller installed base of machines with full Unicode fonts installed. Therefore, it's essential that all valid Java programs can be written using nothing more than ASCII characters.All Java keywords and operators as well as the names of all the classes, methods, and fields in the core API may be written in pure ASCII. This is by deliberate design on the part of JavaSoft. However, Unicode characters are explicitly allowed in comments, string and
char
literals, and identifiers. The following, the opening line from Homer's Odyssey, should be legal Java:To enable statements like that in Java source, non-ASCII characters are embedded through Unicode escape sequences. The escape sequence for a character is a backslash ( \ ) followed by a smallu
, followed by the four-digit hexadecimal code for the character. For example:char tab = '\u0009'; char softHyphen = '\u00AD'; char sigma = '\u03C3'; char squareKeesu = '\u30B9';.
Using Unicode escapes, the opening line from Homer's Odyssey would be rendered as:/* \u039F\u03B4\u03C5\u03C3\u03C3\u03B5\u03B9\u03B1 */ String \u03B1\u03C1\u03C7\u03B7 = "\u0386\u03BD\u03B4\u03C1\u03B1 \u03BC\u03BF\u03B9 " + "\u03AD\u03BD\u03BD\u03B5\u03C0\u03B5, " + "\u039C\u03BF\u03C5\u03C3\u03B1, " + " \u03BF\u03C2 \u03BC\u03AC\u03BB\u03B1 \u03C0\u03BF\u03BB\u03BB\u03B1";
Obviously, this is horribly inconvenient for anything more than an occasional non-ASCII character.Many Java compilers assume that source files are written in ASCII and that the only Unicode characters present are Unicode escapes. During a single-pass preprocessing phase, the compiler converts each raw ASCII character or Unicode escape sequence to a two-byte Unicode character it stores in memory. Only after preprocessing is complete and the ASCII file has been converted to in-memory Unicode, is the file actually compiled. Some compilers and runtimes will also compile the upper 128 characters of the ISO Latin-1 character set. However, some do not. Worse yet, some Java virtual machines can compile files containing non-ASCII, ISO Latin-1 characters but can't run the files they've compiled. For safety's sake and maximum portability, you should escape all non-ASCII characters.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - UTF-8
- Content preview·Buy reprint rights for this chapterSince every Unicode character is encoded in exactly two bytes, Unicode is a fairly simple encoding. The first two bytes of a file are the first character. The next two bytes are the second character, and so on. This makes parsing Unicode data relatively simple compared to schemes that use variable-width characters. The downside is that Unicode is far from the most efficient encoding possible. In a file containing mostly English text, the high bytes of almost all the characters will be 0. These bytes can occupy as much as half of the file. If you're sending data across the network, Unicode data can take twice as long.A more efficient encoding can be achieved for files that are composed primarily of ASCII text by encoding the more common characters in fewer bytes. UTF-8 is one such format that encodes the non-null ASCII characters in a single byte, characters between 128 and 2047 and ASCII null in two bytes, and the remaining characters in three bytes. While theoretically this encoding might expand a file's size by 50%, because most text files contain primarily ASCII, in practice it's almost always a huge savings. Therefore, Java uses UTF-8 in string literals, identifiers, and other text data in compiled byte code. UTF-8 is also a common encoding for XML files and the native encoding of Bell Labs' experimental Plan 9 operating system.To better understand UTF-8, consider a typical Unicode character as a sequence of 16 bits:x15x14x13x12x11x10Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The char Data Type
- Content preview·Buy reprint rights for this chapterThe
char
primitive data type in Java is a two-byte unsigned integer whose values range from to 65,535.char
variables may be assigned fromint
literals, like this:char exclamationPoint = 33;
In the virtual machine,char
s are promoted toint
s in arithmetic operations like addition and multiplication. Therefore, operations more complicated than a simple assignment require an explicit cast tochar
, like this:char a = 97; char b = (char) (a + 1);
In practice,char
s are rarely used in arithmetic operations. Instead, they're given symbolic meanings through mappings to particular elements of the Unicode character set. For instance, 33 is the Unicode (and ASCII) character for the exclamation point (!). 97 is the Unicode (and ASCII) character for the small lettera
. When the Unicode and printable ASCII characters converge, as they do for values between 32 and 127, achar
may be written in Java source code as achar
literal. This is the desired ASCII character between single quote marks, like this:char exclamationPoint = '!'; char a = 'a'; char b = 'b';
For characters outside this range, you can assign values tochar
s using Unicode escape sequences, like this:char tab = '\u0009'; char softHyphen = '\u00AD'; char sigma = '\u03C3'; char squareKeesu = '\u30B9';
As for the other primitive data types, the core API includes a type wrapper class forchar
values. This isjava.lang.Character
:public final class Character implements Serializable
In Java 2Character
also implementsComparable
:public final class Character implements Serializable, Comparable // Java 2
Section 14.5.1.1: Constructor
This class has a single constructor:Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Other Encodings
- Content preview·Buy reprint rights for this chapterAlthough Unicode is the most advanced and comprehensive character set yet designed on this planet, it has not taken the world by storm. Compared to the vast quantities of ASCII data, there are virtually no Unicode files on today's computers. Although Unicode support is growing, there will doubtless be legacy data in other encodings that must be read for centuries to come. A lot of it is in the Unicode subsets ASCII and ISO Latin-1, but a lot of it is also in less popular encoding schemes like EBCDIC and MacRoman. Those only cover English and a few Western European languages. There are multiple encodings in use for Arabic, Turkish, Hebrew, Greek, Cyrillic, Chinese, Japanese, Korean, and many other languages and scripts. The
Reader
andWriter
classes (discussed in the next chapter) allow you to read and write data in these different character sets. TheString
class also has a number of methods that convert between different encodings (though aString
object itself is always represented in Unicode). Furthermore, the JDK includes a character mode tool based on these classes called native2ascii that performs such conversions on existing files.The name native2ascii is a misnomer. Rather than converting to ASCII, it converts to ISO Latin-1 with Unicode characters embedded with Unicode escape sequences like\u020F
. It can also work in reverse, converting an ISO Latin-1 file with embedded Unicode to a native character set. For example, to copy the contents of the file macdata.txt from the MacRoman encoding into a new file called isodata.txt encoded with ISO Latin-1 with Unicode escapes, you would type:% native2ascii -encoding MacRoman macdata.txt isodata.txt
You can convert it back with the-reverse
option:% native2ascii -encoding MacRoman -reverse isodata.txt macdata.txt
If you don't specify a particular encoding, native2ascii makes its best guess as to the platform's native encoding. This best guess is read from the system propertyfile.encoding
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Converting Between Byte Arrays and Strings
- Content preview·Buy reprint rights for this chapterThe
java.lang.String
class has several constructors that form strings from byte arrays and several methods that return a byte array corresponding to a given string. Anytime a Unicode string is converted to bytes or vice versa, that conversion happens according to one of the encodings listed in Table 2.4. The same string can produce different byte arrays if different encodings are used. Six constructors form a newString
object from a byte array:public String(byte[] ascii, int highByte) public String(byte[] ascii, int highByte, int offset, int length) public String(byte[] data, String encoding) throws UnsupportedEncodingException public String(byte[] data, int offset, int length, String encoding) throws UnsupportedEncodingException public String(byte[] data) public String(byte[] data, int offset, int length)
The first two constructors, the ones with thehighByte
argument, are leftovers from Java 1.0 that are deprecated in Java 1.1. These two constructors do not accurately translate non-Latin-1 character sets into Unicode. Instead, they read each byte in theascii
array as the low-order byte of a two-byte character, then fill in the high-order byte with thehighByte
argument. For example:byte[] isoLatin1 = new byte[256]; for (int i = 0; i < 256; i++) isoLatin1[i] = (byte) i; String s = new String(isoLatin1, 0);
Frankly, this is a kludge; it's deprecated for good reason. This scheme works quite well for Latin-1 data with a high byte of 0. However, it's extremely difficult to use for character sets where different characters need to have different high bytes, and it's completely unworkable for character sets like MacRoman that also need to adjust bits in the low-order byte to conform to Unicode. The only approach that genuinely works for the broad range of character sets Java programs may be asked to handle is table lookup. Each character set in Table 2.4 is associated with a table mapping characters in the set to Unicode characters. These tables are hidden inside theAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 15: Readers and Writers
- Content preview·Buy reprint rights for this chapterA language that supports international text must separate the reading and writing of raw bytes from the reading and writing of characters, since in an international system they are no longer the same thing. Classes that read characters must be able to parse a variety of character encodings, not just ASCII, and translate them into the language's native character set. Classes that write characters must be able to translate the language's native character set into a variety of formats and write those. In Java this task is performed by the
Reader
andWriter
classes.You're probably going to experience a little déjà vu. Thejava.io.Writer
class is modeled on thejava.io.OutputStream
class. Thejava.io.Reader
class is modeled on thejava.io.InputStream
class. The names and signatures of the members of theReader
andWriter
classes are similar (sometimes identical) to the names and signatures of the members of theInputStream
andOutputStream
classes. The patterns these classes follow are similar as well. Filtered input and output streams are chained to other streams in their constructors. Similarly, filtered readers and writers are chained to other readers and writers in their constructors.InputStream
andOutputStream
are abstract superclasses that identify common functionality in the concrete subclasses. Likewise,Reader
andWriter
are abstract superclasses that identify common functionality in the concrete subclasses. The difference between readers and writers and input and output streams is that streams are fundamentally byte based, while readers and writers are fundamentally character based. Where an input stream reads a byte, a reader reads a character; where an output stream writes a byte, a writer writes a character.While bytes are a more or less universal concept, characters are not. As you learned in the last chapter, the same character can be encoded differently in different character sets. Different character sets encode different characters. Characters can even have different widths in different character sets. For example, ASCII and ISO Latin-1 use one-byte characters. Unicode uses two-byte characters. UTF-8 uses characters of varying width between one and three bytes. Concrete subclasses of theAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The java.io.Writer Class
- Content preview·Buy reprint rights for this chapterThe
Writer
class is abstract, just likeOutputStream
is abstract. You won't have any pure instances ofWriter
that are not also instances of some concrete subclass ofWriter
. However, many of the subclasses ofWriter
differ primarily in the targets of the text they write, just as many concrete subclasses ofOutputStream
differ only in the targets of the data they write. Most of the time you don't care about the difference betweenFileOutputStream
andByteArrayOutputStream
. Similarly, most of the time you won't care about the differences betweenFileWriter
andStringWriter
. You'll just use the methods of the common superclass,java.io.Writer
.You use a writer almost exactly as you use an output stream. Rather than writingbyte
s, you writechar
s. Thewrite()
method writes a subarray from thechar
array text starting atoffset
and continuing forlength
characters:public abstract void write(char[] text, int offset, int length) throws IOException
For example, given someWriter
objectw
, you can write the stringTesting
1-2-3
like this:char[] test = {'T', 'e', 's', 't', 'i', 'n', 'g', ' ', '1', '-', '2', '-', '3'}; w.write(test, 0, test.length);
This method is abstract. Concrete subclasses that convertchar
s intobyte
s according to a specified encoding and write those bytes onto an underlying stream must override this method. AnIOException
may be thrown if the underlying stream'swrite()
method throws anIOException
. You can also write a single character, an entire array of characters, a string, or a substring:public void write(int c) throws IOException public void write(char[] text) throws IOException public void write(String s) throws IOException public void write(String s, int offset, int length) throws IOException
The default implementations of these four methods convert their first argument into an array ofchar
s and pass that toAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The OutputStreamWriter Class
- Content preview·Buy reprint rights for this chapter
java.io.Writer
is an abstract class. Its most basic concrete subclass isOutputStreamWriter
:public class OutputStreamWriter extends Writer
Its constructor connects a character writer to an underlying output stream:public OutputStreamWriter(OutputStream out) public OutputStreamWriter(OutputStream out, String encoding) throws UnsupportedEncodingException
The first constructor assumes that the text in the stream is to be written using the platform's default encoding. The second constructor specifies an encoding. There's no easy way to determine which encodings are supported, but the ones listed in Table 2.4 in Appendix B, are supported by most VMs. For example, this code attaches anOutputStreamWriter
toSystem.out
with the default encoding:OutputStreamWriter osw = new OutputStreamWriter(System.out);
The default encoding is normally ISO Latin-1, except on Macs, where it is MacRoman. Whatever it is, you can find it in the system propertyfile.encoding
:String defaultEncoding = System.getProperty("file.encoding");
On the other hand, if you want to write a file encoded in ISO 8859-7 (ASCII plus Greek) you might do this:FileOutputStream fos = new FileOutputStream("greek.txt"); OutputStreamWriter greekWriter = new OutputStreamWriter(fos, "8859_7");
Thewrite()
methods convert characters to bytes according to a specified character encoding and write those bytes onto the underlying output stream:public void write(int c) throws IOException public void write(char[] text, int offset, int length) throws IOException public void write(String s, int offset, int length) throws IOException
Once theWriter
is constructed, writing the characters is easy. For example:String arete = "\u03B1\u03C1\u03B5\u03C4\u03B7"; greekWriter.write(arete, 0, arete.length());
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The java.io.Reader Class
- Content preview·Buy reprint rights for this chapterYou use a reader almost exactly as you use an input stream. Rather than reading bytes, you read characters. The basic
read()
method reads a specified number of characters from the underlying input stream into an array starting at a given offset:public abstract int read(char[] buffer, int offset, int length) throws IOException
Thisread()
method returns the number of characters actually read. As with input streams reading bytes, there may not be as many characters available as you requested. Also like theread()
method of an input stream, it returns -1 when it detects the end of the data.Thisread()
method is abstract. Concrete subclasses that read bytes from some source must override this method. AnIOException
may be thrown if the underlying stream'sread()
method throws anIOException
or an encoding error is detected.You can also fill an array with characters using this method:public int read(char[] buffer) throws IOException
This is equivalent to invokingread(buffer,
0,
buffer.length)
. Thus, it also returns the number of characters read and throws anIOException
when the underlying stream throws anIOException
or when an encoding error is detected. The following method reads a single character and returns it:public int read() throws IOException
Although anint
is returned, thisint
is always between and 65,535 and may be cast to achar
without losing information. All threeread()
methods block until some input is available, an I/O error occurs, or the end of the stream is reached.You can skip a certain number of characters. This method also blocks until some characters are available. It returns the number of characters skipped or -1 if the end of stream is reached.public long skip(long n) throws IOException
Theready()
method returnstrue
if the reader is ready to be read from,Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The InputStreamReader Class
- Content preview·Buy reprint rights for this chapterThe most basic concrete subclass of
Reader
isInputStreamReader
:public class InputStreamReader extends Reader
The constructor connects a character reader to an underlying input stream:public InputStreamReader(InputStream in) public InputStreamReader(InputStream in, String encoding) throws UnsupportedEncodingException
The first constructor uses the platform's default encoding, as given by the system propertyfile.encoding
. The second one uses the specified encoding. For example, to attach anInputStreamReader
toSystem.in
with the default encoding (generally ISO Latin-1):InputStreamReader isr = new InputStreamReader(System.in);
If you want to read a file encoded in Latin-5 (ASCII plus Turkish, as specified by ISO 8859-9), you might do this:FileInputStream fin = new FileInputStream("symbol.txt"); InputStreamReader isr = new InputStreamReader(fin, "8859_9");
There's no easy way to determine which encodings are supported, but the ones listed in Table 2.4 are supported by most VMs.Theread()
methods read bytes from an underlying input stream and convert those bytes to characters according to the specified encoding:public int read() throws IOException public int read(char c[], int off, int length) throws IOException
ThegetEncoding()
method returns a string containing the name of the encoding used by this reader:public String getEncoding()
The remaining two methods just override methods fromjava.io.Reader
but behave identically from the perspective of the programmer:public boolean ready() throws IOException public void close() throws IOException
Example 15.2 uses anInputStreamReader
to read a file in a user-specified encoding. TheFileConverter
reads the name of the input file, the name of the of the output file, the input encoding, and the output encoding. Characters that are not available in the output character set are replaced by the substitution character, generally the question mark.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Character Array Readers and Writers
- Content preview·Buy reprint rights for this chapterThe
java.io.ByteArrayInputStream
andjava.io.ByteArrayOutputStream
classes let programmers use stream methods to read and write arrays of bytes. Thejava.io.CharArrayReader
andjava.io.CharArrayWriter
classes allow programmers to useReader
andWriter
methods to read and write arrays ofchar
s. Sincechar
arrays are purely internal to Java and thus composed of true Unicode characters, this is one of the few uses of readers and writers where you don't need to concern yourself with conversions between different encodings. If you want to read arrays of text encoded in some non-Unicode encoding, you should chain aByteArrayInputStream
to anInputStreamReader
instead. Similarly, to write text into a byte array in a non-Unicode encoding, just chain anOutputStreamWriter
to aByteArrayOutputStream
.TheCharArrayWriter
maintains an internal array ofchar
s into which successive characters are written. The array is expanded as needed. This array is stored in a protected field calledbuf
:protected char[] buf
For efficiency, the array generally contains more components than characters. The number of characters actually written is stored in a protectedint
field calledcount
:protected int count
The value of thecount
field is always less than or equal tobuf.length
.The no-argument constructor creates aCharArrayWriter
object with a 32-character buffer. This is on the small side, so you can expand it with the second constructor:public CharArrayWriter() public CharArrayWriter(int initialSize)
Thewrite()
methods write their characters into the buffer. If there's insufficient space inbuf
to hold the characters, its size is doubled.public void write(int c) public void write(char[] text, int offset, int length) public void write(String s, int offset, int length)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - String Readers and Writers
- Content preview·Buy reprint rights for this chapterThe
java.io.StringReader
andjava.io.StringWriter
classes allow programmers to useReader
andWriter
methods to read and write strings. Likechar
arrays, Java strings are also composed of pure Unicode characters. Therefore, they're good sources of data for readers and good targets for writers. This is the other common case where readers and writers don't need to convert between different encodings.This class would more accurately be calledStringBufferWriter
, butStringWriter
is more poetic. AStringWriter
maintains an internaljava.lang.StringBuffer
object to which written characters are appended. This buffer can easily be converted to a string as necessary.public class StringWriter extends Writer
There is a single public constructor:public StringWriter()
There is also a constructor that allows you to specify the initial size of the internal string buffer. This isn't too important, because string buffers (and, by extension, string writers) are expanded as necessary. Still, if you can estimate the size of the string in advance, it's marginally more efficient to select a size big enough to hold all characters that will be written. The constructor is protected in Java 1.1 and public in Java 2:protected StringWriter(int initialSize) public StringWriter(int initialSize) // Java 2
TheStringWriter
class has the usual collection ofwrite()
methods, all of which just append their data to theStringBuffer
:public void write(int c) public void write(char[] text, int offset, int length) public void write(String s) public void write(String s, int offset, int length)
There areflush()
andclose()
methods, but both have empty method bodies, as string writers operate completely internal to Java and do not require flushing or closing:public void flush() public void close()
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Reading and Writing Files
- Content preview·Buy reprint rights for this chapterYou've already learned how to chain an
OutputStreamWriter
to aFileOutputStream
and anInputStreamReader
to aFileInputStream
. Although this isn't hard, Java provides two simple utility classes that take care of the details,java.io.FileWriter
andjava.io.FileReader
.TheFileWriter
class is a subclass ofOutputStreamWriter
that writes text files using the platform's default character encoding and buffer size. If you need to change these values, construct anOutputStreamWriter
on aFileOutputStream
instead.public class FileWriter extends OutputStreamWriter
This class has four constructors:public FileWriter(String fileName) throws IOException public FileWriter(String fileName, boolean append) throws IOException public FileWriter(File file) throws IOException public FileWriter(FileDescriptor fd)
The first constructor opens a file and positions the file pointer at the beginning of the file. Any text in the file is overwritten. For example:FileWriter fw = new FileWriter("36.html");
The second constructor allows you to specify that new text is appended to the existing contents of the file rather than overwriting them by setting the second argument to true. For example:FileWriter fw = new FileWriter("36.html", true);
The third and fourth constructors use aFile
object and aFileDescriptor
, respectively, instead of a filename to identify the file to be written to. Any pre-existing contents in a file so opened are overwritten.No methods other than the constructors are declared in this class. You use the standardWriter
methods likewrite()
,flush()
, andclose()
to write the text in the file.TheFileReader
class is a subclass ofInputStreamReader
that reads text files using the platform's default character encoding. If you need to change the encoding, construct anAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Buffered Readers and Writers
- Content preview·Buy reprint rights for this chapterInput and output can be time-consuming operations. It's often quicker to read or write text in large chunks rather than in many separate smaller pieces, even when you only process the text in the smaller pieces. The
java.io.BufferedReader
andjava.io.BufferedWriter
classes provide internal character buffers. Text that's written to a buffered writer is stored in the internal buffer and only written to the underlying writer when the buffer fills up or is flushed. Likewise, reading text from a buffered reader may cause more characters to be read than were requested; the extra characters are stored in an internal buffer. Future reads first access characters from the internal buffer and only access the underlying reader when the buffer is emptied.Thejava.io.BufferedWriter
class is a subclass ofjava.io.Writer
that you chain to anotherWriter
class to buffer characters. This allows more efficient writing of text.public class BufferedWriter extends Writer
There are two constructors. One has a default buffer size (8192 characters); the other lets you specify the buffer size:public BufferedWriter(Writer out) public BufferedWriter(Writer out, int size)
Each time you write to an unbuffered writer, there's a matching write to the underlying output stream. Therefore, it's a good idea to wrap aBufferedWriter
around each writer whosewrite()
operations are expensive, such as aFileWriter
. For example:BufferedWriter bw = new BufferedWriter(new FileWriter("37.html"));
BufferedWriter
overrides most of its superclass's methods, including:public void write(int c) throws IOException public void write(char[] text,int offset, int length) throws IOException public void write(String s, int offset, int length) throws IOException public void flush() throws IOException public void close() throws IOException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Print Writers
- Content preview·Buy reprint rights for this chapterThe
java.io.PrintWriter
class is a subclass ofjava.io.Writer
that contains the familiarprint()
andprintln()
methods fromSystem.out
and other instances ofPrintStream
. It's deliberately similar to thejava.io.PrintStream
class. In Java 1.0PrintStream
was used for text-oriented output, but it didn't handle multiple-byte character sets particularly well (or really at all). In Java 1.1 and later, streams are only for byte-oriented and numeric output; writers should be used when you want to output text.The main difference betweenPrintStream
andPrintWriter
is thatPrintWriter
handles multiple-byte and other non-ISO Latin-1 character sets properly. The other, more minor difference is that automatic flushing is performed only whenprintln()
is invoked, not every time a newline character is seen. Sun would probably like to deprecatePrintStream
and usePrintWriter
instead, but that would break too much existing code. (In fact, Sun did deprecate thePrintStream()
constructors in 1.1, but they undeprecated them in Java 2.)There are four constructors in this class:public PrintWriter(Writer out) public PrintWriter(Writer out, boolean autoFlush) public PrintWriter(OutputStream out) public PrintWriter(OutputStream out, boolean autoFlush)
ThePrintWriter
can send text either to an output stream or to another writer. IfautoFlush
is set totrue
, thePrintWriter
is flushed every timeprintln()
is invoked.ThePrintWriter
class implements the abstractwrite()
method fromjava.io.Writer
and overrides five other methods:public void write(int c) public void write(char[] text) public void write(String s) public void write(String s, int offset, int length) public void flush() public void close()
These methods are used almost identically to their equivalents in any otherWriter
class. The one difference is that none of them throwIOException
s; in fact, no method in thePrintWriter
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Piped Readers and Writers
- Content preview·Buy reprint rights for this chapterPiped readers and writers do for character streams what piped input and output streams do for byte streams: they allow two threads to communicate. Character output from one thread becomes character input for the other thread:
public class PipedWriter extends Writer public class PipedReader extends Reader
ThePipedWriter
class has two constructors. The first constructs an unconnectedPipedWriter
object. The second constructs one that's connected to thePipedReader
objectsink
:public PipedWriter() public PipedWriter(PipedReader sink) throws IOException
ThePipedReader
class also has two constructors. Again, the first constructor creates an unconnectedPipedReader
object. The second constructs one that's connected to thePipedWriter
objectsource
:public PipedReader() public PipedReader(PipedWriter source) throws IOException
Piped readers and writers are normally created in pairs. The piped writer becomes the underlying source for the piped reader. This is one of the few cases where a reader does not have an underlying input stream. For example:PipedWriter pw = new PipedWriter(); PipedReader pr = new PipedReader(pw);
This simple example is a little deceptive, because these lines of code will normally be in different methods and perhaps even different classes. Some mechanism must be established to pass a reference to thePipedWriter
into the thread that handles thePipedReader
, or you can create them in the same thread, then pass a reference to the connected stream into a separate thread.Alternately, you can start with aPipedReader
and then wrap it with aPipedWriter
:PipedReader pr = new PipedReader(); PipedWriter pw = new PipedWriter(pr);
Or you can create them both unconnected, then use one or the other'sconnect()
method to link them:public void connect(PipedReader sink) throws IOException public void connect(PipedWriter source) throws IOException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Filtered Readers and Writers
- Content preview·Buy reprint rights for this chapterThe
java.io.FilterReader
andjava.io.FilterWriter
classes are abstract classes that read characters and filter them in some way before passing the text along. You can imagine aFilterReader
that converts all characters to uppercase.public abstract class FilterReader extends Reader public abstract class FilterWriter extends Writer
AlthoughFilterReader
andFilterWriter
are modeled afterjava.io.FilterInputStream
andjava.io.FilterOutputStream
, they are much less commonly used than those classes. There are no concrete subclasses ofFilterWriter
in thejava
packages and only one concrete subclass ofFilterReader
(PushbackReader
discussed later). These classes exist so you can write your own filters.FilterReader
has a single constructor, which is protected:protected FilterReader(Reader in)
Thein
argument is theReader
to which this filter is chained. This reference is stored in a protected field calledin
from which text for this filter is read and isnull
after the filter has been closed.protected Reader in
SinceFilterReader
is an abstract class, only subclasses may be instantiated. Therefore, it doesn't matter that the constructor is protected, since it may only be invoked from subclass constructors.FilterReader
provides the usual collection ofread()
,skip()
,ready()
,markSupported()
,mark()
,reset()
, andclose()
methods:public int read() throws IOException public int read(char[] text, int offset, int length) throws IOException public long skip(long n) throws IOException public boolean ready() throws IOException public boolean markSupported() public void mark(int readAheadLimit) throws IOException public void reset() throws IOException public void close() throws IOException
These all simply invoke the equivalent method in theAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - File Viewer Finis
- Content preview·Buy reprint rights for this chapterAs a final example of working with readers and writers, we return for the last time to the
FileDumper
application last seen in Chapter 13. At that point, we had a GUI program that allowed any file to be opened and interpreted in one of several formats, including ASCII, decimal, hexadecimal, short, regular, and long integers in both big- and little-endian formats, floating point, and double-precision floating point.In this section we expand the program to read many different text formats besides ASCII. The user interface must be adjusted to allow a binary choice of whether the file contains text or numeric data. If they choose text, you'll need to use a reader to read the file instead of an input stream. You'll also need to provide some means for the user to pick the encoding they want text read in (e.g., MacRoman, ISO Latin-1, Unicode, etc). Since there are several dozen text encodings, the best choice is a list box. All of this can be integrated into the mode panel. Figure 15.1 shows the revisedModePanel2
class. The code is given in Example 15.9. Two new public methods are added,isText()
andgetEncoding()
. The rest of the changes are fairly minor ones to set up the GUI.Figure 15.1: A mode panel with a list box for encodingsExample 15.9. ModePanel2import java.awt.*; import javax.swing.*; public class ModePanel2 extends JPanel { JCheckBox bigEndian = new JCheckBox("Big Endian", true); JCheckBox deflated = new JCheckBox("Deflated", false); JCheckBox gzipped = new JCheckBox("GZipped", false); ButtonGroup dataTypes = new ButtonGroup(); JRadioButton asciiRadio = new JRadioButton("Text"); JRadioButton decimalRadio = new JRadioButton("Decimal"); JRadioButton hexRadio = new JRadioButton("Hexadecimal"); JRadioButton shortRadio = new JRadioButton("Short"); JRadioButton intRadio = new JRadioButton("Int"); JRadioButton longRadio = new JRadioButton("Long"); JRadioButton floatRadio = new JRadioButton("Float"); JRadioButton doubleRadio = new JRadioButton("Double"); JTextField password = new JTextField(); final static String[] encodings = {"8859_1", "8859_2", "8859_3", "8859_4", "8859_5", "8859_6", "8859_7", "8859_8", "8859_9", "Big5", "CNS11643", "Cp037", "Cp273", "Cp277", "Cp278", "Cp280", "Cp284", "Cp285", "Cp297", "Cp420", "Cp424", "Cp437", "Cp500", "Cp737", "Cp775", "Cp850", "Cp852", "Cp855", "Cp856", "Cp857", "Cp860", "Cp861", "Cp862", "Cp863", "Cp864", "Cp865", "Cp866", "Cp868", "Cp869", "Cp870", "Cp871", "Cp874", "Cp875", "Cp918", "Cp921", "Cp922", "Cp1006", "Cp1025", "Cp1026", "Cp1046", "Cp1097", "Cp1098", "Cp1112", "Cp1122", "Cp1123", "Cp1124", "Cp1250", "Cp1251", "Cp1252", "Cp1253", "Cp1254", "Cp1255", "Cp1256", "Cp1257", "Cp1258", "EUCJIS", "GB2312", "JIS", "JIS0208", "KSC5601", "MacArabic", "MacCentralEurope", "MacCroatian", "MacCyrillic", "MacDingbat", "MacGreek", "MacHebrew", "MacIceland", "MacRoman", "MacRomania", "MacSymbol", "MacThai", "MacTurkish", "MacUkraine", "SJIS", "UTF8", "Unicode" }; JList theEncoding = new JList(encodings); public ModePanel2() { this.setLayout(new GridLayout(1, 2)); JPanel left = new JPanel(); JScrollPane right = new JScrollPane(theEncoding); left.setLayout(new GridLayout(13, 1)); left.add(bigEndian); left.add(deflated); left.add(gzipped); left.add(asciiRadio); asciiRadio.setSelected(true); left.add(decimalRadio); left.add(hexRadio); left.add(shortRadio); left.add(intRadio); left.add(longRadio); left.add(floatRadio); left.add(doubleRadio); dataTypes.add(asciiRadio); dataTypes.add(decimalRadio); dataTypes.add(hexRadio); dataTypes.add(shortRadio); dataTypes.add(intRadio); dataTypes.add(longRadio); dataTypes.add(floatRadio); dataTypes.add(doubleRadio); left.add(password); this.add(left); this.add(right); } public boolean isBigEndian() { return bigEndian.isSelected(); } public boolean isDeflated() { return deflated.isSelected(); } public boolean isGZipped() { return gzipped.isSelected(); } public boolean isText() { if (this.getMode() == FileDumper6.ASC) return true; return false; } public String getEncoding() { return (String) theEncoding.getSelectedValue(); } public int getMode() { if (asciiRadio.isSelected()) return FileDumper6.ASC; else if (decimalRadio.isSelected()) return FileDumper6.DEC; else if (hexRadio.isSelected()) return FileDumper6.HEX; else if (shortRadio.isSelected()) return FileDumper6.SHORT; else if (intRadio.isSelected()) return FileDumper6.INT; else if (longRadio.isSelected()) return FileDumper6.LONG; else if (floatRadio.isSelected()) return FileDumper6.FLOAT; else if (doubleRadio.isSelected()) return FileDumper6.DOUBLE; else return FileDumper6.ASC; } public String getPassword() { return password.getText(); } // A simple test method. public static void main(String[] args) { JFrame jf = new JFrame("Test Mode Panel"); ModePanel2 mp2 = new ModePanel2(); jf.getContentPane().add(mp2); jf.pack(); jf.show(); System.out.println("done"); } }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 16: Formatted I/O with java.text
- Content preview·Buy reprint rights for this chapterOne of the most obvious differences between Java and C is that Java has no equivalent of
printf()
orscanf()
. Part of the reason is that Java doesn't support the variable length argument lists on which these functions depend. However, the real reason Java doesn't have equivalents to C's formatted I/O routines is a difference in philosophy. C'sprintf()
and the like combine number formatting with I/O in an inflexible manner. Java separates number formatting and I/O into separate packages and by so doing produces a much more general and powerful system.More than one programmer has attempted to recreateprintf()
andscanf()
in Java. This task is difficult, since those functions are designed around variable length argument lists, which Java does not support. However, overloading the + signs for string concatenation is easily as effective, probably more so, since it doesn't share the problems of mismatched argument lists. For example, which is clearer to you? This:printf("%s worked %d hours at $%d per/hour for a total of %d dollars.\n", hours, salary, hours*salary);
or this:System.out.println(employee + " worked " + hours + " hours at $" + salary + "per/hour for a total of $%d.");
I'd argue that the second is clearer. Among other advantages, it avoids problems with mismatched format strings and argument lists. (Did you notice that an argument is missing from the previousprintf()
statement?) On the flip side, the format string approach is a little less prone to missing spaces. (Did you notice that theprintln()
statement would print pay scales as "$5.35per/hour" rather than "$5.35 per/hour"?) However, this is only a cosmetic problem and is easily fixed. A mismatched argument list in aprintf()
orscanf()
statement may crash the computer, especially if pointers are involved.The real advantage of theprintf()
/scanf()
family of functions is not the format string. It's number formatting:Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The Old Way
- Content preview·Buy reprint rights for this chapterTraditional computer languages have combined input of text with the parsing of numeric strings. For example, to read a decimal number into the variable
x
, programmers are accustomed to writing C code like this:scanf("%d", &x);
In C++, that line would become:cin >> x;
In Pascal:READLN (X);
In Fortran:READ 2, X 2 FORMAT (F5.1)
Similarly, formatting numeric strings for output tends to be mixed up with writing the string to the screen. For instance, consider the simple task of writing thedouble
variablesalary
with two decimal digits of precision. In C, you'd write this:printf("%.2d", salary);
In C++:cout.precision(2); cout << salary;
In Fortran:PRINT 20, SALARY 20 FORMAT(F10.2)
This conflation of basic input and output with number formatting is so ingrained in most programmers today that we rarely stop to think whether it actually makes sense. What, precisely, does the formatting of numbers as text strings have to do with input and output? It's certainly true that you often need to format numbers to print numbers on the console, but you also need to format numbers to write data in files, to include numbers in text fields and text areas, and to send data across the network. What makes the console so special that it has to have a group of number-formatting routines all to itself? In C, theprintf()
andscanf()
functions are supplemented byfprintf()
andfscanf()
for formatted I/O to files and bysprintf()
andsscanf()
for formatted I/O to strings. Perhaps the conflation of I/O with number formatting is really a relic of a time when command-line interfaces were a lot more important than they are today, and it's simply that nobody's thought to challenge this assumption, at least until Java. When you think about it, there's no fundamental connection between converting a binary number like 11010100110110100100011101011011 to a text string like " -7.500E+12" and writing that string onto an output stream. These are two different operations, and in Java they're handled by separate classes. Input and output are handled by all the streams and readers and writers I've been discussing, while number formatting is handled by a fewAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Choosing a Locale
- Content preview·Buy reprint rights for this chapterNumber formats are dependent on the locale ; that is, the country/language/ culture group of the local operating system. The number formats most English-speaking Americans are accustomed to use are a period as a decimal point, a comma to separate every three orders of magnitude, a dollar sign for currency, and numbers in base 10 that read from left to right. In this locale, Bill Gates's personal fortune, in Microsoft stock alone as of January 12, 1998, is represented as $74,741,086,650.However, in Egypt this number would be written as:The primary difference here is that Egyptians use a different set of glyphs for the digits through 9. For example, in Egypt zero is a
and the
glyph means 6. There are other differences in how Arabic and English treat numbers, and these vary from country to country. In most of the rest of North Africa, this number would be $74,741,086,650 as it is in the U.S. These are just two different scripts; there are several dozen more to go!
Java encapsulates many of the common differences between language/script/culture/country combinations in a loosely defined group called a locale. There's really no better word for it. You can't just rely on language or country or culture alone. Many languages are shared between countries (English is only the most obvious example) but with subtle differences between how they are used in different places: Do commas and periods belong inside or outside of quotation marks? Is it color or colour? Many countries have no clearly dominant tongue: Is Canada an English- or a French-speaking nation? Switzerland has four official languages. Almost all countries have significant minority populations with their own languages. The New York City public school system has to hire teachers fluent in over 100 different languages.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Number Formats
- Content preview·Buy reprint rights for this chapterTo print a formatted number in Java, perform these two steps:
-
Format the number as a string.
-
Print the string.
Simple, right? Of course, this is a little like the old recipe for rabbit stew:-
Catch a rabbit.
-
Boil rabbit in pot with vegetables and spices.
Obviously, step 1 is the tricky part. Fortunately, formatting numbers as strings is somewhat easier than catching a rabbit. The key class that formats numbers as strings isjava.text.NumberFormat
. This is an abstract subclass ofjava.text.Format
. Concrete subclasses such asjava.text.DecimalFormat
implement formatting policies for particular kinds of numbers.public abstract class NumberFormat extends Format implements Cloneable
The staticNumberFormat.getAvailableLocales()
method returns a list of all locales installed that provide number formats. (There may be a few locales installed that only provide date or text formats, not number formats.)public static Locale[] getAvailableLocales()
You can request aNumberFormat
object for the default locale of the host computer or for one of the specified locales in Table 16.1 using the staticNumberFormat.getInstance()
method. For example:NumberFormat myFormat = NumberFormat.getInstance(); NumberFormat canadaFormat = NumberFormat.getInstance(Locale.CANADA); Locale turkey = new Locale("tr", "carview.php?tsp="); NumberFormat turkishFormat = NumberFormat.getInstance(turkey); Locale swissItalian = new Locale("it", "CH"); NumberFormat swissItalianFormat = NumberFormat.getInstance(swissItalian);
The number format returned byNumberFormat.getInstance()
should do a reasonable job of formatting most numbers. However, there's at least a theoretical possibility that the instance returned will format numbers as currencies or percentages. Therefore, it wouldn't hurt to useAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! -
- Specifying Width with FieldPosition
- Content preview·Buy reprint rights for this chapterThe Java core API does not include any classes that pad numbers with spaces like the traditional I/O APIs in Fortran, C, and other languages. Part of the reason is that it's no longer a valid assumption that all output is written in a monospaced font on a VT-100 terminal. Therefore, spaces are insufficient to line up numbers in tables. Ideally, if you're writing tabular data in a GUI, you can use a real table component like
JTable
in the Java foundation classes. If that's not possible, you can measure the width of the string using aFontMetrics
object and offset the position at which you draw the string. And if you are outputting to a terminal or a monospaced font, then you can manually prefix the string with the right number of spaces.Thejava.text.FieldPosition
class separates strings into their component parts, called fields. (This is another unfortunate example of an overloaded term. These fields have nothing to do with the fields of a Java class.) For example, a typical date string can be separated into 18 fields including era, year, month, day, date, hour, minute, second, and so on. Of course, not all of these may be present in any given string. For example, 1999 CE includes only a year and an era field. The different fields that can be parsed are represented aspublic
final
static
int
fields (there's that annoying overloading again) in the corresponding format class. Thejava.text.DateFormat
class defines these kinds of fields as mnemonic constants:public static final int ERA_FIELD public static final int YEAR_FIELD public static final int MONTH_FIELD public static final int DATE_FIELD public static final int HOUR_OF_DAY1_FIELD public static final int HOUR_OF_DAY0_FIELD public static final int MINUTE_FIELD public static final int SECOND_FIELD public static final int MILLISECOND_FIELD public static final int DAY_OF_WEEK_FIELD public static final int DAY_OF_YEAR_FIELD public static final int DAY_OF_WEEK_IN_MONTH_FIELD public static final int WEEK_OF_YEAR_FIELD public static final int WEEK_OF_MONTH_FIELD public static final int AM_PM_FIELD public static final int HOUR1_FIELD public static final int HOUR0_FIELD public static final int TIMEZONE_FIELD
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Parsing Input
- Content preview·Buy reprint rights for this chapterNumber formats also handle input. When used for input, a number format converts a string in the appropriate format to a binary number, achieving more flexible conversions than you can get with the methods in the type wrapper classes (like
Integer.parseInt()
). For instance, a percent formatparse()
method can interpret 57% as 0.57 instead of 57. A currency format can read (12.45) as -12.45.There are threeparse()
methods in theNumberFormat
class. All do roughly the same thing:public Number parse(String text) throws ParseException public abstract Number parse(String text, ParsePosition parsePosition) public final Object parseObject(String source, ParsePosition parsePosition)
The firstparse()
method attempts to parse a number from the given text. If the text represents an integer, it's returned as an instance ofjava.lang.Long
. Otherwise, it's returned as an instance ofjava.lang.Double
. If a string contains multiple numbers, only the first one is returned. For instance, if you parse "32 meters" you'll get the number 32 back. Java throws away everything after the number finishes. If the text cannot be interpreted as a number in the given format, aParseException
is thrown. The secondparse()
method specifies where in the text parsing starts. The position is given by aParsePosition
object. This is a little more complicated than using a simpleint
but does have the advantage of allowing one to read successive numbers from the same string. The thirdparse()
method merely invokes the second. It's declared to returnObject
rather thanNumber
so that it can override the method of the same signature injava.text.Format
. If you know you're working with aNumberFormat
rather than aDateFormat
or some other nonnumeric format, there's no reason to use it.Thejava.text.ParsePosition
class has one constructor and two public methods:public ParsePosition(int index) public int getIndex() public void setIndex(int index)
This whole class is just a wrapper around anAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Decimal Formats
- Content preview·Buy reprint rights for this chapterThe
java.text
package contains a single concrete subclass ofNumberFormat
,DecimalFormat
. TheDecimalFormat
class provides even more control over how floating point numbers are formatted:public class DecimalFormat extends NumberFormat
Most number formats are in fact decimal formats. Generally, you can simply cast any number format to a decimal format, like this:DecimalFormat df = (DecimalFormat) NumberFormat.getCurrencyInstance();
At least in theory, you might encounter a nondecimal format. Therefore, you should useinstanceof
to test whether or not you've got aDecimalFormat
:NumberFormat nf = NumberFormat.getCurrencyInstance(); if (nf instanceof DecimalFormat) { DecimalFormat df = (DecimalFormat) NumberFormat.getCurrencyInstance(); //... }
Alternately, you can place the cast and associated operations in atry
/catch
block that catchesClassCastException
s:try { DecimalFormat df = (DecimalFormat) NumberFormat.getCurrencyInstance(); //... } catch (ClassCastException e) {System.err.println(e);}
EveryDecimalFormat
object has a pattern that describes how numbers are formatted and a list of symbols that describes with which characters they're formatted. This allows the singleDecimalFormat
class to be parameterized so that it can handle many different formats for different kinds of numbers in many locales. The pattern is given as an ASCII string. The symbols are provided by aDecimalFormatSymbols
object. These are accessed and manipulated through the following six methods:public DecimalFormatSymbols getDecimalFormatSymbols() public void setDecimalFormatSymbols(DecimalFormatSymbols newSymbols) public String toPattern() public String toLocalizedPattern() public void applyPattern(String pattern) public void applyLocalizedPattern(String pattern)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - An Exponential Number Format
- Content preview·Buy reprint rights for this chapterThe
DecimalFormat
class is useful for medium-sized numbers, but it doesn't work very well for exceptionally large numbers like Avogadro's number (6,022,094,300,000,000,000,000,000) or exceptionally small numbers like Planck's constant (0.00000000000000000000000000625 erg-seconds). These are traditionally written in scientific notation as a decimal number times 10 to a certain power, positive or negative; for example, 6.0220943 × 1023 and 6.25 × 10-27 erg-seconds. In most programming languages, including Java, an E followed by either a + or a - is used to represent "× 10 to the power"; for example, 6.0220943E+23 or 6.25E-27 erg-seconds.Thejava.text
package does not provide support for formatting numbers in scientific notation, so as the final example of this chapter, I'll develop a new subclass ofNumberFormat
that does use scientific notation. Technically, scientific notation requires exactly one nonzero digit before the decimal point, but I'll be a little more general than that, providing for numbers like 13.2E-8 as well.TheNumberFormat
class is abstract. It declares three abstract methods any subclass must implement:public abstract StringBuffer format(double number, StringBuffer toAppendTo, FieldPosition pos) public abstract StringBuffer format(long number, StringBuffer toAppendTo, FieldPosition pos) public abstract Number parse(String text, ParsePosition parsePosition)
The two format methods must format along
and adouble
respectively, update theFieldPosition
object with the locations of the different fields, append the formatted string to the string buffertoAppendTo
, and return that same string buffer. Theparse()
method must read a number in scientific notation, convert it to ajava.lang.Number
(that is, ajava.lang.Long
or ajava.lang.Double
) and return that.The concrete formatting methods inNumberFormat
all invoke these methods, so they may be kept as is rather than being overridden. However, it would not hurt to overrideAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Chapter 17: The Java Communications API
- Content preview·Buy reprint rights for this chapterThis chapter covers the Java Communications API 2.0, a standard extension available in Java 1.1 and later that allows Java applications (but not applets) to send and receive data to and from the serial and parallel ports of the host computer. The Java Communications API allows Java programs to communicate with essentially any device connected to a serial or parallel port, like a printer, a scanner, a modem, a tape backup unit, and so on. The Comm API operates at a very low level. It only understands how to send and receive bytes to these ports. It does not understand anything about what these bytes mean. Doing useful work generally requires not only understanding the Java Communications API (which is actually quite simple) but also the protocols spoken by the devices connected to the ports (which can be almost arbitrarily complex).Because the Java Communications API is a standard extension, it is not installed by default with the JDK. You have to download it from
https://java.sun.com/products/javacomm/index.html
and install it separately.This chapter is based on the first beta of the Java Communications API. It is almost certain that some parts of this chapter will become inaccurate by the time you read this. Indeed, throughout the process of writing this chapter, I identified a number of bugs and inconsistencies that I forwarded to Sun. They even fixed a few in between early access 3 and beta 1. If you have trouble with anything you see here, cross-check it with the most up-to-date documentation from Sun. I'll also try to post minor corrections on my web site athttps://metalab.unc.edu/javafaq/books/javaio/
.The Java Communications API contains a single package,javax.comm
, which holds a baker's dozen of classes, exceptions, and interfaces. Because the Comm API is a standard extension, thejavax
prefix is used instead of thejava
prefix. The Java Comm API also includes a DLL, or shared library, containing the native code to communicate with the ports, and a few driver classes in theAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The Architecture of the Java Communications API
- Content preview·Buy reprint rights for this chapterBecause the Java Communications API is a standard extension, it is not installed by default with the JDK. You have to download it from
https://java.sun.com/products/javacomm/index.html
and install it separately.This chapter is based on the first beta of the Java Communications API. It is almost certain that some parts of this chapter will become inaccurate by the time you read this. Indeed, throughout the process of writing this chapter, I identified a number of bugs and inconsistencies that I forwarded to Sun. They even fixed a few in between early access 3 and beta 1. If you have trouble with anything you see here, cross-check it with the most up-to-date documentation from Sun. I'll also try to post minor corrections on my web site athttps://metalab.unc.edu/javafaq/books/javaio/
.The Java Communications API contains a single package,javax.comm
, which holds a baker's dozen of classes, exceptions, and interfaces. Because the Comm API is a standard extension, thejavax
prefix is used instead of thejava
prefix. The Java Comm API also includes a DLL, or shared library, containing the native code to communicate with the ports, and a few driver classes in thecom.sun.comm
package that mostly handle the vagaries of Unix or Wintel ports. Other vendors may need to muck around with these if they're porting the Comm API to another platform (e.g., the Mac or OS/2), but as a user of the API, you'll only concern yourself with the documented classes injavax.comm
.javax.comm
is divided into high-level and low-level classes. High-level classes are responsible for controlling access to and ownership of the communication ports and performing basic I/O. TheCommPortIdentifier
class lets you find and open the ports available on a system. TheCommPort
class provides input and output streams connected to the ports. Low-level classes—javax.comm.SerialPort
andjavax.comm.ParallelPort
, for example—manage interaction with particular kinds of ports and help you read and write the control wires on the ports. They also provide event-based notification of changes to the state of the port.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Identifying Ports
- Content preview·Buy reprint rights for this chapterThe
javax.comm.CommPortIdentifier
class is the control room for the ports on a system. It has methods that list the available ports, figure out which program owns them, take control of a port, and open a port so you can perform I/O with it. The actual I/O, stream-based or otherwise, is performed through an instance ofjavax.comm.CommPort
that represents the port in question. The purpose ofCommPortIdentifier
is to mediate between different programs, objects, or threads that want to use the same port.Before you can use a port, you need a port identifier for the port. Because the possible port identifiers are closely tied to the physical ports on the system, you cannot simply construct an arbitraryCommPortIdentifier
object. (For instance, Macs have no parallel ports, and iMacs don't have serial or parallel ports.) Instead, you use one of several static methods injavax.comm.CommPortIdentifier
that use native methods and nonpublic constructors to find and create the right port. These include:public static Enumeration getPortIdentifiers() public static CommPortIdentifier getPortIdentifier(String portName) throws NoSuchPortException public static CommPortIdentifier getPortIdentifier(CommPort port) throws NoSuchPortException
The most general of these isCommPortIdentifier.getPortIdentifiers()
, which returns ajava.util.Enumeration
containing oneCommPortIdentifier
for each of the ports on the system. Example 17.1 uses this method to list all the ports on the system.Example 17.1. PortListerimport javax.comm.*; import java.util.*; public class PortLister { public static void main(String[] args) { Enumeration e = CommPortIdentifier.getPortIdentifiers(); while (e.hasMoreElements()) { System.out.println((CommPortIdentifier) e.nextElement()); } } }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Communicating with a Device on a Port
- Content preview·Buy reprint rights for this chapterThe
open()
method of theCommPortIdentifier
class returns aCommPort
object. Thejavax.comm.CommPort
class has methods for getting input and output streams from a port and for closing the port. There are also a number of driver-dependent methods for adjusting the properties of the port.There are five basic steps to communicating with a port:-
Open the port using the
open()
method ofCommPortIdentifier
. If the port is available, this returns aCommPort
object. Otherwise, aPortInUseException
is thrown. -
Get the port's output stream using the
getOutputStream()
method ofCommPort
. -
Get the port's input stream using the
getInputStream()
method ofCommPort
. -
Read and write data onto those streams as desired.
-
Close the port using the
close()
method ofCommPort
.
Steps 2 through 4 are new. However, they're not particularly complex. Once the connection has been established, you simply use the normal methods of any input or output stream to read and write data. ThegetInputStream()
andgetOutputStream()
methods ofCommPort
are similar to the methods of the same name in thejava.net.URL
class. The primary difference is that with Comm ports, you're completely responsible for understanding and handling the data that's sent to you. There are no content or protocol handlers that perform any manipulation of the data. If the device attached to the port requires a complicated protocol—for example, a fax modem—then you'll have to handle the protocol manually.public abstract InputStream getInputStream() throws IOException public abstract OutputStream getOutputStream() throws IOException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! -
- Serial Ports
- Content preview·Buy reprint rights for this chapterThe
javax.comm.SerialPort
class is an abstract subclass ofCommPort
that provides various methods and constants useful for working with RS-232 serial ports and devices. The main purposes of the class are to allow the programmer to inspect, adjust, and monitor changes in the settings of the serial port. Simple input and output is accomplished with the methods of the superclass,CommPort
.SerialPort
has a public constructor, but that shouldn't be used by applications. Instead, you should call theopen()
method of aCommPortIdentifier
that maps to the port you want to communicate with, then cast the result toSerialPort
. For example:CommPortIdentifier cpi = CommPortIdentifier.getPortIdentifier("COM2"); if (cpi.getType() == CommPortIdentifier.PORT_SERIAL) { try { SerialPort modem = (SerialPort) cpi.open(); } catch (PortInUseException e) {} }
Methods in theSerialPort
class fall into roughly three categories:-
Methods that return the state of the port
-
Methods that set the state of the port
-
Methods that listen for the changes in the state of the port
Data cannot simply be sent over a wire; you need to deal with many issues, like timing, noise, and the fundamentally analog nature of electronics. Therefore, there's a host of layered protocols so that the receiving end can recognize when data is being sent, whether the data was received correctly, and more.Serial communication uses some very basic, simple protocols. Sending between 3 and 25 volts across the serial cable for a number of nanoseconds inversely proportional to the baud rate of the connection is a one bit. Sending between -3 and -25 volts for the same amount of time is a bit. These bits are grouped into serial data units, SDUs for short. Common SDU lengths are 8 (used for binary data) and 7 (used for basic ASCII text). Most modern devices use eight data bits per SDU. However, some older devices use seven, six, or even five data bits per SDU. Once an SDU is begun, the rest of the SDU follows in close order. However, there may be gaps of indeterminate length between SDUs.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! -
- Parallel Ports
- Content preview·Buy reprint rights for this chapterParallel ports are most common on PCs. Sun SparcStations from the Sparc V on also have them. However, Macs do not have them, nor do many non-x86 workstations. Parallel ports are sometimes called printer ports, because their original purpose was to support printers. The names of the parallel ports—"LPT1," "LPT2," etc.—stand for "Line PrinTer," reflecting this usage. Nowadays, parallel ports are also used for Zip drives, tape drives, and various other devices. However, parallel ports are still largely limited by their original goal of providing simple printing. A parallel port sends data eight bits at a time on eight wires. These bits are sent at the same time in parallel, hence the name. The original parallel ports only allowed data to flow one way, from the PC to the printer. The printer could only respond by sending a few standard messages on other wires. Each return wire corresponded to a particular message, like "Out of paper" or "Printer busy." Modern parallel ports allow full, bidirectional communication.The
javax.comm.ParallelPort
class is a concrete subclass ofjavax.comm.CommPort
that provides various methods and constants useful for working with parallel ports and devices. The main purposes of the class are to allow the programmer to inspect, adjust, and monitor changes in the settings of the parallel port. Simple input and output are accomplished with the methods of the superclass,CommPort
.ParallelPort
has a single public constructor, but that shouldn't be used by applications. Instead, you should simply call theopen()
method of aCommPortIdentifier
that maps to the port you want to communicate with, then cast it toParallelPort
:CommPortIdentifier cpi = CommPortIdentifier.getPortIdentifier("LPT2"); if (cpi.getType() == CommPortIdentifier.PORT_PARALLEL) { try { ParallelPort printer = (ParallelPort) cpi.open (); } catch (PortInUseException e) {} }
Methods in theParallelPort
class fall into roughly four categories:Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Appendix A: Additional Resources
- Content preview·Buy reprint rights for this chapterWhen I began work on this book, I thought it would take me about 200 pages and about two months. Now, more than a year and 500 pages later, I can see that I/O is a far larger, more important, and more encompassing topic than I originally guessed. Many chapters could easily lead to books of their own. Indeed, several (Chapter 5, and Chapter 10) already are other books.Since I can't possibly say everything there is to say about all these fascinating topics I've touched on in one page or another in this tome, I'd like to point you to several books, mailing lists, and web sites that explore some of the issues raised in this book in greater detail. Some of these are I/O-specific; some are mostly tangential. However, they're all interesting and worthy of further study and thought.
Section A.1: Digital Think
Section A.2: Design Patterns
Section A.3: The java.io Package
Section A.4: Network Programming
Section A.5: Data Compression
Section A.6: Encryption and Related Technology
Section A.7: Object Serialization
Section A.8: International Character Sets and Unicode
Section A.9: Java Communications API
Section A.10: Updates and Breaking News
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Digital Think
- Content preview·Buy reprint rights for this chapterDigital Think (
https://www.digitalthink.com/
) offers web-based training courses for programmers, developers, system administrators, and end users in C, C++, Java, Windows, web development, object-oriented programming, and more. This book grew out of two web-based courses I wrote for Digital Think, Java Streams (https://www.digitalthink.com/catalog/cs/cs108/
) and Java Readers and Writers (https://www.digitalthink.com/catalog/cs/cs208/
). Although this book is far more comprehensive than those two courses, they're a good way to get started with this material, especially if you think you need a personal helping hand or a leg up. Each course includes graded exercises, a hands-on course project, and tutors to answer your questions and assist you with the difficult parts.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Design Patterns
- Content preview·Buy reprint rights for this chapterAt the time I was writing the first draft of this book, I also happened to be learning about design patterns. Gradually, it became obvious that much of the AWT was written by programmers who had patterns on the brain. The
java.awt.Toolkit
class is a textbook example of the "abstract factory" pattern. TheURL
class'sopenConnection()
method is a factory method. TheReader
andWriter
classes are decorators on top ofInputStream
andOutputStream
. The engine classes in the JCE are proxies, and I could cite many more examples. Much of the class library—including thejava.io
package—has been designed with design patterns, and it will all make a lot more sense if you're familiar with the standard patterns.The seminal text on the subject is Design Patterns, by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (Addison-Wesley, 1995). The four authors are colloquially known as the "Gang of Four," and the book is often cited informally as "GoF." The 23 patterns covered in GoF are rapidly becoming part of the vocabulary of the object-oriented programming community. Design patterns are also beginning to be covered in many more introductory books about object-oriented programming and Java.There are also several extremely active mailing lists and web sites devoted to design patterns. To subscribe to the patterns@cs.uiuc.edu list send email to patterns-request@cs.uiuc.edu with the word "subscribe" in the Subject: field. Archives of this and several related lists may be perused athttps://www.DistributedObjects.com/portfolio/archives/patterns/index.html
.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - The java.io Package
- Content preview·Buy reprint rights for this chapterThe original source for much of the information contained herein about I/O is the javadoc documentation for the ja
va.io
package. You should have downloaded this with the JDK, but it's also available online at:https://java.sun.com/products/jdk/1.2/docs/api/java/io/package-summary.html
(Java 1.2)
https://java.sun.com/products/jdk/1.1/docs/api/Package-java.io.html
(Java 1.1)
https://java.sun.com/products/jdk/1.0.2/api/Package-java.io.html
(Java 1.0)
The class library documentation is, however, woefully incomplete. While it explains what each method does, it often fails to explain how, why, or when you should use those methods. Furthermore, it only occasionally discusses assumptions about the behavior of those methods—assumptions that are crucial for anyone not merely using but also subclassing particular classes. There are many implicit assumptions about what particular methods should do (for instance, that aclose()
method of a filter input stream also closes any other streams it's connected to), and these are generally not documented anywhere (or at least they weren't until I wrote this book).I've tried to document all of these assumptions in this book, but if you're faced with a new class not covered here, the canonical reference is the source code itself. The JDK includes Java source code for thejava
packages. You'll find it in a file called src.zip in your JDK distribution. Sometimes the only way to figure out exactly what Sun intended particular classes to do or how they expected them to do it is to read the source code for those classes.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Network Programming
- Content preview·Buy reprint rights for this chapterIn many ways this book is a prequel to my previous book with O'Reilly, Java Network Programming. Although written first, Java Network Programming presumes a solid familiarity with input and output, streams, and readers and writers as discussed in this book. Java Network Programming explains the fundamental protocols and technology that underlie the Internet, shows you how to communicate with sockets, provides detailed examples of working network clients and servers, and even develops content and protocol handlers. If you want to learn more about TCP/IP, HTTP, URLs, sockets and server sockets, and other elements of Internet programming in Java, you should definitely pick up Java Network Programming. (There's probably an ad for it in the back of this very book.)The Centre for Distance-spanning Technology (CDT) runs the unmoderated java-networking@cdt.luth.se list for informal discussion of Java network programming, which I participate in. To subscribe, send an email containing the word "subscribe" in the body of the message to java-networking-request@cdt.luth.se. An archive of the list and complete instructions are available from
https://www.cdt.luth.se/~peppar/java/java-networking-list/
.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Data Compression
- Content preview·Buy reprint rights for this chapterJava supports several related compression formats, including zlib, deflate, and gzip. These formats are documented in RFCs 1950, 1951, and 1952, and are available wherever RFCs are found, including
https://www.faqs.org/rfcs/
. The master site for these particular RFCs isftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html
.Java's compression classes are native wrappers around the ZLIB compression library written by Jean-Loup Gailly and Mark Adler. You can learn about this library athttps://www.cdrom.com/pub/infozip/zlib/
.For more general information about compression and archiving algorithms and formats, thecomp.compression
FAQ is a good place to start. Seehttps://www.faqs.org/faqs/compression-faq/part1/preamble.html
. More technical details and sample code in C for a variety of algorithms are available in The Data Compression Book, by Mark Nelson and Jean-Loup Gailly (M&T Books, 1996, ISBN 1-55851-434-1).The JAR file format was developed by Sun for Java. The full specification can be found athttps://java.sun.com/products/jdk/1.2/docs/guide/jar/jarGuide.html
( Java 2) orhttps://java.sun.com/products/jdk/1.1/docs/guide/jar/jarGuide.html
( Java 1.1). Aside from the name, the only thing that really distinguishes a JAR file from a zip file is the optional manifest of the contents. The manifest format specification can be found athttps://java.sun.com/products/jdk/1.2/docs/guide/jar/manifest.html
.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Encryption and Related Technology
- Content preview·Buy reprint rights for this chapterChapter 10 only began to explore the fascinating subject of cryptography. The JCE is explicated in much more detail by Jonathan Knudsen in Java Cryptography (O'Reilly & Associates, 1998) Java Cryptography expands on the coverage of the
Cipher
andMessageDigest
classes you'll find in this book. It also includes thorough discussions of thejava.security
package and the Java Cryptography Extension (JCE), showing you how to use security providers and even implement your own provider. It discusses authentication, key management, and public and private key encryption and includes a secure talk application that encrypts all data sent over the network. If you write Java programs that communicate sensitive data, you'll find this book indispensable.For a more in-depth look at the mathematics and protocols that underlie the JCE, you'll want to check out Bruce Schneier's Applied Cryptography (John Wiley & Sons, 1995). This is the standard practical text on cryptographic protocols and algorithms, and the attacks on them. Schneier discusses a wide range of cryptographic algorithms, key management and exchange schemes, one-way hash functions, signature algorithms, and many other problems in sufficient detail to allow a competent programmer to implement them. Although Schneier's language of choice is C, the techniques discussed are applicable in any language.The formal specification of the Java Cryptography API is available from Sun athttps://java.sun.com/products/jdk/1.2/docs/guide/security/CryptoSpec.html
. The actual implementation is in beta at the time of this writing and can be downloaded fromhttps://developer.java.sun.com/developer/earlyAccess/jdk12/jce.html
.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Object Serialization
- Content preview·Buy reprint rights for this chapterSun's serialization web page at
https://java.sun.com/products/jdk/1.2/docs/guide/serialization/
includes a FAQ list, sample code, and the complete object serialization specification. The specification covers serialization as implemented in Java 1.2, which is mostly upward-compatible with the Java 1.1 serialization discussed in Chapter 11. An earlier prebeta specification that covers Java 1.0.2 serialization is posted athttps://java.sun.com/products/jdk/rmi/doc/serial-spec/serialTOC.doc.html
. A formal specification of Java 1.1 serialization was never published. However, the Java 1.2 spec is mostly the same, with the addition of a few extra features like thereadResolve()
method.Sun's formal specification for object serialization is not always clear, especially when it comes to motivating the more esoteric areas of serialization likeObjectInputValidation
. However, it is complete and does add some to what I discussed in Chapter 11, including the binary protocol for serialized objects and .ser files.Object serialization was originally developed to support Remote Method Invocation (RMI), an architecture that allows Java objects in one virtual machine to invoke methods on objects in another virtual machine, possibly running on a different computer somewhere else on the Internet. RMI is discussed briefly in Chapter 14 of my Java Network Programming and at great length in Jim Farley's Java Distributed Computing (O'Reilly & Associates, 1998, ISBN 1-56592-206-9).Object serialization is also used extensively as part of the JavaBeans component software architecture, a standard part of Java 1.1 and later. To learn more about this, I recommend you pick up Robert Englander's Developing Java Beans (O'Reilly & Associates, 1997, ISBN 1-56592-289-1) or my own JavaBeans: Developing Component Software in Java (IDG Books, 1997, ISBN 0-76458-052-3).Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - International Character Sets and Unicode
- Content preview·Buy reprint rights for this chapterThe canonical reference to Unicode is The Unicode Standard, Version 2.0 (Addison-Wesley, 1996, ISBN 0-201-48345-9). This book features detailed analysis of the Unicode standard as well as discussion of the difficulties of defining character sets for all the world's different languages. It's also got tables of almost all the defined characters in Unicode, including about 20,000 Han ideographs. The size of the book and the large number of interesting tables of different scripts from around the world make it a good choice for a techie coffee-table book that can even amuse your liberal arts friends. Updates, corrections, and errata to that volume are available on the Web at
https://www.unicode.org/
.There's no single source of information for all the different non-Unicode character sets Java readers and writers can translate. However, most of the Windows character sets are enumerated in Developing International Software for Windows 95 and NT, by Nadine Kano (Microsoft Press, 1995, ISBN 1-55615-840-8). Kano ignores non-Windows platforms, and she does occasionally sound too much like a Microsoft press release. Nonetheless, this book contains a lot of useful details about how various localized versions of Windows operate. This book is also available on the MSDN Online Library web site athttps://premium.microsoft.com/msdn/library/
. Registration is required, but otherwise it's free. Assuming Microsoft hasn't added an actually navigable interface to MSDN by the time you read this, you'll find it by clicking on "Books" in the lefthand frame, then clicking on "Developing International Software." (I normally wouldn't bother you with such details, but the interface really is painfully obscure.)Roman Czyborra maintains a lot of useful information about various ISO 8859 and Cyrillic character sets on his web site athttps://czyborra.com/
, including charts of a wide range of character sets and code pages.Ken Lunde's CJKV Information Processing: Chinese, Japanese, Korean & Vietnamese ComputingAdditional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Java Communications API
- Content preview·Buy reprint rights for this chapterThis may well be the first book to cover the Java Communications API. Sun includes a limited amount of documentation with the Java Communications API itself, mostly javadoc class library documentation. The latter is also available from Sun's web site at
https://java.sun.com/products/javacomm/javadocs/Package-javax.comm.html
.The RS-232 serial port and IEEE 1284 parallel port standards predate the Web and widespread use of the Internet. Thus, these standards are still available only on dead trees for the moment. A number of books do cover them in reasonable detail, including Scott Mueller's Upgrading and Repairing PCs, 10th edition (Que, 1998, ISBN 0-7897-1636-4).Several books discuss writing port-aware programs in a variety of languages. Although none yet use Java, it's generally not hard to translate from the low-level C or Basic code to the equivalent code that uses the Java Communications API. The best book I've found for parallel ports is Jan Axelson's Parallel Port Complete (Lakeview Research, 1996, ISBN 096508191-5).There are more choices for serial port books, but the most comprehensive one is certainly Joe Campbell's C Programmer's Guide to Serial Communications (Sams, 1993, ISBN 0-672-30286-1). Despite the title, the first half of this 900-page tome is an exhaustive treatment of more or less language-independent serial communication hardware and protocols from 19th-century telegraphy to the present day.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Updates and Breaking News
- Content preview·Buy reprint rights for this chapterIn the fast-moving world of Java, it's an effort to publish a book that isn't out of date by the time it reaches store shelves. Most of what I've written about in this book seems fairly stable. However, there will undoubtedly by many new developments after publication. The following three web sites can help you stay abreast of new technologies and strategies for Java I/O.My Café au Lait site at
https://metalab.unc.edu/javafaq/
features almost daily news updates about Java topics. I pay special attention to new material that's closely related to my books, like I/O and networking libraries. Café au Lait also features many resources to help you develop your Java programming skills, including FAQ lists, tutorials, course notes, examples, exercises, book reviews, and more. Of particular interest will be the Java I/O page athttps://metalab.unc.edu/javafaq/books/javaio/
. I'll post corrections and updates to this book there as necessary.O'Reilly's official Java site athttps://java.oreilly.com/
contains feature articles and links to the official O'Reilly sites for all our Java books. You can peruse the rather impressive O'Reilly Java catalog (18 books and counting) and view descriptions, author bios, tables of contents, indexes, reviews, exercises, examples, errata, and reader comments for all the books (including this one).I/O isn't the sexiest topic in the programming community, but it is one of the most important. IDG's JavaWorld (https://www.javaworld.com/
) is to be commended for treating I/O on an equal footing with sexier topics like JavaBeans and the Java Media APIs. JavaWorld publishes monthly how-to articles, book reviews, news, and more. They're particularly notable for providing short, technical articles that show you how to do things Sun's only hinted at and how to work around common problems programmers face.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing! - Appendix B: Character Sets
- Content preview·Buy reprint rights for this chapterThe first 128 Unicode characters—that is, characters through 127—are identical to the ASCII character set. 32 is the ASCII space; therefore, 32 is the Unicode space. 33 is the ASCII exclamation point; therefore, 33 is the Unicode exclamation point, and so on. Table 2.1 lists this character set.In the first column, characters through 31 are referred to as control characters, because they're traditionally entered by holding down the control key and a letter key (on at least some dumb terminals). For instance Ctrl-H is often ASCII 8, backspace. Ctrl-S is often mapped to ASCII 19, DC3 or XOFF. Ctrl-Q is often mapped to ASCII 17, DC1 or XON. Generally, each control character is entered by pressing the Control key and the printable character whose ASCII value is the ASCII value of the character you want plus 64 (or 96, if you count from the capitals). Character 127, delete, is also a control character.The common abbreviation for the character is given first, followed by its common meaning. Some of these codes are pretty much obsolete. For instance, I'm not aware of any modern OS that actually uses characters 28 through 31 as file, group, record, and unit separators. Those control codes that are still used often have different meanings on different platforms. For example, character 10, the linefeed, originally meant move the platen on the printer up one line, while character 13, the carriage return, meant return the print-head to the beginning of the line. On paper-based teletype terminals, this could be used to position the print-head anywhere on a page and perhaps overtype characters that had already been typed. This no longer makes sense in an era of glass terminals and GUIs, so linefeed has come to mean a generic end-of-line character.The next 128 Unicode characters—that is 128 through 255—have the same values as the equivalent characters in the Latin-1 character set defined in ISO standard 8859-1. Latin-1, a slight variation of which is used by Windows, adds the various accented characters, umlauts, cedillas, upside-down question marks, and other characters needed to write text in most Western European languages. Table 2.2 shows these characters. The first 128 characters in Latin-1 are the ASCII characters shown in Table 2.1.Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Return to Java I/O
About O'Reilly | Contact | Jobs | Press Room | How to Advertise | Privacy Policy
|
© 2008, O'Reilly Media, Inc. | (707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.