Carview!

CARVIEW

MOTORHOMES

Select Language

HTTP/2 302 server: nginx date: Tue, 05 Aug 2025 12:26:59 GMT content-type: text/plain; charset=utf-8 content-length: 0 x-archive-redirect-reason: found capture at 20080216010324 location: https://web.archive.org/web/20080216010324/https://www.oreilly.com/catalog/cinanut/toc.html server-timing: captures_list;dur=0.626527, exclusion.robots;dur=0.020757, exclusion.robots.policy;dur=0.010179, esindex;dur=0.012193, cdx.remote;dur=21.807354, LoadShardBlock;dur=269.251315, PetaboxLoader3.datanode;dur=144.601725, PetaboxLoader3.resolve;dur=82.762206 x-app-server: wwwb-app222 x-ts: 302 x-tr: 318 server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0 set-cookie: wb-p-SERVER=wwwb-app222; path=/ x-location: All x-rl: 0 x-na: 0 x-page-cache: MISS server-timing: MISS x-nid: DigitalOcean referrer-policy: no-referrer-when-downgrade permissions-policy: interest-cohort=() HTTP/2 200 server: nginx date: Tue, 05 Aug 2025 12:27:01 GMT content-type: text/html x-archive-orig-date: Sat, 16 Feb 2008 01:03:24 GMT x-archive-orig-server: Apache x-archive-orig-p3p: policyref="https://www.oreillynet.com/w3c/p3p.xml",CP="CAO DSP COR CURa ADMa DEVa TAIa PSAa PSDa IVAa IVDa CONo OUR DELa PUBi OTRa IND PHY ONL UNI PUR COM NAV INT DEM CNT STA PRE" x-archive-orig-last-modified: Fri, 15 Feb 2008 12:13:13 GMT x-archive-orig-accept-ranges: bytes x-archive-orig-content-length: 709294 x-archive-orig-x-cache: MISS from olive.bp x-archive-orig-x-cache-lookup: MISS from olive.bp:3128 x-archive-orig-via: 1.0 olive.bp:3128 (squid/2.6.STABLE13) x-archive-orig-connection: close x-archive-guessed-content-type: text/html x-archive-guessed-charset: utf-8 memento-datetime: Sat, 16 Feb 2008 01:03:24 GMT link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first memento"; datetime="Sun, 10 Sep 2006 23:59:01 GMT", ; rel="prev memento"; datetime="Tue, 15 Jan 2008 13:36:25 GMT", ; rel="memento"; datetime="Sat, 16 Feb 2008 01:03:24 GMT", ; rel="next memento"; datetime="Wed, 30 Apr 2008 06:49:09 GMT", ; rel="last memento"; datetime="Sat, 22 Nov 2008 13:43:38 GMT" content-security-policy: default-src 'self' 'unsafe-eval' 'unsafe-inline' data: blob: archive.org web.archive.org web-static.archive.org wayback-api.archive.org athena.archive.org analytics.archive.org pragma.archivelab.org wwwb-events.archive.org x-archive-src: 51_2_20080215223126_crawl103-c/51_2_20080216010031_crawl100.arc.gz server-timing: captures_list;dur=0.621235, exclusion.robots;dur=0.023188, exclusion.robots.policy;dur=0.010142, esindex;dur=0.012480, cdx.remote;dur=23.071175, LoadShardBlock;dur=598.129619, PetaboxLoader3.resolve;dur=1732.824132, PetaboxLoader3.datanode;dur=125.406705, load_resource;dur=1330.600014 x-app-server: wwwb-app222 x-ts: 200 x-tr: 2409 server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0 x-location: All x-rl: 0 x-na: 0 x-page-cache: MISS server-timing: MISS x-nid: DigitalOcean referrer-policy: no-referrer-when-downgrade permissions-policy: interest-cohort=() content-encoding: gzip O'Reilly Media | C in a Nutshell

Buy this Book

Print Book $39.95

PDF $27.99

PDF Chapter $3.99

Read it Now!

Print Book £28.50

Reprint Licensing

-- Please select a chapter from the Table of Contents and click the button above to begin the licensing process.

Tell a friend

C in a Nutshell

By Peter Prinz, Tony Crawford
Book Price: $39.95 USD
£28.50 GBP
PDF Price: $27.99

Cover | Table of Contents | Colophon

Chapter 1: Language Basics

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

This chapter describes the basic characteristics and elements of the C programming language.

C is a general-purpose, procedural programming language. Dennis Ritchie first devised C in the 1970s at AT&T Bell Laboratories in Murray Hill, New Jersey, for the purpose of implementing the Unix operating system and utilities with the greatest possible degree of independence from specific hardware platforms. The key characteristics of the C language are the qualities that made it suitable for that purpose:

Source code portability
The ability to operate "close to the machine"
Efficiency

As a result, the developers of Unix were able to write most of the operating system in C, leaving only a minimum of system-specific hardware manipulation to be coded in assembler.

C's ancestors are the typeless programming languages BCPL (the Basic Combined Programming Language), developed by Martin Richards; and B, a descendant of BCPL, developed by Ken Thompson. A new feature of C was its variety of data types : characters, numeric types, arrays, structures, and so on. Brian Kernighan and Dennis Ritchie published an official description of the C programming language in 1978. As the first de facto standard, their description is commonly referred to simply as "K&R." C owes its high degree of portability to a compact core language that contains few hardware-dependent elements. For example, the C language proper has no file access or dynamic memory management statements . In fact, there aren't even any statements for console input and output. Instead, the extensive standard C library provides the functions for all of these purposes.

This language design makes the C compiler relatively compact and easy to port to new systems. Furthermore, once the compiler is running on a new system, you can compile most of the functions in the standard library with no further modification, because they are in turn written in portable C. As a result, C compilers are available for practically every computer system.

Because C was expressly designed for system programming, it is hardly surprising that one of its major uses today is in programming embedded systems. At the same time, however, many developers use C as a portable, structured high-level language to write programs such as powerful word processor, database, and graphics applications.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Characteristics of C

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Source code portability
The ability to operate "close to the machine"
Efficiency

As a result, the developers of Unix were able to write most of the operating system in C, leaving only a minimum of system-specific hardware manipulation to be coded in assembler.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The Structure of C Programs

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The procedural building blocks of a C program are functions, which can invoke one another. Every function in a well-designed program serves a specific purpose. The functions contain statements for the program to execute sequentially, and statements can also be grouped to form block statements, or blocks. As the programmer, you can use the ready-made functions in the standard library, or write your own whenever no standard function fulfills your intended purpose. In addition to the standard C library, there are many specialized libraries available, such as libraries of graphics functions. However, by using such nonstandard libraries, you limit the portability of your program to those systems to which the libraries themselves have been ported.

Every C program must define at least one function of its own, with the special name main(): this is the first function invoked when the program starts. The main() function is the program's top level of control, and can call other functions as subroutines.

Example 1-1 shows the structure of a simple, complete C program. We will discuss the details of declarations, function calls, output streams and more elsewhere in this book. For now, we are simply concerned with the general structure of the C source code. The program in Example 1-1 defines two functions, main() and circularArea(). The main() function calls circularArea() to obtain the area of a circle with a given radius, and then calls the standard library function printf() to output the results in formatted strings on the console.

Example 1-1. A simple C program

// circle.c: Calculate and print the areas of circles
#include <stdio.h>                // Preprocessor directive
double circularArea( double r );  // Function declaration (prototype form)
int main()                        // Definition of main() begins
{
  double radius = 1.0, area = 0.0;
  printf( "    Areas of Circles\n\n" );
  printf( "     Radius          Area\n"
          "-------------------------\n" );
  area = circularArea( radius );
  printf( "%10.1f     %10.2f\n", radius, area );
  radius = 5.0;
  area = circularArea( radius );
  printf( "%10.1f     %10.2f\n", radius, area );
  return 0;
}
// The function circularArea() calculates the area of a circle
// Parameter:    The radius of the circle
// Return value: The area of the circle
double circularArea( double r )      // Definition of circularArea() begins
{
  const double pi = 3.1415926536;    // Pi is a constant
  return  pi * r * r;
}

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Source Files

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The function definitions, global declarations and preprocessing directives make up the source code of a C program. For small programs, the source code is written in a single source file. Larger C programs consist of several source files . Because the function definitions generally depend on preprocessor directives and global declarations, source files usually have the following internal structure:

Preprocessor directives
Global declarations
Function definitions

C supports modular programming by allowing you to organize a program in as many source and header files as desired, and to edit and compile them separately. Each source file generally contains functions that are logically related, such as the program's user interface functions. It is customary to label C source files with the filename suffix .c .

Examples 1-2 and 1-3 show the same program as Example 1-1, but divided into two source files.

Example 1-2. The first source file, containing the main() function

// circle.c: Prints the areas of circles.
// Uses circulararea.c for the math
#include <stdio.h>
double circularArea( double r );
int main()
{
  /* ... As in Example 1-1 ... */
}

Example 1-3. The second source file, containing the circularArea() function

// circulararea.c: Calculates the areas of circles.
// Called by main() in circle.c
double circularArea( double r )
{
  /* ... As in Example 1-1 ... */
}

When a program consists of several source files, you need to declare the same functions and global variables, and define the same macros and constants, in many of the files. These declarations and definitions thus form a sort of file header that is more or less constant throughout a program. For the sake of simplicity and consistency, you can write this information just once in a separate header file, and then reference the header file using an #include directive in each source code file. Header files are customarily identified by the filename suffix .h . A header file explicitly included in a C source file may in turn include other files.

Each C source file, together with all the header files included in it, makes up a

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Comments

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

You should use comments generously in the source code to document your C programs. There are two ways to insert a comment in C: block comments begin with /* and end with */, and line comments begin with // and end with the next new line character.

You can use the /* and */ delimiters to begin and end comments within a line, and to enclose comments of several lines. For example, in the following function prototype, the ellipsis (...) signifies that the open() function has a third, optional parameter. The comment explains the usage of the optional parameter:

    int open( const char *name, int mode, ... /* int permissions */ );

You can use // to insert comments that fill an entire line, or to write source code in a two-column format, with program code on the left and comments on the right:

    const double pi = 3.1415926536;     // Pi is constant

These line comments were officially added to the C language by the C99 standard, but most compilers already supported them even before C99. They are sometimes called "C++-style" comments, although they originated in C's forerunner, BCPL.

Inside the quotation marks that delimit a character constant or a string literal, the characters /* and // do not start a comment. For example, the following statement contains no comments:

    printf( "Comments in C begin with /* or //.\n" );

The only thing that the preprocessor looks for in examining the characters in a comment is the end of the comment; thus it is not possible to nest block comments. However, you can insert /* and */ to comment out part of a program that contains line comments:

    /* Temporarily removing two lines:
      const double pi = 3.1415926536;     // Pi is constant
      area = pi * r * r                   // Calculate the area
       Temporarily removed up to here */

If you want to comment out part of a program that contains block comments, you can use a conditional preprocessor directive (described in Chapter 14):

    #if 0
      const double pi = 3.1415926536;     /* Pi is constant     */
      area = pi * r * r                   /* Calculate the area */
    #endif

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Character Sets

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

C makes a distinction between the environment in which the compiler translates the source files of a program—the translation environment —and the environment in which the compiled program is executed, the execution environment. Accordingly, C defines two character sets : the source character set is the set of characters that may be used in C source code, and the execution character set is the set of characters that can be interpreted by the running program. In many C implementations, the two character sets are identical. If they are not, then the compiler converts the characters in character constants and string literals in the source code into the corresponding elements of the execution character set.

Each of the two character sets includes both a basic character set and extended characters . The C language does not specify the extended characters, which are usually dependent on the local language. The extended characters together with the basic character set make up the extended character set .

The basic source and execution character sets both contain the following types of characters:

The letters of the Latin alphabet: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
The decimal digits: 0 1 2 3 4 5 6 7 8 9
The following 29 punctuation marks: ! " # % & ' () * + , − . / : ; < = > ? [ \ ] ^ _ { | } ~
The five whitespace characters: Space, horizontal tab, vertical tab, new line, and form feed

The basic execution character set also includes four nonprintable characters : the null character, which acts as the termination mark in a character string; alert; backspace; and carriage return. To represent these characters in character and string literals, type the corresponding escape sequences beginning with a backslash: \0 for the null character, \a for alert, \b for backspace, and \r for carriage return. See Chapter 3 for more details.

The actual numeric values of characters—the character codes —may vary from one C implementation to another. The language itself imposes only the following conditions:

Each character in the basic character set must be representable in one byte.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Identifiers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The term identifier refers to the names of variables, functions, macros, structures and other objects defined in a C program. Identifiers can contain the following characters:

The letters in the basic character set, a-z and A-Z. Identifiers are case-sensitive.
The underscore character, _.
The decimal digits 0-9, although the first character of an identifier must not be a digit.
Universal character names that represent the letters and digits of other languages.

The permissible universal characters are defined in Annex D of the C standard, and correspond to the characters defined in the ISO/IEC TR 10176 standard, minus the basic character set.

Multibyte characters may also be permissible in identifiers . However, it is up to the given C implementation to determine exactly which multibyte characters are permitted and what universal character names they correspond to.

The following 37 keywords are reserved in C, each having a specific meaning to the compiler, and must not be used as identifiers:

auto

enum

restrict

unsigned

break

extern

return

void

case

float

short

volatile

char

for

signed

while

const

goto

sizeof

_Bool

continue

if

static

_Complex

default

inline

struct

_Imaginary

do

int

switch

double

long

typedef

else

register

union

The following examples are valid identifiers:

x dollar Break error_handler scale64

The following are not valid identifiers:

1st_rank switch y/n x-ray

If the compiler supports universal character names, then α is also an example of a valid identifier, and you can define a variable by that name:

    double α = 0.5;

Your source code editor might save the character α in the source file as the universal character \u03B1.

When choosing identifiers in your programs, remember that many identifiers are already used by the C standard library. These include the names of standard library functions, which you cannot use for functions you define or for global variables. See Chapter 15 for details.

The C compiler provides the predefined identifier _ _func_ _, which you can use in any function to access a string constant containing the name of the function. This is useful for logging or for debugging output; for example:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

How the C Compiler Works

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Once you have written a source file using a text editor, you can invoke a C compiler to translate it into machine code. The compiler operates on a translation unit consisting of a source file and all the header files referenced by #include directives. If the compiler finds no errors in the translation unit, it generates an object file containing the corresponding machine code. Object files are usually identified by the filename suffix .o or .obj . In addition, the compiler may also generate an assembler listing (see Part III).

Object files are also called modules. A library, such as the C standard library, contains compiled, rapidly accessible modules of the standard functions.

The compiler translates each translation unit of a C program—that is, each source file with any header files it includes—into a separate object file. The compiler then invokes the linker, which combines the object files, and any library functions used, in an executable file. Figure 1-1 illustrates the process of compiling and linking a program from several source files and libraries. The executable file also contains any information that the target operating system needs to load and start it.

Figure 1-1: From source code to executable file

The compiling process takes place in eight logical steps. A given compiler may combine several of these steps, as long as the results are not affected. The steps are:

Characters are read from the source file and converted, if necessary, into the characters of the source character set. The end-of-line indicators in the source file, if different from the new line character, are replaced. Likewise, any trigraph sequences are replaced with the single characters they represent. (Digraphs, however are left alone; they are not converted into their single-character equivalents.)
Wherever a backslash is followed immediately by a newline character, the preprocessor deletes both. Since a line end character ends a preprocessor directive, this processing step lets you place a backslash at the end of a line in order to continue a directive, such as a macro definition, on the next line.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 2: Types

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Programs have to store and process different kinds of data, such as integers and floating-point numbers, in different ways. To this end, the compiler needs to know what kind of data a given value represents.

In C, the term object refers to a location in memory whose contents can represent values. Objects that have names are also called variables . An object's type determines how much space the object occupies in memory, and how its possible values are encoded. For example, the same pattern of bits can represent completely different integers depending on whether the data object is interpreted as signed—that is, either positive or negative—or unsigned, and hence unable to represent negative values.

The types in C can be classified as follows:

Basic type
- Standard and extended integer types
- Real and complex floating-point types
Enumerated types
The type void
Derived types
- Pointer types
- Array types
- Structure types
- Union types
- Function types

The basic types and the enumerated types together make up the arithmetic types . The arithmetic types and the pointer types together are called the scalar types . Finally, array types and structure types are referred to collectively as the aggregate types . (Union types are not considered aggregate, because only one of their members can store a value at any given time.)

A function type describes the interface to a function; that is, it specifies the type of the function's return value, and may also specify the types of all the parameters that are passed to the function when it is called.

All other types describe objects. This description may or may not include the object's storage size: if it does, the type is properly called an object type ; if not, it is an incomplete type . An example of an incomplete type might be an externally defined array variable:

    extern float fArr[ ];     // External declaration

This line declares fArr as an array whose elements have type float. However, because the array's size is not specified here, fArr's type is incomplete. As long as the global array fArr is defined with a specified size at another location in the program—in another source file, for example—this declaration is sufficient to let you use the array in its present scope. (For more details on external declarations, see Chapter 11.)

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Typology

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The types in C can be classified as follows:

Basic type
- Standard and extended integer types
- Real and complex floating-point types
Enumerated types
The type void
Derived types
- Pointer types
- Array types
- Structure types
- Union types
- Function types

    extern float fArr[ ];     // External declaration

This chapter describes the basic types, enumerations and the type void. The derived types are described in Chapters 7 through 10.

Some types are designated by a sequence of more than one keyword, such as unsigned short. In such cases, the keywords can be written in any order. However, there is a conventional keyword order, which we use in this book.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Integer Types

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

There are five signed integer types . Most of these types can be designated by several synonyms, which are listed in Table 2-1.

Table 2-1: Standard signed integer types
Type	Synonyms
signed char
int	signed, signed int
short	short int, signed short, signed short int
long	long int, signed long, signed long int
long long (C99)	long long int, signed long long, signed long long int

For each of the five signed integer types in Table 2-1, there is also a corresponding unsigned type that occupies the same amount of memory, with the same alignment: in other words, if the compiler aligns signed int objects on even-numbered byte addresses, then

unsigned 
 int

objects are also aligned on even addresses. These unsigned types are listed in Table 2-2.

Table 2-2: Unsigned standard integer types
Type	Synonyms
_Bool	`bool` (defined in `stdbool.h` )
unsigned char
unsigned int	unsigned
unsigned short	unsigned short int
unsigned long	unsigned long int
unsigned long long	unsigned long long int

C99 introduced the unsigned integer type _Bool to represent Boolean truth values. The Boolean value true is coded as 1, and false is coded as 0. If you include the header file stdbool.h in a program, you can also use the identifiers bool, true, and false, which are familiar to C++ programmers. The macro bool is a synonym for the type _Bool, and true and false are symbolic constants equal to 1 and 0.

The type char is also one of the standard integer types. However, the one-word type name char is synonymous either with signed char or with unsigned char, depending on the compiler. Because this choice is left up to the implementation, char, signed char, and unsigned char are formally three different types.

If your program relies on char being able to hold values less than zero or greater than 127, you should be using either signed char or unsigned char instead.

You can do arithmetic with character variables. It's up to you to decide whether your program interprets the number in a char variable as a character code or as something else. For example, the following short program treats the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Floating-Point Types

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

C also includes special numeric types that can represent nonintegers with a decimal point in any position. The standard floating-point types for calculations with real numbers are as follows:

float: For variables with single precision
double: For variables with double precision
long double: For variables with extended precision

A floating-point value can be stored only with a limited precision, which is determined by the binary format used to represent it and the amount of memory used to store it. The precision is expressed as a number of significant digits. For example, a "precision of six decimal digits" or "six-digit precision" means that the type's binary representation is precise enough to store a real number of six decimal digits, so that its conversion back into a six-digit decimal number yields the original six digits. The position of the decimal point does not matter, and leading and trailing zeros are not counted in the six digits. The numbers 123,456,000 and 0.00123456 can both be stored in a type with six-digit precision.

In C, arithmetic operations with floating-point numbers are performed internally with double or greater precision. For example, the following product is calculated using the double type.

    float height = 1.2345, width = 2.3456;  // Float variables have single
                                            // precision.
    double area = height * width;           // The actual calculation is
                                            // performed with double
                                            // (or greater) precision.

If you assign the result to a float variable, the value is rounded as necessary. For more details on floating-point math, see the section "math.h" in Chapter 15.

C defines only minimal requirements for the storage size and the binary format of the floating-point types . However, the format commonly used is the one defined by the International Electrotechnical Commission (IEC) in the 1989 standard for binary floating-point arithmetic, IEC 60559. This standard is based in turn on the Institute of Electrical and Electronics Engineers' 1985 standard IEEE 754. Compilers can indicate that they support the IEC floating-point standard by defining the macro

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Complex Floating-Point Types (C99)

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

C99 supports mathematical calculations with complex numbers. The 1999 standard introduced complex floating-point types and extended the mathematical library to include complex arithmetic functions. These functions are declared in the header file complex.h , and include for example the trigonometric functions csin(), ctan(), and so on (see Chapter 15).

A complex number z can be represented in Cartesian coordinates as z = x + y × i, where x and y are real numbers, and i is the imaginary unit, defined by the equation i² = -1. The number x is called the real part and y the imaginary part of z.

In C, a complex number is represented by a pair of floating-point values for the real and imaginary parts. Both parts have the same type, whether float, double, or long double. Accordingly, these are the three complex floating-point types:

float _Complex
double _Complex
long double _Complex

Each of these types has the same size and alignment as an array of two float, double, or long double elements.

The header file complex.h defines the macros complex and I. The macro complex is a synonym for the keyword _Complex. The macro I represents the imaginary unit i, and has the type const float _Complex:

    #include <complex.h>
    // ...
    double complex z = 1.0 + 2.0 * I;
    z *= I;      // Rotate z through 90° counterclockwise around the origin.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Enumerated Types

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Enumerations are integer types that you define in a program. The definition of an enumeration begins with the keyword enum, possibly followed by an identifier for the enumeration, and contains a list of the type's possible values, with a name for each value:

    enum [identifier] { enumerator-list };

The following example defines the enumerated type enum color:

    enum color { black, red, green, yellow, blue, white=7, gray };

The identifier color is the tag of this enumeration. The identifiers in the list—black, red, and so on—are the enumeration constants , and have the type int. You can use these constants anywhere within their scope—as case constants in a switch statement, for example.

Each enumeration constant of a given enumerated type represents a certain value, which is determined either implicitly by its position in the list, or explicitly by initialization with a constant expression. A constant without an initialization has the value 0 if it is the first constant in the list, or the value of the preceding constant plus one. Thus in the previous example, the constants listed have the values 0, 1, 2, 3, 4, 7, 8.

Within an enumerated type's scope, you can use the type in declarations:

    enum color bgColor = blue,         // Define two variables
               fgColor = yellow;       // of type enum color.
    void setFgColor( enum color fgc ); // Declare a function with a parameter
                                       // of type enum color.

An enumerated type always corresponds to one of the standard integer types. Thus your C programs may perform ordinary arithmetic operations with variables of enumerated types. The compiler may select the appropriate integer type depending on the defined values of the enumeration constants. In the previous example, the type char would be sufficient to represent all the values of the enumerated type enum color.

Different constants in an enumeration may have the same value:

    enum { OFF, ON, STOP = 0, GO = 1, CLOSED = 0, OPEN = 1 };

As the preceding example also illustrates, the definition of an enumerated type does not necessarily have to include a tag. Omitting the tag makes sense if you want only to define constants, and not declare any variables of the given type. Defining integer constants in this way is generally preferable to using a long list of

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The Type void

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The type specifier void indicates that no value is available. Consequently, you cannot declare variables or constants with this type. You can use the type void for the purposes described in the following sections.

A function with no return value has the type void. For example, the standard function perror() is declared by the prototype:

    void perror( const char * );

The keyword void in the parameter list of a function prototype indicates that the function has no parameters:

    FILE *tmpfile( void );

As a result, the compiler issues an error message if you try to use a function call such as tmpfile("name.tmp"). If the function were declared without void in the parameter list, the C compiler would have no information about the function's parameters, and hence be unable to determine whether the function call is correct.

A void expression is one that has no value. For example, a call to a function with no return value is an expression of type void:

    char filename[ ] = "memo.txt";
    if ( fopen( filename, "r" ) == NULL )
      perror( filename );             // A void expression.

The cast operation (void)expression explicitly discards the value of an expression, such as the return value of a function:

    (void)printf("I don't need this function's return value!\n");

A pointer of type void * represents the address of an object, but not its type. You can use such quasi-typeless pointers mainly to declare functions that can operate on various types of pointer arguments, or that return a "multipurpose" pointer. The standard memory management functions are a simple example:

    void *malloc( size_t size );
    void *realloc( void *ptr, size_t size );
    void free( void *ptr );

As Example 2-3 illustrates, you can assign a void pointer value to another object pointer type, or vice versa, without explicit type conversion.

Example 2-3. Using the type void

// usingvoid.c: Demonstrates uses of the type void
// -------------------------------------------------------
#include <stdio.h>
#include <time.h>
#include <stdlib.h>  // Provides the following function prototypes:
                     // void srand( unsigned int seed );
                     // int rand( void );
                     // void *malloc( size_t size );
                     // void free( void *ptr );
                     // void exit( int status );
enum { ARR_LEN = 100 };
int main()
{
  int i,                                // Obtain some storage space.
      *pNumbers = malloc(ARR_LEN * sizeof(int));
  if ( pNumbers == NULL )
  {
    fprintf(stderr, "Insufficient memory.\n");
    exit(1);
  }
  srand( (unsigned)time(NULL) );        // Initialize the
                                        // random number generator.
  for ( i=0; i < ARR_LEN; ++i )
    pNumbers[i] = rand() % 10000;          // Store some random numbers.
  printf("\n%d random numbers between 0 and 9999:\n", ARR_LEN );
  for ( i=0; i < ARR_LEN; ++i )         // Output loop:
  {
    printf("%6d", pNumbers[i]);         // Print one number per loop iteration
    if ( i % 10 == 9 ) putchar('\n');   // and a newline after every 10 numbers.
  }
  free( pNumbers );                     // Release the storage space.
  return 0;
}

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 3: Literals

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

In C source code, a literal is a token that denotes a fixed value, which may be an integer , a floating-point number, a character, or a string. A literal's type is determined by its value and its notation.

The literals discussed here are different from compound literals , which were introduced in the C99 standard. Compound literals are ordinary modifiable objects, similar to variables. For a full description of compound literals and the special operator used to create them, see Chapter 5.

An integer constant can be expressed as an ordinary decimal numeral, or as a numeral in octal or hexadecimal notation. You must specify the intended notation by a prefix.

A decimal constant begins with a nonzero digit. For example, 255 is the decimal constant for the base-10 value 255.

A number that begins with a leading zero is interpreted as an octal constant. Octal (or base eight) notation uses only the digits from 0 to 7. For example, 047 is a valid octal constant representing 4 × 8 + 7, and is equivalent with the decimal constant 39. The decimal constant 255 is equal to the octal constant 0377.

A hexadecimal constant begins with the prefix 0x or 0X. The hexadecimal digits A to F can be upper- or lowercase. For example, 0xff, 0Xff, 0xFF, and 0XFF represent the same hexadecimal constant, which is equivalent to the decimal constant 255.

Because the integer constants you define will eventually be used in expressions and declarations, their type is important. The type of a constant is determined at the same time as its value is defined. Integer constants such as the examples just mentioned usually have the type int. However, if the value of an integer constant is outside the range of the type int, then it must have a bigger type. In this case, the compiler assigns it the first type in a hierarchy that is large enough to represent the value. For decimal constants, the type hierarchy is:

    int, long, long long

For octal and hexadecimal constants , the type hierarchy is:

    int, unsigned int, long, unsigned long, long long, unsigned long long

For example, on a 16-bit system, the decimal constant

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Integer Constants

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

An integer constant can be expressed as an ordinary decimal numeral, or as a numeral in octal or hexadecimal notation. You must specify the intended notation by a prefix.

A decimal constant begins with a nonzero digit. For example, 255 is the decimal constant for the base-10 value 255.

    int, long, long long

For octal and hexadecimal constants , the type hierarchy is:

    int, unsigned int, long, unsigned long, long long, unsigned long long

For example, on a 16-bit system, the decimal constant 50000 has the type long, since the greatest possible int value is 32,767, or 2¹⁵ − 1.

You can also influence the types of constants in your programs explicitly by using suffixes . A constant with the suffix l or L has the type long (or a larger type if necessary, in accordance with the hierarchies just mentioned). Similarly, a constant with the suffix ll or LL has at least the type long long. The suffix u or U can be used to ensure that the constant has an unsigned type. The

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Floating-Point Constants

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Floating-point constants can be written either in decimal or in hexadecimal notation. These notations are described in the next two sections.

An ordinary floating-point constant consists of a sequence of decimal digits containing a decimal point. You may also multiply the value by a power of 10, as in scientific notation : the power of 10 is represented simply by an exponent, introduced by the letter e or E. A floating-point constant that contains an exponent does not need to have a decimal point. Table 3-2 gives a few examples of decimal floating-point constants .

Table 3-2: Examples of decimal floating-point constants
Floating-point constant	Value
10.0	10
2.34E5	2.34 × 10⁵
67e-12	67.0 × 10⁻¹²

The decimal point can also be the first or last character . Thus 10. and .234E6 are permissible numerals. However, the numeral 10 with no decimal point would be an integer constant, not a floating-point constant.

The default type of a floating-point constant is double. You can also append the suffix F or f to assign a constant the type float, or the suffix L or l to give a constant the type long double, as this example shows:

    float  f_var = 123.456F;              // Initialize a float variable.
    long double ld_var = f_var * 987E7L;  // Initialize a long double variable
                                          // with the product of a
                                          // multiplication performed with
                                          // long double precision.

The C99 standard introduced hexadecimal floating-point constants , which have a key advantage over decimal floating-point numerals: if you specify a constant value in hexadecimal notation, it can be stored in the computer's binary floating-point format exactly, with no rounding error, whereas values that are "round numbers" in decimal notation—like 0.1—may be repeating fractions in binary, and have to be rounded for representation in the internal format. (For an example of rounding with floating-point numbers, see Example 2-2.)

A hexadecimal floating-point constant consists of the prefix

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Character Constants

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A character constant consists of one or more characters enclosed in single quotation marks. Some examples:

    'a'   'XY'   '0'   '*'

All the characters of the source character set are permissible in character constants , except the single quotation mark ', the backslash \, and the newline character. To represent these characters, you must use escape sequences:

    '\''   '\\'   '\n'

All the escape sequences that are permitted in character constants are described in the upcoming section "Escape sequences."

Character constants have the type int, unless they are explicitly defined as wide characters, with type wchar_t, by the prefix L. If a character constant contains one character that can be represented in a single byte, then its value is the character code of that character in the execution character set. For example, the constant 'a' in ASCII encoding has the decimal value 97. The value of character constants that consist of more than one character can vary from one compiler to another.

The following code fragment tests whether the character read is a digit between 1 and 5, inclusive:

    #include <stdio.h>
    int c = 0;
    /* ... */
    c = getchar();                          // Read a character.
    if ( c != EOF && c > '0' && c < '6' )   // Compare input to character
                                            // constants.
    {
      /* This block is executed if the user entered a digit from 1 to 5. */
    }

If the type char is signed, then the value of a character constant can also be negative, because the constant's value is the result of a type conversion of the character code from char to int. For example, ISO 8859-1 is a commonly used 8-bit character set, also known as the ISO Latin 1 or ANSI character set . In this character set, the currency symbol for pounds sterling, £, is coded as hexadecimal A3:

    int c = '\xA3';                         // Symbol for pounds sterling
    printf("Character: %c     Code: %d\n", c, c);

If the execution character set is ISO 8859-1, and the type char is signed, then the printf statement in the preceding example generates the following output:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

String Literals

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A string literal consists of a sequence of characters (and/or escape sequences) enclosed in double quotation marks. Example:

    "Hello world!\n"

Like character constants, string literals may contain all the characters in the source character set. The only exceptions are the double quotation mark ", the backslash \, and the newline character, which must be represented by escape sequences. The following printf statement first produces an alert tone, then indicates a documentation directory in quotation marks, substituting the string literal addressed by the pointer argument doc_path for the conversion specification %s:

    char doc_path[128] = ".\\share\\doc";
    printf("\aSee the documentation in the directory \"%s\"\n", doc_path);

A string literal is a static array of char that contains character codes followed by a string terminator, the null character \0 (see also Chapter 8). The empty string "carview.php?tsp=" occupies exactly one byte in memory, which holds the terminating null character. Characters that cannot be represented in one byte are stored as multibyte characters.

As illustrated in the previous example, you can use a string literal to initialize a char array. A string literal can also be used to initialize a pointer to char:

    char *pStr = "Hello, world!";     // pStr points to the first character, 'H'

In such an initializer, the string literal represents the address of its first element, just as an array name would.

In Example 3-1, the array error_msg contains three pointers to char, each of which is assigned the address of the first character of a string literal.

Example 3-1. Sample function error_exit()

#include <stdlib.h>
#include <stdio.h>
void error_exit(unsigned int error_n)  // Print a last error message
{                                      // and exit the program.
  char * error_msg[ ] = { "Unknown error code.\n",
                         "Insufficient memory.\n",
                         "Illegal memory access.\n" };
  unsigned int arr_len = sizeof(error_msg)/sizeof(char *);
  if ( error_n >= arr_len )
     error_n = 0;
  fputs( error_msg[error_n], stderr );
  exit(1);
}

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 4: Type Conversions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

In C, operands of different types can be combined in one operation. For example, the following expressions are permissible:

    double dVar = 2.5;   // Define dVar as a variable of type double.
    dVar *= 3;           // Multiply dVar by an integer constant.
    if ( dVar < 10L )    // Compare dVar with a long-integer constant.
      { /* ... */ }

When the operands have different types, the compiler tries to convert them to a uniform type before performing the operation. In certain cases, furthermore, you must insert type conversion instructions in your program. A type conversion yields the value of an expression in a new type, which can be either the type void (meaning that the value of the expression is discarded: see "Expressions of Type void" in Chapter 2), or a scalar type—that is, an arithmetic type or a pointer. For example, a pointer to a structure can be converted into a different pointer type. However, an actual structure value cannot be converted into a different structure type.

The compiler provides implicit type conversions when operands have mismatched types, or when you call a function using an argument whose type does not match the function's corresponding parameter. Programs also perform implicit type conversion as necessary when initializing variables or otherwise assigning values to them. If the necessary conversion is not possible, the compiler issues an error message.

You can also convert values from one type to another explicitly using the cast operator (see Chapter 5):

    (type_name) expression

In the following example, the cast operator causes the division of one integer variable by another to be performed as a floating-point operation:

    int sum = 22, count = 5;
    double mean = (double)sum / count;

Because the cast operator has precedence over division, the value of sum in this example is first converted to type double. The compiler must then implicitly convert the divisor, the value of count, to the same type before performing the division.

You should always use the cast operator whenever there is a possibility of losing information, as in a conversion from

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Conversion of Arithmetic Types

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Type conversions are always possible between any two arithmetic types , and the compiler performs them implicitly wherever necessary. The conversion preserves the value of an expression if the new type is capable of representing it. This is not always the case. For example, when you convert a negative value to an unsigned type, or convert a floating-point fraction from type double to the type int, the new type simply cannot represent the original value. In such cases the compiler generally issues a warning.

When arithmetic operands have different types, the implicit type conversion is governed by the types' conversion rank . The types are ranked according to the following rules:

Any two unsigned integer types have different conversion ranks. If one is wider than the other, then it has a higher rank.
Each signed integer type has the same rank as the corresponding unsigned type. The type char has the same rank as signed char and unsigned char.

The standard integer types are ranked in the order:

    _Bool < char < short < int < long < long long

Any standard integer type has a higher rank than an extended integer type of the same width. (Extended integer types are described in the section "Integer Types with Exact Width (C99)" in Chapter 2.)
Every enumerated type has the same rank as its corresponding integer type (see "Enumerated Types" in Chapter 2).
The floating-point types are ranked in the following order:
```
    float < double < long double
```
The lowest-ranked floating-point type, float, has a higher rank than any integer type.
Every complex floating-point type has the same rank as the type of its real and imaginary parts.

In any expression, you can always use a value whose type ranks lower than int in place of an operand of type int or unsigned int. You can also use a bit-field as an integer operand (bit-fields are discussed in Chapter 10). In these cases, the compiler applies integer promotion : any operand whose type ranks lower than int is automatically converted to the type int, provided int is capable of representing all values of the operand's original type. If

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Conversion of Nonarithmetic Types

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Pointers and the names of arrays and functions are also subject to certain implicit and explicit type conversions. Structures and unions cannot be converted, although pointers to them can be converted to and from other pointer types.

An array or function designator is any expression that has an array or function type. In most cases, the compiler implicitly converts an expression with an array type, such as the name of an array, into a pointer to the array's first element. The array expression is not converted into a pointer only in the following cases:

When the array is the operand of the sizeof operator
When the array is the operand of the address operator &
When a string literal is used to initialize an array of char or wchar_t

The following examples demonstrate the implicit conversion of array designators into pointers, using the conversion specification %p to print pointer values:

    #include <stdio.h>
    int *iPtr = 0;                      // A pointer to int, initialized with 0.
    int iArray[ ] = { 0, 10, 20 };       // An array of int, initialized.
    int array_length = sizeof(iArray) / sizeof(int); // The number of elements:
                                                     // in this case, 3.
    printf("The array starts at the address %p.\n", iArray);
    *iArray = 5;                      // Equivalent to iArray[0] = 5;
    iPtr = iArray + array_length - 1; // Point to the last element of iArray:
                                      // Equivalent to
                                      // iPtr = &iArray[array_length-1];
    printf("The last element of the array is %d.\n", *iPtr);

In the initialization of array_length in this example, the expression sizeof(iArray) yields the size of the whole array, not the size of a pointer. However, the same identifier iArray is implicitly converted to a pointer in the other three statements in which it appears:

As an argument in the first printf() call.
As the operand of the dereferencing operator *.
In the pointer arithmetic operations and assignment to iPtr (see also "Modifying and Comparing Pointers" in Chapter 9).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 5: Expressions and Operators

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

An expression consists of a sequence of constants, identifiers, and operators that the program evaluates by performing the operations indicated. The expression's purpose in the program may be to obtain the resulting value, or to produce side effects of the evaluation, or both (see the section "Side Effects and Sequence Points," later in this chapter).

A single constant, a string literal, or the identifier of an object or function is in itself an expression. Such a simple expression, or a more complex expression enclosed in parentheses, is called a primary expression.

Every expression has a type. An expression's type is the type of the value that results when the expression is evaluated. If the expression yields no value, it has the type void. Some simple examples of expressions are listed in Table 5-1 (assume that a has been declared as a variable of type int, and z as a variable of type float _Complex).

Table 5-1: Example expressions
Expression	Type
`'\n'`	`int`
`a + 1`	`int`
`a + 1.0`	`double`
`a < 77.7`	`int`
`"A string literal."`	`char *`
`abort()`	`void`
`sqrt(2.0)`	`double`
`z / sqrt(2.0)`	`double _Complex`

As you can see from the examples in Table 5-1, compound expressions are formed by using an operator with expressions as its operands. The operands can themselves be primary or compound expressions. For example, you can use a function call as a factor in a multiplication. Likewise, the arguments in a function call can be expressions involving several operators, as in this example:

2.0 * sin( 3.14159 * fAngleDegrees/180.0 )

Before we consider specific operators in detail, this section explains a few fundamental principles that will help you understand how C expressions are evaluated. The precedence and associativity of operators are obviously important in parsing compound expressions, but sequence points and lvalues are no less essential to understanding how a C program works.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

How Expressions Are Evaluated

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

An lvalue is an expression that designates an object. The simplest example is the name of a variable. The initial "L" in the term originally meant "left": because an lvalue designates an object, it can appear on the left side of an assignment operator, as in leftexpression = rightexpression. Other expressions—those that represent a value without designating an object—are called, by analogy, rvalues. An rvalue is an expression that can appear on the right side of an assignment operator, but not the left. Examples include constants and arithmetic expressions.

An lvalue can always be resolved to the corresponding object's address, unless the object is a bit-field or a variable declared with the register storage class (see the section "Storage Class Specifiers" in Chapter 11). The operators that yield an lvalue include the subscript operator [ ] and the indirection operator *, as the examples in Table 5-2 illustrate (assume that array has been declared as an array and ptr as a pointer variable).

Table 5-2: Pointer and array expressions may be lvalues
Expression	Lvalue?
`array[1]`	Yes; an array element is an object with a location.
`&array[1]`	No; the location of the object is not an object with a location.
`ptr`	Yes; the pointer variable is an object with a location.
`*ptr`	Yes; what the pointer points to is also an object with a location.
`ptr+1`	No; the addition yields a new address value, but not an object.
`*ptr+1`	No; the addition yields a new arithmetic value, but not an object.

An object may be declared as constant. If this is the case, you can't use it on the left side of an expression, even though it is an lvalue, as the following example illustrates:

int a = 1;
const int b = 2, *ptr = &a;
b = 20;                // Error: b is declared as const int.
*ptr = 10;             // Error: ptr is declared as a pointer to const int.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Operators in Detail

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

This section describes in detail the individual operators, and indicates what kinds of operands are permissible. The descriptions are arranged according to the customary usage of the operators, beginning with the usual arithmetic and assignment operators.

Table 5-5 lists the arithmetic operators .

Table 5-5: Arithmetic operators
Operator	Meaning	Example	Result
`*`	Multiplication	`x * y`	The product of `x` and `y`
`/`	Division	`x / y`	The quotient of `x` by `y`
`%`	The modulo operation	`x % y`	The remainder of `x` divided by `y`
`+`	Addition	`x + y`	The sum of `x` and `y`
`-`	Subtraction	`x − y`	The difference of `x` and `y`
`+` (unary)	Positive sign	`+x`	The value of `x`
`-` (unary)	Negative sign	`-x`	The arithmetic negation of `x`

The operands of the arithmetic operators are subject to the following rules:

Only the % operator requires integer operands.
The operands of all other operators may have any arithmetic type.

Furthermore, addition and subtraction operations may also be performed on pointers in the following cases:

In an addition, one addend can be an object pointer while the other has an integer type.
In a subtraction, either both operands can be pointers to objects of the same type (without regard to type qualifiers), or the minuend (the left operand) can be an object pointer, while the subtrahend (the right operand) has an integer type.

Section 5.2.1.1: Standard arithmetic

The operands are subject to the usual arithmetic conversions (see "Conversion of Arithmetic Types" in Chapter 4). The result of division with two integer operands is also an integer! To obtain the remainder of an integer division, use the modulo operation (the % operator). Implicit type conversion takes place in the evaluation of the following expressions, as shown in Table 5-6 (assume n is declared by short n = -5;).

Table 5-6: Implicit type conversions in arithmetic expressions
Expression	Implicit type conversion	The expression's type	The expression's value
`-n`	Integer promotion.	`int`	5
`n * -2L`	Integer promotion: the value of `n` is promoted to `long`, because the constant

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Constant Expressions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The compiler recognizes constant expressions in source code and replaces them with their values. The resulting constant value must be representable in the expression's type. You may use a constant expression wherever a simple constant is permitted.

Operators in constant expressions are subject to the same rules as in other expressions. Because constant expressions are evaluated at translation time, though, they cannot contain function calls or operations that modify variables, such as assignments.

An integer constant expression is a constant expression with any integer type. These are the expressions you use to define the following items:

The size of an array
The value of an enumeration constant
The size of a bit-field
The value of a case constant in a switch statement

For example, you may define an array as follows:

#define BLOCK_SIZE 512
char buffer[4*BLOCK_SIZE];

The operands can be integer, character, or enumeration constants, or sizeof expressions. However, the operand of sizeof in a constant expression must not be a variable-length array. You can also use floating-point constants, if you cast them as an integer type.

You can also use constant expressions to initialize static and external objects. In these cases, the constant expressions can have any arithmetic or pointer type desired. You may use floating-point constants as operands in an arithmetic constant expression.

A constant with a pointer type, called an address constant, is usually a null pointer, an array or function name, or a value obtained by applying the address operator & to an object with static storage duration. However, you can also construct an address constant by casting an integer constant as a pointer type, or by pointer arithmetic. Example:

#define ARRAY_SIZE 200
static float fArray[ARRAY_SIZE];
static float *fPtr = fArray + ARRAY_SIZE − 1;  // Pointer to the last
                                               // array element

In composing an address constant, you can also use other operators, such as . and ->, as long as you do not actually dereference a pointer to access the value

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 6: Statements

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A statement specifies one or more actions to be performed, such as assigning a value to a variable, passing control to a function, or jumping to another statement. The sum total of all a program's statements determines what the program does.

Jumps and loops are statements that control the flow of the program. Except when those control statements result in jumps, statements are executed sequentially; that is, in the order in which they appear in the program.

An expression statement is an expression followed by a semicolon:

[expression] ;

In an expression statement, the expression—whether an assignment or another operation—is evaluated for the sake of its side effects. Following are some typical expression statements :

y = x;                         // An assignment
sum = a + b;                   // Calculation and assignment
++x;
printf("Hello, world\n");      // A function call

The type and value of the expression are irrelevant, and are discarded before the next statement is executed. For this reason, statements such as the following are syntactically correct, but not very useful:

100;
y < x;

If a statement is a function call and the return value of the function is not needed, it can be discarded explicitly by casting the function as void:

char name[32];
/* ... */
(void)strcpy( name, "Jim" );   // Explicitly discard
                               // the return value.

A statement can also consist of a semicolon alone: this is called a null statement . Null statements are necessary in cases where syntax requires a statement, but the program should not perform any action. In the following example, a null statement forms the body of a for loop:

for ( i = 0; s[i] != '\0'; ++i ) // Loop conditions
  ;                              // A null statement

This code sets the variable i to the index of the first null character in the array s, using only the expressions in the head of the for loop.

A compound statement, called a block for short, groups a number of statements and declarations together between braces to form a single statement:

{ [list of declarations and statements

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Expression Statements

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

An expression statement is an expression followed by a semicolon:

[expression] ;

In an expression statement, the expression—whether an assignment or another operation—is evaluated for the sake of its side effects. Following are some typical expression statements :

y = x;                         // An assignment
sum = a + b;                   // Calculation and assignment
++x;
printf("Hello, world\n");      // A function call

100;
y < x;

If a statement is a function call and the return value of the function is not needed, it can be discarded explicitly by casting the function as void:

char name[32];
/* ... */
(void)strcpy( name, "Jim" );   // Explicitly discard
                               // the return value.

for ( i = 0; s[i] != '\0'; ++i ) // Loop conditions
  ;                              // A null statement

This code sets the variable i to the index of the first null character in the array s, using only the expressions in the head of the for loop.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Block Statements

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A compound statement, called a block for short, groups a number of statements and declarations together between braces to form a single statement:

{ [list of declarations and statements] }

Unlike simple statements, block statements are not terminated by a semicolon. A block is used wherever the syntax calls for a single statement, but the program's purpose requires several statements. For example, you can use a block statement in an if statement, or when more than one statement needs to be repeated in a loop:

{  double result = 0.0, x = 0.0;   // Declarations
   static long status = 0;
   extern int limit;
   ++x;                            // Statements
   if ( status == 0 )
   {                               // New block
      int i = 0;
      while ( status == 0 && i < limit )
      {  /* ... */  }              // Another block
   }
   else
   {  /* ... */  }                 // And yet another block
}

The declarations in a block are usually placed at the beginning, before any statements. However, C99 allows declarations to be placed anywhere.

Names declared within a block have block scope ; in other words, they are visible only from their declaration to the end of the block. Within that scope, such a declaration can also hide an object of the same name that was declared outside the block. The storage duration of automatic variables is likewise limited to the block in which they occur. This means that the storage space of a variable not declared as static or extern is automatically freed at the end of its block statement. For a full discussion of scope and storage duration, see Chapter 11.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Loops

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Use a loop to execute a group of statements, called the loop body, more than once. In C, you can introduce a loop by one of three iteration statements : while, do ... while, and for.

In each of these statements, the number of iterations through the loop body is controlled by a condition, the controlling expression . This is an expression of a scalar type; that is, an arithmetic expression or a pointer. The loop condition is true if the value of the controlling expression is not equal to 0; otherwise, it is considered false.

A while statement executes a statement repeatedly as long as the controlling expression is true:

while ( expression ) statement

The while statement is a top-driven loop: first the loop condition (i.e., the controlling expression) is evaluated. If it yields true, the loop body is executed, and then the controlling expression is evaluated again. If the condition is false, program execution continues with the statement following the loop body.

Syntactically, the loop body consists of one statement. If several statements are required, they are grouped in a block. Example 6-1 shows a simple while loop that reads in floating-point numbers from the console and accumulates a running total of them.

Example 6-1. A while loop

/* Read in numbers from the keyboard and
 * print out their average.
 * -------------------------------------- */
#include <stdio.h>
int main()
{
   double x = 0.0, sum = 0.0;
   int count = 0;
   printf( "\t--- Calculate Averages ---\n" );
   printf( "\nEnter some numbers:\n"
           "(Type a letter to end your input)\n" );
   while ( scanf( "%lf", &x ) == 1 )
   {
      sum += x;
      ++count;
   }
   if ( count == 0 )
     printf( "No input data!\n" );
   else
     printf( "The average of your numbers is %.2f\n", sum/count );
   return 0;
}

In Example 6-1, the controlling expression:

scanf( "%lf", &x ) == 1

is true as long as the user enters a decimal number. As soon as the function scanf() is unable to convert the string input into a floating-point number—when the user types the letter q, for example—scanf() returns the value 0 (or -1 for

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Selection Statements

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A selection statement can direct the flow of program execution along different paths depending on a given condition. There are two selection statements in C: if and switch.

An if statement has the following form:

if ( expression ) statement1 [ else statement2 ]

The else clause is optional. The expression is evaluated first, to determine which of the two statements is executed. This expression must have a scalar type. If its value is true—that is, not equal to 0—then statement1 is executed. Otherwise, statement2, if present, is executed.

The following example uses if in a recursive function to test for the condition that ends its recursion:

// The recursive function power() calculates
// integer powers of floating-point numbers.
// -----------------------------------------
double power( double base, unsigned int exp )
{
   if ( exp == 0 ) return 1.0;
   else return base * power( base, exp-1 );
}

If several if statements are nested, then an else clause always belongs to the last if (on the same block nesting level) that does not yet have an else clause:

if ( n > 0 )
   if ( n % 2 == 0 )
      puts( "n is positive and even" );
   else                                 // This is the alternative
      puts( "n is positive and odd" );  // to the last if

An else clause can be assigned to a different if by enclosing the last if statement that should not have an else clause in a block:

if ( n > 0 )
{
  if ( n % 2 == 0 )
     puts( "n is positive and even" );
}
else                                  // This is the alternative
   puts( "n is negative or zero" );   // to the first if

To select one of more than two alternative statements, if statements can be cascaded in an else if chain. Each new if statement is simply nested in the else clause of the preceding if statement:

// Test measurements for tolerance.
// --------------------------------
double spec = 10.0, measured = 10.3, diff;
/* ... */
diff = measured - spec;
if ( diff >= 0.0 && diff < 0.5 )
   printf( "Upward deviation: %.2f\n", diff );
else if ( diff < 0.0 && diff > -0.5 )
   printf( "Downward deviation: %.2f\n", diff );
else
   printf( "Deviation out of tolerance!\n" );

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Unconditional Jumps

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Jump statements interrupt the sequential execution of statements, so that execution continues at a different point in the program. A jump destroys automatic variables if the jump destination is outside their scope. There are four statements that cause unconditional jumps in C: break , continue, goto, and return.

The break statement can occur only in the body of a loop or a switch statement, and causes a jump to the first statement after the loop or switch statement in which it is immediately contained:

break;

Thus the break statement can be used to end the execution of a loop statement at any position in the loop body. For example, the while loop in Example 6-7 may be ended either at the user's request (by entering a non-numeric string), or by a numeric value outside the range that the programmer wants to accept.

Example 6-7. The break statement

// Read user input of scores from 0 to 100
// and store them in an array.
// Return value: the number of values stored.
// ------------------------------------------
int getScores( short scores[ ], int len )
{
   int i = 0;
   puts( "Please enter scores between 0 and 100.\n"
         "Press <Q> and <Return> to quit.\n" );
   while ( i < len )
   {
      printf( "Score No. %2d: ", i+1 );
      if ( scanf( "%hd", &scores[i] ) != 1 )
         break;          // No number read: end the loop.
      if ( scores[i] < 0  ||  scores[i] > 100 )
      {
         printf( "%d: Value out of range.\n", scores[i] );
         break;          // Discard this value and end the loop.
      }
      ++i;
   }
   return i;             // The number of values stored.
}

The continue statement can be used only within the body of a loop, and causes the program flow to skip over the rest of the current iteration of the loop:

continue;

In a while or do ... while loop, the program jumps to the next evaluation of the loop's controlling expression. In a for loop, the program jumps to the next evaluation of the third expression in the for statement, containing the operations that are performed after every loop iteration.

In Example 6-7, the second

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 7: Functions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

All the instructions of a C program are contained in functions . Each function performs a certain task. A special function name is main(): the function with this name is the first one to run when the program starts. All other functions are subroutines of the main() function (or otherwise dependent procedures, such as call-back functions), and can have any names you wish.

Every function is defined exactly once. A program can declare and call a function as many times as necessary.

The definition of a function consists of a function head (or the declarator), and a function block . The function head specifies the name of the function, the type of its return value, and the types and names of its parameters, if any. The statements in the function block specify what the function does. The general form of a function definition is as follows:

In the function head, name is the function's name, while type consists of at least one type specifier, which defines the type of the function's return value. The return type may be void or any object type, except array types. Furthermore, type may include the function specifier inline, and/or one of the storage class specifiers extern and static.

A function cannot return a function or an array. However, you can define a function that returns a pointer to a function or a pointer to an array.

The parameter declarations are contained in a comma-separated list of declarations of the function's parameters. If the function has no parameters, this list is either empty or contains merely the word void.

The type of a function specifies not only its return type, but also the types of all its parameters. Example 7-1 is a simple function to calculate the volume of a cylinder.

Example 7-1. Function cylinderVolume()

// The  cylinderVolume() function calculates the volume of a cylinder.
// Arguments: Radius of the base circle; height of the cylinder.
// Return value: Volume of the cylinder.
extern double cylinderVolume( double r, double h )
{
   const double pi = 3.1415926536;     // Pi is constant
   return  pi * r * r * h;
}

This function has the name

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Function Definitions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A function cannot return a function or an array. However, you can define a function that returns a pointer to a function or a pointer to an array.

The type of a function specifies not only its return type, but also the types of all its parameters. Example 7-1 is a simple function to calculate the volume of a cylinder.

Example 7-1. Function cylinderVolume()

// The  cylinderVolume() function calculates the volume of a cylinder.
// Arguments: Radius of the base circle; height of the cylinder.
// Return value: Volume of the cylinder.
extern double cylinderVolume( double r, double h )
{
   const double pi = 3.1415926536;     // Pi is constant
   return  pi * r * r * h;
}

This function has the name cylinderVolume, and has two parameters, r and h, both with type double. It returns a value with the type double.

The function in Example 7-1 is declared with the storage class specifier extern. This is not strictly necessary, since extern is the default storage class for functions. An ordinary function definition that does not contain a static or inline specifier can be placed in any source file of a program. Such a function is available in all of the program's source files, because its name is an external identifier (or in strict terms, an identifier with external linkage: see "Linkage of Identifiers" in Chapter 11). You merely have to declare the function before its first use in a given translation unit (see the section "Function Declarations," later in this chapter). Furthermore, you can arrange functions in any order you wish within a source file. The only restriction is that you cannot define one function within another. C does not allow you to define "local functions" in this way.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Function Declarations

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

By declaring a function before using it, you inform the compiler of its type: in other words, a declaration describes a function's interface. A declaration must indicate at least the type of the function's return value, as the following example illustrates:

int rename();

This line declares rename() as a function that returns a value with type int. Because function names are external identifiers by default, that declaration is equivalent to this one:

extern int rename();

As it stands, this declaration does not include any information about the number and the types of the function's parameters. As a result, the compiler cannot test whether a given call to this function is correct. If you call the function with arguments that are different in number or type from the parameters in its definition, the result will be a critical runtime error. To prevent such errors, you should always declare a function's parameters as well. In other words, your declaration should be a function prototype. The prototype of the standard library function rename(), for example, which changes the name of a file, is as follows:

int rename( const char *oldname, const char *newname );

This function takes two arguments with type pointer to const char. In other words, the function uses the pointers only to read char objects. The arguments may thus be string literals.

The identifiers of the parameters in a prototype declaration are optional. If you include the names, their scope ends with the prototype itself. Because they have no meaning to the compiler, they are practically no more than comments telling programmers what each parameter's purpose is. In the prototype declaration of rename(), for example, the parameter names oldname and newname in indicate that the old filename goes first and the new filename second in your rename() function calls. To the compiler, the prototype declaration would have exactly the same meaning without the parameter names:

int rename( const char *, const char * );

The prototypes of the standard library functions are contained in the standard header files. If you want to call the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

How Functions Are Executed

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The instruction to execute a function—the function call—consists of the function's name and the operator () (see the section "Other Operators" in Chapter 5). For example, the following statement calls the function maximum() to compute the maximum of the matrix mat, which has r rows and c columns:

maximum( r, c, mat );

The program first allocates storage space for the parameters, then copies the argument values to the corresponding locations. Then the program jumps to the beginning of the function, and execution of the function begins with first variable definition or statement in the function block.

If the program reaches a return statement or the closing brace } of the function block, execution of the function ends, and the program jumps back to the calling function. If the program "falls off the end" of the function by reaching the closing brace, the value returned to the caller is undefined. For this reason, you must use a return statement to stop any function that does not have the type void. The value of the return expression is returned to the calling function (see the section "The return Statement" in Chapter 6).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Pointers as Arguments and Return Values

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

C is inherently a call by value language, as the parameters of a function are local variables initialized with the argument values. This type of language has the advantage that any expression desired can be used as an argument, as long as it has the appropriate type. On the other hand, the drawback is that copying large data objects to begin a function call can be expensive. Moreover, a function has no way to modify the originals—that is, the caller's variables—as it knows how to access only the local copy.

However, a function can directly access any variable visible to the caller if one of its arguments is that variable's address. In this way C also provides call by reference functions. A simple example is the standard function scanf(), which reads the standard input stream and places the results in variables referenced by pointer arguments that the caller provides:

int var;
scanf( "%d", &var );

This function call reads a string as a decimal numeral, converts it to an integer, and stores the value in the location of var.

In the following example, the initNode() function initializes a structure variable. The caller passes the structure's address as an argument.

#include <string.h>                 // Prototypes of memset() and strcpy().
struct Node { long key;
              char name[32];
              /* ... more structure members ... */
              struct Node *next;
            };
void initNode( struct Node *pNode )      // Initialize the structure *pNode.
{
  memset( pNode, 0, sizeof(*pNode) );
  strcpy( pNode->name, "XXXXX" );
}

Even if a function needs only to read and not to modify a variable, it still may be more efficient to pass the variable's address rather than its value. That's because passing by address avoids the need to copy the data; only the variable's address is pushed onto the stack. If the function does not modify such a variable, then you should declare the corresponding parameter as a "read-only" pointer, as in the following example:

void printNode( const struct Node *pNode );
{
  printf( "Key:  %ld\n", pNode->key );
  printf( "Name: %s\n",  pNode->name );
  /* ... */
}

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Inline Functions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Ordinarily, calling a function causes the computer to save its current instruction address, jump to the function called and execute it, then make the return jump to the saved address. With small functions that you need to call often, this can degrade the program's run-time behavior substantially. As a result, C99 has introduced the option of defining inline functions. The keyword inline is a request to the compiler to insert the function's machine code wherever the function is called in the program. The result is that the function is executed as efficiently as if you had inserted the statements from the function body in place of the function call in the source code.

To define a function as an inline function, use the function specifier inline in its definition. In Example 7-7, swapf() is defined as an inline function that exchanges the values of two float variables, and the function selection_sortf() calls the inline function swapf().

Example 7-7. Function swapf()

// The function swapf() exchanges the values of two float variables.
// Arguments:    Two pointers to float.
// Return value: None.
inline void swapf( float *p1, float *p2 ) // Define it as an inline function.
{
   float tmp = *p1; *p1 = *p2; p2 = tmp;
}
// The function selection_sortf() uses the selection-sort
// algorithm to sort an array of float elements.
// Arguments:    An array of float, and its length.
// Return value: None.
void selection_sortf( float a[ ], int n )  // Sort an array a of length n.
{
  register int i, j, mini;                // Three index variables.
  for ( i = 0;  i < n - 1;  ++i )
  {
    mini = i;                 // Search for the minimum starting at index i.
    for ( j = i+1;  j < n;  ++j )
      if ( a[j] < a[mini] )
        mini = j;
      swapf( a+i, a+mini ); // Swap the minimum with the element at index i.
   }
}

It is generally not a good idea to define a function containing loops, such as selection_sortf(), as inline . Example 7-7 uses inline instead to speed up the instructions inside a for loop.

The inline specifier is not imperative: the compiler may ignore it. Recursive functions, for example, are usually not compiled inline. It is up to the given compiler to determine when a function defined with

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Recursive Functions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A recursive function is one that calls itself, whether directly or indirectly. Indirect recursion means that a function calls another function (which may call a third function, and so on), which in turn calls the first function. Because a function cannot continue calling itself endlessly, recursive functions must always have an exit condition.

In Example 7-8, the recursive function binarySearch() implements the binary search algorithm to find a specified element in a sorted array. First the function compares the search criterion with the middle element in the array. If they are the same, the function returns a pointer to the element found. If not, the function searches in whichever half of the array could contain the specified element by calling itself recursively. If the length of the array that remains to be searched reaches zero, then the specified element is not present, and the recursion is aborted.

Example 7-8. Function binarySearch()

// The binarySearch() function searches a sorted array.
// Arguments:    The value of the element to find;
//               the array of long to search; the array length.
// Return value: A pointer to the element found,
//               or NULL if the element is not present in the array.
long *binarySearch( long val, long array[ ], int n )
{
  int m = n/2;
  if ( n <= 0 )          return NULL;
  if ( val == array[m] ) return array + m;
  if ( val <  array[m] ) return binarySearch( val, array, m );
  else                   return binarySearch( val, array+m+1, n-m-1 );
}

For an array of n elements, the binary search algorithm performs at most 1+log₂(n) comparisons. With a million elements, the maximum number of comparisons performed is 20, which means at most 20 recursions of the binarySearch() function.

Recursive functions depend on the fact that a function's automatic variables are created anew on each recursive call. These variables, and the caller's address for the return jump, are stored on the stack with each recursion of the function that begins. It is up to the programmer to make sure that there is enough space available on the stack. The

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Variable Numbers of Arguments

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

C allows you to define functions that you can call with a variable number of arguments. These are sometimes called variadic functions. Such functions require a fixed number of mandatory arguments, followed by a variable number of optional arguments. Each such function must have at least one mandatory argument. The types of the optional arguments can also vary. The number of optional arguments is either determined by the values of the mandatory arguments , or by a special value that terminates the list of optional arguments.

The best-known examples of variadic functions in C are the standard library functions printf() and scanf(). Each of these two functions has one mandatory argument , the format string. The conversion specifiers in the format string determine the number and the types of the optional arguments.

For each mandatory argument, the function head shows an appropriate parameter, as in ordinary function declarations. These are followed in the parameter list by a comma and an ellipsis (...), which stands for the optional arguments.

Internally, variadic functions access any optional arguments through an object with the type va_list, which contains the argument information. An object of this type—also called an argument pointer—contains at least the position of one argument on the stack. The argument pointer can be advanced from one optional argument to the next, allowing a function to work through the list of optional arguments. The type va_list is defined in the header file stdarg.h.

When you write a function with a variable number of arguments, you must define an argument pointer with the type va_list in order to read the optional arguments. In the following description, the va_list object is named argptr. You can manipulate the argument pointer using four macros, which are defined in the header file stdarg.h:

void va_start( va_list argptr, lastparam );: The macro va_start initializes the argument pointer argptr with the position of the first optional argument. The macro's second argument must be the name of the function's last named parameter. You must call this macro before your function can use the optional arguments.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 8: Arrays

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

An array contains objects of a given type, stored consecutively in a continuous memory block. The individual objects are called the elements of an array. The elements' type can be any object type. No other types are permissible: array elements may not have a function type or an incomplete type (see the section "Typology" in Chapter 2).

An array is also an object itself, and its type is derived from its elements' type. More specifically, an array's type is determined by the type and number of elements in the array. If an array's elements have type T, then the array is called an "array of T." If the elements have type int, for example, then the array's type is "array of int." The type is an incomplete type, however, unless it also specifies the number of elements. If an array of int has 16 elements, then it has a complete object type, which is "array of 16 int elements."

The definition of an array determines its name, the type of its elements, and the number of elements in the array. An array definition without any explicit initialization has the following syntax:

type name[ number_of_elements ];

The number of elements, between square brackets ([ ]), must be an integer expression whose value is greater than zero. An example:

char buffer[4*512];

This line defines an array with the name buffer, which consists of 2,048 elements of type char.

You can determine the size of the memory block that an array occupies using the sizeof operator. The array's size in memory is always equal to the size of one element times the number of elements in the array. Thus, for the array buffer in our example, the expression sizeof(buffer) yields the value of 2048 * sizeof(char); in other words, the array buffer occupies 2,048 bytes of memory, because sizeof(char) always equals one.

In an array definition, you can specify the number of elements as a constant expression, or, under certain conditions, as an expression involving variables. The resulting array is accordingly called a fixed-length or a variable-length array.

Most array definitions specify the number of array elements as a constant expression. An array so defined has a fixed length. Thus the array

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Defining Arrays

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The definition of an array determines its name, the type of its elements, and the number of elements in the array. An array definition without any explicit initialization has the following syntax:

type name[ number_of_elements ];

The number of elements, between square brackets ([ ]), must be an integer expression whose value is greater than zero. An example:

char buffer[4*512];

This line defines an array with the name buffer, which consists of 2,048 elements of type char.

Most array definitions specify the number of array elements as a constant expression. An array so defined has a fixed length. Thus the array buffer defined in the previous example is a fixed-length array.

Fixed-length arrays can have any storage class: you can define them outside all functions or within a block, and with or without the storage class specifier static. The only restriction is that no function parameter can be an array. An array argument passed to a function is always converted into a pointer to the first array element (see the section "Arrays as Function Parameters" in Chapter 7).

The four array definitions in the following example are all valid:

int a[10];             // a has external linkage.
static int b[10];      // b has static storage duration and file scope.
void func()
{
  static int c[10];    // c has static storage duration and block scope.
  int d[10];           // d has automatic storage duration.
  /* ... */
}

C99 also allows you to define an array using a nonconstant expression for the number of elements, if the array has automatic storage duration—in other words, if the definition occurs within a block and does not have the specifier

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Accessing Array Elements

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The subscript operator [ ] provides an easy way to address the individual elements of an array by index. If myArray is the name of an array and i is an integer, then the expression myArray[i] designates the array element with the index i. Array elements are indexed beginning with 0. Thus, if len is the number of elements in an array, the last element of the array has the index len-1 (see the section "Memory Addressing Operators" in Chapter 5).

The following code fragment defines the array myArray and assigns a value to each element.

#define A_SIZE 4
long myarray[A_SIZE];
for ( int i = 0;  i < A_SIZE;  ++i )
  myarray[i] = 2 * i;

The diagram in Figure 8-1 illustrates the result of this assignment loop.

Figure 8-1: Values assigned to elements by index

An array index can be any integer expression desired. The subscript operator [ ] does not bring any range checking with it; C gives priority to execution speed in this regard. It is up to you the programmer to ensure that an index does not exceed the range of permissible values. The following incorrect example assigns a value to a memory location outside the array:

long myarray[4];
myArray[4] = 8;         // Error: subscript must not exceed 3.

Such "off-by-one" errors can easily cause a program to crash, and are not always as easy to recognize as in this simple example.

Another way to address array elements, as an alternative to the subscript operator, is to use pointer arithmetic. After all, the name of an array is implicitly converted into a pointer to the first array element in all expressions except sizeof operations. For example, the expression myArray+i yields a pointer to the element with the index i, and the expression *(myArray+i) is equivalent to myArray[i] (see the section "Pointer arithmetic" in Chapter 5).

The following loop statement uses a pointer instead of an index to step through the array myArray, and doubles the value of each element:

for ( long *p = myArray; *p < myArray + A_SIZE; ++p )
  *p *= 2;

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Initializing Arrays

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

If you do not explicitly initialize an array variable, the usual rules apply: if the array has automatic storage duration, then its elements have undefined values. Otherwise, all elements are initialized by default to the value 0. (If the elements are pointers, they are initialized to NULL.) For more details, see the section "Initialization" in Chapter 11.

To initialize an array explicitly when you define it, you must use an initialization list: this is a comma-separated list of initializers, or initial values for the individual array elements, enclosed in braces. An example:

int a[4] = { 1, 2, 4, 8 };

This definition gives the elements of the array a the following initial values:

a[0] = 1,  a[1] = 2,  a[2] = 4,  a[3] = 8

When you initialize an array, observe the following rules:

You cannot include an initialization in the definition of a variable-length array.
If the array has static storage duration, then the array initializers must be constant expressions. If the array has automatic storage duration, then you can use variables in its initializers.
You may omit the length of the array in its definition if you supply an initialization list. The array's length is then determined by the index of the last array element for which the list contains an initializer. For example, the definition of the array a in the previous example is equivalent to this:
```
int a[ ] = { 1, 2, 4, 8 };     // An array with four elements.
```
If the definition of an array contains both a length specification and an initialization list, then the length is that specified by the expression between the square brackets. Any elements for which there is no initializer in the list are initialized to zero (or NULL, for pointers). If the list contains more initializers than the array has elements, the superfluous initializers are simply ignored.
A superfluous comma after the last initializer is also ignored.

As a result of these rules, all of the following definitions are equivalent:

int a[4] = { 1, 2 };
int a[ ]  = { 1, 2, 0, 0 };
int a[ ]  = { 1, 2, 0, 0, };
int a[4] = { 1, 2, 0, 0, 5 };

In the final definition, the initializer

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Strings

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A string is a continuous sequence of characters terminated by '\0', the null character. The length of a string is considered to be the number of characters excluding the terminating null character. There is no string type in C, and consequently there are no operators that accept strings as operands.

Instead, strings are stored in arrays whose elements have the type char or wchar_t. Strings of wide characters—that is, characters of the type wchar_t—are also called wide strings . The C standard library provides numerous functions to perform basic operations on strings, such as comparing, copying, and concatenating them (see the section "String Processing" in Chapter 16).

You can initialize arrays of char or wchar_t using string literals. For example, the following two array definitions are equivalent:

char str1[30] = "Let's go";       // String length: 8; array length: 30.
char str1[30] = { 'L', 'e', 't', '\'', 's',' ', 'g', 'o', '\0' };

An array holding a string must always be at least one element longer than the string length to accommodate the terminating null character. Thus the array str1 can store strings up to a maximum length of 29. It would be a mistake to define the array with length 8 rather than 30, because then it wouldn't contain the terminating null character.

If you define a character array without an explicit length and initialize it with a string literal, the array created is one element longer than the string length. An example:

char str2[ ] = " to London!";      // String length: 11 (note leading space);
                                  // array length: 12.

The following statement uses the standard function strcat() to append the string in str2 to the string in str1. The array str1 must be large enough to hold all the characters in the concatenated string.

#include <string.h>
char str1[30] = "Let's go";
char str2[ ] = " to London!";
/* ... */
strcat( str1, str2 );
puts( str1 );

The output printed by the puts() call is the new content of the array str1:

Let's go to London!

The names str1 and str2 are pointers to the first character of the string stored in each array. Such a pointer is called a

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Multidimensional Arrays

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A multidimensional array in C is merely an array whose elements are themselves arrays. The elements of an n-dimensional array are (n-1)-dimensional arrays. For example, each element of a two-dimensional array is a one-dimensional array. The elements of a one-dimensional array, of course, do not have an array type.

A multidimensional array declaration has a pair of brackets for each dimension:

char screen[10][40][80];      // A three-dimensional array.

The array screen consists of the 10 elements screen[0] to screen[9]. Each of these elements is a two-dimensional array, consisting in turn of 40 one-dimensional arrays of 80 characters each. All in all, the array screen contains 32,000 elements with the type char.

To access a char element in the three-dimensional array screen, you must specify three indices. For example, the following statement writes the character Z in the last char element of the array:

screen[9][39][79] = 'Z';

Two-dimensional arrays are also called matrices . Because they are so frequently used, they merit a closer look. It is often helpful to think of the elements of a matrix as being arranged in rows and columns. Thus the matrix mat in the following definition has three rows and five columns:

float mat[3][5];

The three elements mat[0], mat[1], and mat[2] are the rows of the matrix mat. Each of these rows is an array of five float elements. Thus the matrix contains a total of 3 × 5 = 15 float elements, as the following diagram illustrates:

	0	1	2	3	4
mat[0]	0.0	0.1	0.2	0.3	0.4
mat[1]	1.0	1.1	1.2	1.3	1.4
mat[2]	2.0	2.1	2.2	2.3	2.4

The values specified in the diagram can be assigned to the individual elements by a nested loop statement. The first index specifies a row, and the second index addresses a column in the row:

for ( int row = 0;  row < 3;  ++row )
  for ( int col = 0;  col < 5;  ++col )
    mat[row][col] = row + (float)col/10;

In memory, the three rows are stored consecutively, since they are the elements of the array mat. As a result, the float values in this matrix are all arranged consecutively in memory in ascending order.

In an array declaration that is not a definition, the array type can be incomplete; you can declare an array without specifying its length. Such a declaration is a reference to an array that you must define with a specified length elsewhere in the program. However, you must always declare the complete type of an array's elements. For a multidimensional array declaration, only the first dimension can have an unspecified length. All other dimensions must have a magnitude. In declaring a two-dimensional matrix, for example, you must always specify the number of columns.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Arrays as Arguments of Functions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

When the name of an array appears as a function argument, the compiler implicitly converts it into a pointer to the array's first element. Accordingly, the corresponding parameter of the function is always a pointer to the same object type as the type of the array elements.

You can declare the parameter either in array form or in pointer form: type name[ ] or type *name. The strcat() function defined in Example 8-1 illustrates the pointer notation. For more details and examples, see the section "Arrays as Function Parameters" in Chapter 7. Here, however, we'll take a closer look at the case of multidimensional arrays.

When you pass a multidimensional array as a function argument, the function receives a pointer to an array type. Because this array type is the type of the elements of the outermost array dimension, it must be a complete type. For this reason, you must specify all dimensions of the array elements in the corresponding function parameter declaration.

For example, the type of a matrix parameter is a pointer to a "row" array, and the length of the rows (i.e., the number of "columns") must be included in the declaration. More specifically, if NCOLS is the number of columns, then the parameter for a matrix of float elements can be declared as follows:

#define NCOLS 10                              // The number of columns.
/* ... */
void somefunction( float (*pMat)[NCOLS] );    // A pointer to a row array.

This declaration is equivalent to the following:

void somefunction( float pMat[ ][NCOLS] );

The parentheses in the parameter declaration float (*pMat)[NCOLS] are necessary in order to declare a pointer to an array of float. Without them, float *pMat[NCOLS] would declare the identifier pMat as an array whose elements have the type float*, or pointer to float. See the section "Complex Declarators" in Chapter 11.

In C99, parameter declarations can contain variable-length arrays. Thus in a declaration of a pointer to a matrix, the number of columns need not be constant, but can be another parameter of the function. For example, you can declare a function as follows:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 9: Pointers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A pointer is a reference to a data object or a function. Pointers have many uses: defining "call-by-reference" functions, and implementing dynamic data structures such as chained lists and trees, to name just two examples.

Very often the only efficient way to manage large volumes of data is to manipulate not the data itself, but pointers to the data. For example, if you need to sort a large number of large records, it is often more efficient to sort a list of pointers to the records, rather than moving the records themselves around in memory. Similarly, if you need to pass a large record to a function, it's more economical to pass a pointer to the record than to pass the record contents, even if the function doesn't modify the contents.

A pointer represents both the address and the type of an object or function. If an object or function has the type T, then a pointer to it has the derived type pointer to T. For example, if var is a float variable, then the expression &var—whose value is the address of the float variable—has the type pointer to float, or in C notation, the type float *. A pointer to any type T is also called a T pointer for short. Thus the address operator in &var yields a float pointer.

Because var doesn't move around in memory, the expression &var is a constant pointer. However, C also allows you to define variables with pointer types. A pointer variable stores the address of another object or a function. We describe pointers to arrays and functions a little further on. To start out, the declaration of a pointer to an object that is not an array has the following syntax:

type * [type-qualifier-list] name [= initializer];

In declarations, the asterisk (*) means "pointer to." The identifier name is declared as an object with the type type *, or pointer to type. The optional type qualifier list may contain any combination of the type qualifiers const, volatile, and restrict. For details about qualified pointer types, see the section "Pointers and Type Qualifiers," later in this chapter.

Here is a simple example:

int *iPtr;          // Declare iPtr as a pointer to int.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Declaring Pointers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

type * [type-qualifier-list] name [= initializer];

Here is a simple example:

int *iPtr;          // Declare iPtr as a pointer to int.

The type int is the type of object that the pointer iPtr can point to. To make a pointer refer to a certain object, assign it the address of the object. For example, if iVar is an int variable, then the following assignment makes iPtr point to the variable iVar:

iPtr = &iVar;       // Let iPtr point to the variable iVar.

The general form of a declaration consists of a comma-separated list of declarators, each of which declares one identifier (see Chapter 11). In a pointer declaration, the asterisk (*) is part of an individual declarator. We can thus define and initialize the variables iVar and iPtr in one declaration, as follows:

int iVar = 77, *iPtr = &iVar; // Define an int variable and a pointer to it.

The second of these two declarations initializes the pointer

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Operations with Pointers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

This section describes the operations that can be performed using pointers. The most important of these operations is accessing the object or function that the pointer refers to. You can also compare pointers, and use them to iterate through a memory block. For a complete description of the individual operators in C, with their precedence and permissible operands, see Chapter 5.

The indirection operator * yields the location in memory whose address is stored in a pointer. If ptr is a pointer, then *ptr designates the object (or function) that ptr points to. Using the indirection operator is sometimes called dereferencing a pointer. The type of the pointer determines the type of object that is assumed to be at that location in memory. For example, when you access a given location using an int pointer, you read or write an object of type int.

Unlike the multiplication operator *, the indirection operator * is a unary operator; that is, it has only one operand. In Example 9-1, ptr points to the variable x. Hence the expression *ptr is equivalent to the variable x itself.

Example 9-1. Dereferencing a pointer

double x, y, *ptr;     // Two double variables and a pointer to double.
ptr = &x;              // Let ptr point to x.
*ptr = 7.8;            // Assign the value 7.8 to the variable x.
*ptr *= 2.5;           // Multiply x by 2.5.
y = *ptr + 0.5;        // Assign y the result of the addition x + 0.5.

Do not confuse the asterisk (*) in a pointer declaration with the indirection operator. The syntax of the declaration can be seen as an illustration of how to use the pointer. An example:

double *ptr;

As declared here, ptr has the type double * (read: "pointer to double"). Hence the expression *ptr would have the type double.

Of course, the indirection operator * must be used with only a pointer that contains a valid address. This usage requires careful programming! Without the assignment ptr = &x in Example 9-1, all of the statements containing *ptr would be senseless—dereferencing an undefined pointer value—and might well cause the program to crash.

A pointer variable is itself an object in memory, which means that a pointer can point to it. To declare a pointer to a pointer , you must use two asterisks, as in the following example:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Pointers and Type Qualifiers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The declaration of a pointer may contain the type qualifiers const, volatile, and/or restrict. The type qualifiers const and volatile may qualify either the pointer type itself, or the type of object it points to. The difference is important. Those type qualifiers that occur in the pointer's declarator—that is, between the asterisk and the pointer's name—qualify the pointer itself. An example:

short const volatile * restrict ptr;

In this declaration, the keyword restrict qualifies the pointer ptr. This pointer can refer to objects of type short that may be qualified with const or volatile, or both.

An object whose type is qualified with const is constant: the program cannot modify it after its definition. The type qualifier volatile is a hint to the compiler that the object so qualified may be modified not only by the present program, but also by other processes or events (see Chapter 11).

The most common use of qualifiers in pointer declarations is in pointers to constant objects, especially as function parameters. For this reason, the following description refers to the type qualifier const. The same rules govern the use of the type qualifier volatile with pointers.

When you define a constant pointer, you must also initialize it, because you can't modify it later. As the following example illustrates, a constant pointer does not necessarily point to a constant object:

int var;                 // An object with type int.
int *const c_ptr = &var; // A constant pointer to int.
*c_ptr = 123;            // OK: we can modify the object referenced, but ...
++c_ptr;                 // error: we can't modify the pointer.

You can modify a pointer that points to an object that has a const-qualified type (also called a pointer to const). However, you can use such a pointer only to read the referenced object, not to modify it. For this reason, pointers to const are commonly called "read-only pointers ." The referenced object itself may or may not be constant. An example:

int var;                     // An object with type int.
const int c_var = 100,       // A constant int object.
          *ptr_to_const;     // A pointer to const 
 int:
                             // the pointer itself is not constant!
ptr_to_const = &c_var;       // OK: Let ptr_to_const point to c_var.
var = 2 * *ptr_to_const;     // OK. Equivalent to: var = 2 * c_var;
ptr_to_const = &var;         // OK: Let ptr_to_const point to var.
if ( c_var < *ptr_to_const ) // OK: "read-only" access.
  *ptr_to_const = 77;        // Error: we can't modify var using
                             // ptr_to_const, even though var is
                             // not constant.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Pointers to Arrays and Arrays of Pointers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Pointers occur in many C programs as references to arrays , and also as elements of arrays. A pointer to an array type is called an array pointer for short, and an array whose elements are pointers is called a pointer array.

For the sake of example, the following description deals with an array of int. The same principles apply for any other array type, including multidimensional arrays.

To declare a pointer to an array type, you must use parentheses, as the following example illustrates:

int (* arrPtr)[10] = NULL; // A pointer to an array of
                           // ten elements with type int.

Without the parentheses, the declaration int * arrPtr[10]; would define arrPtr as an array of 10 pointers to int. Arrays of pointers are described in the next section.

In the example, the pointer to an array of 10 int elements is initialized with NULL. However, if we assign it the address of an appropriate array, then the expression *arrPtr yields the array, and (*arrPtr)[i] yields the array element with the index i. According to the rules for the subscript operator, the expression (*arrPtr)[i] is equivalent to *((*arrPtr)+i) (see "Memory Addressing Operators" in Chapter 5). Hence **arrPtr yields the first element of the array, with the index 0.

In order to demonstrate a few operations with the array pointer arrPtr, the following example uses it to address some elements of a two-dimensional array—that is, some rows of a matrix (see "Matrices" in Chapter 8):

int matrix[3][10];       // Array of three rows, each with 10 columns.
                         // The array name is a pointer to the first
                         // element; i.e., the first row.
arrPtr = matrix;         // Let arrPtr point to the first row of
                         // the matrix.
(*arrPtr)[0] = 5;        // Assign the value 5 to the first element of the
                         // first row.
                         //
arrPtr[2][9] = 6;        // Assign the value 6 to the last element of the
                         // last row.
                         //
++arrPtr;                // Advance the pointer to the next row.
(*arrPtr)[0] = 7;        // Assign the value 7 to the first element of the
                         // second row.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Pointers to Functions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

There are a variety of uses for function pointers in C. For example, when you call a function, you might want to pass it not only the data for it to process, but also pointers to subroutines that determine how it processes the data. We have just seen an example of this use: the standard function qsort(), used in Example 9-4, takes a pointer to a comparison function as one of its arguments, in addition to the information about the array to be sorted. qsort() uses the pointer to call the specified function whenever it has to compare two array elements.

You can also store function pointers in arrays, and then call the functions using array index notation. For example, a keyboard driver might use a table of function pointers whose indices correspond to the key numbers. When the user presses a key, the program would jump to the corresponding function.

Like declarations of pointers to array types, function pointer declarations require parentheses. The examples that follow illustrate how to declare and use pointers to functions .

double (*funcPtr)(double, double);

This declaration defines a pointer to a function type with two parameters of type double and a return value of type double. The parentheses that enclose the asterisk and the identifier are important. Without them, the declaration double *funcPtr(double, double); would be the prototype of a function, not the definition of a pointer.

Wherever necessary, the name of a function is implicitly converted into a pointer to the function. Thus the following statements assign the address of the standard function pow() to the pointer funcPtr, and then call the function using that pointer:

double result;
funcPtr = pow;                  // Let funcPtr point to the function pow().
                                // The expression *funcPtr now yields the
                                // function pow().
result = (*funcPtr)( 1.5, 2.0 );  // Call the function referenced by
                                  // funcPtr.
result = funcPtr( 1.5, 2.0 );     // The same function call.

As the last line in this example shows, when you call a function using a pointer, you can omit the indirection operator, because the left operand of the function call operator (i.e., the parentheses enclosing the argument list) has the type "pointer to function" (see "Function Calls" in Chapter 5).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 10: Structures and Unions and Bit-Fields

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The pieces of information that describe the characteristics of objects, such as information on companies or customers, are generally grouped together in records . Records make it easy to organize, present, and store information about similar objects.

A record is composed of fields that contain the individual details, such as the name, address, and legal form of a company. In C, you determine the names and types of the fields in a record by defining a structure type. The fields are called the members of the structure.

A union is defined in the same way as a structure. Unlike the members of a structure, all the members of a union start at the same address. Hence you define a union type when you want to use the same location in memory for different types of objects.

In addition to the basic and derived types, the members of structures and unions can also include bit-fields . A bit-field is an integer variable composed of a specified number of bits. By defining bit-fields, you can break down an addressable memory unit into groups of individual bits that you can address by name.

A structure type is a type defined within the program that specifies the format of a record, including the names and types of its members, and the order in which they are stored. Once you have defined a structure type, you can use it like any other type in declaring objects, pointers to those objects, and arrays of such structure elements.

The definition of a structure type begins with the keyword struct, and contains a list of declarations of the structure's members, in braces:

struct [tag_name] { member_declaration_list };

A structure must contain at least one member. The following example defines the type struct Date, which has three members of type short:

struct Date { short year, month, day; };

The identifier Date is this structure type's tag. The identifiers year, month, and day are the names of its members. The tags of structure types are a distinct name space: the compiler distinguishes them from variables or functions whose names are the same as a structure tag. Likewise, the names of structure members form a separate name space for each structure type. In this book, we have generally capitalized the first letter in the names of structure, union, and enumeration types: this is merely a common convention to help programmers distinguish such names from those of variables.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Structures

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The definition of a structure type begins with the keyword struct, and contains a list of declarations of the structure's members, in braces:

struct [tag_name] { member_declaration_list };

A structure must contain at least one member. The following example defines the type struct Date, which has three members of type short:

struct Date { short year, month, day; };

The members of a structure may have any desired complete type, including previously defined structure types. They must not be variable-length arrays, or pointers to such arrays.

The following structure type, struct Song, has five members to store five pieces of information about a music recording. The member published has the type struct Date, defined in the previous example:

struct Song { char title[64];
              char artist[32];
              char composer[32];
              short duration;          // Playing time in seconds.
              struct Date published;   // Date of publication.
            };

A structure type cannot contain itself as a member, as its definition is not complete until the closing brace (}). However, structure types can and often do contain pointers to their own type. Such

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Unions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Unlike structure members, which all have distinct locations in the structure, the members of a union all share the same location in memory; that is, all members of a union start at the same address. Thus you can define a union with many members, but only one member can contain a value at any given time. Unions are an easy way for programmers to use a location in memory in different ways.

The definition of a union is formally the same as that of a structure, except for the keyword union in place of struct:

union [tag_name] { member_declaration_list };

The following example defines a union type named Data which has the three members i, x, and str:

union Data { int i; double x; char str[16]; };

An object of this type can store an integer, a floating-point number, or a short string.

union Data var, myData[100];

This declaration defines var as an object of type union Data, and myData as an array of 100 elements of type union Data. A union is at least as big as its largest member. To obtain the size of a union, use the sizeof operator. Using our example, sizeof(var) yields the value 16, and sizeof(myData) yields 1,600.

As Figure 10-2 illustrates, all the members of a union begin at the same address in memory.

Figure 10-2: An object of the type union Data in memory

To illustrate how unions are different from structures, consider an object of the type struct Record with members i, x, and str, defined as follows:

struct Record { int i; double x; char str[16]; };

As Figure 10-3 shows, each member of a structure object has a separate location in memory.

Figure 10-3: An object of the type struct Record in memory

You can access the members of a union in the same ways as structure members. The only difference is that when you change the value of a union member, you modify all the members of the union. Here are a few examples using the union objects var and myData:

var.x = 3.21;
var.x += 0.5;
strcpy( var.str, "Jim" );         // Occupies the place of var.x.
myData[0].i = 50;
for ( int i = 0; i < 50; ++i )
  myData[i].i = 2 * i;

As for structures, the members of each union type form a name space unto themselves. Hence in the last of these statements, the index variable

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Bit-Fields

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Members of structures or unions can also be bit-fields . A bit-field is an integer variable that consists of a specified number of bits. If you declare several small bit-fields in succession, the compiler packs them into a single machine word. This permits very compact storage of small units of information. Of course, you can also manipulate individual bits using the bitwise operators, but bit-fields offer the advantage of handling bits by name, like any other structure or union member.

The declaration of a bit-field has the form:

type [member_name] : width ;

The parts of this syntax are as follows:

type: An integer type that determines how the bit-field's value is interpreted. The type may be _Bool, int, signed int, unsigned int, or another type defined by the given implementation. The type may also include type qualifiers.
Bit-fields with type signed int are interpreted as signed; bit-fields whose type is unsigned int are interpreted as unsigned. Bit-fields of type int may be signed or unsigned, depending on the compiler.
member_name: The name of the bit-field, which is optional. If you declare a bit-field with no name, though, there is no way to access it. Nameless bit-fields can serve only as padding to align subsequent bit-fields to a certain position in a machine word.
width: The number of bits in the bit-field. The width must be a constant integer expression whose value is non-negative, and must be less than or equal to the bit width of the specified type.

Nameless bit-fields can have zero width. In this case, the next bit-field declared is aligned at the beginning of a new addressable storage unit.

When you declare a bit-field in a structure or union, the compiler allocates an addressable unit of memory that is large enough to accommodate it. Usually the storage unit allocated is a machine word whose size is that of the type int. If the following bit-field fits in the rest of the same storage unit, then it is defined as being adjacent to the previous bit-field. If the next bit-field does not fit in the remaining bits of the same unit, then the compiler allocates another storage unit, and may place the next bit-field at the start of new unit, or wrap it across the end of one storage unit and the beginning of the next.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 11: Declarations

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A declaration determines the significance and properties of one or more identifiers . The identifiers you declare can be the names of objects, functions, types, or other things, such as enumeration constants. Identifiers of objects and functions can have various types and scopes. The compiler needs to know all of these characteristics of an identifier before you can use it in an expression. For this reason, each translation unit must contain a declaration of each identifier used in it.

Labels used as the destination of goto statements may be placed before any statement. These identifiers are declared implicitly where they occur. All other identifiers require explicit declaration before their first use, either outside of all functions or at the beginning of a block. In C99, declarations may also appear after statements within a block.

After you have declared an identifier, you can use it in expressions until the end of its scope. The identifiers of objects and functions can have file or block scope (see "Identifier Scope" in Chapter 1).

There are several different kinds of declarations:

Declarations that only declare a structure, union, or enumeration tag, or the members of an enumeration (that is, the enumeration constants)
Declarations that declare one or more object or function identifiers
typedef declarations, which declare new names for existing types

Declarations of enumerated, structure, and union types are described in Chapter 2 and Chapter 10. This chapter deals mainly with object, function, and typedef declarations. These declarations contain a declarator list with one or more declarators. Each declarator declares a typedef name or an identifier for an object or a function. The general form of this kind of declaration is:

    [typedef | storage_class_specifier] type declarator [, declarator [, ...]];

The parts of this syntax are as follows:

storage_class_specifier: No more than one of the storage class specifiers extern, static, auto, or register. A typedef declaration cannot include a storage class specifier. The exact meanings of the storage class specifiers, and restrictions on their use, are described in "Storage Class Specifiers," later in this section.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

General Syntax

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

There are several different kinds of declarations:

Declarations that only declare a structure, union, or enumeration tag, or the members of an enumeration (that is, the enumeration constants)
Declarations that declare one or more object or function identifiers
typedef declarations, which declare new names for existing types

    [typedef | storage_class_specifier] type declarator [, declarator [, ...]];

The parts of this syntax are as follows:

storage_class_specifier

No more than one of the storage class specifiers extern, static, auto, or register. A typedef declaration cannot include a storage class specifier. The exact meanings of the storage class specifiers, and restrictions on their use, are described in "Storage Class Specifiers," later in this section.

type

At least a type specifier, possibly with type qualifiers. The type specifier may be any of these:

A basic type
The type void
An enumerated, structure, or union type
A name defined by a previous typedef declaration

In a function declaration, the type specifier inline may also appear.

type may also contain one or more of the type qualifiers const, volatile, and restrict.

declarator

The declarator list is a comma-separated list containing at least one declarator. A declarator names the identifier that is being declared. If the declarator defines an object, it may also include an initializer for the identifier. There are four different kinds of declarators:

Function declarator: The identifier is declared as a function name if it is immediately followed by a left parenthesis (().
Array declarator: The identifier is declared as an array name if it is immediately followed by a left bracket ([).
Pointer declarator: The identifier is the name of a pointer if it is preceded by an asterisk (

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Type Names

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

To convert a value explicitly from one type to another using the cast operator, you must specify the new type by name. For example, in the cast expression (char *)ptr, the type name is char * (read: "char pointer" or "pointer to char"). When you use a type name as the operand of sizeof, it appears the same way, in parentheses. Function prototype declarations also designate a function's parameters by their type names , even if the parameters themselves have no names.

The syntax of a type name is like that of an object or function declaration, but with no identifier (and no storage class specifier). Two simple examples to start with:

    unsigned char

The type unsigned char.

    unsigned char *

The type "pointer to unsigned char."

In the examples that follow, the type names are more complex. Each type name contains at least one asterisk (*) for "pointer to," as well as parentheses or brackets. To interpret a complex type name, start with the first pair of brackets or parentheses that you find to the right of the last asterisk. (If you were parsing a declarator with an identifier rather than a type name, the identifier would be immediately to the left of those brackets or parentheses.) If the type name includes a function type, then the parameter declarations must be interpreted separately.

float *[ ]: The type "array of pointers to float." The number of elements in the array is undetermined.
float (*)[10]: The type "pointer to an array of ten elements whose type is float."
double *(double *): The type "function whose only parameter has the type pointer to double, and which also returns a pointer to double."
double (*)(): The type "pointer to a function whose return value has the type double." The number and types of the function's parameters are not specified.
int *(*(*)[10])(void): The type "pointer to an array of ten elements whose type is pointer to a function with no parameters which returns a pointer to int."

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

typedef Declarations

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The easy way to use types with complex names, such as those described in the previous section, is to declare simple synonyms for them. In a declaration that starts with the keyword typedef, each declarator defines an identifier as a synonym for the specified type. The identifier is then called a typedef name for that type. Except for the keyword typedef, the syntax is exactly the same as for a declaration of an object or function of the specified type. Some examples:

    typedef unsigned int UINT, UINT_FUNC();
    typedef struct Point { double x, y; } Point_t;
    typedef float Matrix_t[3][10];

In the scope of these declarations, UINT is synonymous with unsigned int, and Point_t is synonymous with the structure type struct Point. You can use the typedef names in declarations, as the following examples show:

    UINT ui = 10, *uiPtr = &ui;

The variable ui has the type unsigned int, and uiPtr is a pointer to unsigned int.

    UINT_FUNC *funcPtr;

The pointer funcPtr can refer to a function whose return value has the type unsigned int. The function's parameters are not specified.

    Matrix_t *func( float * );

The function func() has one parameter, whose type is pointer to float, and returns a pointer to the type Matrix_t.

Example 11-1 uses the typedef name of one structure type, Point_t, in the typedef definition of a second structure type.

Example 11-1. typedef declarations

typedef struct Point { double x, y; } Point_t;
typedef struct { Point_t top_left; Point_t bottom_right; } Rectangle_t;

Ordinarily, you would use a header file to hold the definitions of any typedef names that you need to use in multiple source files. However, you must make an exception in the case of typedef declarations for types that contain a variable-length array. Variable-length arrays can only be declared within a block, and the actual length of the array is calculated anew each time the flow of program execution reaches the typedef declaration. An example:

    int func( int size )
    {
      typedef float VLA[size];    // A typedef name for the type "array of float
                                  // whose length is (the value of size)."
      size *= 2;
      VLA temp;                   // An array of float whose length is the value
                                  // that size

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Linkage of Identifiers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

An identifier that is declared in several translation units, or several times in the same translation unit, may refer to the same object or function in each instance. The extent of an identifier's identity in and among translation units is determined by the identifier's linkage. The term reflects the fact that identifiers in separate source files need to be linked if they are to refer to a common object.

Identifiers in C have either external, internal, or no linkage. The linkage is determined by the declaration's position and storage class specifier, if any. Only object and function identifiers can have external or internal linkage .

An identifier with external linkage represents the same function or object throughout the program. The compiler presents such identifiers to the linker, which resolves them with other occurrences in other translation units and libraries.

Function and object identifiers declared with the storage class specifier extern have external linkage, with one exception: if an identifier has already been declared with internal linkage, a second declaration within the scope of the first cannot change the identifier's linkage to external.

The compiler treats function declarations without a storage class specifier as if they included the specifier extern. Similarly, any object identifiers that you declare outside all functions and without a storage class specifier have external linkage.

An identifier with internal linkage represents the same object or function within a given translation unit. The identifier is not presented to the linker. As a result, you cannot use the identifier in another translation unit to refer to the same object or function.

A function or object identifier has internal linkage if it is declared outside all functions and with the storage class specifier static.

Identifiers with internal linkage do not conflict with similar identifiers in other translation units. However, if a given identifier is declared with external linkage in any translation unit, you cannot declare the same identifier with internal linkage in that translation unit. Or to put it another way, if you declare an identifier with internal linkage in a given translation unit, you cannot also declare and use an external identifier defined in another translation unit with the same spelling.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Storage Duration of Objects

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

During the execution of the program, each object exists as a location in memory for a certain period, called its lifetime . There is no way to access an object before or after its lifetime. For example, the value of a pointer becomes invalid when the object that it references reaches the end of its lifetime.

In C, the lifetime of an object is determined by its storage duration . Objects in C have one of three kinds of storage duration: static, automatic, or allocated. C does not specify how objects must actually be stored in any particular system architecture, but typically, objects with static storage duration are located in a data segment of the program in memory, while objects with automatic storage duration are located on the stack. Allocated storage is memory that the program obtains at runtime by calling the malloc(), calloc(), and realloc() functions. Dynamic storage allocation is described in Chapter 12.

Objects that are defined outside all functions, or within a function and with the storage class specifier static, have static storage duration . These include all objects whose identifiers have internal or external linkage.

All objects with static storage duration are generated and initialized before execution of the program begins. Their lifetime spans the program's entire runtime.

Objects defined within a function and with no storage class specifier (or with the unnecessary specifier auto) have automatic storage duration. Function parameters also have automatic storage duration. Objects with automatic storage duration are generally called automatic variables for short.

The lifetime of an automatic object is delimited by the braces ({}) that begin and end the block in which the object is defined. Variable-length arrays are an exception: their lifetime begins at the point of declaration, and ends with the identifier's scope—that is, at the end of the block containing the declaration, or when a jump occurs to a point before the declaration.

Each time the flow of program execution enters a block, new instances of any automatic objects defined in the block are generated (and initialized, if the declaration includes an initializer). This fact is important in recursive functions, for example.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Initialization

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

You can explicitly specify an object's initial value by including an initializer in its definition. An object defined without an initializer either has an undetermined initial value, or is implicitly initialized by the compiler.

Objects with automatic storage duration have an undetermined initial value if their definition does not include an initializer. Function parameters, which also have automatic storage duration, are initialized with the argument values when the function call occurs. All other objects have static storage duration, and are implicitly initialized with the default value 0, unless their definition includes an explicit initializer. Or, to put it more exactly:

Objects with an arithmetic type have the default initial value 0.
The default initial value of pointer objects is a null pointer (see "Initializing Pointers" in Chapter 9).

The compiler applies these rules recursively in initializing array elements, structure members, and the first members of unions.

An initializer in an object definition specifies the object's initial value explicitly. The initializer is appended to the declarator for the object's identifier with an equals sign (=). The initializer can be either a single expression or a list of initializer expressions enclosed in braces.

For objects with a scalar type, the initializer is a single expression:

    #include <string.h>                // Prototypes of string functions.
    double var = 77, *dPtr = &var;
    int (*funcPtr)( const char*, const char* ) = strcmp;

The initializers here are 77 for the variable var, and &var for the pointer dPtr. The function pointer funcPtr is initialized with the address of the standard library function strcmp().

As in an assignment operation, the initializer must be an expression that can be implicitly converted to the object's type. Thus in the previous example, the constant value 77, with type int, is implicitly converted to the type double.

Objects with an array, structure or union type are initialized with a comma-separated list containing initializers for their individual elements or members:

    short a[4] = { 1, 2, 2*2, 2*2*2 };
    Rectangle_t rect1 = { { -1, 1 }, { 1, -1 } };

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 12: Dynamic Memory Management

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

When you're writing a program, you often don't know how much data it will have to process; or you can anticipate that the amount of data to process will vary widely. In these cases, efficient resource use demands that you allocate memory only as you actually need it at runtime, and release it again as soon as possible. This is the principle of dynamic memory management , which also has the advantage that a program doesn't need to be rewritten in order to process larger amounts of data on a system with more available memory.

This chapter describes dynamic memory management in C, and demonstrates the most important functions involved using a general-purpose binary tree implementation as an example.

The standard library provides the following four functions for dynamic memory management:

malloc(), calloc(): Allocate a new block of memory.
realloc(): Resize an allocated memory block.
free(): Release allocated memory.

All of these functions are declared in the header file stdlib.h. The size of an object in memory is specified as a number of bytes. Various header files, including stdlib.h, define the type size_t specifically to hold information of this kind. The sizeof operator, for example, yields a number of bytes with the type size_t.

The two functions for allocating memory, malloc() and calloc(), have slightly different parameters:

void *malloc( size_t size );: The malloc() function reserves a contiguous memory block whose size in bytes is at least size. When a program obtains a memory block through malloc(), its contents are undetermined.
void *calloc( size_t count, size_t size );: The calloc() function reserves a block of memory whose size in bytes is at least count × size. In other words, the block is large enough to hold an array of count elements, each of which takes up size bytes. Furthermore, calloc() initializes every byte of the memory with the value 0.

Both functions return a pointer to void, also called a typeless pointer. The pointer's value is the address of the first byte in the memory block allocated, or a null pointer if the memory requested is not available.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Allocating Memory Dynamically

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The two functions for allocating memory, malloc() and calloc(), have slightly different parameters:

void *malloc( size_t size );: The malloc() function reserves a contiguous memory block whose size in bytes is at least size. When a program obtains a memory block through malloc(), its contents are undetermined.
void *calloc( size_t count, size_t size );: The calloc() function reserves a block of memory whose size in bytes is at least count × size. In other words, the block is large enough to hold an array of count elements, each of which takes up size bytes. Furthermore, calloc() initializes every byte of the memory with the value 0.

When a program assigns the void pointer to a pointer variable of a different type, the compiler implicitly performs the appropriate type conversion. Some programmers prefer to use an explicit type conversion, however. When you access locations in the allocated memory block, the type of the pointer you use determines how the contents of the location are interpreted. Some examples:

    #include <stdlib.h>                         // Provides function prototypes.
    typedef struct { long key;
                     /* ... more members ... */
                   } Record;                    // A structure type.
    float *myFunc( size_t n )
    {
      // Reserve storage for an object of type double.
      double *dPtr = malloc( sizeof(double) );
      if  ( dPtr == NULL )                      // Insufficient memory.
      {
        /* ... Handle the error ... */
        return NULL;
      }
      else                                      // Got the memory: use it.
      {
        *dPtr = 0.07;
        /* ... */
      }
      // Get storage for two objects of type Record.
      Record *rPtr;
      if  ( ( rPtr = malloc( 2 * sizeof(Record) ) == NULL )
      {
        /* ... Handle the insufficient-memory error ... */
        return NULL;
      }
      // Get storage for an array of n elements of type float.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Characteristics of Allocated Memory

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A successful memory allocation call yields a pointer to the beginning of a memory block. "The beginning" means that the pointer's value is equal to the lowest byte address in the block. The allocated block is aligned so that any type of object can be stored at that address.

An allocated memory block stays reserved for your program until you explicitly release it by calling free() or realloc(). In other words, the storage duration of the block extends from its allocation to its release, or to end of the program.

The arrangement of memory blocks allocated by successive calls to malloc(), calloc(), and/or realloc() is unspecified.

It is also unspecified whether a request for a block of size zero results in a null pointer or an ordinary pointer value. In any case, however, there is no way to use a pointer to a block of zero bytes, except perhaps as an argument to realloc() or free().

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Resizing and Releasing Memory

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

When you no longer need a dynamically allocated memory block, you should give it back to the operating system. You can do this by calling the function free(). Alternatively, you can increase or decrease the size of an allocated memory block by calling the function realloc(). The prototypes of these functions are as follows:

void free( void *ptr );: The free() function releases the dynamically allocated memory block that begins at the address in ptr. A null pointer value for the ptr argument is permitted, and such a call has no effect.
void *realloc( void *ptr, size_t size );: The realloc() function releases the memory block addressed by ptr and allocates a new block of size bytes, returning its address. The new block may start at the same address as the old one.
realloc() also preserves the contents of the original memory block—up to the size of whichever block is smaller. If the new block doesn't begin where the original one did, then realloc() copies the contents to the new memory block. If the new memory block is larger than the original, then the values of the additional bytes are unspecified.
It is permissible to pass a null pointer to realloc() as the argument ptr. If you do, then realloc() behaves similarly to malloc(), and reserves a new memory block of the specified size.
The realloc() function returns a null pointer if it is unable to allocate a memory block of the size requested. In this case, it does not release the original memory block or alter its contents.

The pointer argument that you pass to either of the functions free() and realloc()—if it is not a null pointer—must be the starting address of a dynamically allocated memory block that has not yet been freed. In other words, you may pass these functions only a null pointer or a pointer value obtained from a prior call to malloc(), calloc(), or realloc(). If the pointer argument passed to free() or realloc() has any other value, or if you try to free a memory block that has already been freed, the program's behavior is undefined.

The memory management functions keep internal records of the size of each allocated memory block. This is why the functions

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

An All-Purpose Binary Tree

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Dynamic memory management is fundamental to the implementation of dynamic data structures such as linked lists and trees. In Chapter 10 we presented a simple linked list (see Figure 10-1). The advantage of linked lists over arrays is that new elements can be inserted and existing members removed quickly. However, they also have the drawback that you have to search through the list in sequential order to find a specific item.

A binary search tree (BST), on the other hand, makes linked data elements more quickly accessible. The data items must have a key value that can be used to compare and sort them. A binary search tree combines the flexibility of a linked list with the advantage of a sorted array, in which you can find a desired data item using the binary search algorithm.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Characteristics

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A binary tree consists of a number of nodes that contain the data to be stored (or pointers to the data), and the following structural characteristics :

Each node has up to two direct child nodes.
There is exactly one node, called the root of the tree, that has no parent node. All other nodes have exactly one parent.
Nodes in a binary tree are placed according to this rule: the value of a node is greater than or equal to the values of any descendant in its left branch, and less than the value of any descendant in its right branch.

Figure 12-1 illustrates the structure of a binary tree.

Figure 12-1: A binary tree

A leaf is a node that has no children. Each node of the tree is also considered as the root of a subtree, which consists of the node and all its descendants.

An important property of a binary tree is its height. The height is the length of the longest path from the root to any leaf. A path is a succession of linked nodes that form the connection between a given pair of nodes. The length of a path is the number of nodes in the path, not counting the first node. It follows from these definitions that a tree consisting only of its root node has a height of 0, and the height of the tree in Figure 12-1 is 3.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Implementation

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The example that follows is an implementation of the principal functions for a binary search tree, and uses dynamic memory management. This tree is intended to be usable for data of any kind. For this reason, the structure type of the nodes includes a flexible member to store the data, and a member indicating the size of the data:

    typedef struct Node { struct Node *left,    // Pointers to the left and
                                      *right;   // right child nodes.
                          size_t size;          // Size of the data payload.
                          char data[ ];          // The data itself.
                        } Node_t;

The pointers left and right are null pointers if the node has no left or right child.

As the user of our implementation, you must provide two auxiliary functions. The first of these is a function to obtain a key that corresponds to the data value passed to it, and the second compares two keys. The first function has the following type:

    typedef const void *GetKeyFunc_t( const void *dData );

The second function has a type like that of the comparison function used by the standard function bsearch():

    typedef int CmpFunc_t( const void *pKey1, const void *pKey2 );

The arguments passed on calling the comparison function are pointers to the two keys that you want to compare. The function's return value is less than zero, if the first key is less than the second; or equal to zero, if the two keys are equal; or greater than zero, if the first key is greater than the second. The key may be the same as the data itself. In this case, you need to provide only a comparison function.

Next, we define a structure type to represent a tree. This structure has three members: a pointer to the root of the tree; a pointer to the function to calculate a key, with the type GetKeyFunc_t; and a pointer to the comparison function, with the type CmpFunc_t.

    typedef struct { struct Node  *pRoot;     // Pointer to the root.
                     CmpFunc_t    *cmp;       // Compares two keys.
                     GetKeyFunc_t *getKey;    // Converts data into a key value.
                   } BST_t;

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 13: Input and Output

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Programs must be able to write data to files or to physical output devices such as displays or printers, and to read in data from files or input devices such as a keyboard. The C standard library provides numerous functions for these purposes. This chapter presents a survey of the part of the standard library that is devoted to input and output, often referred to as the I/O library. Further details on the individual functions can be found in Part II. Apart from these library functions, the C language itself contains no input or output support at all.

All of the basic functions, macros, and types for input and output are declared in the header file stdio.h. The corresponding declarations for wide character input and output functions—that is, for input and output of characters with the type wchar_t—are contained in the header file wchar.h.

From the point of view of a C program, all kinds of files and devices for input and output are uniformly represented as logical data streams , regardless of whether the program reads or writes a character or byte at a time, or text lines, or data blocks of a given size. Streams in C can be either text or binary streams , although on some systems even this difference is nil. Opening a file by means of the function fopen() (or tmpfile()) creates a new stream, which then exists until closed by the fclose() function. C leaves file management up to the execution environment—in other words, the system on which the program runs. Thus a stream is a channel by which data can flow from the execution environment to the program, or from the program to its environment. Devices, such as consoles, are addressed in the same way as files.

A text stream transports the characters of a text that is divided into lines. A line of text consists of a sequence of characters ending in a newline character. A line of text can also be empty, meaning that it consists of a newline character only. The last line transported may or may not have to end with a newline character, depending on the implementation.

The internal representation of text in a C program is the same regardless of the system on which the program is running. Thus text input and output on a given system may involve removing, adding, or altering certain characters. For example, on systems that are not Unix-based, end-of-line indicators ordinarily have to be converted into newline characters when reading text files , as on Windows systems for instance, where the end-of-line indicator is a sequence of two control characters,

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Streams

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

As the programmer, you generally do not have to worry about the necessary adaptations, because they are performed automatically by the I/O functions in the standard library. However, if you want to be sure that an input function call yields exactly the same text that was written by a previous output function call, your text should contain only the newline and horizontal tab control characters, in addition to printable characters. Furthermore, the last line should end with a newline character, and no line should end with a space immediately before the newline character.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Files

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A file represents a sequence of bytes. The fopen() function associates a file with a stream and initializes an object of the type FILE, which contains all the information necessary to control the stream. Such information includes a pointer to the buffer used; a file position indicator , which specifies a position to access in the file; and flags to indicate error and end-of-file conditions.

Each of the functions that open files—namely fopen(), freopen(), and tmpfile()—returns a pointer to a FILE object for the stream associated with the file being opened. Once you have opened a file, you can call functions to transfer data and to manipulate the stream. Such functions have a pointer to a FILE object—commonly called a FILE pointer—as one of their arguments. The FILE pointer specifies the stream on which the operation is carried out.

The I/O library also contains functions that operate on the file system, and take the name of a file as one of their parameters. These functions do not require the file to be opened first. They include the following:

The remove() function deletes a file (or an empty directory). The string argument is the file's name. If the file has more than one name, then remove() only deletes the specified name, not the file itself. The data may remain accessible in some other way, but not under the deleted filename.
The rename() function changes the name of a file (or directory). The function's two string arguments are the old and new names, in that order. The remove() and rename() functions both have the return type int, and return zero on success, or a non-zero value on failure. The following statement changes the name of the file songs.dat to mysongs.dat:
```
    if ( rename( "songs.dat", "mysongs.dat" ) != 0 )
      fprintf( stderr, "Error renaming \"songs.dat\".\n" );
```

Conditions that can cause the rename() function to fail include the following: no file exists with the old name; the program does not have the necessary access privileges; or the file is open. The rules for forming permissible filenames depend on the implementation.

Like the elements of a

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Opening and Closing Files

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

To write to a new file or modify the contents of an existing file, you must first open the file. When you open a file, you must specify an access mode indicating whether you plan to read, to write, or some combination of the two. When you have finished using a file, close it to release resources.

The standard library provides the function fopen() to open a file. For special cases, the freopen() and tmpfile() functions also open files.

    FILE *fopen( const char * restrict filename, const char * restrict mode );

This function opens the file whose name is specified by the string filename. The filename may contain a directory part. The second argument, mode, is also a string, and specifies the access mode. The possible access modes are described in the next section. The fopen() function associates the file with a new stream.

    FILE *freopen( const char * restrict filename, const char * restrict mode,
                   FILE * restrict stream );

This function redirects a stream. Like fopen(), freopen() opens the specified file in the specified mode. However, rather than creating a new stream, freopen() associates the file with the existing stream specified by the third argument. The file previously associated with that stream is closed. The most common use of freopen() is to redirect the standard streams, stdin, stdout, and stderr.

    FILE *tmpfile( void );

The tmpfile() function creates a new temporary file whose name is distinct from all other existing files, and opens the file for binary writing and reading (as if the mode string "wb+" were used in an fopen() call). If the program is terminated normally, the file is automatically deleted.

All three file-opening functions return a pointer to the stream opened if successful, or a null pointer to indicate failure.

The access mode specified by the second argument to fopen() or freopen() determines what input and output operations the new stream permits. The permissible values of the mode string are restricted. The first character in the mode string is always r for "read," w for "write," or a for "append," and in the simplest case, the string contains just that one character. However, the mode string may also contain one or both of the characters

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Reading and Writing

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

This section describes the functions that actually retrieve data from or send data to a stream. First, there is another detail to consider: an open stream can be used either for byte characters or for wide characters.

In addition to the type char, C also provides a type for wide characters, named wchar_t. This type is wide enough to represent any character in the extended character sets that the implementation supports (see "Wide Characters and Multibyte Characters" in Chapter 1). Accordingly, there are two complete sets of functions for input and output of characters and strings: the byte-character I/O functions and the wide-character I/O functions. Functions in the second set operate on characters with the type wchar_t. Each stream has an orientation that determines which set of functions is appropriate.

Immediately after you open a file, the orientation of the stream associated with it is undetermined. If the first file access is performed by a byte-character I/O function, then from that point on the stream is byte-oriented. If the first access is by a wide-character function, then the stream is wide-oriented. The orientation of the standard streams, stdin, stdout, and stderr, is likewise undetermined when the program starts.

You can call the function fwide() at any time to ascertain a stream's orientation. Before the first I/O operation, fwide() can also set a new stream's orientation. To change a stream's orientation once it has been determined, you must first reopen the stream by calling the freopen() function.

The wide characters written to a wide-oriented stream are stored in the file associated with the stream as multibyte characters. The read and write functions implicitly perform the necessary conversion between wide characters of type wchar_t and the multibyte character encoding. This conversion may be stateful. In other words, the value of a given byte in the multibyte encoding may depend on control characters that precede it, which alter the shift state or conversion state of the character sequence. For this reason, each wide-oriented stream has an associated object with the type

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Random File Access

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Random file access refers to the ability to read or modify information directly at any given position in a file. You do this by getting and setting a file position indicator, which represents the current access position in the file associated with a given stream.

The following functions return the current file access position. Use one of these functions when you need to note a position in the file to return to it later.

long ftell( FILE *fp );: ftell() returns the file position of the stream specified by fp. For a binary stream, this is the same as the number of characters in the file before this given position—that is, the offset of the current character from the beginning of the file. ftell() returns -1 if an error occurs.
int fgetpos( FILE * restrict fp, fpos_t * restrict ppos );: fgetpos() writes the file position indicator for the stream designated by fp to an object of type fpos_t, addressed by ppos. If fp is a wide-oriented stream, then the indicator saved by fgetpos() also includes the stream's current conversion state (see "Byte-Oriented and Wide-Oriented Streams," earlier in this chapter). fgetpos() returns a nonzero value to indicate that an error occurred. A return value of zero indicates success.

The following example records the positions of all lines in the text file message.txt that begin with the character #:

    #define ARRAY_LEN 1000
    long arrPos[ARRAY_LEN] = { 0L };
    FILE *fp = fopen( "messages.txt", "r" );
    if ( fp != NULL)
    {
      int i = 0, c1 = '\n', c2;
      while ( i < ARRAY_LEN  && ( c2 = getc(fp) ) != EOF )
      {
        if ( c1 == '\n'  &&  c2 == '#' )
          arrPos[i++] = ftell( fp ) - 1;
        c1 = c2;
      }
      /* ... */
    }

The following functions modify the file position indicator:

int fsetpos( FILE *fp, const fpos_t *ppos );: Sets both the file position indicator and the conversion state to the values stored in the object referenced by ppos. These values must have been obtained by a call to the fgetpos() function. If successful, fsetpos() returns 0 and clears the stream's EOF flag. A nonzero return value indicates an error.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 14: Preprocessing Directives

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

In the section "How the C Compiler Works" in Chapter 1, we outlined the eight steps in translation from C source to an executable program. In the first four of those steps, the C preprocessor prepares the source code for the actual compiler. The result is a modified source in which comments have been deleted and preprocessing directives have been replaced with the results of their execution.

This chapter describes the C preprocessing directives. Among these are directives to insert the contents of other source files; to identify sections of code to be compiled only under certain conditions; and to define macros, which are identifiers that the preprocessor replaces with another text.

Each preprocessor directive appears on a line by itself, beginning with the character #. Only space and tab characters may precede the # character on a line. A directive ends with the first newline character that follows its beginning. The shortest preprocessor directive is the null directive. This directive consists of a line that contains nothing but the character #, and possibly comments or whitespace characters. Null directives have no effect: the preprocessor removes them from the source file.

If a directive doesn't fit on one text line, you can end the line with a backslash (\) and continue the directive on the next line. An example:

    #define MacroName  A long, \
    long macro replacement value

The backslash must be the last character before the newline character. The preprocessor concatenates the lines by removing each backslash-and-newline pair that it encounters. Because the preprocessor also replaces each comment with a space, the backslash no longer has the same effect if you put a comment between the backslash and the newline character.

Spaces and tab characters may appear between the # character that introduces a directive and the directive name. (In the previous example, the directive name is define.)

You can verify the results of the C preprocessor, either by running the preprocessor as a separate program or by using a compiler option to perform only the preprocessing steps.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Inserting the Contents of Header Files

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

An #include directive instructs the preprocessor to insert the contents of a specified file in the place of the directive. There are two ways to specify the file to be inserted:

    #include  <filename>
    #include  "filename"

Use the first form, with angle brackets, when you include standard library header files or additional header files provided by the implementation. An example:

    #include <math.h>        // Prototypes of mathematical functions,
                             // with related types and macros.

Use the second form, with double quotation marks, to include source files specific to your programs. Files inserted by #include directives typically have names ending in .h, and contain function prototypes, macro definitions, and type definitions. These definitions can then be used in any program source file after the corresponding #include directive. An example:

    #include "myproject.h"   // Function prototypes, type definitions
                             // and macros used in my project.

You may use macros in an #include directive. If you do use a macro, the macro's replacement must result in a correct #include directive. Example 14-1 demonstrates such #include directives.

Example 14-1. Macros in #include directives

#ifdef _DEBUG_
  #define MY_HEADER "myProject_dbg.h"
#else
  #define MY_HEADER "myProject.h"
#endif
#include MY_HEADER

If the macro _DEBUG_ is defined when this segment is preprocessed, then the preprocessor inserts the contents of myProject_dbg.h. If not, it inserts myProject.h. The #ifdef, #else, and #endif directives are described in detail in the section "Conditional Compiling," later in this chapter.

It is up to the given C implementation to define where the preprocessor searches for files specified in #include directives. Whether filenames are case-sensitive is also implementation-dependent. For files specified between angle brackets (<filename>), the preprocessor usually searches in certain system directories, such as /usr/local/include and /usr/include on Unix systems, for example.

For files specified in quotation marks ("filename

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Defining and Using Macros

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

You can define macros in C using the preprocessor directive #define. This directive allows you to give a name to any text you want, such as a constant or a statement. Wherever the macro's name appears in the source code after its definition, the preprocessor replaces it with that text.

A common use of macros is to define a name for a numeric constant:

    #define ARRAY_SIZE 100
    double data[ARRAY_SIZE];

These two lines define the macro name ARRAY_SIZE for the number 100, and then use the macro in a definition of the array data. Writing macro names in all capitals is a widely used convention that helps to distinguish them from variable names. This simple example also illustrates how macros can make a C program more flexible. It's safe to assume that the length of an array like data will be used in several places in the program—to control for loops that iterate through the elements of the array, for example. In each instance, use the macro name instead of a number. Then, if a program maintainer ever needs to modify the size of the array, it needs to be changed in only one place: in the #define directive.

In the third translation step, the preprocessor parses the source file as a sequence of preprocessor tokens and whitespace characters (see "The C Compiler's Translation Phases" in Chapter 1). If any token is a macro name, the preprocessor expands the macro; that is, it replaces the macro name with the text it has been defined to represent. Macro names that occur in string literals are not expanded, because a string literal is itself a single preprocessor token.

Preprocessor directives cannot be created by macro expansion. Even if a macro expansion results in a formally valid directive, the preprocessor doesn't execute it.

You can define macros with or without parameters.

A macro definition with no parameters has the form:

    #define macro_name replacement_text

Whitespace characters before and after replacement_text are not part of the replacement text. The replacement_text can also be empty. Some examples:

    #define TITLE  "*** Examples of Macros Without Parameters ***"
    #define BUFFER_SIZE  (4 * 512)
    #define RANDOM  (-1.0 + 2.0*(double)rand() / RAND_MAX)

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Conditional Compiling

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The conditional compiling directives instruct the preprocessor to retain or omit parts of the source code depending on specified conditions. You can use conditional compiling to adapt a program to different target systems, for example, without having to manage a variety of source files.

A conditional section begins with one of the directives #if, #ifdef, or #ifndef, and ends with the directive #endif. Any number of #elif directives, and at most one #else directive, may occur within the conditional section. A conditional section that begins with #if has the following form:

    #if expression1
      [ group1 ]
    [#elif expression2
      [ group2 ]]
    ...
    [#elif expression(n)
      [ group(n) ]]
    [#else
      [ group(n+1) ]]
    #endif

The preprocessor evaluates the conditional expressions in sequence until it finds one whose value is nonzero, or "true." The preprocessor retains the text in the corresponding group for further processing. If none of the expressions is true, and the conditional section contains an #else directive, then the text in the #else directive's group is retained.

The token groups group1, group2, and so on consist of any C source code, and may include more preprocessing directives, including nested conditional compiling directives. Groups that the preprocessor does not retain for further processing are removed from the program at the end of the preprocessor phase.

The expression that forms the condition of an #if or #elif directive must be an integer constant preprocessor expression. This is different from an ordinary integer constant expression (see "Constant Expressions" in Chapter 5) in these respects:

You may not use the cast operator in an #if or #elif expression.
You may use the preprocessor operator defined (see "The defined Operator," later in this chapter).
After the preprocessor has expanded all macros and evaluated all defined expressions, it replaces all other identifiers or keywords in the expression with the character 0.
All signed values in the expression have the type intmax_t, and all unsigned values have the type uintmax_t

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Defining Line Numbers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The compiler includes line numbers and source filenames in warnings, error messages, and information provided to debugging tools. You can use the #line directive in the source file itself to change the compiler's filename and line numbering information. The #line directive has the following syntax:

    #line line_number ["filename"]

The next line after a #line directive has the number specified by line_number. If the directive also includes the optional string literal "filename", then the compiler uses the contents of that string as the name of the current source file.

The line_number must be a decimal constant greater than zero. An example:

    #line 1200 "primary.c"

The line containing the #line directive may also contain macros. If so, the preprocessor expands them before executing the #line directive. The #line directive must then be formally correct after macro expansion.

Programs can access the current line number and filename settings as values of the standard predefined macros _ _LINE_ _ and _ _FILE_ _:

    printf( "This message was printed by line %d in the file %s.\n", _  _LINE_  _,
            _  _FILE_  _ );

The #line directive is typically used by programs that generate C source code as their output. By placing the corresponding input file line numbers in #line directives, such programs can make the C compiler's error messages refer to the pertinent lines in the original source.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Generating Error Messages

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The #error directive makes the preprocessor issue an error message, regardless of any actual formal error. Its syntax is:

    #error [text]

If the optional text is present, it is included in the preprocessor's error message. The compiler then stops processing the source file and exits as it would on encountering a fatal error. The text can be any sequence of preprocessor tokens. Any macros contained in it are not expanded. It is a good idea to use a string literal here to avoid problems with punctuation characters, such as single quotation marks.

The following example tests whether the standard macro _ _STDC_ _ is defined, and generates an error message if it is not:

    #ifndef _  _STDC_  _
      #error  "This compiler does not conform to the ANSI C standard."
    #endif

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The #pragma Directive

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The #pragma directive is a standard way to provide additional information to the compiler. This directive has the following form:

    #pragma [tokens]

If the first token after #pragma is STDC, then the directive is a standard pragma. If not, then the effect of the #pragma directive is implementation-dependent. For the sake of portability, you should use #pragma directives sparingly.

If the preprocessor recognizes the specified tokens, it performs whatever action they stand for, or passes information on to the compiler. If the preprocessor doesn't recognize the tokens, it must ignore the #pragma directive.

Recent versions of the GNU C compiler and Microsoft's Visual C compiler both recognize the pragma pack(n), for example, which instructs the compiler to align structure members on certain byte boundaries. The following example uses pack(1) to specify that each structure member be aligned on a byte boundary:

    #if defined( _  _GNUC_  _ ) || defined( _MSC_VER )
      #pragma pack(1)                              // Byte-aligned: no padding.
    #endif

Single-byte alignment ensures that there are no gaps between the members of a structure. The argument n in a pack pragma is usually a small power of two. For example, pack(2) aligns structure members on even-numbered byte addresses, and pack(4) on four-byte boundaries. pack() with no arguments resets the alignment to the implementation's default value.

C99 introduced the following three standard pragmas:

    #pragma  STDC  FP_CONTRACT  on_off_switch
    #pragma  STDC  FENV_ACCESS  on_off_switch
    #pragma  STDC  CX_LIMITED_RANGE  on_off_switch

The value of the on_off_switch must be ON, OFF, or DEFAULT. The effects of these pragmas are discussed in "Mathematical Functions" in Chapter 16.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The _Pragma Operator

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

You cannot construct a #pragma directive (or any other preprocessor directive) by means of a macro expansion. For cases where you would want to do that, C99 has also introduced the preprocessor operator _Pragma, which you can use with macros. Its syntax is as follows:

    _Pragma ( string_literal )

Here is how the _Pragma operator works. First, the string_literal operand is "de-stringized," or converted into a sequence of preprocessor tokens, in this way: the quotation marks enclosing the string are removed; each sequence of a backslash followed by a double quotation mark (\") is replaced by a quotation mark alone ("); and each sequence of two backslash characters (\\) is replaced with a single backslash (\). Then the preprocessor interprets the resulting sequence of tokens as if it were the text of a #pragma directive.

The following line defines a helper macro, STR, which you can use to rewrite any #pragma directive using the _Pragma operator:

    #define  STR(s)  #s             // This # is the "stringify" operator.

With this definition, the following two lines are equivalent:

    #pragma tokens
    _Pragma ( STR(tokens) )

The following example uses the _Pragma operator in a macro:

    #define ALIGNMENT(n) _Pragma( STR(pack(n)) )
    ALIGNMENT(2)

Macro replacement changes the ALIGNMENT(2) macro call to the following:

    _Pragma( "pack(2)" )

The preprocessor then processes the line as it would the following directive:

    #pragma pack(2)

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Predefined Macros

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Every compiler that conforms to the ISO C standard must define the following seven macros. Each of these macro names begins and ends with two underscore characters:

_ _DATE_ _: The replacement text is a string literal containing the compilation date in the format "Mmm dd yyyy" (example: "Mar 19 2006"). If the day of the month is less than 10, the tens place contains an additional space character.
_ _FILE_ _: A string literal containing the name of the current source file.
_ _LINE_ _: An integer constant whose value is the number of the line in the current source file that contains the _ _LINE_ _ macro reference, counting from the beginning of the file.
_ _TIME_ _: A string literal that contains the time of compilation, in the format "hh:mm:ss" (example: "08:00:59").
_ _STDC_ _: The integer constant 1, indicating that the compiler conforms to the ISO C standard.
_ _STDC_HOSTED_ _: The integer constant 1 if the current implementation is a hosted implementation; otherwise the constant 0.
_ _STDC_VERSION_ _: The long integer constant 199901L if the compiler supports the C99 standard of January 1999.

The values of the _ _FILE_ _ and _ _LINE_ _ macros can be influenced by the #line directive. The values of all the other predefined macros remains constant throughout the compilation process.

The value of the constant _ _STDC_VERSION_ _ will be adjusted with each future revision of the international C standard.

Under the C99 standard, C programs are executed either in a hosted or in a freestanding environment. Most C programs are executed in a hosted environment, which means that the C program runs under the control and with the support of an operating system. In this case, the constant _ _STDC_HOSTED_ _ has the value 1, and the full standard library is available.

A program in a freestanding environment runs without the support of an operating system, and therefore only minimal standard library resources are available to it (see "Execution Environments" in Chapter 15).

Unlike the macros listed previously, the following standard macros are optional. If any of these macros is defined, it indicates that the implementation supports a certain IEC or ISO standard:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 15: The Standard Headers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Each standard library function is declared in one or more of the standard headers . These headers also contain all the macro and type definitions that the C standard provides. This chapter describes the contents and use of the standard headers.

Each of the standard headers contains a set of related function declarations, macros, and type definitions. The standard headers are also called header files , as the contents of each header are usually stored in a file. Strictly speaking, however, the standard does not require the headers to be organized in files.

The C standard defines the following 24 headers. Those marked with an asterisk have been added in C99.

`assert.h`	`inttypes.h`*	`signal.h`	`stdlib.h`
`complex.h`*	`iso646.h`	`stdarg.h`	`string.h`
`ctype.h`	`limits.h`	`stdbool.h`*	`tgmath.h`*
`errno.h`	`locale.h`	`stddef.h`	`time.h`
`fenv.h`*	`math.h`	`stdint.h`*	`wchar.h`
`float.h`	`setjmp.h`	`stdio.h`	`wctype.h`

You can add the contents of a standard header to a source file by inserting an #include directive, which must be placed outside all functions. You can include the standard headers as many times as you want, and in any order. However, before the #include directive for any header, your program must not define any macro with the same name as an identifier in that header. To make sure that your programs respect this condition, always include the required standard headers at the beginning of your source files, before any header files of your own.

C programs run in one of two execution environments : hosted or freestanding. Most common programs run in a hosted environment; that is, under the control and with the support of an operating system. In a hosted environment , the full capabilities of the standard library are available. Furthermore, programs compiled for a hosted environment must define a function named main(), which is the first function invoked on program start.

A program designed for a freestanding environment runs without the support of an operating system. In a freestanding environment , the name and type of the first function invoked when a program starts is determined by the given implementation. Programs for a freestanding environment cannot use complex floating-point types, and may be limited to the following headers:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Using the Standard Headers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

float.h

iso646.h

limits.h

stdarg.h

stdbool.h

stddef.h

stdint.h

Specific implementations may also provide additional standard library resources.

All standard library functions have external linkage. You may use standard library functions without including the corresponding header by declaring them in your own code. However, if a standard function requires a type defined in the header, then you must include the header.

The standard library functions are not guaranteed to be reentrant—that is, two calls to a standard library function may not safely be in execution concurrently in one process. One reason for this rule is that several of the functions use and modify static variables, for example. As a result, you can't generally call standard library functions in signal handling routines. Signals are asynchronous, which means that a program may receive a signal at any time, even while it's executing a standard library function. If that happens, and the handler for that signal calls the same standard function, then the function must be reentrant. It is up to individual implementations to determine which functions are reentrant, or whether to provide a reentrant version of the whole standard library.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Contents of the Standard Headers

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The following subsections list the standard headers in alphabetical order, with brief descriptions of their contents , including all the types and macros defined in them.

The standard functions are described in the next two chapters: Chapter 16 summarizes the functions that the standard library provides each area of application—the mathematical functions, string manipulation functions, functions for time and date operations, and so on. Chapter 17 then provides a detailed description of each function individually, in alphabetical order, with examples illustrating their use.

This header defines only the function-like macro assert(), which tests whether the value of an expression is nonzero. If you define the macro NDEBUG before including assert.h , then calls to assert() have no effect.

C99 supports arithmetic with complex numbers by introducing complex floating-point types and including appropriate functions in the math library. The header file complex.h contains the prototypes of the complex math functions and defines the related macros. For a brief description of complex numbers and their representation in C, see "Complex Floating-Point Types (C99)" in Chapter 2.

The names of the mathematical functions for complex numbers all begin with the letter c. For example, csin() is the complex sine function, and cexp() the complex exponential function. You can find a complete list of these functions in "Mathematical Functions" in Chapter 16. In addition, the following function names are reserved for future extensions:

    cerf()    cerfc()    cexp2()    cexpm1()    clog10()    clog1p()
    clog2()   clgamma()  ctgamma()

The same names with the suffixes f (float _Complex) and l (long double _Complex) are also reserved.

The header file complex.h defines the following macros:

complex: This is a synonym for the keyword _Complex.
_Complex_I: This macro represents an expression of type const float _Complex whose value is the imaginary unit, i.
I: This macro is a synonym for _Complex_I, and likewise represents the imaginary unit.

This header contains the declarations of functions to classify and convert single characters. These include the following functions, which are usually also implemented as macros:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 16: Functions at a Glance

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

This chapter lists the functions in the standard library according to their respective areas of application, describing shared features of the functions and their relationships to one another. This compilation might help you to find the right function for your purposes while programming.

The individual functions are described in detail in Chapter 17, which explains them in alphabetical order, with examples.

We have dealt with this topic in detail in Chapter 13, which contains sections on I/O streams, sequential and random file access , formatted I/O, and error handling. A tabular list of the I/O functions will therefore suffice here. Table 16-1 lists general file access functions declared in the header stdio.h.

Table 16-1: General file access functions
Purpose	Functions
Rename a file, delete a file	`rename()`, `remove()`
Create and/or open a file	`fopen()`, `freopen()`, `tmpfile()`
Close a file	`fclose()`
Generate a unique filename	`tmpnam()`
Query or clear file access flags	`feof()`, `ferror()`, `clearerr()`
Query the current file access position	`ftell()`, `fgetpos()`
Change the current file access position	`rewind()`, `fseek()`, `fsetpos()`
Write buffer contents to file	`fflush()`
Control file buffering	`setbuf()`, `setvbuf()`

There are two complete sets of functions for input and output of characters and strings: the byte-character and the wide-character I/O functions (see "Byte-Oriented and Wide-Oriented Streams" in Chapter 13 for more information). The wide-character functions operate on characters with the type wchar_t, and are declared in the header wchar.h. Table 16-2 lists both sets.

Table 16-2: File I/O functions
Purpose	Functions in stdio.h	Functions in wchar.h
Get/set stream orientation		`fwide()`
Write characters	`fputc()`, `putc()`, `putchar()`	`fputwc()`, `putwc()`, `putwchar()`
Read characters	`fgetc()`, `getc()`, `getchar()`	`fgetwc()`, `getwc()`, `getwchar()`
Put back characters read	`ungetc()`	`ungetwc()`
Write lines	`fputs()`, `puts()`	`fputws()`
Read lines	`fgets()`, `gets()`	`fgetws()`
Write blocks	`fwrite()`
Read blocks	`fread()`
Write formatted strings

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Input and Output

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Table 16-1: General file access functions
Purpose	Functions
Rename a file, delete a file	`rename()`, `remove()`
Create and/or open a file	`fopen()`, `freopen()`, `tmpfile()`
Close a file	`fclose()`
Generate a unique filename	`tmpnam()`
Query or clear file access flags	`feof()`, `ferror()`, `clearerr()`
Query the current file access position	`ftell()`, `fgetpos()`
Change the current file access position	`rewind()`, `fseek()`, `fsetpos()`
Write buffer contents to file	`fflush()`
Control file buffering	`setbuf()`, `setvbuf()`

Table 16-2: File I/O functions
Purpose	Functions in stdio.h	Functions in wchar.h
Get/set stream orientation		`fwide()`
Write characters	`fputc()`, `putc()`, `putchar()`	`fputwc()`, `putwc()`, `putwchar()`
Read characters	`fgetc()`, `getc()`, `getchar()`	`fgetwc()`, `getwc()`, `getwchar()`
Put back characters read	`ungetc()`	`ungetwc()`
Write lines	`fputs()`, `puts()`	`fputws()`
Read lines	`fgets()`, `gets()`	`fgetws()`
Write blocks	`fwrite()`
Read blocks	`fread()`
Write formatted strings	`printf()`, `vprintf()` `fprintf()`, `vfprintf()` `sprintf()`, `vsprintf()` `snprintf()`, `vsnprintf()`	`wprintf()`, `vwprintf()` `fwprintf()`, `vfwprintf()` `swprintf()`, `vswprintf()`
Read formatted strings	`scanf()`, `vscanf()` `fscanf()`, `vfscanf()` `sscanf()`, `vsscanf()`	`wscanf()`, `vwscanf()` `fwscanf()`, `vfwscanf()` `swscanf()`, `vswscanf()`

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Mathematical Functions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The standard library provides many mathematical functions. Most of them operate on real or complex floating-point numbers. However, there are also several functions with integer types, such as the functions to generate random numbers.

The functions to convert numeral strings into arithmetic types are listed in "String Processing," later in this chapter. The remaining math functions are described in the following subsections.

The math functions for the integer types are declared in the header stdlib.h. Two of these functions, abs() and div(), are declared in three variants to operate on the three signed integer types int, long, and long long. As Table 16-3 shows, the functions for the type long have names beginning with the letter l; those for long long with ll. Furthermore, the header inttypes.h declares function variants for the type intmax_t, with names that begin with imax.

Table 16-3: Integer arithmetic functions
Purpose	Functions declared in stdlib.h	Functions declared in stdint.h
Absolute value	`abs()`, `labs()`, `llabs()`	`imaxabs()`
Division	`div()`, `ldiv()`, `lldiv()`	`imaxdiv()`
Random numbers	`rand()`, `srand()`

The functions for real floating-point types are declared in the header math.h, and those for complex floating-point types in complex.h. Table 16-4 lists the functions that are available for both real and complex floating-point types. The complex versions of these functions have names that start with the prefix c. Table 16-5 lists the functions that are only defined for the real types; and Table 16-6 lists the functions that are specific to complex types.

For the sake of readability, Tables 16-4 through 16-6 show only the names of the functions for the types double and double _Complex. Each of these functions also exists in variants for the types float (or float _Complex) and long double (or long double _Complex). The names of these variants end in the suffix f for float or l for long double. For example, the functions sin() and csin() listed in Table 16-4 also exist in the variants sinf(), sinl(), csinf(), and csinl() (but see also "Type-generic macros" in the next section).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Character Classification and Conversion

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The standard library provides a number of functions to classify characters and to perform conversions on them. The header ctype.h declares such functions for byte characters, with character codes from 0 to 255. The header wctype.h declares similar functions for wide characters, which have the type wchar_t. These functions are commonly implemented as macros.

The results of these functions, except for isdigit() and isxdigit(), depends on the current locale setting for the locale category LC_CTYPE. You can query or change the locale using the setlocale() function.

The functions listed in Table 16-12 test whether a character belongs to a certain category. Their return value is nonzero, or true, if the argument is a character code in the given category.

Table 16-12: Character classification functions
Category	Functions in ctype.h	Functions in wctype.h
Letters	`isalpha()`	`iswalpha()`
Lowercase letters	`islower()`	`iswlower()`
Uppercase letters	`isupper()`	`iswupper()`
Decimal digits	`isdigit()`	`iswdigit()`
Hexadecimal digits	`isxdigit()`	`iswxdigit()`
Letters and decimal digits	`isalnum()`	`iswalnum()`
Printable characters (including whitespace)	`isprint()`	`iswprint()`
Printable, non-whitespace characters	`isgraph()`	`iswgraph()`
Whitespace characters	`isspace()`	`iswspace()`
Whitespace characters that separate words in a line of text	`isblank()`	`iswblank()`
Punctuation marks	`ispunct()`	`iswpunct()`
Control characters	`iscntrl()`	`iswcntrl()`

The functions isgraph() and iswgraph() behave differently if the execution character set contains other byte-coded, printable, whitespace characters (that is, whitespace characters which are not control characters) in addition to the space character (' '). In that case, iswgraph() returns false for all such printable whitespace characters, while isgraph() returns false only for the space character (' ').

The header wctype.h also declares the two additional functions listed in Table 16-13 to test wide characters. These are called the extensible classification functions, which you can use to test whether a wide-character value belongs to an implementation-defined category designated by a string.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

String Processing

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A string is a continuous sequence of characters terminated by '\0', the string terminator character. The length of a string is considered to be the number of characters before the string terminator. Strings are stored in arrays whose elements have the type char or wchar_t. Strings of wide characters—that is, characters of the type wchar_t—are also called wide strings .

C does not have a basic type for strings, and hence has no operators to concatenate, compare, or assign values to strings. Instead, the standard library provides numerous functions, listed in Table 16-16, to perform these and other operations with strings. The header string.h declares the functions for conventional strings of char. The names of these functions begin with str. The header wchar.h declares the corresponding functions for strings of wide characters, with names beginning with wcs.

Like any other array, a string that occurs in an expression is implicitly converted into a pointer to its first element. Thus when you pass a string as an argument to a function, the function receives only a pointer to the first character, and can determine the length of the string only by the position of the string terminator character.

Table 16-16: String-processing functions
Purpose	Functions in string.h	Functions in wchar.h
Find the length of a string.	`strlen()`	`wcslen()`
Copy a string.	`strcpy()`, `strncpy()`	`wcscpy()`, `wcsncpy()`
Concatenate strings.	`strcat()`, `strncat()`	`wcscat()`, `wcsncat()`
Compare strings.	`strcmp()`, `strncmp()`, `strcoll()`	`wcscmp()`, `wcsncmp()`, `wcscoll()`
Transform a string so that a comparison of two transformed strings using `strcmp()` yields the same result as a comparison of the original strings using the locale-sensitive function `strcoll()`.	`strxfrm()`	`wcsxfrm()`
In a string, find:
The first or last occurrence of a given character	`strchr()`, `strrchr()`	`wcschr()`, `wcsrchr()`
The first occurrence of another string	`strstr()`	`wcsstr()`
The first occurrence of any of a given set of characters	`strcspn()`, `strpbrk()`	`wcscspn()`, `wcspbrk()`
The first character that is not a member of a given set

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Multibyte Characters

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

In multibyte character sets, each character is coded as a sequence of one or more bytes (see "Wide Characters and Multibyte Characters" in Chapter 1). Unlike wide characters, each of which is represented by a single object of the type wchar_t, individual multibyte characters may be represented by different numbers of bytes. However, the number of bytes that represent a multibyte character , including any necessary state-shift sequences, is never more than the value of the macro MB_CUR_MAX, which is defined in the header stdlib.h.

C provides standard functions to obtain the wide-character code, or wchar_t value, that corresponds to any given multibyte character, and to convert any wide character to its multibyte representation. Some multibyte encoding schemes are stateful; the interpretation of a given multibyte sequence may depend on its position with respect to control characters, called shift sequences, that are used in the multibyte stream or string. In such cases, the conversion of a multibyte character to a wide character, or the conversion of a multibyte string into a wide string, depends on the current shift state at the point where the first multibyte character is read. For the same reason, converting a wide character to a multibyte character, or a wide string to a multibyte string, may entail inserting appropriate shift sequences in the output.

Conversions between wide and multibyte characters or strings may be necessary when you read or write characters from a wide-oriented stream (see "Byte-Oriented and Wide-Oriented Streams" in Chapter 13).

Table 16-17 lists all of the standard library functions for handling multibyte characters.

Table 16-17: Multibyte character functions
Purpose	Functions in stdlib.h	Functions in wchar.h
Find the length of a multibyte character	`mblen()`	`mbrlen()`
Find the wide character corresponding to a given multibyte character	`mbtowc()`	`mbrtowc()`
Find the multibyte character corresponding to a given wide character	`wctomb()`	`wcrtomb()`
Convert a multibyte string into a wide string	`mbstowcs()`	`mbsrtowcs()`

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Converting Between Numbers and Strings

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The standard library provides a variety of functions to interpret a numeral string and return a numeric value. These functions are listed in Table 16-18. The numeral conversion functions differ both in their target types and in the string types they interpret. The functions for char strings are declared in the header stdlib.h, and those for wide strings in wchar.h.

Table 16-18: Conversion of numeral strings
Conversion	Functions in stdlib.h	Functions in wchar.h
String to `int`	`atoi()`
String to `long`	`atol()`, `strtol()`	`wcstol()`
String to `unsigned long`	`strtoul()`	`wcstoul()`
String to `long long`	`atoll()`, `strtoll()`	`wcstoll()`
String to `unsigned long long`	`strtoull()`	`wcstoull()`
String to `float`	`strtof()`	`wcstof()`
String to `double`	`atof()`, `strtod()`	`wcstod()`
String to `long double`	`strtold()`	`wcstold()`

The functions strtol(), strtoll(), and strtod() can be more practical to use than the corresponding functions atol(), atoll(), and atof(), as they return the position of the next character in the source string after the character sequence that was interpreted as a numeral.

In addition to the functions listed in Table 16-18, you can also perform string-to-number conversions using one of the sscanf() functions with an appropriate format string. Similarly, you can use the sprintf() functions to perform the reverse conversion, generating a numeral string from a numeric argument. These functions are declared in the header stdio.h. Once again, the corresponding functions for wide strings are declared in the header wchar.h. Both sets of functions are listed in Table 16-19.

Table 16-19: Conversions between strings and numbers using format strings
Conversion	Functions in stdio.h	Functions in wchar.h
String to number	`sscanf()`, `vsscanf()`	`swscanf()`, `vswscanf()`
Number to string	`sprintf()`, `snprintf()`, `vsprintf()`, `vsnprintf()`	`swprintf()`, `vswprintf()`

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Searching and Sorting

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Table 16-20 lists the standard library's two general searching and sorting functions, which are declared in the header stdlib.h. The functions to search the contents of a string are listed in the section "String Processing," earlier in this chapter.

Table 16-20: Searching and sorting functions
Purpose	Function
Sort an array	`qsort()`
Search a sorted array	`bsearch()`

These functions feature an abstract interface that allows you to use them for arrays of any element type. One parameter of the qsort() function is a pointer to a call-back function that qsort() can use to compare pairs of array elements. Usually you will need to define this function yourself. The bsearch() function, which finds the array element designated by a "key" argument, uses the same technique, calling a user-defined function to compare array elements with the specified key.

The bsearch() function uses the binary search algorithm , and therefore requires that the array be sorted beforehand. Although the name of the qsort() function suggests that it implements the quick-sort algorithm, the standard does not specify which sorting algorithm it uses.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Memory Block Handling

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The functions listed in Table 16-21 initialize, copy, search, and compare blocks of memory. The functions declared in the header string.h access a memory block byte by byte, while those declared in wchar.h read and write units of the type wchar_t. Accordingly, the size parameter of each function indicates the size of a memory block as a number of bytes, or as a number of wide characters.

Table 16-21: Functions to manipulate blocks of memory
Purpose	Functions in string.h	Functions in wchar.h
Copy a memory block, where source and destination do not overlap	`memcpy()`	`wmemcpy()`
Copy a memory block, where source and destination may overlap	`memmove()`	`wmemmove()`
Compare two memory blocks	`memcmp()`	`wmemcmp()`
Find the first occurrence of a given character	`memchr()`	`wmemchr()`
Fill the memory block with a given character value	`memset()`	`wmemset()`

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Dynamic Memory Management

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Many programs, including those that work with dynamic data structures for example, depend on the ability to allocate and release blocks of memory at runtime. C programs can do that by means of the four dynamic memory management functions declared in the header stdlib.h, which are listed in Table 16-22. The use of these functions is described in detail in Chapter 12.

Table 16-22: Dynamic memory management functions
Purpose	Function
Allocate a block of memory	`malloc()`
Allocate a memory block and fill it with null bytes	`calloc()`
Resize an allocated memory block	`realloc()`
Release a memory block	`free()`

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Date and Time

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The header time.h declares the standard library functions to obtain the current date and time , to perform certain conversions on date and time information, and to format it for output. A key function is time(), which yields the current calendar time in the form of an arithmetic value of the type time_t. This is usually encoded as the number of seconds elapsed since a specified moment in the past, called the epoch. The Unix epoch is 00:00:00 o'clock on January 1, 1970, UTC (Coordinated Universal Time, formerly called Greenwich Mean Time or GMT).

There are also standard functions to convert a calendar time value with the type time_t into a string or a structure of type struct tm. The structure type has members of type int for the second, minute, hour, day, month, year, day of the week, day of the year, and a Daylight Saving Time flag (see the description of the gmtime() function in Chapter 17). Table 16-23 lists all the date and time functions.

Table 16-23: Date and time functions
Purpose	Function
Get the amount of CPU time used	`clock()`
Get the current calendar time	`time()`
Convert calendar time to `struct tm`	`gmtime()`
Convert calendar time to `struct tm` with local time values	`localtime()`
Normalize the values of a `struct tm` object and return the calendar time with type `time_t`	`mktime()`
Convert calendar time to a string	`ctime()`, `strftime()`, `wcsftime()`

The extremely flexible strftime() function uses a format string and the LC_TIME locale category to generate a date and time string. You can query or change the locale using the setlocale() function. The function wcsftime() is the wide-string version of strftime(), and is declared in the header wchar.h rather than time.h.

The diagram in Figure 16-1 offers an organized summary of the available date and time functions.

Figure 16-01: Date and time functions

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Process Control

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

A process is a program that is being executed. Each process has a number of attributes, such as its open files. The exact attributes of processes are dependent on the given system. The standard library's process control features can be divided into two kinds: those for communication with the operating system, and those concerned with signals.

The functions in Table 16-24 are declared in the header stdio.h, and allow programs to communicate with the operating system.

Table 16-24: Functions for communication with the operating system
Purpose	Function
Query the value of an environment variable	`getenv()`
Execute a system command	`system()`
Register a function to be executed when the program exits	`atexit()`
Exit the program normally	`exit()`, `_Exit()`
Exit the program abruptly	`abort()`

In Unix and Windows, one attribute of a process is the environment, which consists of a list of strings of the form name=value. Usually, a process inherits an environment generated by its parent process. The getenv() function is one way for a program to receive control information, such as the names of directories containing files to use.

In contrast to exit(), the _Exit() function ignores all signals, and does not call any functions registered by atexit().

An operating system sends various signals to processes to notify them of unusual events. Such events typically include severe errors, such as illegal memory access, or hardware interrupts such as timer alarms. Signals may also be caused by a user at the console, however, or by the program itself calling the raise() function.

Each program may determine for itself how to react to specific signals. A program can choose to ignore signals, or let the default signal handler deal with them, or install its own signal handler function. A signal handler is a function that is executed automatically when the program receives a given type of signal.

The two C functions that deal with signals are declared, along with macros to designate the signal types, in the header signal.h. The functions are listed in Table 16-25.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Internationalization

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The standard library supports the development of C programs that are able to adapt to local cultural conventions. For example, programs may use locale-specific character sets or formats for currency information.

All programs start in the default locale, named "C", which contains no country or language-specific information. During runtime, programs can change their locale or query information about the current locale. The information that makes up a locale is divided into categories, which you can query and set individually.

The functions that operate on the current locale are declared, along with the related types and macros, in the header locale.h. They are listed in Table 16-26.

Table 16-26: Locale functions
Purpose	Function
Query or set the locale for a specified category of information	`setlocale()`
Get information about the local formatting conventions for numeric and monetary strings	`localeconv()`

Many functions make use of locale-specific information. The standard library function descriptions in Chapter 17 point out whenever a given function accesses locale settings. Such functions include the following:

Character classification and case mapping functions
Locale-sensitive string comparison (strcoll() and wcscoll())
Date and time formatting (strftime() and wcsftime())
Conversion of numeral strings
Conversions between multibyte and wide characters

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Nonlocal Jumps

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The goto statement in C can be used to jump only within a function. For greater freedom, the header setjmp.h declares a pair of functions that permit jumps to any point in a program. Table 16-27 lists these functions.

Table 16-27: Nonlocal jump functions
Purpose	Function
Save the current execution context as a jump target for the `longjmp()` function	`setjmp()`
Jump to a program context saved by a call to the `setjmp()` function	`longjmp()`

When you call the function-like macro setjmp(), it stores a value in its argument with the type jmp_buf that acts as a bookmark to that point in the program. The jmp_buf object holds all the necessary parts of the current execution state (including registers and stack). When you pass a jmp_buf object to longjmp(), longjmp() restores the saved state, and the program continues at the point following the earlier setjmp() call. The longjmp() call must not occur after the function that called setjmp() returns. Furthermore, if any variables with automatic storage duration in the function that called setjmp() were modified after the setjmp() call (and were not declared as volatile), then their values after the longjmp() call are indeterminate.

The return value of setjmp() indicates whether the program has reached that point after the original setjmp() call, or through a longjmp() call: setjmp() itself returns 0. If setjmp() appears to return any other value, then that point in the program was reached by calling longjmp(). If the second argument in the longjmp() call—that is, the requested return value—is 0, it is replaced with 1 as the apparent return value after the corresponding setjmp() call.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Debugging

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Using the macro assert() is a simple way to find logical mistakes during program development. This macro is defined in the header assert.h. It simply tests its scalar argument for a nonzero value. If the argument's value is zero, assert() prints an error message that lists the argument expression, the function, the filename, and the line number, and then calls abort() to stop the program. In the following example, the assert() calls perform some plausibility checks on the argument to be passed to free():

    #include <stdlib.h>
    #include <assert.h>
 
    char *buffers[64] = { NULL };   // An array of pointers
    int i;
 
    /* ... allocate some memory buffers; work with them ... */
 
      assert( i >= 0 && i < 64 );     // Index out of range?
      assert( buffers[i] != NULL );   // Was the pointer used at all?
      free( buffers[i] );

Rather than trying to free a nonexistent buffer, this code aborts the program (here compiled as assert.c) with the following diagnostic output:

    assert: assert.c:14: main: Assertion `buffers[i] != ((void *)0)' failed.
    Aborted

When you have finished testing, you can disable all assert() calls by defining the macro NDEBUG before the #include directive for assert.h. The macro does not need to have a replacement value. For example:

    #define NDEBUG
    #include <assert.h>
    /* ... */

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Error Messages

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Various standard library functions set the global variable errno to a value indicating the type of error encountered during execution (see the section on errno.h in Chapter 15). The functions in Table 16-28 generate an appropriate error message for the current the value of errno.

Table 16-28: Error message functions
Purpose	Function	Header
Print an appropriate error message on `stderr` for the current value of `errno`	`perror()`	`stdio.h`
Return a pointer to the appropriate error message for a given error number	`strerror()`	`string.h`

The function perror() prints the string passed to it as an argument, followed by a colon and the error message that corresponds to the value of errno. This error message is the one that strerror() would return if called with the same value of errno as its argument. Here is an example:

    if ( remove("test1") != 0)  // If we can't delete the file ...
      perror( "Couldn't delete 'test1'" );

This perror() call produces the same output as the following statement:

    fprintf( stderr, "Couldn't delete 'test1': %s\n", strerror( errno ) );

In this example, if the file test1 does not exist, a program compiled with GCC prints the following message:

    Couldn't delete 'test1': No such file or directory

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 17: Standard Library Functions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

This chapter describes in alphabetical order the functions available in the standard ANSI C libraries. Most of the functions described here were included in the 1989 ANSI standard or in the 1990 "Normative Addendum" and are currently supported by all major compilers. The ISO/IEC 9899:1999 standard introduced several new functions, which are not yet implemented in all compilers. These are labeled "C99" in this chapter.

Each description includes the function's purpose and return value, the function prototype, the header file in which the function is declared, and a brief example. For the sake of brevity, the examples do not always show a main() function or the #include directives indicating the header file with the function's declaration. When using the functions described in this chapter, remember that you must provide a declaration of each standard function used in your program by including the appropriate header file. Also, any filename may also contain a relative or absolute directory path. For more information about errors and exceptions that can occur in standard function calls, see the sections on the standard headers math.h, fenv.h, and errno.h in Chapter 15.

_Exit

Ends program execution without calling atexit() functions or signal handlers

#include <stdlib.h>
void _Exit( int status );

The _Exit() function terminates the program normally, but without calling any cleanup functions that you have installed using atexit(), or signal handlers you have installed using signal(). Exit() returns a status value to the operating system in the same way as the exit() function does.

Whether _Exit() flushes the program's file buffers or removes its temporary files may vary from one implementation to another.

Example

int main (int argc, char *argv[ ])
{
  if (argc < 3)
  {
    fprintf(stderr, "Missing required arguments.\n");
    _Exit(-1);
  }
  /* ... */
}

Section 18.3.1.1: Preprocessing

Before submitting the source code to the actual compiler, the preprocessor executes directives and expands macros in the source files (see steps 1 through 4 in the section "The C Compiler's Translation Phases" in Chapter 1). GCC ordinarily leaves no intermediate output file containing the results of this preprocessing stage. However, you can save the preprocessor output for diagnostic purposes by using the -E option, which directs GCC to stop after preprocessing. The preprocessor output is directed to the standard output stream, unless you indicate an output filename using the

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

C Dialects

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

When writing a C program, one of your first tasks is to decide which of the various definitions of the C language applies to your program. GCC's default dialect is "GNU C," which is largely the ISO/IEC 9899:1990 standard, with its published corrigenda, and with a number of language extensions. These extensions include many features that have since been standardized in C99—such as complex floating-point types and long long integers—as well as other features that have not been adopted, such as complex integer types and zero-length arrays. The full list of extensions is provided in the GCC documentation.

To turn off all the GNU C extensions, use the command-line option -ansi. This book describes C as defined in ISO/IEC 9899:1999, or "C99." GCC adheres (not yet completely, but nearly so) to C99 if you use the command-line option -std=c99, and we have done so in testing the examples in this book.

GCC's language standardization options are:

-std=iso9899:1990, -std=c89, -ansi: These three options all mean the same thing: conform to ISO/IEC 9899:1990, including Technical Corrigenda of 1994 and 1996. They do not mean that no extensions are accepted: only those GNU extensions that conflict with the ISO standard are disabled, such as the typeof operator.
-std=iso9899:199409: Conform to "AMD1," the 1995 internationalization amendment to ISO/IEC 9899:1990.
-std=iso9899:1999, -std=c99: Conform to ISO/IEC 9899:1999, with the Technical Corrigendum of 2001. Note that support for all provisions of C99 is not yet complete. See https://gcc.gnu.org/c99status.html for the current development status.
-std=gnu89: Support ISO/IEC 9899:1990 and the GNU extensions. This dialect is GCC's default.
-std=gnu99: Support ISO/IEC 9899:1999 and the GNU extensions. This dialect is expected to become the default dialect of future GCC versions once C99 support has been completed.

With any of these options, you must also add the option -pedantic if you want GCC to issue all the warnings that are required by the given standard version, and to reject all extensions that are prohibited by the standard. The option -pedantic-errors

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Compiler Warnings

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

You'll get two types of complaints from GCC when compiling a C program. Error messages refer to problems that make your program impossible to compile. Warnings refer to conditions in your program that you might want to know about and change—for stricter conformance to a given standard, for example—but that do not prevent the compiler from finishing its job. You may be able to compile and run a program in spite of some compiler warnings—although that doesn't mean it's a good idea to do so.

GCC gives you very fine control over the warning messages that it provides. For example, if you don't like the distinction between errors and warnings, you can use the -Werror option to make GCC stop compiling on any warning, as if it were an error. Other options let you request warnings about archaic or nonstandard usage, and about many kinds of C constructs in your programs that are considered hazardous, ambiguous, or sloppy.

You can enable most of GCC's warnings individually using options that begin with -W. For example, the option -Wswitch-default causes GCC to produce a warning message whenever you use a switch statement without a default label, and -Wsequence-point provides a warning when the value of an expression between two sequence points depends on a subexpression that is modified in the same interval (see "Side Effects and Sequence Points" in Chapter 5).

The easiest way to request these and many other warnings from GCC is to use the command-line option -Wall. However, the name of this option is somewhat misleading: -Wall does not enable all of the individual -W options. Quite a few more must be asked for specifically by name, such as -Wshadow: this option gives you a warning whenever you define a variable with block scope that has the same name as, and thus "shadows," another variable with a larger scope. Such warnings are not among those produced by -Wall.

If you use the -Wall option but want to disable a subset of the warnings it causes, you can insert no- after the -W in the names of individual warning options. Thus -Wno-switch-default turns off warnings about switch

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Optimization

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

GCC can apply many techniques to make the executable program that it generates faster and/or smaller. These techniques all tend to reduce still further the "word-for-word" correspondence between the C program you write and the machine code that the computer reads. As a result, they can make debugging more difficult, and are usually applied only after a program has been tested and debugged without optimization .

There are two kinds of optimization options. You can apply individual optimization techniques by means of options beginning with -f (for flag), such as -fmerge-constants, which causes the compiler to place identical constants in a common location, even across different source files. You can also use the -O options (-O0, -O1, -O2, and -O3) to set an optimization level that cumulatively enables a number of techniques at once.

Each of the -O options represents a number of individual optimization techniques. The -O optimization levels are cumulative: -O2 includes all the optimizations in -O1, and -O3 includes -O2. For complete and detailed descriptions of the different levels, and the many -f optimization options that they represent, see the GCC reference manual. The following list offers a brief description of each level:

-O0: Turn off all optimization options.
-O, -O1: Try to make the executable program smaller and faster, but without increasing compiling time excessively. The techniques applied include merging identical constants, basic loop optimization, and grouping stack operations after successive function calls. An -O with no number is interpreted as -O1.
-O2: Apply almost all of the supported optimization techniques that do not involve a tradeoff between program size and execution speed. This option generally increases the time needed to compile. In addition to the optimizations enabled by -O1, the compiler performs common subexpression elimination, or CSE; this process involves detecting mathematically equivalent expressions in the program and rewriting the code to evaluate them only once, saving the value in an unnamed variable for reuse. Furthermore, instructions are reordered to reduce the time spent waiting for data moving between memory and CPU registers. Incidentally, the data flow analysis performed at this level of optimization also allows the compiler to provide additional warnings about the use of uninitialized variables.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Debugging

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Use the -g option to have GCC include symbol and source-line information in its object and executable output files. This information is used by debugging programs to display the contents of variables in registers and memory while stepping through the program. (For more on debugging, see Chapter 20.) There are a number of formats for this symbol information, and by default GCC uses your system's native format.

You can also use a suffix to the -g option to store the symbol information in a different format from your system's native format. You might want to do this in order to conform to the specific debugging program that you are using. For example, the option -ggdb chooses the best format available on your system for debugging with the GNU debugger, GDB.

Because the symbol information can increase and even multiply the size of your executable file, you will probably want to recompile without the -g option and link using the -s option when you have completed debugging and testing. However, some software packages are distributed with debugging information in the binaries for use in diagnosing subsequent users' problems.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Profiling

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The -p option adds special functions to your program to output profiling information when you run it. Profiling is useful in resolving performance problems, because it lets you see which functions your program is spending its execution time on. The profiling output is saved in a file called mon.out. You can then use the prof utility to analyze the profiling information in a number of ways; see the prof manual for details.

For the GNU profiler, gprof, compile your program with the -pg option. The default output filename for the profiling information is then gmon.out. gprof with the -pg option can generate a call graph showing which functions in your program call which others. If you combine the -pg option with -g, the GCC option that provides source-line information for a debugger, then gprof can also provide line-by-line profiling.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Option and Environment Variable Summary

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

This section summarizes frequently used GCC options for quick reference, and lists the environment variables used by GCC.

-c: Preprocess, compile, and assemble only (i.e., don't link).
-C: Leave comments in when preprocessing.
-Dname[=definition]: Defines the symbol name.
-ename: Start program execution at name.
-E: Preprocess only; output to stdout, unless used with -o.
-ffast-math: Permit faster floating-point arithmetic methods at the cost of accuracy or precision.
-ffinite-math-only: Disregard infinities and NaN ("not a number") values.
-ffreestanding: Compile as a freestanding (not hosted) program.
-finline-functions, -fno-inline-functions: Enable/disable inline functions.
-fno-math-errno: Disable the errno variable for simple math functions.
-fno-trapping-math: Generate "nonstop" floating-point code.
-frounding-math: Don't disregard the rounding-mode features of the floating-point environment (experimental).
-fsignaling-nans: Allow all exceptions raised by signaling NaNs (experimental).
-fsyntax-only: Don't compile or link; just test input for syntax.
-funroll-loops, -fno-unroll-loops: Enable/disable loop optimization.
-funsafe-math-optimizations: Permit optimizations that don't conform to standards and/or don't verify values.
-fverbose-asm: Include C variable names as comments in assembly language.
-g[format]: Compile for debugging.
-Idirectory[:directory[...]]: Search for "include" files in the specified path.
-I-: Distinguish between -Ipath for #include <file> and -Ipath for #include "file".
-lbasename: Link with library libbasename.so or libbasename.a.
-Ldirectory[:directory[...]]: Search for library files in the specified path.
-march=cpu: Intel x86: Generate model-specific code.
-mcpu=cpu: Sparc, ARM, and RS/6000-PowerPC: Generate model-specific code.
Intel x86: Optimize scheduling for the specified CPU model.
-mtune=cpu: Optimize scheduling for the specified CPU model.
-nostartfiles: Don't link startup code.
-nostdlib: Don't link with the standard library.
-o file: Direct output to the specified file.
-O0: Turn off all optimization options.
-O, -O1: Perform some optimization without taking much time.
-O2: Perform more optimization, including data flow analysis.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 19: Using make to Build C Programs

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

As you saw in Chapter 18, the commands involved in compiling and linking C programs can be numerous and complex. The make utility automates and manages the process of compiling programs of any size and complexity, so that a single make command replaces hundreds of compiler and linker commands. Moreover, make compares the timestamps of related files to avoid having to repeat any previous work. And most importantly, make manages the individual rules that define how to build various targets, and automatically analyzes the dependency relationships between all the files involved.

There are a number of different versions of make, and their features and usage differ to varying degrees. They feature different sets of built-in variables and targets with special meanings. In this brief chapter, rather than trying to cover different varieties, we concentrate on GNU make, which is widely available. (On systems that use a different default make, GNU make is often available under the executable name gmake.) Furthermore, even as far as GNU make is concerned, this chapter sticks more or less to the basics: in this book, we want to use make only as a tool for building programs from C source code. If you want to go on to exploit the full capabilities of make, an inevitable step is to read the program's documentation itself. For a well-structured course in using make's advanced capabilities, see also Managing Projects with GNU make by Robert Mecklenburg (O'Reilly).

Before we describe the make solution, we will briefly review the problem. To make an executable program, we need to link compiled object files. To generate object files, we need to compile C source files. The source files in turn need to be preprocessed to include their header files. And whenever we have edited a source or header file, then any file that was directly or indirectly generated from it needs to be rebuilt.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Targets, Prerequisites, and Commands

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The make utility organizes the work just described in the form of rules. For C programs, these rules generally take the following form: the executable file is a target that must be rebuilt whenever certain object files have changed—the object files are its prerequisites. At the same time, the object files are intermediate targets, which must be recompiled if the source and header files have changed. (Thus the executable depends indirectly on the source files. make manages such dependency chains elegantly, even when they become complex.) The rule for each target generally contains one or more commands, called the command script, that make executes to build it. For example , the rule for building the executable file says to run the linker, while the rule for building object files says to run the preprocessor and compiler. In other words, a rule's prerequisites say when to build the target, and the command script says how to build it.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

The Makefile

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The make program has a special syntax for its rules. Furthermore, the rules for all the operations that you want make to manage in your project generally need to be collected in a file for make to read. The command-line option -f filename tells make which file contains the rules you want it to apply. Usually, though, this option is omitted and make looks for a file with the default name makefile, or failing that, Makefile.

When you read makefiles , remember that they are not simply scripts to be executed in sequential order. Rather, make first analyzes an entire makefile to build a dependency tree of possible targets and their prerequisites, then iterates through that dependency tree to build the desired targets.

In addition to rules, makefiles also contain comments, variable assignments, macro definitions, include directives, and conditional directives. These will be discussed in later sections of this chapter, after we have taken a closer look at the meat of the makefile: the rules.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Rules

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Example 19-1 shows a makefile that might be used to build the program in Example 1-2.

Example 19-1. A basic makefile

# A basic makefile for "circle".
 
CC = gcc
CFLAGS = -Wall -g -std=c99
LDFLAGS = -lm
 
circle : circle.o circulararea.o
        $(CC) $(LDFLAGS) -o $@ $^
 
circle.o : circle.c
        $(CC) $(CFLAGS) -o $@ -c $<
 
circulararea.o: circulararea.c
        $(CC) $(CFLAGS) -o $@ -c $<

The line that begins with the character # is a comment, which make ignores. This makefile begins by defining some variables, which are used in the statements that follow. The rest of the file consists of rules, whose general form is:

target [target [...]] : [prerequisite[prerequisite[...]]]
        [command
        [command
        [...]]]

The first target must be placed at the beginning of the line, with no whitespace to the left of it. Moreover, each command line must start with a tab character. (It would be simpler if all whitespace characters were permissible here, but that's not the case.)

Each rule in the makefile says, in effect: if any target is older than any prerequisite, then execute the command script. More importantly, make also checks whether the prerequisites have other prerequisites in turn before it starts executing commands.

Both the prerequisites and the command script are optional. A rule with no command script only tells make about a dependency relationship; and a rule with no prerequisites tells only how to build the target, not when to build it. You can also put the prerequisites for a given target in one rule, and the command script in another. For any target requested, whether on the make command line or as a prerequisite for another target, make collects all the pertinent information from all rules for that target before it acts on them.

Example 19-1 shows two different notations for variable references in the command script. Variable names that consist of more than one character—in this case, CC, CFLAGS, and LDFLAGS—must be prefixed with a dollar sign and enclosed in parentheses when referenced. Variables that consist of just one character—in our example, these happen to be the automatic variables

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Comments

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

In a makefile, a hash mark (#) anywhere in a line begins a comment, unless the line is a command. make ignores comments, as if the text from the hash mark to the end of its line did not exist. Comments (like blank lines) between the lines of a rule do not interrupt its continuity. Leading whitespace before a hash mark is ignored.

If a line containing a hash mark is a command—that is, if it begins with a tab character—then it cannot contain a make comment. If the corresponding target needs to be built, make passes the entire command line, minus the leading tab character, to the shell for execution. (Some shells, such as the Bourne shell, also interpret the hash mark as introducing a comment, but that is beyond make's control.)

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Variables

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

All variables in make are of the same type: they contain sequences of characters, never numeric values. Whenever make applies a rule, it evaluates all the variables contained in the targets, prerequisites, and commands. Variables in GNU make come in two "flavors," called recursively expanded and simply expanded variables. Which flavor a given variable has is determined by the specific assignment operator used in its definition. In a recursively expanded variable, nested variable references are stored verbatim until the variable is evaluated. In a simply expanded variable, on the other hand, variable references in the value are expanded immediately on assignment, and their expanded values are stored, not their names.

Variable names can include any character except :, =, and #. However, for robust makefiles and compatibility with shell constraints, you should use only letters, digits, and the underscore character.

Which assignment operator you use in defining a variable determines whether it is a simply or a recursively expanded variable. The assignment operator = in the following example creates a recursively expanded variable:

DEBUGFLAGS = $(CFLAGS) -ggdb -DDEBUG -O0

make stores the character sequence to the right of the equals sign verbatim; the nested variable $(CFLAGS) is not expanded until $(DEBUGFLAGS) is used.

To create a simply expanded variable, use the assignment operator := as shown in the following example:

OBJ = circle.o circulararea.o
TESTOBJ := $(OBJ) profile.o

In this case make stores the character sequence circle.o circulararea.o profile.o as the value of $(TESTOBJ). If a subsequent assignment modifies the value of $(OBJ), $(TESTOBJ) is not affected.

You can define both recursively expanded and simply expanded variables not only in the makefile, but also on the make command line, as in the following example:

$ make CFLAGS=-ffinite-math-only circulararea.o

Each such assignment must be contained in a single command-line argument. If the assignment contains spaces, you must escape them or enclose the entire assignment in quotation marks. Any variable defined on the command line, or in the shell environment, can be cancelled out by an assignment in the makefile that starts with the optional

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Phony Targets

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

The makefile in Example 19-4 also illustrates several different ways of using targets. The targets debug, testing, production, clean, and symbols are not names of files to be generated. Nonetheless, the rules clearly define the behavior of a command like make production or make clean symbols debug. Targets that are not the names of files to be generated are called phony targets .

In Example 19-4, the phony target clean has a command script, but no prerequisites. Furthermore, its command script doesn't actually build anything: on the contrary, it deletes files generated by other rules. We can use this target to clear the board before rebuilding the program from scratch. In this way, the phony targets testing and production ensure that the executable is linked from object files made with the desired compiler options by including clean as one of their prerequisites.

You can also think of a phony target as one that is never supposed to be up to date: its command script should be executed whenever the target is called for. This is the case with clean—as long as no file with the name clean happens to appear in the project directory.

Often, however, a phony target's name might really appear as a filename in the project directory. For example, if your project's products are built in subdirectories, such as bin and doc, you might want to use subdirectory names as targets. But you must make sure that make rebuilds the contents of a subdirectory when out of date, even if the subdirectory itself already exists.

For such cases, make lets you declare a target as phony regardless of whether a matching filename exists. The way to do so is to is to add a line like this one to your makefile, making the target a prerequisite of the special built-in target .PHONY:

.PHONY: clean

Or, to use an example with a subdirectory name, suppose we added these lines to the makefile in Example 19-4:

.PHONY: bin
bin: circle
        $(MKDIR) $@
        $(CP) $< $@/
        $(CHMOD) 600 $@/$<

This rule for the target bin actually does create bin in the project directory. However, because bin is explicitly phony, it is never up to date.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Other Target Attributes

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

There are also other attributes that you can assign to certain targets in a makefile by making those targets prerequisites of other built-in targets like .PHONY. The most important of these built-in targets are listed here. Other special built-in targets that can be used in makefiles to alter make's runtime behavior in general are listed at the end of this chapter.

.PHONY: Any targets that are prerequisites of .PHONY are always treated as out of date.
.PRECIOUS: Normally, if you interrupt make while running a command script—if make receives any fatal signal, to be more precise—make deletes the target it was building before it exits. Any target you declare as a prerequisite of .PRECIOUS is not deleted in such cases, however.
Furthermore, when make builds a target by concatenating implicit rules, it normally deletes any intermediate files that it creates by one such rule as prerequisites for the next. However, if any such file is a prerequisite of .PRECIOUS (or matches a pattern that is a prerequisite of .PRECIOUS), make does not delete it.
.INTERMEDIATE: Ordinarily, when make needs to build a target whose prerequisites do not exist, it searches for an appropriate rule to build them first. If the absent prerequisites are not named anywhere in the makefile, and make has to resort to implicit rules to build them, then they are called intermediate files. make deletes any intermediate files after building its intended target (see the section "Implicit Rule Chains," earlier in this chapter). If you want certain files to be treated in this way even though they are mentioned in your makefile, declare them as prerequisites of .INTERMEDIATE.
.SECONDARY: Like .INTERMEDIATE, except that make does not automatically delete files that are prerequisites of .SECONDARY.
You can also put .SECONDARY in a makefile with no prerequisites at all. In this case, make treats all targets as prerequisites of .SECONDARY.
.IGNORE: For any target that is a prerequisite of .IGNORE, make ignores any errors that occur in executing the commands to build that target. .IGNORE itself does not take a command script.
You can also put

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Macros

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

When we talk about macros in make, you should remember that there is really no difference between them and variables. Nonetheless, make provides a directive that allows you to define variables with both newline characters and references to other variables embedded in them. Programmers often use this capability to encapsulate multiline command sequences in a variable, so that the term macro is fairly appropriate. (The GNU make manual calls them "canned command sequences.")

To define a variable containing multiple lines, you must use the define directive. Its syntax is:

define macro_name
macro_value
endef

The line breaks shown in the syntax are significant: define and endef both need to be placed at the beginning of a line, and nothing may follow define on its line except the name of the macro. Within the macro_value, though, any number of newline characters may also occur. These are included literally, along with all other characters between the define and endef lines, in the value of the variable you are defining. Here is a simple example:

define installtarget
 @echo Installing $@ in $(USRBINDIR) ... ;\
 $(MKDIR) -m 7700 $(USRBINDIR)           ;\
 $(CP) $@ $(USRBINDIR)/                  ;\
 @echo ... done.
endef

The variable references contained in the macro installtarget are stored literally as shown here, and expanded only when make expands $(installtarget) itself, in a rule like this for example:

circle: $(OBJ) $(LIB)
        $(CC) $(LDFLAGS) -o $@ $^
ifdef INSTALLTOO
        $(installtarget)
endif

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Functions

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

GNU make goes beyond simple macro expansion to provide functions —both built-in and user-defined functions. By using parameters, conditions, and built-in functions, you can define quite powerful functions and use them anywhere in your makefiles.

The syntax of function invocations in makefiles, like that of macro references, uses the dollar sign and parentheses:

$(function_name argument[,argument[,...]])

Whitespace in the argument list is significant. make ignores any whitespace before the first argument, but if you include any whitespace characters before or after a comma, make treats them as part of the adjacent argument value.

The arguments themselves can contain any characters, except for embedded commas. Parentheses must occur in matched pairs; otherwise they will keep make from parsing the function call correctly. If necessary, you can avoid these restrictions by defining a variable to hold a comma or parenthesis character, and using a variable reference as the function argument.

GNU make provides more than 20 useful text-processing and flow-control functions, which are listed briefly in the following sections.

Section 19.9.1.1: Text-processing functions

The text-processing functions listed here are useful in operating on the values of make variables, which are always sequences of characters:

$(subst find_text,replacement_text,original_text): Expands to the value of original_text, except that each occurrence of find_text in it is changed to replacement_text.
$(patsubst find_pattern,replacement_pattern,original_text): Expands to the value of original_text, except that each occurrence of find_pattern in it is changed to replacement_pattern. The find_pattern argument may contain a percent sign as a wildcard for any number of non-whitespace characters. If replacement_pattern also contains a percent sign, it is replaced with the characters represented by the wildcard in find_pattern. The patsubst function also collapses each unquoted whitespace sequence into a single space character.
$(strip original_text): Removes leading and trailing whitespace, and collapses each unquoted internal whitespace sequence into a single space character.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Directives

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

We have already introduced the define directive, which produces a simply expanded variable or a function. Other make directives allow you to influence the effective contents of your makefiles dynamically by making certain lines in a makefile dependent on variable conditions, or by inserting additional makefiles on the fly.

You can also make part of your makefile conditional upon the existence of a variable by using the ifdef or ifndef directive. They work the same as the C preprocessor directives of the same names, except that in make, an undefined variable is the same as one whose value is empty. Here is an example:

OBJ = circle.o
LIB = -lm
 
ifdef SHAREDLIBS
  LIB += circulararea.so
else
  OBJ += circulararea.o
endif
 
circle: $(OBJ) $(LIB)
        $(CC) -o $@ $^
 
%.so : %.o
       $(CC) -shared -o $@ $<

As the example shows, the variable name follows ifdef or ifndef without a dollar sign or parentheses. The makefile excerpt shown here defines a rule to link object files into a shared library if the variable SHAREDLIBS has been defined. You might define such a general build option in an environment variable, or on the command line, for example.

You can also make certain lines of the makefile conditional upon whether two expressions—usually the value of a variable and a literal string—are equal. The ifeq and ifneq directives test this condition. The two operands whose equality is the condition to test are either enclosed together in parentheses and separated by a comma, or enclosed individually in quotation marks and separated by whitespace. Here is an example:

ifeq ($(MATHLIB), /usr/lib/libm.so)
  # ... Special provisions for this particular math library ...
endif

That conditional directive, with parentheses, is equivalent to this one with quotation marks:

ifeq "$(MATHLIB)" "/usr/lib/libm.so"
  # ... Special provisions for this particular math library ...
endif

The second version has one strong advantage: the quotation marks make it quite clear where each of the operands begins and ends. In the first version, you must remember that whitespace within the parentheses

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Running make

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

This section explains how to add dependency information to the makefile automatically, and how to use make recursively. These two ways of using make are common and basic, but they do involve multiple features of the program. Finally, the remainder of this section is devoted to a reference list of GNU make's command-line options and the special pseudotargets that also function as runtime options.

The command-line syntax of make is as follows:

 make [options] [variable_assignments] [target [target [...]]]

If you don't specify any target on the command line, make behaves as though you had specified the default target; that is, whichever target is named first in the makefile. make builds other targets named in the makefile only if you request them on the command line, or if they need to be built as prerequisites of any target requested.

Our program executable circle depends on more files than those we have named in the sample makefile up to now. Just think of the standard headers included in our source code, to begin with—not to mention the implementation-specific header files they include in turn.

Most C source files include both standard and user-defined header files, and the compiled program should be considered out of date whenever any header file has been changed. Because you cannot reasonably be expected to know the full list of header files involved, the standard make technique to account for these dependencies is to let the C preprocessor analyze the #include directives in your C source and write the appropriate make rules. The makefile lines in Example 19-6 fulfill this purpose.

Example 19-6. Generating header dependencies

CC = gcc
OBJ = circle.o circulararea.o
LIB = -lm
 
circle: $(OBJ) $(LIB)
        $(CC) $(LDFLAGS) -o $@ $^
 
%.o: %.c
        $(CC) $(CFLAGS) $(CPPFLAGS) -o $@ $<
 
dependencies: $(OBJ:.o=.c)
        $(CC) -M $^ > $@
 
include dependencies

The third rule uses a special kind of make variable reference, called a substitution reference , to declare that the target dependencies depends on files like those named in the value of $(OBJ), but with the ending

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Chapter 20: Debugging C Programs with GDB

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

An important part of software development is testing and troubleshooting. In a large program, programming errors, or bugs, are practically inevitable. Programs can deliver wrong results, get hung up in infinite loops, or crash due to illegal memory operations. The task of finding and eliminating such errors is called debugging a program.

Many bugs are not apparent by simply studying the source code. Extra output provided by a testing version of the program is one helpful diagnostic technique. You can add statements to display the contents of variables and other information during runtime. However, you can generally perform runtime diagnostics much more efficiently by using a debugger.

A debugger is a program that runs another program in a finely controlled environment. For example, a debugger allows you to run the program step by step, observing the contents of variables, memory locations, and CPU registers after each statement. You can also analyze the sequence of function calls that lead to a given point in the program.

This chapter is an introduction to one powerful and widely used debugger, the GNU debugger or GDB. The sections that follow describe GDB's basic options and commands. Most of the features and working principles described here are similar to those of other debugging tools. For a complete description of GDB's capabilities, see the program manual "Debugging with GDB" by the Free Software Foundation , which is available in PDF and HTML at https://www.gnu.org/software/gdb/documentation/. If your system also has the GNU Texinfo system installed, you can browse the full manual by entering the shell command info gdb.

If the GNU C compiler, GCC, is available on your system, then GDB is probably already installed as well. You can tell by running the following command, which displays the debugger's version and copyright information:

$ gdb -version

As in the preceding chapters, the dollar sign character ($) followed by a space represents the shell command prompt.

If GDB is installed, a message like the following appears:

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i586-suse-linux".

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Installing GDB

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

$ gdb -version

As in the preceding chapters, the dollar sign character ($) followed by a space represents the shell command prompt.

If GDB is installed, a message like the following appears:

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i586-suse-linux".

If GDB is not installed, you can download the source code and compile it (see https://www.gnu.org/software/gdb/download/). This is seldom necessary, though. Most Unix-like systems provide a convenient method to install a binary GDB package, including the documentation. On Windows systems, we recommend that you install the Cygwin software. Cygwin provides a standard Unix environment on Windows platforms, including the GCC compiler, the GDB debugger, and other GNU tools (see https://www.cygwin.com or https://www.redhat.com/software/cygwin/).

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

A Sample Debugging Session

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

This section describes a sample GDB session to illustrate the basic operation of the debugger. Many problems in C programs can be pinpointed using just a handful of debugger commands. The program in Example 20-1, gdb_example.c, contains a logical error. We'll use this program in the following subsections to show how GDB can be used to track down such errors.

Example 20-1. A program to be debugged in a GDB session

// gdb_example.c:
// Test the swap() function, which exchanges the contents of two int variables.
// -------------------------------------------------------------
#include <stdio.h>
 
void swap( int *p1, int *p2 );         // Exchange *p1 and *p2
 
int main()
{
  int a = 10, b = 20;
/* ... */
  printf( "The old values: a = %d; b = %d.\n", a, b );
 
  swap( &a, &b );
 
  printf( "The new values: a = %d; b = %d.\n", a, b );
/* ... */
  return 0;
}
 
void swap( int *p1, int *p2 )         // Exchange *p1 and *p2.
{
  int *p = p1;
  p1 = p2;
  p2 = p;
}

GDB is a symbolic command line debugger. "Symbolic" here means that you can refer to variables and functions in the running program by the names you have given them in your C source code. In order to display and interpret these names, the debugger requires information about the types of the variables and functions in the program, and about which instructions in the executable file correspond to which lines in the source files. Such information takes the form of a symbol table, which the compiler and linker include in the executable file when you run GCC with the -g option:

$ gcc -g gdb_example.c

In a large program consisting of several source files, you must compile each module with the -g option.

The following command runs the program from Example 20-1:

$ ./a.out

The program produces the following output:

The old values: a = 10; b = 20.
The new values: a = 10; b = 20.

Although the swap() function call is plain to see in the source code, the contents of the variables a and b have not been swapped. We can look for the reason using GDB. To begin the debugging session, start GDB from the shell, specifying the name of the executable file as a command-line argument to the debugger:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Starting GDB

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

You can start GDB by entering gdb at the shell command prompt. GDB supports numerous command-line options and arguments:

gdb [options] [executable_file [core_file | process_id]]

For example, the following command starts the debugger without displaying its sign-on message:

$ gdb -silent
(gdb)

In this example, the command line does not name the executable file to be debugged. You can specify the program you want to test in GDB using the debugger's file command, described in the section "Using GDB Commands," later in this chapter.

Ordinarily, the program to be debugged is named on the GDB command line. In the following example, the GDB command loads the executable myprog for debugging:

$ gdb myprog
(gdb)

As an additional argument after the name of the program to be tested, you may specify a process ID number or the name of a core dump file. In the following example, the number after the program name is a process ID (or "PID"):

$ gdb myprog 1001
(gdb)

This command instructs GDB to connect to a process that is already running on the system, and has the program name myprog and the process ID 1001. If GDB finds such a process, you can interrupt its execution to begin debugging by pressing Ctrl+C. If the debugger finds a file in the current working directory named 1001, however, it will interpret that argument as the name of a core file rather than a process ID. For details about debugging with core files, see the section "Analyzing Core Files in GDB ," at the end of this chapter.

Most of the command-line options for the GDB debugger have both short and long forms. The descriptions in the following list and subsections show both forms for the most frequently used options. You can also truncate the long form, if you type enough of it to be unambiguous. For options that take an argument, such as -tty device, the option and its argument can be separated either by a space or by an equals sign (=), as in -tty=/dev/tty6. Options may be introduced by either one or two hyphens: -quiet for example is synonymous with --quiet.

This section lists the most commonly used GDB options. For a complete list, see the program's documentation.

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Using GDB Commands

Content preview·Buy PDF of this chapter|Buy reprint rights for this chapter

Upon starting, the debugger prompts you to enter commands—for example, to set a breakpoint and run the program that you specified on the command line to load for debugging.

Each command you issue to GDB is a line of text beginning with a command keyword. The remainder of the line consists of the command's arguments. You can truncate any keyword, as long as you type enough of it to identify a command unambiguously. For example, you can enter q (or qu or qui) to exit the debugger with the quit command.

If you enter an empty command line—that is, if you press the return key immediately at the GDB command prompt—then GDB repeats your last command, if that action is plausible. For example, GDB automatically repeats step and next commands in this way, but not a run command.

If you enter an ambiguous or unknown abbreviation, or fail to specify required command arguments, GDB responds with an appropriate error message, as in this example:

(gdb) sh
Ambiguous command "sh": sharedlibrary, shell, show.

The GDB debugger can reduce your typing by completing the names of commands, variables, files, and functions. Type the first few characters of the desired word, then press the Tab key. For example, the program circle.c in Example 1-1 contains the function circularArea(). To display this function in a GDB session, all you have to enter is the following:

(gdb) list ci

and press the Tab key. Automatic completion yields this command line:

(gdb) list circularArea

Press Return to execute the command. If there are several possible completions for a word, GDB inserts the next letters that are common to all possible completions, then prompts you for more input. You can type another letter or two to make your entry more specific, then press the Tab key again. If you press the Tab key twice in a row, GDB displays all possible completions of the word. Here is an example of command completion in several steps:

(gdb) break ci<tab>

GDB appends two letters, then pauses for more input to resolve an ambiguity:

(gdb) break circ

If you press the Tab key twice, GDB displays the possible completions, then repeats the prompt:

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!

Return to C in a Nutshell

Original Source | Taken Source