Python - Built-in Functions

Built-in Functions
	Chapter 19. Files

Built-in Functions

There are two built-in functions that creates a new file or open an existing file.

file ( filename , [ mode , ] [ buffering ] ) → file object: Create a Python file object associated with an operating system file. filename is the name of the file. mode can be 'r', 'w' or 'a' for reading (default), writing or appending. The file will be created if it doesn't exist when opened for writing or appending; it will be truncated when opened for writing. Add a 'b' to the mode for binary files. Add a '+' to the mode to allow simultaneous reading and writing. If the buffering argument is given, 0 means unbuffered, 1 means line buffered, and larger numbers specify the buffer size.
open ( filename , [ mode , ] [ buffering ] ) → file object: Does the same thing as the file function. The file function is a standard factory function, like int, float, str, list, tuple, etc. The name of the function matches the class of the object being created. The open function, however, is more descriptive of what is really going on in the program.

Creating File Name Strings. A filename string can be given as a standard name, or it can use OS-specific punctuation. The standard is to use / to separate elements of a file path; Python can do OS-specific translation. Windows, for example, uses \ for most levels of the path, but has a leading device character separated by a :. Rather than force your program to implement the various operating system punctuation rules, Python provides modules to help you construct and process file names. The os.path module should be used to construct file names. Best practice is to use the os.path.join function to make file names from sequences of strings. We'll look at this in Chapter 33, File Handling Modules .

The filename string can be a simple file name, also called a relative path string, where the OS rules of applying a current working directory are used to create a full, absolute path. Or the filename string can be a full absolute path to the file.

File Mode Strings. The mode string specifies how the file will be accessed by the program. There are three separate issues addressed by the mode string: opening, text handling and operations.

Opening. For the opening part of the mode string, there are three alternatives:

r: Open for reading. Start at the beginning of the file. If the file does not exist, raise an IOError exception. This is implied if nothing else is specified.

w: Open for writing. Start at he beginning of the file. If the file does not exist, create the file.

a: Open for appending. Start at the end of the file. If the file does not exist, create the file.

Text Handling. For the text handling part of the mode string, there are two alternatives:

b: Do not interpret newlines as end of line. The file is simply a sequence of bytes.

(nothing): The default, if nothing is specified is to interpret newlines as end of line. The file is a sequence of text lines.
U: The capital U mode enables "universal newline" processing. This allows your program to cope with the non-standard line-ending characters present in some Windows files. The standard end-of-line is a single newline character, \n. In Windows, an additional \r character may also be present.

Operations. For the additional operations part of the mode string, there are two alternatives:

+: Allow both read and write operations.

(nothing): If nothing is specified, allow only reads for files opened with "r"; allow only writes for files opened with "w" or "a".

Typical combinations include "rb" to read binary data and "w+" to create a file for reading and writing.

Examples. The following examples create file objects for further processing:

myLogin = file( ".login", "r" )
newSource = open( "somefile.c", "w" )
theErrors = file( "error.log", "a" )

Each of these opens a named file in the current working directory. The first example opens a file for reading. The second example creates a new file or truncates an existing file prior to writing. The third example creates a new file or opens an existing file for appending.

Buffering files is typically left as a default, specifying nothing. However, for some situations buffering can improve performance. Error logs, for instance, are often unbuffered, so the data is available immediately. Large input files may have large buffer numbers specified to encourage the operating system to optimize input operations by reading a few large chunks of data instead of a large number of smaller chunks.


Additional Background		File Methods