There are two built-in functions that creates a new file or open
an existing file.
-
file
(
filename
, [
mode
, ] [
buffering
]
) → file object
-
Create a Python file object associated with an operating
system file.
filename
is the name of the
file.
mode
can be 'r', 'w' or 'a' for
reading (default), writing or appending. The file will be created
if it doesn't exist when opened for writing or appending; it will
be truncated when opened for writing. Add a 'b' to the mode for
binary files. Add a '+' to the mode to allow simultaneous reading
and writing. If the
buffering
argument is
given, 0 means unbuffered, 1 means line buffered, and larger
numbers specify the buffer size.
-
open
(
filename
, [
mode
, ] [
buffering
]
) → file object
-
Does the same thing as the file
function. The file
function is a standard
factory function, like int
,
float
, str
,
list
,
tuple
, etc. The name
of the function matches the class of the object being created. The
open
function, however, is more descriptive
of what is really going on in the program.
Creating File Name Strings. A filename string
can be given as a
standard name, or it can use OS-specific punctuation. The standard is
to use /
to separate elements of a file path; Python can
do OS-specific translation. Windows, for example, uses \
for most levels of the path, but has a leading device character
separated by a :
. Rather than force your program to
implement the various operating system punctuation rules, Python
provides modules to help you construct and process file names. The
os.path
module should be used to construct file
names. Best practice is to use the os.path.join
function to make file names from sequences of
string
s. We'll look at this in Chapter 33, File Handling Modules
.
The
filename
string
can be a simple file name, also called a relative
path string
, where the OS rules of
applying a current working directory are used to create a full, absolute
path. Or the filename string
can be a full
absolute path to the file.
File Mode Strings. The
mode
string
specifies how the file will be accessed by the program. There are
three separate issues addressed by the mode
string
: opening, text handling and
operations.
Opening. For the opening part of the mode string
,
there are three alternatives:
-
r
-
Open for reading. Start at the beginning of the file. If the
file does not exist, raise an
IOError
exception. This is implied
if nothing else is specified.
-
w
-
Open for writing. Start at he beginning of the file. If the
file does not exist, create the file.
-
a
-
Open for appending. Start at the end of the file. If the
file does not exist, create the file.
Text Handling. For the text handling part of the mode
string
, there are two alternatives:
-
b
-
Do not interpret newlines as end of line. The file is simply
a sequence of bytes.
-
(nothing)
-
The default, if nothing is specified is to interpret
newlines as end of line. The file is a sequence of text
lines.
-
U
-
The capital U
mode enables "universal newline"
processing. This allows your program to cope with the non-standard
line-ending characters present in some Windows files. The standard
end-of-line is a single newline character, \n
.
In Windows, an additional \r
character may also
be present.
Operations. For the additional operations part of the mode
string
, there are two alternatives:
-
+
-
Allow both read and write operations.
-
(nothing)
-
If nothing is specified, allow only reads for files opened
with "r"; allow only writes for files opened with "w" or
"a".
Typical combinations include "rb"
to read
binary data and "w+"
to create a file for reading and
writing.
Examples. The following examples create file objects for further
processing:
myLogin = file( ".login", "r" )
newSource = open( "somefile.c", "w" )
theErrors = file( "error.log", "a" )
Each of these opens a named file in the current working directory.
The first example opens a file for reading. The second example creates a
new file or truncates an existing file prior to writing. The third
example creates a new file or opens an existing file for
appending.
Buffering files is typically left as a default, specifying
nothing. However, for some situations buffering can improve performance.
Error logs, for instance, are often unbuffered, so the data is available
immediately. Large input files may have large buffer numbers specified
to encourage the operating system to optimize input operations by
reading a few large chunks of data instead of a large number of smaller
chunks.