Tab Files: Nothing Special
Tab-delimited files are text files organized around data that has
rows and columns. This format is used to exchange data between
spread-sheet programs or databases. A tab-delimited file uses just rwo
punctuation rules to encode the data.
-
Each row is delimited by an ordinary newline character. This
is usually the standard \n
. If you are exchanging files
across platforms, you may need to open files for reading using the
"rU" mode to get universal newline handling.
-
Within a row, columns are delimited by a single character,
often \t
. The column punctuation character that is
chosen is one that will never occur in the data. It is usually (but
not always) an unprintable character like \t
.
In the ideal cases, a CSV file will have the same number of
columns in each row, and the first row will be column titles. Almost as
pleasant is a file without column titles, but with a known sequence of
columns. In the more complex cases, the number of columns per row
varies.
When we have a single, standard punctuation mark, we can simply
use two operations in the string
and
list
classes to process files. We use the
split
method of a string
to parse the rows. We use the join
method of a
list
to assemble the rows.
We don't actually need a separate module to handle tab-delimited
files. We looked at a related example in the section called “Reading a Text File”.
Reading. The most general case for reading Tab-delimited data is shown in
the following example.
myFile= open( "
somefile
", "rU" )
for aRow in myFile:
print aRow.split('\t')
myFile.close()
Each row will be a list
of column
values.
Writing. The writing case is the inverse of the reading case.
Essentially, we use a "\t".join( someList )
to create the
tab-delimeted row. Here's our sailboat example, done as tab-delimited
data.
test= file( "boats.tab", "w" )
test.write( "\t".join( Boat.csvHeading ) )
test.write( "\n" )
for d in db:
test.write( "\t".join( map( str, d.csvRow() ) ) )
test.write( "\n" )
test.close()
Note that some elements of our data objects aren't string values.
In this case, the value for sails is a tuple, which needs to be
converted to a proper string. The expression map(str,
someList
)
applies the
str
function to each element of the original list,
creating a new list which will have all string values. See the section called “Sequence Processing Functions: map
,
filter
, reduce
and
zip
”.