A useful technique within PostgreSQL is to use the COPY command to insert values
directly into tables from external files. Files used for input by COPY must either be
in standard ASCII text format, whose fields are delimited by a uniform symbol, or in PostgreSQL's binary table format.
Common delimiters for ASCII files are tabs and commas. When using an ASCII formatted input file with
COPY, each line within the file will be treated as a row of data to be inserted and
each delimited field will be treated as a column value.
The COPY FROM command operates much faster than a normal
INSERT command because the data is read as a single transaction directly
to the target table. On the other hand, it is a very strict format, and the entire COPY
procedure will fail if just one line is malformed.
The following is the syntax for using the COPY FROM command, where
table_name
is the table that you wish to insert values into and
filename
is
the absolute system path to the from which file to be read:
COPY [ BINARY ]
table_name
[ WITH OIDS ]
FROM { '
filename
' | stdin }
[ [USING] DELIMITERS '
delimiter
' ]
[ WITH NULL AS '
null_string
' ]
-
BINARY
-
Indicates that input will come from a binary
file previously created by the COPY TO command.
-
table_name
-
The name of the table you are copying.
-
WITH OIDS
-
Instructs PostgreSQL to retrieve all of the OIDs of the table represented by
filename
from the first line of the file.
-
FROM { '
filename
' | stdin }
-
Indicates that either the file specified with
filename
or standard input (stdin)
should be read by PostgreSQL.
-
[ USING ] DELIMITERS '
delimiter
'
-
Indicates the character provided with
delimiter
should be used as a delimiter when parsing input.
This clause is not applicable to files that were output in PostgreSQL's binary format.
-
WITH NULL AS '
null_string
'
-
Indicates that the character(s) provided with
null_string
should be interpreted as
NULL values. This clause is not applicable to files that were output in PostgreSQL's binary format.
When preparing to copy a file from the underlying operating system, remember that the file specified must be readable
by the
postmaster
process (i.e., the user which PostgreSQL is running as), since the
backend reads the file directly. Additionally, the filename must be provided with an absolute path; an attempt to use a
relative path will result in an error.
If you are using an ASCII formatted input file, a
delimiter
value may be passed to the
DELIMITERS clause, which defines the character which delimits columns on a single line in
the filename. If omitted, PostgreSQL will assume that the ASCII file is tab-delimited. The optional
WITH NULL clause allows you to specify in what form to expect
NULL values. If omitted, PostgreSQL interprets the \N
sequence as a NULL value to be inserted (e.g., blank fields in a source file will be
treated as blank string constants, rather than NULL, by default).
The stdin term may be supplied as the source for the FROM
clause if you wish to type values in manually or paste from another location directly into a terminal session. If you
choose to enter values from stdin, you must terminate the input stream with a \. sequence
(backslash-period) followed immediately by a newline.
Example 4-19 shows the contents of a file that was output in ASCII format by PostgreSQL.
The file in Example 4-19 is comma-delimited and uses \null to
represent NULL values. It contains row data from the Book Town
subjects table.
Example 4-19. An example ASCII copy file
1,Business,Productivity Ave
2,Children's Books,Kids Ct
3,Classics,Academic Rd
4,Computers,Productivity Ave
5,Cooking,Creativity St
12,Religion,\null
8,History,Academic Rd
9,Horror,Black Raven Dr
10,Mystery,Black Raven Dr
11,Poetry,Sunset Dr
13,Romance,Main St
14,Science,Productivity Ave
15,Science Fiction,Main St
0,Arts,Creativity St
6,Drama,Main St
7,Entertainment,Main St
The statement in Example 4-20 copies the file
(
/tmp/subjects.sql
) into a table within the booktown database's
subjects table.
Example 4-20. Copying an ASCII file
booktown=#
COPY subjects FROM '/tmp/subjects.sql'
booktown-#
USING DELIMITERS ',' WITH NULL AS '\null';
COPY
The COPY command can also input and output both binary formatted data. Specifying
to the COPY FROM command the BINARY keyword requires
that the input file specified was created with the COPY TO command in PostgreSQL's
binary format. Binary files can be read more quickly than ASCII files, but are not readable or modifiable with plain-text
editors as ASCII files are.
Example 4-21 uses the COPY command to insert the rows in
the binary output file from the subjects table within the
booktown database.
Example 4-21. Copying a binary file
booktown=#
COPY BINARY subjects FROM '/tmp/subjects.sql';
COPY
The COPY command is
not
the same as the
psql
\copy command. The \copy command
accepts the same syntax (though without a terminating semicolon), and therefore performs the operation via the
psql
client, rather than the
postmaster
server. The result is that
\copy operates with the permissions of the user running
psql
rather
than of the user the
postmaster
is running as.
The syntax of COPY FROM may be used with nearly identical syntax to send a
table's data to a file. You need only replace the FROM keyword with the
TO keyword. Additionally, the stdin keyword may be
replaced with stdout if you wish to redirect to standard output rather than to a file
(e.g., to the screen, in
psql
). Example 4-22 shows how you would
copy the books table to an ASCII formatted file.
Example 4-22. Copying the books table to an ASCII file
booktown=#
COPY books TO 'filename';
COPY
Files containing row data with object identifier values (created with the
COPY TO command, involving the WITH OIDS clause) can
be read by a COPY FROM command, if the WITH OIDS
clause is specified. Attempts to use the COPY FROM command with the
WITH OIDS clause on a file that wasn't given OIDs during its creation will
fail.
The ability to copy values into a table with object-identifiers is a special capability reserved for
COPY. This value cannot be modified by INSERT or
UPDATE, as it is a system value. If you are not careful, you may end up with two rows
which have the same OID, which potentially negates their usefulness.