GNU tar is able to create and read compressed archives. It supports
gzip and bzip2 compression programs. For backward
compatibilty, it also supports compress command, although
we strongly recommend against using it, since there is a patent
covering the algorithm it uses and you could be sued for patent
infringement merely by running compress! Besides, it is less
effective than gzip and bzip2.
Creating a compressed archive is simple: you just specify a
compression option along with the usual archive creation
commands. The compression option is -z (--gzip) to
create a gzip compressed archive, -j
(--bzip2) to create a bzip2 compressed archive, and
-Z (--compress) to use compress program.
For example:
$ tar cfz archive.tar.gz .
Reading compressed archive is even simpler: you don't need to specify
any additional options as GNU tar recognizes its format
automatically. Thus, the following commands will list and extract the
archive created in previous example:
# List the compressed archive
$ tar tf archive.tar.gz
# Extract the compressed archive
$ tar xf archive.tar.gz
The only case when you have to specify a decompression option while
reading the archive is when reading from a pipe or from a tape drive
that does not support random access. However, in this case GNU tar
will indicate which option you should use. For example:
$ cat archive.tar.gz | tar tf -
tar: Archive is compressed. Use -z option
tar: Error is not recoverable: exiting now
If you see such diagnostics, just add the suggested option to the
invocation of GNU tar:
$ cat archive.tar.gz | tar tfz -
Notice also, that there are several restrictions on operations on
compressed archives. First of all, compressed archives cannot be
modified, i.e., you cannot update (--update (-u)) them or delete
(--delete) members from them. Likewise, you cannot append
another tar archive to a compressed archive using
--append (-r)). Secondly, multi-volume archives cannot be
compressed.
The following table summarizes compression options used by GNU tar.
-z
--gzip
--ungzip
Filter the archive through gzip.
You can use --gzip and --gunzip on physical devices
(tape drives, etc.) and remote files as well as on normal files; data
to or from such devices or remote files is reblocked by another copy
of the tar program to enforce the specified (or default) record
size. The default compression parameters are used; if you need to
override them, set GZIP environment variable, e.g.:
$ GZIP=--best tar cfz archive.tar.gz subdir
Another way would be to avoid the --gzip (--gunzip, --ungzip, -z) option and run
gzip explicitly:
About corrupted compressed archives: gzip'ed files have no
redundancy, for maximum compression. The adaptive nature of the
compression scheme means that the compression tables are implicitly
spread all over the archive. If you lose a few blocks, the dynamic
construction of the compression tables becomes unsynchronized, and there
is little chance that you could recover later in the archive.
There are pending suggestions for having a per-volume or per-file
compression in GNU tar. This would allow for viewing the
contents without decompression, and for resynchronizing decompression at
every volume or file, in case of corrupted archives. Doing so, we might
lose some compressibility. But this would have make recovering easier.
So, there are pros and cons. We'll see!
-j
--bzip2
Filter the archive through bzip2. Otherwise like --gzip.
-Z
--compress
--uncompress
Filter the archive through compress. Otherwise like --gzip.
The GNU Project recommends you not use
compress, because there is a patent covering the algorithm it
uses. You could be sued for patent infringement merely by running
compress.
--use-compress-program=prog
Use external compression program prog. Use this option if you
have a compression program that GNU tar does not support. There
are two requirements to which prog should comply:
First, when called without options, it should read data from standard
input, compress it and output it on standard output.
Secondly, if called with -d argument, it should do exactly
the opposite, i.e., read the compressed data from the standard input
and produce uncompressed data on the standard output.
Published under the terms of the GNU General Public License