Node:
Comparison,
Next:
Output Formats,
Previous:
Overview,
Up:
Top
What Comparison Means
There are several ways to think about the differences between two files.
One way to think of the differences is as a series of lines that were
deleted from, inserted in, or changed in one file to produce the other
file. diff
compares two files line by line, finds groups of
lines that differ, and reports each group of differing lines. It can
report the differing lines in several formats, which have different
purposes.
GNU diff
can show whether files are different without detailing
the differences. It also provides ways to suppress certain kinds of
differences that are not important to you. Most commonly, such
differences are changes in the amount of white space between words or
lines. diff
also provides ways to suppress differences in
alphabetic case or in lines that match a regular expression that you
provide. These options can accumulate; for example, you can ignore
changes in both white space and alphabetic case.
Another way to think of the differences between two files is as a
sequence of pairs of bytes that can be either identical or
different. cmp
reports the differences between two files
byte by byte, instead of line by line. As a result, it is often
more useful than diff
for comparing binary files. For text
files, cmp
is useful mainly when you want to know only whether
two files are identical, or whether one file is a prefix of the other.
To illustrate the effect that considering changes byte by byte
can have compared with considering them line by line, think of what
happens if a single newline character is added to the beginning of a
file. If that file is then compared with an otherwise identical file
that lacks the newline at the beginning, diff
will report that a
blank line has been added to the file, while cmp
will report that
almost every byte of the two files differs.
diff3
normally compares three input files line by line, finds
groups of lines that differ, and reports each group of differing lines.
Its output is designed to make it easy to inspect two different sets of
changes to the same file.
- Hunks: Groups of differing lines.
- White Space: Suppressing differences in white space.
- Blank Lines: Suppressing differences in blank lines.
- Case Folding: Suppressing differences in alphabetic case.
- Specified Folding: Suppressing differences that match regular expressions.
- Brief: Summarizing which files are different.
- Binary: Comparing binary files or forcing text comparisons.