CMP will compare files (or groups of files) and report any differences. Output is suitable for piping, or processing by other programs. A value returned in ERRORLEVEL lets batch files take action based on whether files are the same or differ.
The two versions operate the same and have the same features,
except that the 32-bit version supports long filenames and more lines per file
(about 2000 million versus about 32 thousand).
If you typically run CMP in a DOS box under Windows 9x or NT, the 32-bit
version is the one you want.
You may wish to rename the version you use more often to the
simpler CMP.EXE. All the examples in this user guide will assume you've
done that. Otherwise, just substitute CMP16
or
CMP32
wherever you see CMP
in the examples.
cmpThe full command form is one of
cmp [options] file1 file2 [>reportfile] cmp [options] files directory [>reportfile]In the second form, files may be any number of file specs, possibly containing wildcards, and directory may be a disk letter (with colon) or path (with or without trailing backslash). Please be aware that the 16-bit and 32-bit CMP programs expand wildcards slightly differently because the 32-bit version supports long filenames. Thus the 32-bit version would expand
abc*
to include all files, with any extension or none, whose names start
with abc; with the 16-bit version you need abc*.*
to get
the same result.
Example:
cmp -L5 zonk1 b:zonk2will compare file zonk1 (on the current drive and directory) to file zonk2 in the current directory of drive b, limiting look-ahead to five lines (the
-L5
option).
Another example:
cmp a:*.doc xx.htm b:will compare all the *.doc files in the current directory of drive a, plus xx.htm in the current directory of the current disk, to files of the same names in the current directory of drive b.
Here are quick hyperlinks to all the options:
You have a lot of freedom about how you enter options. You can use
a leading hyphen or slash mark; you can use upper- or lower-case
letters. You can leave spaces between options or combine them. For
instance, the following are just some of the different ways of turning
on the W100
and B
options:
/w100 /b /w100/b /w100B -W100-B -W100 -bThis document will always use capital letters for the options, to make it easier to distinguish letter l and figure 1.
/?
cmp /? >prnwill send the help text to the printer.
/0
and /1
/0
returns 0 if there are
differences or 1 if there are no differences; /1
returns
1 for differences or 0 for no differences. For more details, see
Return values below.
/D
/Z
If you use the /Z
option on the command line, any
options in the environment variable will be
disregarded, and so will any preceding options on the command line.
This can be useful in batch files, to make sure that the action of
GREP is controlled only by the options on the command line, and not by
any settings in the environment variable.
The /Z
option is the only one whose
effect can't be reversed. If you use /Z
more than once,
GREP disregards the environment variable and all command-line options
up through the last /Z
.
/B
Note that runs of spaces and/or tabs are compressed to a single space, not completely removed. Thus CMP will always consider "ab" (with no space between "a" and "b") different from "a b" (any spaces or tabs between "a" and "b").
Regardless of this option, CMP will always disregard spaces and tabs
at the ends of lines. Some more esoteric details are given below in
"How spaces and tabs are handled".
/E
/E
option to make CMP keep track of blank lines and
report added or deleted blank lines as differences.
/I
/L
look-ahead
The significance of look-ahead is this. Suppose CMP finds, after lines 28-31 of file 1 match lines 38-41 of file 2, that line 32 of file 1 doesn't match line 42 of file 2. In this case, CMP has to look ahead at line 33 of file 1 and line 43 of file 2.
file 1 file 2 ================== ================== (28) line a (38) line a (29) line b (39) line b (30) line c (40) line c (31) line d (41) line d (32) line e (42) something different (33: look ahead) (43: look ahead)Maybe they match, or maybe line 43 of file 2 matches line 32 of file 1 (meaning that line 42 of file 2 is new in that file and doesn't exist in file 1). The
/L
option tells
CMP how many lines to look ahead trying to find a match after lines
that don't match. If CMP examines that number of lines from both files
without finding a match, it will report that fact and stop processing.
(If you wish, you can then re-run CMP with a higher /L
value.)
There's no specific limit for look-ahead by itself, but
/L
and /W
(below) have a combined limit.
In the 16-bit version, 64 K (65,536 bytes) is available for
look-ahead, and look-ahead times (width+2) must not
exceed that value.
In the 32-bit version, the look-ahead and width are
limited only by available memory (including virtual memory).
In either version, if you exceed the available space with the combined
/L
and /W
options, CMP will display a
message inviting you to choose lower values.
/T
The /T
option has no effect unless you also use
/B
to turn off the compression of
runs of spaces.
Some more esoteric details are given below in
"How spaces and tabs are handled".
/W
width
/W
value if you suspect that some lines
contain differences beyond the original width.
You can suppress the messages about truncation of individual lines
by using the /Q
option, but CMP will still display the
message at the end so you'll know that some lines were not examined
completely and what you can do about it.
The effective width of a line, which is measured against
/W
width, may be different from that line's length
in characters, depending on how spaces and tabs
are handled (see below). If you want to know the actual maximum
effective line width in a file, simply compare the file to itself with
a small width value and the /Q option to suppress messages,
like this:
cmp /QW10 file1 file1The maximum value for
/W
depends on the value given
for /L
(above).
/B
/E
/T
options, above, will affect the
formatting of the lines reported as different.
/F
n
n is a minimum field width. If you specify
/F4
, line numbers for any differences in lines 1 through
9999 will be right justified in a four-character field. Any larger
line numbers will take additional positions to the right, like this:
1. 99>text1a 2. 99>text1b 2. 100>text1c 1.2398>text2a 2.2399>text2b 1.23468>text3a 1.23469>text3b 2.23469>text3cIf you prefer to left justify line numbers in a field of stated width, put a minus sign before n. For instance, the output under the
/F-4
option would line up like the above, but spaces
would appear after the short line numbers instead of before.
The default is /F0
, which displays each line number
with no padding, like this:
1.99>text1a 2.99>text1b 2.100>text1c 1.2398>text2a 2.2399>text2b 1.23468>text3a 1.23469>text3b 2.23469>text3c
/N
str
If you want certain characters like =, |, <, or space in your
separator, you can't simply type them because DOS gives them special
meanings.
Use special "numeric escape sequences" to represent those
characters in the /N
option. For example,
to make your output look like this:
1. 99 : text1a 2. 99 : text1b 2.100 : text1c 1.398 : text2a 2.399 : text2buse the sequence
\32
to represent the space character, like this:
cmp /N\32:\32 /F3 file1 file2The numeric escape sequences are a backslash (\) followed by the numeric value of the character, up to three decimal digits. A leading 0 denotes octal; a leading 0x or 0X denotes hexadecimal. Here are some sample sequences:
instead of | use any of |
---|---|
(space) | \32 \0x20 \040 |
(tab) | \9 \0x09 \011 |
< (less) | \60 \0x3C \074 |
= (equal) | \61 \0x3D \075 |
> (greater) | \62 \0x3E \076 |
| (vertical bar) | \124 \0x7C \0174 |
" (double quote) | \34 \0x22 \042 |
The above are only examples: you can enter any character as a numeric
sequence. For example, capital A would be \65
,
\0x41
, or \0101
.
/Q
/W
, above),
and the final display of line counts for the two files.
If any lines were truncated, a single message will still appear at the
end of processing.
For even quieter operation, use the
Example:
/QQ
option, described immediately
below. (Separate /Q
and /QQ
options
exist for historical reasons. /QQ
was added in response
to user requests, rather than change the operation of /Q
,
which existing users might be depending on.)
/QQ
/Q
option,
suppress the blank lines between difference blocks, and send the
header (identification of files) and footer (summary counts of
differences found) to stderr instead of stdout. The result is that, if
you have the /QQ
option turned on, you can redirect the
output of CMP and you will get only the difference lines from the two
files. You still get line numbers, but by using the
/F
option you can force them to a
fixed format that is easily stripped away.
cmp /QQ /F6 file1 file2 >output
will send just the different lines to the file called
output
. Non-essential messages will be suppressed,
because the /QQ
option turns on the
/Q
option. Essential messages
will appear on your screen because they are written to stderra nd are
not redirected. Assuming each file has fewer than a
million lines, each line written to the output
file will
have a 9-character prefix: file number (1 or 2), a period, a six-digit
line number field, and the separator character >.
ORS_CMP
environment variable. You have the same freedom
as on the command line: leading slashes or hyphens, space separation
or options run together, caps or lower case.
CMP processes the environment variable before any command-line options, which means that an option on the command line will override the corresponding option in the environment variable.
The toggles,
/B
/E
/I
/Q
/QQ
/T
,
reverse their state every time you
specify them. So if you usually want case-blind comparisons, put
/I
in the environment variable. Then, if you want
case-sensitive comparisons for a particular run, simply put
/I
on the command line and that will reverse the setting
from the environment variable. To alter the settings of other options,
like /L
and /F
, simply put the option on
the command line with the desired setting.
Particularly in a batch file, you may want to be sure that the
environment variable, if set, doesn't affect the option settings. To
ensure this, put the /Z
option
first on the command line.
If you have any question which options
are in effect, simply use /D
on the command line to
display all option values.
IF ERRORLEVEL
in
a batch file.
255 | bad option, or other error on the command line |
254 | specified file not available |
253 | not enough memory for combination of
/L and /W options |
2 | help message displayed (/? option, or
no files specified on the command line) |
0 | program ran to completion (whether the files are the same or different) |
You might want to use CMP in a batch file or a
makefile and take different actions depending on whether two files are
the same or different. To do this, use the
/0
or
/1
option. The /1
option
emulates UNIX diff by returning an error level of 1 if the files are
different or 0 if they're the same. /0
is the opposite:
it returns 0 if the files are different or 1 if they're the same. In
other words, the /0
or /1
option gives the
value CMP should return if differences are found.
/B
and
/T
options, which control the
treatment of spaces and tabs within a line.
CMP applies the /B
and
/T
option settings
while reading each line from file. In fact, CMP actually makes the
changes to its own in-RAM copy of each line, so that when differences
are found CMP displays the transformed line.
CMP always ignores any spaces and tabs at the end of a line, regardless of the options. CMP also ignores any difference between the UNIX line-ending convention (LF only) and the DOS convention (CR+LF).
There can be some interaction between the
/B
and
/T
option
settings and the /W
width
setting. The /W
option specifies the maximum
effective line width, but the effective line width of
a line can be less or greater than the actual length of that line in
characters:
/B
, runs of spaces and tabs are
squeezed to a single space, so the effective line width can be less
than the actual width.
/B
and not /T
, tabs are
expanded to a run of spaces, so the effective line width can be
greater than the actual line length.
/B
and /T
, tabs and
spaces are treated as ordinary characters.
/W
width, CMP will tell you the maximum effective width
at the end of the run.
Since CMP normally disregards the above differences in spacing
within a line, as well as completely blank lines, if the program finds
no differences it will report that the files are "effectively
identical". If you want to compare for character-by-character
identity, including spaces, tabs, and blank lines, specify the
/BET
options. Then if the program finds no differences it
will report that the files are identical.
/Z option
;
update the logo message to use the URL for Oak Road Systems;
expand the help message;
suggest "cmp /?|more" when the user types cmp
with no
files
/F option
,
the /N option
,
and the /QQ option
; send
the help message to stdout instead of stderr as
previously; reorganize this documentation file and add many
hyperlinks and a few small clarifications
/I
and
/D
options. Split the
confusing three-valued /B
n option into separate
/B
and
/T
toggle-type options. Change the
32-bit default to /L100
. Improve
diagnostics for a bad option in the environment variable. Convert
documentation to HTML from Word for Windows.
/0
and
/1
options; systematize
all return values. No longer require the trailing backslash on a
directory argument. Instead of "effectively identical", report a more
specific phrase when the files are not significantly different based
on the /B
and /E
options.
/B
option to control that feature and tab expansion. Add
the /Q
option. Make the format of
command-line options more flexible, and scan the
ORS_CMP
environment variable for options.
/L
and
/W
.
/L20
(previously /L10
).