7. KNOWN BUGS AMONG SED VERSIONS
Most versions of GNU sed and ssed contain a "buglist" in the
archive source code of known errors or reported behaviors that may
be misconstrued as bugs. This portion of the sed FAQ does not
attempt to fully reproduce those buglists files. However, we do
seek to do some substantial reporting, particularly where certain
programs have no "buglist" of their own or are not being actively
maintained.
As a rule of thumb, if the bug "bites" someone on the sed-users
mailing list, I tend to report it.
7.1. ssed v3.59 (by Paolo Bonzini)
(1) N does not discard the contents of the pattern space upon
reaching the end of file; not a bug. See section 6.7.5.A, above.
(2) If \x26 is entered into the RHS of a substitution, it is
interpreted as an ampersand metacharacter, and the entire pattern
matched in the "find" portion is inserted at that point. A literal
ampersand should be inserted instead.
(3) Under Windows 2000, the -i switch doesn't create backup files
properly. When passed one or more files to process, the source
file(s) are unchanged, and the output changed files are given
filenames like sedDOSxyz with no way to correspond them with the
names of the source files.
7.2. GNU sed v4.0 - v4.0.5
(1) N does not discard the contents of the pattern space upon
reaching the end of file; not a bug. See section 6.7.5.A, above.
(2) If \x26 is entered into the RHS of a substitution, it is
interpreted as an ampersand metacharacter, and the entire pattern
matched in the "find" portion is inserted at that point. A literal
ampersand should be inserted instead.
7.3. GNU sed v3.02.80
(1) N does not discard the contents of the pattern space upon
reaching the end of file; not a bug. See section 6.7.5.A, above.
(2) Same as #2 for GNU sed v4.0, above.
7.4. GNU sed v3.02
(1) Affects only v3.02 binaries compiled with DJGPP for MS-DOS and
MS-Windows: 'l' (list) command does not display a lone carriage
return (0x0D, ^M) embedded in a line.
(2) The expression "\<" causes problems when attempting the
following types of substitutions, which should print "+aaa +bbb":
echo aaa bbb | sed 's/\</+/g' # prints "+a+a+a +b+b+b"
echo aaa bbb | sed 's/\<./+&/g' # prints "+a+a+a +b+b+b"
(3) The N command no longer discards the contents of the pattern
space upon reaching the end of file. This is not a bug, it's a
feature. See section 6.7.5, "Commands which operate differently".
7.5. GNU sed v2.05
(1) If a number follows the substitute command (e.g., s/f/F/10) and
the number exceeds the possible matches on the pattern space, the
command 't label' always jumps to the specified label. 't' should
jump only if the substitution was successful (or returned "true").
(2) 'l' (list) command does not convert the following characters to
hex values, but passes them through unchanged: 0xF7, 0xFB, 0xFC,
0xFD, 0xFE.
(3) A range address like "/foo/,14" is supposed to match every line
from the first occurrence of "foo" until line 14, inclusive, and
then match only those lines containing "foo" thereafter. In gsed
v2.05, if "foo" occurs later in the file, every line from there to
the end of file will be matched (since gsed is looking for line 14
to occur again!).
(4) The regexes /\`/ and /\'/ are not interpreted as a backquote
and apostrophe, as might be expected. Instead, they are used to
represent the beginning-of-line and end-of-line (respectively), to
conform with similar regexes in the GNU versions of Emacs and awk.
As a consequence, there is no clear way to indicate an apostrophe,
since a bare apostrophe (') has special meaning to the Unix shell
and the quoted apostrophe (\') is interpreted as the EOL. A
double-quote apostrophe (\\') was interpreted as a backslash to sed
and a quote mark to the shell--again, not providing the expected
results. This syntax changed in the next version of gsed.
(5) Multiple occurrences of the 'w' command fail, as shown here,
given that both "aaa" and "bbb" occur within the file:
gsed -e "/aaa/w FILE" -e "/bbb/w FILE" input.txt
(6) The expression "\<" causes problems when attempting the
following type of substitution, which should print "+aaa +bbb":
echo aaa bbb | sed 's/\</+/g' # sed hangs up with no output
The syntax 's/\<./+&/g' issues the proper output.
7.6. GNU sed v1.18
(1) Same as #1 for GNU sed v2.05, above.
(2) The following command will lock the computer under Win95. Echos
is an echo command that does not issue a trailing newline:
echos any_word | gsed "s/[ ]*$//"
(3) Same as #3 for GNU sed v2.05, above.
7.7. GNU sed v1.03 (by Frank Whaley)
(1) The \w and \W escape sequences both match only nonword
characters. \w is misdefined and should match word characters.
(2) The underscore is defined as a nonword character; it should be
defined as a word character.
(3) same as #3 for GNU sed v2.05, above.
7.8. sed v1.6 (by Walter Briscoe) - still in beta version
(1) Duplicated subexpressions (still) do not match an empty set as
they should. This problem was inherited from HHsed15.
echo 123 | sed "s/\([a-z][a-z]\)*/=\1/" # does not return '='
(2) If grouping is followed by a + operator, nothing is matched.
This problem was inherited from HHsed; it fixed a bug with the *
operator, but the problem with the + operator persists.
echo aaa | sed "/\(a\)+/d" # nothing is deleted.
(3) With the interval expressions \{1,\} and +, there is a bug
related to the & replacement character. This affected the BETA
release, and it's not known if it affects the final release.
echo ab | sed "s/a[^a]*/&c/" # returns 'abc'. Okay.
echo ab | sed "s/a[^a]+/&c/" # returns 'ab'. Bug!
echo ab | sed "s/a[^a]\{1,\}/&c/" # returns 'ab'. Bug!
7.9. HHsed v1.5 (by Howard Helman)
(1) If a number follows the substitute command (e.g., s/foo/bar/2),
in a sed script entered from the command line, two semicolons must
follow the number, or they must be separated by an -e switch.
Normally, only 1 semicolon is needed to separate commands.
echo bit bet | HHsed "s/b/n/2;;s/b/B/" # solution 1
echo bit bet | HHsed -e "s/b/n/2" -e "s/b/B" # solution 2
(2) If the substitute command is followed by a number and a "p"
flag, when the -n switch is used, the "p" flag must occur first.
echo aaa | HHsed -n "s/./B/3p" # bug! nothing prints
echo aaa | HHsed -n "s/./B/p3" # prints "aaB" as expected
(3) The following commands will cause HHsed to lock the computer
under MS-DOS or Win95. Note that they occur because of malformed
regular expressions which will match no characters.
sed -n "p;s/\<//g;" file
sed -n "p;s/[char-set]*//g;" file
(4) The range command '/RE1/,/RE2/' in HHsed will match one line if
both regexes occur on the same line (see section 3.4(3), above).
Though this could be construed as a feature, it should probably be
considered a bug since its operation differs from every other
version of sed. For example, '/----/,/----/{s/^/>>/;}' should put
two angle brackets ">>" before every line which is sandwiched
between a row of 4 or more hyphens. With HHsed, this command will
only prefix the hyphens themselves with the angle brackets.
(5) If the hold space is empty, the H command copies the pattern
space to the hold space but fails to prepend a leading newline. The
H command is supposed to add a newline, followed by the contents of
the pattern space, to the hold space at all times. A workaround is
"{G;s/^\(.*\)\(\n\)$/\2\1/;H;s/\n$//;}", but it requires knowing
that the hold space is empty and using the command only once.
Another alternative is to use the G or the h command alone at key
points in the script.
(6) If grouping is followed by an '*' or '+' operator, HHsed does
not match the pattern, but issues no warning. See below:
echo aaa | HHsed "/\(a\)*/d" # nothing is deleted
echo aaa | HHsed "/\(a\)+/d" # nothing is deleted
echo aaa | HHsed "s/\(a\)*/\1B/" # nothing is changed
echo aaa | HHsed "s/\(a\)+/\1B/" # nothing is changed
(7) If grouping is followed by an interval expression, HHsed halts
with the error message "garbled command", in all of the following
examples:
echo aaa | HHsed "/\(a\)\{3\}/d"
echo aaa | HHsed "/\(a\)\{1,5\}/d"
echo aaa | HHsed "s/\(a\)\{3\}/\1B/"
(8) In interval expressions, 0 is not supported. E.g., \{0,3\)
7.10. sedmod v1.0 (by Hern Chen)
Technically, the following are limits (or features?) of sedmod, not
bugs, since the docs for sedmod do not claim to support these
missing features.
(1) sedmod does not support standard interval expressions \{...\}
present in nearly all versions of sed.
(2) If grouping is followed by an '*' or '+' operator, sedmod gives
a "garbled command" message. However, if the grouped expressions
are strings literals with no metacharacters, a partial workaround
can be done like so:
\(string\)\1* # matches 1 or more instances of 'string'
\(string\)\1+ # matches 2 or more instances of 'string'
(3) sedmod does not support a numeric argument after the s///
command, as in 's/a/b/3', present in nearly all versions of sed.
The following are bugs in sedmod v1.0:
(4) When the -i (ignore case) switch is used, the '/regex/d'
command is not properly obeyed. Sedmod may miss one or more lines
matching the expression, regardless of where they occur in the
script. Workaround: use "/regex/{d;}" instead.
7.11. HP-UX sed
(1) Versions of HP-UX sed up to and including version 10.20 are
buggy. According to the README file, which comes with the GNU cc
at <ftp://ftp.ntua.gr/pub/gnu/sed/sed-2.05.bin.README>:
"When building gcc on a hppa*-*-hpux10 platform, the `fixincludes'
step (which involves running a sed script) fails because of a bug
in the vendor's implementation of sed. Currently the only known
workaround is to install GNU sed before building gcc. The file
sed-2.05.bin.hpux10 is a precompiled binary for that platform."
7.12. SunOS sed v4.1
(1) Bug occurs in RE pattern matching when a non-null '[char-set]*'
is followed by a null '\NUM' pattern recall, illustrated here and
reported by Greg Ubben:
s/\(a\)\(b*\)cd\1[0-9]*\2foo/bar/ # between '[0-9]*' and '\2'
s/\(a\{0,1\}\).\{0,1\}\1/bar/ # between '.\{0,1\}' and '\1'
Workaround: add a do-nothing 'X*' expression which will not match
any characters on the line between the two components. E.g.,
s/\(a\)\(b*\)cd\1[0-9]*X*\2foo/bar/
s/\(a\{0,1\}\).\{0,1\}X*\1/bar/
7.13. SunOS sed v5.6
(1) If grouping is followed by an asterisk, SunOS sed does not match
the null string, which it should do. The following command:
echo foo | sed 's/f\(NO-MATCH\)*/g\1/'
should transform "foo" to "goo" under normal versions of sed.
7.14. Ultrix sed v4.3
(1) If grouping is followed by an asterisk, Ultrix sed replies with
"command garbled", as shown in the following example:
echo foo | sed 's/f\(NO-MATCH\)*/g\1/'
(2) If grouping is followed by a numeric operator such as \{0,9\},
Ultrix sed does not find the match.
7.15. Digital Unix sed
(1) The following comes from the man pages for sed distributed with
new, 1998 versions of Digital Unix (reformatted to fit our
margins):
[Digital] The h subcommand for sed does not work properly. When
you use the h subcommand to place text into the hold area, only
the last line of the specified text is saved. You can use the H
subcommand to append text to the hold area. The H subcommand and
all others dealing with the hold area work correctly.
(2) "$d" command issues an error message, "cannot parse". Reported
by Carlos Duarte on 8 June 1998.
[end-of-file]