4.3. How do I convert files with toggle characters, like +this+, to look like [i]this[/i]?
Input files, especially message-oriented text files, often contain
toggle characters for emphasis, like ~this~, this, or =this=. Sed
can make the same input pattern produce alternating output each
time it is encountered. Typical needs might be to generate HMTL
codes or print codes for boldface, italic, or underscore. This
script accomodates multiple occurrences of the toggle pattern on
the same line, as well as cases where the pattern starts on one
line and finishes several lines later, even at the end of the file:
# sed script to convert +this+ to [i]this[/i]
:a
/+/{ x; # If "+" is found, switch hold and pattern space
/^ON/{ # If "ON" is in the (former) hold space, then ..
s///; # .. delete it
x; # .. switch hold space and pattern space back
s|+|[/i]|; # .. turn the next "+" into "[/i]"
ba; # .. jump back to label :a and start over
}
s/^/ON/; # Else, "ON" was not in the hold space; create it
x; # Switch hold space and pattern space
s|+|[i]|; # Turn the first "+" into "[i]"
ba; # Branch to label :a to find another pattern
}
#---end of script---
This script uses the hold space to create a "flag" to indicate
whether the toggle is ON or not. We have added remarks to
illustrate the script logic, but in most versions of sed remarks
are not permitted after 'b'ranch commands or labels.
If you are sure that the +toggle+ characters never cross line
boundaries (i.e., never begin on one line and end on another), this
script can be reduced to one line:
s|+\([^+][^+]*\)+|[i]\1[/i]|g
If your toggle pattern contains regex metacharacters (such as '*'
or perhaps '+' or '?'), remember to quote them with backslashes.