[ Prev ] [ Index ] [ Next ]

This document describes the unix stream editor, implemented as sed(1) and provides examples.

1. Sed and perl regex

Sed uses perl regular expressions, which include the perl character
classes.
A. Character classes
    Character classes are used in perlre and are similar to quoted
    characters like "\s" for matching whitespace. Character classes
    are enclosed in "[::]" (like [:space:]) and can be used just like
    any normal character. Character classes can themselves appear in
    a character class list. For example, consider the difference in
    the following 2 re's:
'[,[:space:]]'  - match whitespace and comma, in any order
',[:space:]'    - match a comma followed by whitespace
B. Man pages
    For more info on the regular expressions used by sed (perlre), see
    the following man pages:
perlunicode - For details about unicode and for details on  "\pP",
              "\PP", and "\X" (e.g., "\x{85}",  "\x{2028}",
              "\x{2029}"
perluniintro - Unicode in general.
perllocale - Localization, which affects, for example, the
             list of alphabetic characters generated by "\w".

2. Eric Pement's "One-Liners For sed"

The following sed document (Pement 2004) contains some pretty useful sed 
one-lines. See (local) content in #sed1line.txt or the web url at
http://www.student.northpark.edu/pemente/sed/sed1line.txt

3. Multiple expresions

Sed can parse multiple regular expressions and apply them to it's input stream. This is useful, for example, when removing text from the beinging and end of lines in the input stream. Consider an input file foo.txt, with the following content:

SOL This is line 1 EOL
SOL This is line 2 EOL

One way to remove the SOL and EOL symbols is to pass the contents of the file through an invocation of sed with 2 expressions, one for removing hte SOL symbol and the other for removing the EOL symbol. The following sed invocation does exactly that:

bash $ cat foo.txt | sed 's@^SOL\W@@;s@\WEOL$@@'  

No backlinks to this page.