1. sed流编辑器原理概述

一般我们更习惯说sed为linux命令,但更准确的说法应该是sed流编辑器。和普通的交互式文本编辑器不同,流编辑器是在编辑器处理数据之前基于预先提供的一组规则来编辑数据流;而交互式文本编辑器(比如vim)则是通过键盘命令来交互式的插入、删除或者替换数据中的文本。sed编辑器可以基于输入到命令行的或是存储在命令文本文件中的命令来处理数据流中的数据。它每次从输入读入一行,用提供的编辑器命令匹配数据、按照命令中指定的方式修改流中的数据,然后将生成的数据输出到STDOUT。在流编辑器将所有的命令与一行数据进行匹配后,它会读取下一行数据并重复这个过程。在流编辑器处理完流中所有的数据行后,它就会终止。

2. sed的选项和命令

sed命令的格式为:sed options script file .

2.1 sed的选项

选项参数(options)允许你修改sed命令的行为,sed命令常用的选项如下:

选项

描述

-e script 在处理输入时,将script中指定的命令添加到运行的命令中;如果需要多个命令,也可用-e选项
-f file 在处理输入时,将file中指定的命令添加到运行的命令中
-n 不要为每个命令生成输出,等待print命令来输出

2.2 sed的命令

2.2.1 替换命令

替换(substitute,s)命令的格式为:s/from/to/ .

(1)普通替换

linux-osud:~/temp # echo "This is a test"  | sed 's/test/big test/'
This is a big test

sed编辑器自身不会修改文本文件中的数据,它只会将修改后的数据发送到STDOUT:

linux-osud:~/temp # cat data1 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
linux-osud:~/temp # sed 's/dog/cat/' data1 
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
linux-osud:~/temp # cat data1 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.

使用-e选项,我们可以在命令行上面使用多个编辑器命令:

linux-osud:~/temp # sed -e 's/brown/green/; s/dog/cat/' data1 
The quick green fox jumps over the lazy cat.
The quick green fox jumps over the lazy cat.
The quick green fox jumps over the lazy cat.
The quick green fox jumps over the lazy cat.
linux-osud:~/temp # sed -e '
> s/brown/green/
> s/fox/elephant/
> s/dog/cat/' data1
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.

注意:

  • 使用多个命令时,命令之间必须用分号分隔,并且在命令末尾和分号之间不能有空格。
  • 也可以使用bash shell的次提示符来分隔命令,而不用分号。只要输入第一个单引号来开始编写,bash会继续提示你输入更多的命令,知道你输入了封尾的单引号。必须记住,要在封尾单引号所在行结束命令。

如果有大量要处理的sed命令,可以将他们放进一个文件中(此时,不用在每个命令后面加分号,sed知道每一行都是一条单独的命令),然后使用-f选项来指定文件:

linux-osud:~/temp # cat script1 
s/brown/green/
s/fox/elephant/
s/dog/cat/
linux-osud:~/temp # sed -f script1 data1 
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.

(2)替换标记

看下面一个例子:

linux-osud:~/temp # echo "This test is just a test." | sed 's/test/example/'
This example is just a test.

可以看到替换命令只替换了第一次出现的test。默认情况下,替换命令会遍历所有的行,但只会替换每行第一处匹配的地方,要让替换命令对一行中不同地方出现的文本都起作用,必须使用替换标记(substitution flag):s/from/to/flags .

有四种可用的替换标记:

  • 数字,表明新文本将替换第几处模式匹配的地方;
  • g,表明新文本将会替换所有匹配的地方;
  • p,表明原来行的内容要打印出来,一般和-n配合使用;-n选项禁止sed编辑器输出,但p替换标记会输出修改过的行,二者配合使用将只会输出替换命令修改过的行;
  • w file,将替换的结果(指包含匹配模式的行)写出到文件。
linux-osud:~/temp # sed 's/test/trial/' data5
This is a trial of the test script.
This is the second trial of the test script.
linux-osud:~/temp # sed 's/test/trial/2' data5
This is a test of the trial script.
This is the second test of the trial script.
linux-osud:~/temp # sed 's/test/trial/g' data5
This is a trial of the trial script.
This is the second trial of the trial script.

linux-osud:~/temp # cat data6
This is a test line.
This is a different line.
linux-osud:~/temp # sed -n 's/test/trial/p' data6
This is a trial line.

linux-osud:~/temp # sed 's/test/trial/w test' data6
This is a trial line.
This is a different line.
linux-osud:~/temp # cat test 
This is a trial line.

(3)替换字符

有的时候,我们要替换的文本中含有正斜线(/),而sed默认的字符串分隔符也是正斜线,这样我们就需要使用反斜线()来转义:

linux-osud:~/temp # sed 's//bin/bash//bin/csh/' /etc/password

上面使用/bin/csh来替换/bin/bash,由于使用了很多转义字符,看起来非常不直观。为了解决这个问题,sed编辑器允许选择其他字符来作为替换命令中的字符串分隔符,比如使用感叹号(!):

linux-osud:~/temp # sed 's!/bin/bash!/bin/csh!' /etc/passwd

(4)使用地址

默认情况下,在sed编辑器中使用的命令会作用于文本数据中的所有行。如果想要命令作用于特定某行或某几行,你必须用行寻址(line addressing)。在sed编辑器中有两种形式的行寻址:

  • 行的数字范围;
  • 用文本模式来过滤出某行,格式为/pattern/command 。

两种形式都使用相同的格式来指定地址:[address]command

也可以为特定地址将多个命令放在一起:

address {
    command1
    command2
    command3
}

sed编辑器会将指定的每条命令只作用到匹配指定地址的行上。

使用数字方式的行寻址:

linux-osud:~/temp # sed '2s/dog/cat/' data1
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
linux-osud:~/temp # sed '2,3s/dog/cat/' data1
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy dog.
linux-osud:~/temp # sed '2,$s/dog/cat/' data1
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.

使用文本模式过滤器:

linux-osud:~/temp # cat data0
This is just a test.
This is just a trial.
linux-osud:~/temp # sed '/trial/s/This/That/' data0
This is just a test.
That is just a trial.

组合命令:

linux-osud:~/temp # sed '2{
> s/fox/elephant/
> s/dog/cat/
> }' data1
The quick brown fox jumps over the lazy dog.
The quick brown elephant jumps over the lazy cat.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
linux-osud:~/temp # sed '3,${
> s/brown/green/
> s/lazy/active/
> }' data1
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick green fox jumps over the active dog.
The quick green fox jumps over the active dog.

当然,行寻址与文本过滤器不光可以用在替换命令里面,在sed的其他命令里面也可以使用见后面的介绍。

2.2.2 删除命令

删除命令(delete,d)会删除匹配指定寻址模式的所有行或文本过滤器匹配的所有行。如果什么也没加,就会删除流中所有的文本行。

linux-osud:~/temp # cat data1
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
linux-osud:~/temp # sed 'd' data1
linux-osud:~/temp # 

linux-osud:~/temp # cat data7
This is line 1.
This is line 2.
This is line 3.
This is line 4.
linux-osud:~/temp # sed '3d' data7
This is line 1.
This is line 2.
This is line 4.
linux-osud:~/temp # sed '2,3d' data7
This is line 1.
This is line 4.
linux-osud:~/temp # sed '3,$d' data7
This is line 1.
This is line 2.

linux-osud:~/temp # sed '/line 1/d' data7
This is line 2.
This is line 3.
This is line 4.

注意:sed编辑器不会修改原始文件,所以我们删除的只是从sed编辑器的输出中消失了,原始文件中的那些行依旧是存在的。

我们也可以删除用两个文本模式匹配的范围的行,两个模式之间用都好隔开。但这么做要非常小心:你指定的第一个模式会“打开”行删除功能,第二个模式会“关闭”行删除功能。sed编辑器会删除两个指定行之间的所有行,包括指定的行:

linux-osud:~/temp # cat data7
This is line 1.
This is line 2.
This is line 3.
This is line 4.
linux-osud:~/temp # sed '/1/,/3/d' data7
This is line 4.
# 有问题的例子
linux-osud:~/temp #  cat data8
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is line number 1 again.
This is text you want to keep.
This is the last line in the file.
linux-osud:~/temp # sed '/1/,/3/d' data8
This is line number 4.

可以看到第二个例子中,后面再次匹配到‘/1/’,导致打开了删除功能,但是直到结束也没有匹配到‘/3/’,所以删除模式打开后没有关闭,导致后面的内容全部被删掉了。

2.2.3 插入和追加命令

插入(insert,i)和追加(append,a)命令允许我们向数据流中插入或者附加文本行:

  • 插入命令i会在指定行前增加一个或多个新行;
  • 追加命令a会在指定行后增加一个或多个新行。

命令格式为:

sed ‘[address]command
new line’
linux-osud:~/temp # echo "Test Line 2" | sed 'iTest Line 1'
Test Line 1
Test Line 2
linux-osud:~/temp # echo "Test Line 2" | sed 'aTest Line 1'
Test Line 2
Test Line 1
linux-osud:~/temp # echo "Test Line 2" | sed 'i
> Test Line 1'
Test Line 1
Test Line 2
linux-osud:~/temp # sed '3i
> This is an inserted line.' data7
This is line 1.
This is line 2.
This is an inserted line.
This is line 3.
This is line 4.
linux-osud:~/temp # sed '3a
This is an inserted line.' data7
This is line 1.
This is line 2.
This is line 3.
This is an inserted line.
This is line 4.
linux-osud:~/temp # sed '$a
This is an inserted line.' data7
This is line 1.
This is line 2.
This is line 3.
This is line 4.
This is an inserted line.
linux-osud:~/temp # sed '1i
> This is one line of new text.
> This is another line of new text.' data7
This is one line of new text.
This is another line of new text.
This is line 1.
This is line 2.
This is line 3.
This is line 4.

2.2.4 修改命令

修改(change,c)命令允许修改数据流中整行的文本内容。它跟插入和追加命令的工作机制一样:

linux-osud:~/temp # cat data7
This is line 1.
This is line 2.
This is line 3.
This is line 4.
linux-osud:~/temp # sed '3c
> This is a changed line of test' data7
This is line 1.
This is line 2.
This is a changed line of test
This is line 4.

linux-osud:~/temp # sed '/number 1/c
This is a changed line of test' data8
This is a changed line of test
This is line number 2.
This is line number 3.
This is line number 4.
This is a changed line of test
This is text you want to keep.
This is the last line in the file.
linux-osud:~/temp # sed '2,3c
> This is a new line of text.' data7
This is line 1.
This is a new line of text.
This is line 4.

2.2.5 转换命令

转换命令(transform,y)是唯一可以处理单个字符的sed编辑器命令,命令格式如下:

[address]y/inchars/outchars/

转换命令会进行inchars和outchars值的一对一映射。inchars中的第一个字符会被转换为outchars中的第一个字符,第二个字符会被转换成outchars中的第二个字符。这个映射关系会一直持续到处理完指定字符。如果inchars和outchars的长度不同,则sed编辑器会产生一条错误信息。

linux-osud:~/temp # cat data8
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is line number 1 again.
This is text you want to keep.
This is the last line in the file.
linux-osud:~/temp # sed 'y/123/789/' data8
This is line number 7.
This is line number 8.
This is line number 9.
This is line number 4.
This is line number 7 again.
This is text you want to keep.
This is the last line in the file.
linux-osud:~/temp # sed '3y/123/789/' data8
This is line number 1.
This is line number 2.
This is line number 9.
This is line number 4.
This is line number 1 again.
This is text you want to keep.
This is the last line in the file.
linux-osud:~/temp # sed '3,$y/123/789/' data8
This is line number 1.
This is line number 2.
This is line number 9.
This is line number 4.
This is line number 7 again.
This is text you want to keep.
This is the last line in the file.
linux-osud:~/temp # echo "This 1 is a test of 1 try." | sed 'y/123/456/'
This 4 is a test of 4 try.

2.2.6 读写文件命令

除了替换命令外,sed编辑器还有一些命令也可以和文件配合使用:

[address]w  filename
[address]r filename
linux-osud:~/temp # sed '1,2w test' data7
This is line 1.
This is line 2.
This is line 3.
This is line 4.
linux-osud:~/temp # cat test 
This is line 1.
This is line 2.
linux-osud:~/temp # cat data11 
Blum. Katie	Chicago. IL
Mullen. Riley	West Lafayette. IN
Snell. Haley	Ft.	IN
Jim. Green	Grant. IL
linux-osud:~/temp # sed -n '/IN/w INcustomers' data11
linux-osud:~/temp # cat INcustomers 
Mullen. Riley	West Lafayette. IN
Snell. Haley	Ft.	IN
linux-osud:~/temp # vi data12
linux-osud:~/temp # cat data12
This is an added line.
This is the second added line.

linux-osud:~/temp # sed '3r data12' data7
This is line 1.
This is line 2.
This is line 3.
This is an added line.
This is the second added line.
This is line 4.
linux-osud:~/temp # cat letter 
Would you following people:
LIST
please report to the office.
linux-osud:~/temp # sed '/LIST/{
r data11
d
}' letter
Would you following people:
Blum. Katie	Chicago. IL
Mullen. Riley	West Lafayette. IN
Snell. Haley	Ft.	IN
Jim. Green	Grant. IL
please report to the office.

2.2.7 打印命令

  • 小写p命令用来打印文本行;
  • 等号(=)命令用来打印行号;
  • l(小写L)命令用来列出行,与p不同的是,l可以打印数据流中不可打印的ASCII字符。任何不可打印字符都用它们的八进制值前加一个反斜线或标准C风格的命名法,比如t用来代表制表符。

一般这些命令单独使用没有意义,都是和其他命令配合使用:

linux-osud:~/temp # echo "this is a test" | sed 'p'
this is a test
this is a test
linux-osud:~/temp # cat data7
This is line 1.
This is line 2.
This is line 3.
This is line 4.
linux-osud:~/temp # sed -n '/line 3/p' data7
This is line 3.
linux-osud:~/temp # sed -n '2,3p' data7
This is line 2.
This is line 3.

# 显示原来的行和替换后的行
linux-osud:~/temp # sed -n '/3/{
> p
> s/line/test/p
> }' data7
This is line 3.
This is test 3.
linux-osud:~/temp # sed '=' data1
1
The quick brown fox jumps over the lazy dog.
2
The quick brown fox jumps over the lazy dog.
3
The quick brown fox jumps over the lazy dog.
4
The quick brown fox jumps over the lazy dog.
linux-osud:~/temp # sed -n '/line 4/{
> =
> p
> }' data7
4
This is line 4.

linux-osud:~/temp # cat data9 
This	line	contains	tabs.
linux-osud:~/temp # sed -n 'l' data9
Thistlinetcontainsttabs.$            #  $代表换行符

至此,sed编辑器的基本用法就介绍完了。当然,sed还有一些高级用法,但平时我们使用的比较少,这里也先不介绍了。

本文总结自《Linux命令行与shell脚本编程大全》第二版。