Sed question

Jonathan McKeown jonathan+freebsd-questions at
Mon Dec 22 08:28:00 UTC 2008

On Monday 22 December 2008 00:27:44 Gary Kline wrote:
> 	anyway, this is one for giiorgos, or another perl wiz. i've
> 	been using the perl subsitution cmd one-liner for years with
> 	unfailing success.  is there a way of deleting lines with perl
> 	using the same idea as:
> 	  perl -pi.bak -e 's/OLDSTRING/NEWSTRING/g' file1 file2 fileN

For a single file it's very easy:

perl -ne 'print unless 8..10' filename

will print every line except lines 8, 9 and 10.

The .. or range operator (in scalar context) is a sort of flip-flop. It keeps 
its own state, which is either true or false. When it's false it only 
evaluates its left-hand argument; when it's true it only evaluates its 
right-hand argument; and whenever the argument it's currently looking at 
returns true, the expression changes state.

If the argument is an integer, it's treated as a comparison against the 
current line number, $. ; so the first expression, 8..10, means

($. == 8) .. ($. == 10)

It's false to start with, until ($. == 8) returns true (on line 8); it becomes 
true and remains true until ($. == 10) returns true (on line 10), when it 
becomes false again and remains false until it next sees line number 8.

You can also use more complicated tests in the range operator:

perl -ne 'print unless /START/ .. /END/'

will find each line containing the word START anywhere, and delete from that 
line to the next line containing END (inclusive of both endpoints) - this 
will work for multiple occurrences of START and END in your file.

There are two problems if you string multiple files together on the command 
line: first, if you're using line numbers, the line number doesn't reset 
between files unless you do an explicit close on each file.

The bigger problem is if you have a file in which the second condition doesn't 
occur (a file with only 9 lines in the first example, or a file with a START 
and no corresponding END in the second case): the range operator will stay 
true until it sees the ending condition in the next file, meaning you'll lose 
the first ten lines in the numeric case, or every line from the top of file 
to the first END in the second case.

To get round these two problems, you need to test for eof in the range 
operator, and close each file when it hits eof to reset the line count.

perl -ne 'print unless 8 .. $. == 10 || eof; close ARGV if eof' file[1-n]
perl -ne 'print unless /START/../END/ || eof; close ARGV if eof' file[1-n]

There's some hairy precedence in the first range expression: a useful tip for 
checking that you've got it right (and indeed in general for checking that a 
bit of Perl does what you think it does) is the B::Deparse core module, which 
you call like this:

perl -MO=Deparse,-p -e 'print unless 8 .. $. == 10 || eof'

which outputs

((8 .. (($. == 10) || eof)) or print($_));
-e syntax OK

The ,-p argument to -MO=Deparse tells it to put in parentheses everywhere. If 
you're like me and like to leave them all out, feed your expression to 
Deparse with all the parens in and leave off the ,-p argument: Deparse will 
get rid of all the unnecessary ones:

$ perl -MO=Deparse -e 'print unless (8 .. (($. == 10) or eof))'
print $_ unless 8 .. $. == 10 || eof;
-e syntax OK


More information about the freebsd-questions mailing list