Before and after match in awk

Posted by waldner on 18 July 2010, 11:49 am

The problem is: print the line n lines before or after a line that matches a given pattern. How to do this in awk?

The sample input

Here is a sample input file:

line1
foobar
line3
line4
line5
foobar
line7
foobar
line8

We want to look for matches of "foobar", and print the lines 1, 2 or 3 lines before or after the matching lines.

n lines after the match

This can be accomplished by maintaining a "queue" (implemented with an array) of the line numbers of the future lines to be printed, updated every time a match is found (ok, it's not a proper queue, but it has the same logical function). This uses the "two files" awk idiom, despite there being a single input file:

$ awk -v n=1 '/foobar/{queue[NR+n]} NR in queue' file.txt
line3
line7
line8
$ awk -v n=2 '/foobar/{queue[NR+n]} NR in queue' file.txt
line4
foobar
$ awk -v n=3 '/foobar/{queue[NR+n]} NR in queue' file.txt
line5
line8

To avoid keeping already-used keys in the array, they can of course be deleted, although this should be hardly needed (thanks igli from #awk):

$ awk -v n=5 '/foobar/{queue[NR+n]} NR in queue {print; delete queue[NR]}' file.txt

n lines before the match

This is slightly more complicated. Basically, we need to maintain a "sliding window" of n lines, and when a match is found, print the line NR-n lines before (which will be the first line in the sliding window). Modular arithmetic comes handy because using NR%n always gives a number from 0 to n-1, which can be used to access the sliding window array.

$ awk -v n=1 '/foobar/ && NR>n {print window[(NR-n)%n]}{window[NR%n]=$0}' file.txt
line1
line5
line7
$ awk -v n=2 '/foobar/ && NR>n {print window[(NR-n)%n]}{window[NR%n]=$0}' file.txt
line4
foobar
$ awk -v n=3 '/foobar/ && NR>n {print window[(NR-n)%n]}{window[NR%n]=$0}' file.txt
line3
line5

We also need to check that NR is greater than n because otherwise we may print lines before the first, which obviously does not make sense.

Filed under awk, faq, shell, tips Tagged after, awk, before, matching, oneliners, text processing

Comments are closed | Permalink

One Comment

SML says:

December 17, 2010 at 00:43

very useful for pulling dates of events out of Oracle alert log

\1