Skip to content

Smart ranges in sed

Since there seem to be still quite a few people who want to do this with sed...let's see how to select ranges of lines in the same way as with awk (explained here).

We should also avoid the same issue described there, that is, if other /BEGIN/ lines are found while we are inside a range, those lines should be printed. So with this input:

1 BEGIN
2 foo
3 bar
4 BEGIN
5 baz
6 END

at least lines 2 to 5 should be printed (line 1, or 6, or both may also be printed, depending on whether and which range endpoint we are including/excluding).

We're going to assume a sed with ERE (-E) support (as should be the norm these days anyway).

From BEGIN to END, inclusive

This is obviously the easy one:

# print lines from /BEGIN/ to /END/, inclusive
$ sed '/BEGIN/,/END/!d'
$ sed -n '/BEGIN/,/END/p'

No mysteries here. Let's get to the interesting cases.

From BEGIN to END, excluding END

# print lines from /BEGIN/ to /END/, excluding /END/
$ sed '/BEGIN/!d; :loop; n; /END/d; $!bloop'

We start a loop when we see a /BEGIN/, and keep looping until we see an /END/, at which point we delete the line so it's not printed.

From BEGIN to END, excluding BEGIN

# print lines from /BEGIN/ to /END/, excluding /BEGIN/
$ sed -E '/BEGIN/!d; :loop; N; /END/{ s/^[^\n]*\n//; p; d;}; $!bloop'

Same loop, but the lines are accumulated in the pattern space, and the first of them is removed before printing the whole block (note that the "D" command cannot be used for that purpose here, as it starts a new cycle).

From BEGIN to END, not inclusive

This is of course just a small variation on the preceding one, in that we delete both the first and the last line:

# print lines from /BEGIN/ to /END/, excluding both lines
$ sed -E '/BEGIN/!d; :loop; N; /END/{ s/^[^\n]*\n//; s/\n?[^\n]*$//; /./p; d;}; $!bloop'

Since we're excluding both the start and the end line, what's left after removing them may be empty, so we check that there's at least one character left and we only print the pattern space if that is the case.

For anything more complex, just use awk!

Pulling out strings

This is a generic text-processing need that often occurs in different kinds of scripts. Simply put, you want to get a list of the strings in the file (or files) that match a certain pattern. Let's use this simple file as an example:

12345#foobar3#blah
xxxxxxx#foobar77#yyyyyy
foobar867#zzzzzzz
ooooooo#foobar12#ggggggg#foobar17#kkkkkkkk#foobar99
xxxxxxxxxxxxxxxxx
somefoobar12thatwedontwant

Our pattern is (using ERE syntax) "foobar[0-9]+", that is, "foobar" followed by any number of digits. We will refine it a bit later.

Using common shell tools, we have several possibilities.

GNU grep

Probably the simplest one, if GNU grep is available, is to use its -o option, to return only the part of the input that matches the pattern, so:

$ grep -Eo 'foobar[0-9]+' test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99
foobar12

As said, this needs GNU grep due to the -o option.

GNU awk and BusyBox awk

These two awk implementations support, as a non-standard extension, the assignment of a regular expression to RS, and make whatever matched RS available in the special variable RT (mawk seems to support the former feature, but not the latter, which make it unsuitable to be used in the way we describe here). So here's how to use these awks for the task:

$ gawk -v RS='foobar[0-9]+' 'RT{print RT}' test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99
foobar12

Note that using RS/RT this way allows to match patterns that contain newlines, something that's not easily achieved with other tools (except Perl, see below).

These methods are easy and quick; however, if none of the above implementations is available, we need to use something more standard.

Standard awk

With standard awk, a way to extract all occurrences is to use a loop over each line, repeatedly using match():

$ cat matches.awk
{
  line = $0
  while (match(line, /foobar[0-9]+/) > 0) {
    print substr(line, RSTART, RLENGTH)
    line = substr(line, RSTART + RLENGTH)
  }
}
$ awk -f matches.awk test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99
foobar12

Here the original line is saved (in case it's needed for further processing) and a copy is used to find matches. Since match() only finds the first match in the string, when a match is found it's removed so running match() again can find the following occurrence (if any). For this reason, the above code will loop forever if it's given a pattern that can match the empty string, like for example a*. When you do that, you really want a+ instead anyway, so use the latter. The code above is a common awk idiom to find all matches of a pattern.

Sed

With sed the task is a bit complicated. Basically, we need to somehow "mark" the parts of the data that match our pattern, so we can later delete everything that's not between markers, leaving thus only what we want.

A safe character to use as marker is the newline character (\n), since sed guarantees that, under normal conditions, no input line as seen in the pattern space will contain that character. For the first of the following solutions to work, a sed implementation that recognize \n in the RHS and the special bracket expression [^\n] (any character except \n) is needed. And since our pattern is a ERE (though it could be rewritten as BRE), we need a sed that recognizes EREs. GNU sed has all these features, and we're going to assume it in the examples.

That said, let's see a couple of ways to solve the task with sed.

One somewhat laborious solution is as follows:

$ sed -E '
s/foobar[0-9]+/\n&/g
t ok
d
:ok
s/^[^\n]*\n//
s/(foobar[0-9]+)[^\n]*/\1/g' test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99
foobar12

Here we prepend a \n to each match, then delete what's before the very first match in the line (zero or more non-\n followed by a \n at the beginning of the string). Finally we delete all the parts between matches, which leaves us with just the matches, nicely separated by \n characters.

Another approach to the problem is implemented with the following code (which also has the benefit of using standard syntax; changing the ERE into BRE (foobar[0-9][0-9]*) and converting all the "\n" in the RHS to literal escaped newlines would allow this solution to be used with a standard sed):

$ sed -E '
/\n/!s/foobar[0-9]+/\n&\n/g
/^foobar[0-9]+\n/P
D' test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99
foobar12

Here the approach is to "isolate" each match with a \n before and one after (if the pattern space doesn't already have one). If the line begins with a match, it's printed with "P" (up to the following \n, which is what we want). Regardless, the part up to and including the first \n is deleted (with "D"). If something is left, go to the beginning to do the previous steps again, until the whole pattern space is entirely consumed. If there were no matches in the original line, "D" will just delete it entirely and start a new cycle. Rinse and repeat for every input line.

Perl

With perl we can do it pretty easily thanks to its powerful regular expression matching operators:

$ perl -ne 'print "$_\n" for (/foobar\d+/g);' test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99
foobar12

If the pattern we want has newlines in it, we can just tell perl to slurp the file with perl -n000e and we're set.

Context comes to town

All the solutions seen so far strictly match a pattern, regardless of where it appears. In other words, they ignore the context of the matches. However there may be cases where this is important. In our example input data, we might want to match foobar[0-9]+ only if it's delimited, where "delimited" here is defined as "preceded by either a hash (#) or beginning of line, and followed by either a hash or end of line". Clearly, with this new requirements we don't want the foobar12 in the last line.

We thus need to consider the context in the regular expressions, making them include a larger text, so that matches only happen where there's data that we want; however, since the matched text will now be larger than what we need, we need to subsequently "clean up" the match, extracting only what we want from it. Our regular expression becomes now (ERE syntax)

(^|#)foobar[0-9]+(#|$)

Let's see how to modify the previous solutions to work with context.

GNU grep

Grep can't really edit text, so it would seem like it's out of the discussion here, but with a silly trick we can still use it:

$ grep -Eo '(^|#)foobar[0-9]+(#|$)' test.txt | grep -Eo 'foobar[0-9]+'
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99

The first grep prints all matches with their context, and the second one, operating only on the good data, strictly "extracts" the matches that we need.

GNU awk and BusyBox awk

Setting RS to a non-default value obviously causes awk to stop working in line-oriented mode, so the beginning of line and end-of line anchors in our regular expression need to be augmented to consider the newline character.

Now, with the extended RS, RT will contain the full match with context, so we use gsub() to clean it up:

$ gawk -v RS='(^|#|\n)foobar[0-9]+(#|\n|$)' 'RT{gsub(/^(#|\n)|(#|\n)$/, "", RT); print RT}' test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99

The critical part here is obviously the gsub(), which should be written carefully to remove the context stuff and only leave what we want.

Standard awk

Here we don't change RS so we're using the traditional line-oriented mode:

$ cat matches2.awk
{
  line = $0
  while (match(line, /(^|#)foobar[0-9]+(#|$)/)>0) {
    m = substr(line, RSTART, RLENGTH)
    gsub(/^#|#$/, "", m); print m
    line = substr(line, RSTART + RLENGTH)
  }
}
$ awk -f matches2.awk test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99

Sed

Things start to get complicated with sed if we want context. However we can still do it.

Of the two sed solutions presented previously, the easiest to adapt is the second one, so here it is:

$ sed -E '
/\n/!s/(^|#)foobar[0-9]+(#|$)/\n&\n/g
/^#?foobar[0-9]+#?\n/ {
  s/^#?(foobar[0-9]+)#?/\1/
  P
}
D' test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99

Again, the critical bit is the part where the context (that we needed to match only the "correct" parts, but no longer want) is removed. This part will be highly dependent on the actual input data and problem requirements.

Perl

Perl is again an easy winner, as we can match with context and pull out only the interesting parts in a single go:

$ perl -ne 'print "$_\n" for (/(?:^|#)(foobar\d+)(?:#|$)/g);' test.txt
foobar3
foobar77
foobar867
foobar12
foobar17
foobar99

The regular expressions for what comes before and after are non-capturing, so the list returned byt the overall match is already made of clean strings, which we thus just need to print.

Overlap problems

You might have noticed that at the same time we introduced context to the matches, we also introduced the potential for overlap. Consider the following sample input data:

12345#foobar3#foobar9999#blah
somefoobar12thatwedontwant

If we run for example the above GNU awk solution on this data, we get:

$ gawk -v RS='(^|#|\n)foobar[0-9]+(#|\n|$)' 'RT{gsub(/^(#|\n)|(#|\n)$/, "", RT); print RT}' test.txt
foobar3

The foobar9999 is missed since the regular expression that matches foobar3 also "consumes" its surrounding context (the leading and trailing hash) and thus applying the regex with context again on what's left fails to match the second occurrence of the pattern.

However, this does not happen with all the solutions; only with some of them. The standard awk and the sed solutions still work since the previous match is deleted from the line, and the extended pattern we use to include context works if the match is at the beginning of a line without a delimiter, too. In the example, once #foobar3# has been matched and removed what's left is "^foobar9999#blah$", and the expression we're using for the match can still match again it since the pattern is at the very beginning and ^ is a possible anchor.
Of course, this happens to work because of the specific combination of input data and regular expressions that we're using; generally speaking, this doesn't have to be the case. It will depend on the actual situation.

The modern RE engine answer to safely solve the overlapping context problem is, naturally, lookaround, which turns actual consumed characters into zero-length assertions, and leaves them available for the next match attempt. This means that sed and awk are excluded, since their RE engines do not support lookaround.

What's left is GNU grep (with its -P option to match in PCRE mode, where available), and of course perl.

grep:

$ grep -Po '(?<=^|#)foobar[0-9]+(?=#|$)' test2.txt
foobar3
foobar9999

There's also a pcregrep utility that comes with the PCRE library, with a syntax similar to that of grep. In particular, it supports the -o option, se we can also do:

$ pcregrep -o '(?<=^|#)foobar[0-9]+(?=#|$)' test2.txt
foobar3
foobar9999

Let's try perl:

$ perl -ne 'print "$_\n" for (/(?<=^|#)(foobar\d+)(?=#|$)/g);' test2.txt
Variable length lookbehind not implemented in regex m/(?<=^|#)(foobar\d+)(?=#|$)/ at -e line 1.

Oops...it seems PCRE is more advanced than perl itself in this particular feature. As man pcrepattern informs us,

The contents of a lookbehind assertion are restricted such that all the strings it matches must have a fixed length. However, if there are several top-level alternatives, they do not all have to have the same fixed length. Thus

(?<=bullock|donkey)

is permitted, but

(?<!dogs?|cats?)

causes an error at compile time. Branches that match different length strings are permitted only at the top level of a lookbehind assertion. This is an extension compared with Perl, which requires all branches to match the same length of string. An assertion such as

(?<=ab(c|de))

is not permitted, because its single top-level branch can match two different lengths, but it is acceptable to PCRE if rewritten to use two top-level branches:

(?<=abc|abde)

So what can we do with perl? We have two possibilities.

We note that, strictly speaking, and in this particular case, only what follows the match has to be preserved for the next attempt; the lookbehind is not strictly needed, and we can replace it with a regular match. Thus:

$ perl -ne 'print "$_\n" for (/(?:^|#)(foobar\d+)(?=#|$)/g);' test2.txt
foobar3
foobar9999

Another way to solve the problem is a bit ugly, but it works: we can just move the ^ anchor outside the lookbehind and make it part of a regular alternation; since it's a zero-length match anyway, nothing is harmed:

$ perl -ne 'print "$_\n" for (/(?:^|(?<=#))(foobar\d+)(?=#|$)/g);' test2.txt
foobar3
foobar9999

It is important to understand that there's no generic rule here, and the solution will necessarily have to depend on the problem at hand. Depending on the actual situation, transforming a variable-length lookbehind into something accepted by perl may not always be so easy (or even possible).

Diskless iSCSI boot with PXE HOWTO

Here we will boot a machine (diskless or not, but even if it has a disk it won't be used) entirely from the network using PXE and the iSCSI protocol.

There are a few options to boot a system whose root partition is on iSCSI:

  • The machine could have a local bootloader that loads a local kernel and initrd. With suitable options, the initrd scripts are directed to log into an iSCSI LUN and use it as /. In this case, the LUN that is used as root filesystem does not need to have a kernel or bootloader installed.
  • Same as above, but the kernel and initrd are downloaded using PXE (via TFTP or HTTP).
  • The most interesting option (and the one that will be described here) is booting directly the iSCSI LUN via PXE. In this case, the LUN looks exactly like a local disk, with partitions, MBR, bootloader (grub) etc. The MBR is read and executed, which loads the second-stage bootloader and so on, just as if the disk were local.

A peculiar thing about iSCSI is that it doesn't really like the network going away while a session is connected. For this reason it is very important that the network be stable and reliable, but there are also a few specific boot-time tweaks to do in the Linux distribution that is being run from iSCSI. One of them is, of course, supplying the needed iSCSI information to the kernel; another one is preventing the initscripts from trying to (re)configure the network on the interface that is being used for the iSCSI session, as this may cause it to go down temporarily. In this case, the network is configured early, by the initrd, and should not be touched afterwards.

For this example, we will boot a Debian Wheezy over iSCSI, using PXE to read the LUN right from the very beginning (MBR and bootloader stage). For this to work, a PXE implementation that supports booting from iSCSI is obviously needed. iPXE is one such implementation (see here for more information on how to setup a more complete PXE infrastructure); here we will assume that the booting client is sent iPXE commands.

Installation

Debian does not (yet?) support direct installation to iSCSI, so there are two ways to do this: the first way is to transfer an existing installation to the LUN (eg using dd or rsync). The second (described here) is to use debootsrap on an existing helper machine to partition, install and prepare the LUN. The specific tweaks described starting from "iSCSI boot configuration" have to be performed regardless of whether it's an existing or a new install (if it's an existing installation, remember to chroot into it before).

When commands are shown, the prompt shows where they have to be run: "helper" is the helper machine, "client" is the chroot environment (ie the future iSCSI boot client).

Log into the LUN

We assume that our LUN is provided by the SAN at 10.10.10.10 (san.example.com), is called iqn.2007-08.com.example.san:rootp and has a size of 10G. So from a (possibly Debian or Ubuntu) machine with open-iscsi installed, we can log into it:

helper# iscsiadm -m discovery -t sendtargets -p 10.10.10.10
10.10.10.10:3260,1 iqn.2007-08.com.example.san:rootp
helper# iscsiadm -m node -T 'iqn.2007-08.com.example.san:rootp' -p 10.10.10.10 -l
Logging in to [iface: default, target: iqn.2007-08.com.example.san:rootp, portal: 10.10.10.10,3260] (multiple)
Login to [iface: default, target: iqn.2007-08.com.example.san:rootp, portal: 10.10.10.10,3260] successful.
helper# ls -l /dev/disk/by-path
...
lrwxrwxrwx 1 root root  9 Nov  2 15:03 ip-10.10.10.10:3260-iscsi-iqn.2007-08.com.example.san:rootp-lun-0 -> ../../sda
...
Partitioning

To make things more interesting (not much), we're going to use the newer GPT partitioning. For simplicity, here we'll create a 512MB swap partition and a 9.5G root partition. On BIOS systems, which are still the majority, GPT also needs a small partition at the beginning of the disk, the so-called "BIOS boot partition" (type EF02). See here, here and here for more info (all three documents are very interesting reads). So here's the disk layout:

helper# gdisk -l /dev/sda
GPT fdisk (gdisk) version 0.8.5

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 20971520 sectors, 10.0 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 67D92849-CD16-4CB1-8B3B-0758E62227CA
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 20971486
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048            8191   3.0 MiB     EF02  BIOS boot partition
   2            8192         1056767   512.0 MiB   8200  Linux swap
   3         1056768        20971486   9.5 GiB     8300  Linux filesystem
helper# mkfs.ext4 /dev/sda3
mke2fs 1.42.5 (29-Jul-2012)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
622592 inodes, 2489339 blocks
124466 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2550136832
76 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done 

helper# mkswap /dev/sda2
Setting up swapspace version 1, size = 524284 KiB
no label, UUID=e4f25981-3886-4939-a5cf-b05a0c7058a6
System installation

Let's mount the partition and install a minimal system with debootstrap:

helper# mkdir /mnt/chroot
helper# mount /dev/sda3 /mnt/chroot
helper# debootstrap wheezy /mnt/chroot
I: Retrieving Release
I: Retrieving Release.gpg
I: Checking Release signature
...
I: Configuring tasksel...
I: Configuring tasksel-data...
I: Base system installed successfully.

Now let's chroot into the system to finish the install:

helper# mount -t proc none /mnt/chroot/proc
helper# mount -t sysfs none /mnt/chroot/sys
helper# mount --bind /dev /mnt/chroot/dev
helper# chroot /mnt/chroot /bin/bash
client#

Let's create /etc/mtab which is needed by many programs:

client# cp /proc/mounts /etc/mtab
client# sed -i '\|^/dev/sda3|,$!d' /etc/mtab

The sed command removes the first lines from the file, which are not relevant for the chrooted system, and keeps only lines from the one starting with /dev/sda3 to the end (replace sda3 if your partition name is different, of course).

Now let's create /etc/fstab. In this case, the best option is working with UUIDS, so let's find them:

client# blkid /dev/sda2 /dev/sda3
/dev/sda2: UUID="e4f25981-3886-4939-a5cf-b05a0c7058a6" TYPE="swap" 
/dev/sda3: UUID="6c816f51-0613-45e7-a15b-bc2d5cd00f88" TYPE="ext4"
client# echo 'UUID=6c816f51-0613-45e7-a15b-bc2d5cd00f88 / ext4 errors=remount-ro 0 1' >> /etc/fstab
client# echo 'UUID=e4f25981-3886-4939-a5cf-b05a0c7058a6 none swap sw 0 0' >> /etc/fstab
client# cat /etc/fstab
# UNCONFIGURED FSTAB FOR BASE SYSTEM
UUID=6c816f51-0613-45e7-a15b-bc2d5cd00f88 / ext4 errors=remount-ro 0 1
UUID=e4f25981-3886-4939-a5cf-b05a0c7058a6 none swap sw 0 0

Here we can install any extra package that we want:

client# apt-get install vim less openssh-server locales

This is also the time to do any other needed customization (eg localization, setting hostname, repositories, etc.).

Finally, we need to install a kernel, a bootloader and the initramfs utilities that we'll use later:

client# apt-get install linux-image-amd64 grub2 initramfs-tools

When prompted, we choose to install grub to /dev/sda, just as we'd do with a local hard disk.

iSCSI boot configuration

Now it's time to finally do what it takes for the actual boot process to work. Basically, we need a special initrd that configures the network, logs into the iSCSI target LUN, mounts it as / and calls pivot_root() on it. We will provide the needed information in the form of kernel command line arguments.

The open-iscsi package includes the necessary initrd hooks to do the above, so let's install it:

client# apt-get install open-iscsi

The relevant bit are in /usr/share/initramfs-tools/scripts/local-top/iscsi, where we learn that we can pass information by setting various ISCSI_* variables. We also want early (ie, kernel-level) IP configuration, which again can be done with special arguments to the kernel. We pass all this information by modifying the grub kernel command line, so we need the following line in the client's /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX="ISCSI_INITIATOR=iqn.2007-08.com.example.client:client ISCSI_TARGET_NAME=iqn.2007-08.com.example.san:rootp ISCSI_TARGET_IP=10.10.10.10 ISCSI_TARGET_PORT=3260 root=UUID=6c816f51-0613-45e7-a15b-bc2d5cd00f88 ip=10.10.10.50::10.10.10.1:255.255.255.0:client:eth0:off"

Here we're using static IP configuration, use "ip=dhcp" for DHCP (here the full story). Also, the GRUB_CMDLINE_LINUX_DEFAULT variable is normally set to "quiet", but it's probably better to remove that to be able to see what happens at boot. It can be readded back later if wanted.

Also note that if the SAN needs authentication more variables are needed, most likely ISCSI_USERNAME and ISCSI_PASSWORD.

Looking into /usr/share/initramfs-tools/hooks/iscsi, we learn that for the initrd update process to know that we want the iSCSI stuff included, we need to create the file /etc/iscsi/iscsi.initramfs:

client# touch /etc/iscsi/iscsi.initramfs

We also see that the file /etc/iscsi/initiatorname.iscsi gets copied into the inird and sourced to learn the initiator name, so let's write it inside it in the expected format:

client# echo "InitiatorName=iqn.2007-08.com.example.client:client" > /etc/iscsi/initiatorname.iscsi

Now to apply all our changes, we regenerate grub config and the initrd:

client# update-grub
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.2.0-4-amd64
Found initrd image: /boot/initrd.img-3.2.0-4-amd64
done
client# update-initramfs -u
update-initramfs: Generating /boot/initrd.img-3.2.0-4-amd64

We also need to set a root password, otherwise we won't be able to login:

client# passwd
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully

Lastly, as we said we don't want that Debian initscripts try to configure eth0 at boot. This is achieved in a simple way by either removing any reference to eth0 in /etc/network/interfaces, or just telling Debian that the configuration is "manual":

#/etc/network/interfaces
auto eth0
    iface eth0 inet manual
# other interfaces here ...

We can finally exit the chroot environment and log out of the iSCSI LUN in the helper machine:

client# exit
helper# umount /mnt/chroot/{dev,proc,sys,}
helper# iscsiadm -m node -T 'iqn.2007-08.com.example.san:rootp' -p 10.10.10.10 -u

PXE

Let's summarize what happens when our client is booted:

  • iPXE configures the network (either via DHCP or statically)
  • iPXE logs into the iSCSI LUN, mapping it as a local disk.
  • The MBR is read, and the boot process is kickstarted, which loads the kernel and the initrd.
  • Early IP configuration is performed during the boot, and an initrd script logs into the iSCSI LUN as specified on the kernel command line (the kernel is unaware of the PXE login)
  • pivot_root() is called on the iSCSI partition specified on the command line with root=, and from there the boot process proceeds normally

So we need to configure the first three steps. Using iPXE, all that we have to do is sending this iPXE script to the client:

#!ipxe
set initiator-iqn iqn.2007-08.com.example.client:client
sanboot iscsi:san.example.com:6:3260:0:iqn.2007-08.com.example.san:rootp

This is the bare minimum; if your SAN needs authentication, then username and password should also be set before attempting to boot (see the iPXE docs, and SAN URIs explained).

Test it!

So if we boot our client, we should see that iPXE logs into the LUN and loads GRUB:

...
Registered as SAN device 0x80
Booting from SAN device 0x80
GRUB loading.
Welcome to GRUB!

and after GRUB has booted the kernel, something like this in the kernel messages:

...
[    2.073406] scsi2 : iSCSI Initiator over TCP/IP
[    2.335112] scsi 2:0:0:0: Direct-Access     EQLOGIC  100E-00          4.3  PQ: 0 ANSI: 5
[    2.337709] scsi 2:0:0:0: Attached scsi generic sg1 type 0
[    2.349859] sd 2:0:0:0: [sda] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
[    2.351322] sd 2:0:0:0: [sda] Write Protect is off
[    2.352271] sd 2:0:0:0: [sda] Mode Sense: 77 00 00 08
[    2.353451] sd 2:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[    2.368450]  sda: sda1 sda2 sda3
[    2.370812] sd 2:0:0:0: [sda] Attached SCSI disk
[    3.396538] EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null)
...
[    4.810052] Adding 524284k swap on /dev/sda2.  Priority:-1 extents:1 across:524284k 
[    4.824409] EXT4-fs (sda3): re-mounted. Opts: (null)
[    4.959888] EXT4-fs (sda3): re-mounted. Opts: errors=remount-ro
...

At this point, we can use this machine and do all the normal administrative operations (add/remove packages, upgrades, kernel configuration, etc.) in the usual way, as if it had a local hard disk.

Update 07/11/2014: According to the README.Debian file that comes with open-iscsi, when passing iSCSI variables from grub their names should be lowercase, though uppercase seems to work just as fine. It is also possible to set the various variables directly in the /etc/iscsi/iscsi.initramfs file (this time in uppercase).

If manual configuration is necessary, there are two ways to include iSCSI boot
options in your initramfs:

1) Touch /etc/iscsi/iscsi.initramfs and provide options on the command line.
This provides flexibility, but if passwords are used, is not very secure.
Available boot line options:
iscsi_initiator, iscsi_target_name, iscsi_target_ip,
iscsi_target_port, iscsi_target_group, iscsi_username,
iscsi_password, iscsi_in_username, iscsi_in_password
See iscsistart --help for a description of each option

2) Provide iSCSI option in /etc/iscsi/iscsi.initramfs.
Available options:
ISCSI_INITIATOR, ISCSI_TARGET_NAME, ISCSI_TARGET_IP,
ISCSI_TARGET_PORT, ISCSI_TARGET_GROUP, ISCSI_USERNAME
ISCSI_PASSWORD, ISCSI_IN_USERNAME, ISCSI_IN_PASSWORD

Example Syntax:

ISCSI_INITIATOR="iqn.1993-08.org.debian:01:9b3e5634fdb9"
ISCSI_TARGET_NAME=iqn.2008-01.com.example:storage.foo
ISCSI_TARGET_IP=192.168.1.1
ISCSI_TARGET_PORT=3160
ISCSI_USERNAME="username"
ISCSI_PASSWORD="password"
ISCSI_IN_USERNAME="in_username"
ISCSI_IN_PASSWORD="in_password"
ISCSI_TARGET_GROUP=1

Remember to set proper permissions if username/passwords are used.

Update2 29/01/2015: It seems iPXE has trouble booting from those iSCSI targets that use multiple IP addresses (eg most Dell Equallogic), where there is a "main" or "group" IP to which initiators connect, but the actual session is established against another IP to which the initator is directed after the first contact with the main IP (typically, the IP of a specific iSCSI interface on the storage). iPXE seems to have trouble understanding this form of "redirection", and just fails the iSCSI login. On the other hand, connecting to single-IP targets like iSCSI Enterprise Target works fine.

PXE server with dnsmasq, apache and iPXE

Here we're going to set up a PXE server that would boot even cards with bad or buggy PXE firmware, without having to flash them.

First, some words about PXE.

PXE

PXE, acronym of Preboot eXecution Environment, is a specification originally developed by Intel that allows a computer to boot over the network. This has obvious applications in the case of diskless boxes (eg thin clients), but it can also be useful for normal machines, for example to temporarily boot a rescue disk, or (re)install the OS over the network without needing any physical medium.

Simplifying a bit, it goes like this:

  • A machine is turned on. In the BIOS, the boot order says to try PXE first (or a key can be pressed during the POST to the same effect, normally).
  • Its network card (NIC) has a chip with a special firmware, which implements a minimal stack of TCP/IP protocols (DHCP, TFTP, possibly DNS).
  • This firmware is loaded and performs a DHCP broadcast to get an IP address and other pieces of information.
  • If a suitable DHCP server sees the request, it selects an IP address and assigns it to the client.
  • Up to here, it's not different from normal DHCP. However, the server also sends two special pieces of information to the client in the DHCP offer: one is the name or IP address of a server (the so-called "next server", which may be the same DHCP server or not), the other one is the name of a file to download from there (so-called "boot filename" in DHCP speak, or "network boot program" (NBP) in PXE speak).
  • The PXE client configures its TCP/IP stack with the received information, then tries to download the boot filename from the next server via TFTP.
  • If it succeeds, it loads the NBP in memory and runs it. From now on, the NBP takes over and does whatever it takes to fully boot the machine.

Sounds simple, but as usual life isn't as simple as it seems. There are a few things to be noted.

First, while originally the NBP was downloaded via TFTP (and many times still is), some enhanced PXE implementations (like gPXE or iPXE) can use HTTP. They also support extra protocols like iSCSI o A0E (to boot from SANs).

Second, PXE isn't just a sequence of steps to bootstrap a machine; it also specifies an API. This means that the NBP runs in a special environment and can make use of many functionalities made available by the PXE that loaded it. In particular, if the calling PXE supports HTTP networking, this means that the NBP can too, via the PXE API, even if it wouldn't otherwise support it natively.

Let's take the case of pxelinux, probably the most used NBP for its flexibility. Only recent versions support HTTP natively; however, older versions (starting from 3.70, which is quite old) can use the PXE API and do HTTP if they are invoked from an HTTP-capable PXE implementation like gPXE or iPXE mentioned above. Since ideally we want our PXE server to serve stuff over HTTP as much as possible rather than TFTP, all this is quite good.

However, these enhanced PXE implementations are normally not found in consumer-end NICs, which instead tend to come with limited or buggy PXE implementations. There are a few workarounds for this:

  • Load the enhanced PXE firmware from a floppy, CDROM, or USB stick. So in the BIOS, the machine is configured to boot from the appropriate removable media, which loads the PXE firmware, which in turn boots from the network. In general, this is not very practical (and the media can be lost or damaged, or the reader can break. Many machine don't even have a floppy or CD reader anymore).
  • The NIC ROM can be flashed with the enhanced firmware. This is better, but it still requires some special action. For hundreds of machines, again this is not very practical.
  • The enhanced PXE firmware can be downloaded (chainloaded) by the buggy PXE as if it were an NBP (via TFTP), then take over and do the "real" PXE boot, downloading the "real" NBP which will then be able to use the API in the enhanced environment (with HTTP and all).

The last option is the easiest and most convenient to implement, since it does not require to mess around with sneakernet or ROM flashing, and is what is described here.

The plan

So we are going to use dnsmasq as our DHCP and TFTP server, apache to serve HTTP (for no particular reason, just because it's easy to set up with PHP), and iPXE for the enhanced PXE firmware. All running on the same machine for convenience, but there's no reason why the web server could not run on another box.

Since the DHCP server will possibly see (at least) two different DHCP queries (first one from the buggy PXE firmware, then one from iPXE), and has to send different NBP strings to them, a way is needed to tell which query we are seeing.

This is quite straightforward: if we capture the traffic with tcpdump, we see that the requests coming from iPXE have at least two identifying characteristics that are not present in requests not coming from iPXE. The first is DHCP option number 175, which is used for iPXE/gPXE-specific information. The second is the iPXE user class, which again is not normally present.

15:14:41.719114 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto UDP (17), length 415)
    0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:12:34:56:78:90, length 387, xid 0x71ceb4, secs 4, Flags [none]
	  Client-Ethernet-Address 00:12:34:56:78:90
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Discover
	    MSZ Option 57, length 2: 1472
	    ARCH Option 93, length 2: 0
	    NDI Option 94, length 3: 1.2.1
	    Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001"
	    CLASS Option 77, length 4: "iPXE"
	    Parameter-Request Option 55, length 13: 
	      Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG
	      Hostname, Domain-Name, RP, Vendor-Option
	      Vendor-Class, TFTP, BF, Option 175
	      Option 203
	    T175 Option 175, length 45: 177.5.1.26.244.16.0.24.1.1.35.1.1.34.1.1.25.1.1.33.1.1.16.1.2.19.1.1.17.1.1.235.3.1.0.0.23.1.1.21.1.1.18.1.1
	    Client-ID Option 61, length 7: ether 00:12:34:56:78:90
	    GUID Option 97, length 17: 0.99.10.237.238.79.65.104.58.28.29.107.2.246.140.217.96

It's also easy to see the same information in the DHCP server log.

In dnsmasq, we set a tag if we detect that the request comes from iPXE, and do different things depending on whether or not the tag is set. If the request is from a non-enhanced PXE client, we send them the iPXE firmware; otherwise, it's iPXE so we direct it to an HTTP URL to continue the boot process (see below).

To have maximum flexibility, we want to be able to tell which client we're talking to, and possibly give different orders to different clients. ("Orders" here means "iPXE scripts", which are textual sequences of iPXE directives that tell the clients to do certain things.)

To this end, we direct iPXE to do an HTTP GET request containing various parameteres that identify the client. On the server this runs a PHP script that decides what to do based on the received values. We thus send back an iPXE script containing further instructions to the client (eg "chainload pxelinux.0", "boot from iscsi", etc. See below for the examples).

This allows us to do things like (for example) "Client X: go get pxelinux from the local HTTP server to boot a rescue environment. Client Y: boot from iSCSI, here is the LUN URL. Client Z: boot pxelinux from another HTTP server to do an unattended Debian install..."

Configuration

Now that we have defined the plan, let's finally get to the practical bits. It is assumed that the PXE server (pxe.example.com) has IP address 10.188.0.10/24, the network's default gateway is 10.188.0.1, and the DNS server is 10.188.0.20. It is also assumed that no other DHCP servers are present in the network.

dnsmasq configuration

The configuration of dnsmasq is short (of course adapt as needed):

interface=eth0
domain=example.com
dhcp-range=10.188.0.60,10.188.0.70,12h
dhcp-option=option:router,10.188.0.1
dhcp-option=option:dns-server,10.188.0.20
dhcp-authoritative

# enable logging
log-queries
log-dhcp

# set tag "ENH" if request comes from iPXE ("iPXE" user class)
dhcp-userclass=set:ENH,iPXE

# alternative way, look for option 175
#dhcp-match=set:ENH,175

# if request comes from dumb firmware, send them iPXE (via TFTP)
dhcp-boot=tag:!ENH,undionly.kpxe,10.188.0.10

# if request comes from iPXE, direct it to boot from boot1.txt
dhcp-boot=tag:ENH,http://pxe.example.com/boot1.txt

dhcp-no-override

enable-tftp
tftp-root=/var/www

So we set the tag ENH (set:ENH) if the request comes from iPXE. The tag:!ENH syntax means "if the ENH tag is NOT set". Note that this syntax requires a reasonably recent version of dnsmasq; in older versions, "net:" had to be used instead of "tag:", and "#ENH" instead of "!ENH" (ie, "net:#ENH") to say "tag ENH not set".

The file undionly.kpxe (or a symlink to it) has to be in /var/www, and is the iPXE implementation used for chainloading, which is sent to the dumb clients via TFTP. This is the only TFTP transaction in the whole process. Once the client has loaded iPXE, everything happens over HTTP.

As a special case (in a positive sense), when PXE-booting a KVM virtual machine the very first request that the server sees already comes from iPXE, since that's what qemu uses to implement the VM's PXE "firmware". This means that in that case the process will be faster, since the chainloading phase will be skipped and the client sent directly to the HTTP URL.

Regardless of whether the client is originally dumb or not, it will eventually end up fetching boot1.txt (see below) via HTTP.

The last configuration lines enable dnsmasq's internal TFTP server, telling it to serve files (not coincidentally) from /var/www. And so...

Apache configuration

Any web server with PHP support would work, in fact; it's just that with apache, a running PHP is just two commands away with zero configuration.
And of course, it doesn't even have to be PHP: any server-side scripting language will do.

So our client (which is running iPXE, and can do HTTP) fetches boot1.txt, which lives in /var/www. Here's how it looks like

#!ipxe

chain http://pxe.example.com/boot2.php?mac=${mac}&ip=${ip}&asset=${asset}&netmask=${netmask}&gateway=${gateway}&dns=${dns}&domain=${domain}&filename=${filename}&nextserver=${next-server}&hostname=${hostname}&uuid=${uuid}&userclass=${user-class}&manufacturer=${manufacturer}&product=${product}&serial=${serial}&asset=${asset}

This is an iPXE script that chainloads another URL. Basically, it's just a cheap trick to send as much information as possible about the client to the server via a gigantic HTTP GET, so the client can be identified for further processing (though 99% of the times only the MAC address will be looked at, it's good to have as many variables as possible). iPXE replaces the various ${mac}, ${ip} etc. variables with the actual values for the client and also does URL-encoding. The full list of available parameters is here in the docs.

The above URL could also be supplied directly from dnsmasq, by replacing the URL in the dhcp-boot=tag:ENH,http://pxe.example.com/boot1.txt line with the one in boot1.txt. However it looks like that way the URL gets truncated if it's too long, so it's better to be safe and put it in its own file.

Now, finally, let's look at how boot2.php (which must also be in /var/www) looks like. Here is where we actually decide what to do with each client.

<?php
 
# send a suitable iPXE script to a client

echo "#!ipxe\n";
 
switch ($_GET['mac']) {
 
  case '00:12:34:56:78:90':
    # boot pxelinux from this server
    echo "chain http://pxe.example.com/pxelinux.0\n";
    break;
 
  case '00:11:22:33:44:55':
    # boot from iSCSI
    echo "set initiator-iqn iqn.2007-08.com.example.initiator:initiator\n";
    # see http://ipxe.org/sanuri for the syntax
    echo "sanboot iscsi:san.example.com:6:3260:0:iqn.2007-08.com.example.san:sometarget\n";
    break;
 
  case '00:77:21:ab:cd:ee':
    # boot boot.salstar.sk's super cool boot menu      
    echo "chain http://boot.salstar.sk\n";
    break;
 
  default:
    # exit iPXE and let machine go on with BIOS boot sequence
    echo "exit\n";
    break;
}
 
 
?>

In short, each client will receive an iPXE scrit telling it what to do. Here clients are detected by their MACs, but any variable among those that we pass can be used, of course.
If a client has no specific treatment set up for it, it will end up in the "default" branch of the switch statement, which will just direct it to exit iPXE and try the next device in the BIOS boot sequence, which would normally mean it will boot from its local hard disk (again this can be changed, of course). Another option is to chainload another bootloader that is able to boot a local disk, for example GRUB4DOS as explained in this page.

Another thing that can be done here, in case the client is told to chainload pxelinux, and pxelinux resides on the same server, is generating some pieces of pxelinux config dynamically, write them to some file which will then be included by the main pxelinux configuration (since syslinux/pxelinux, to the best of my knowledge, do not allow variables in the configuration).
Typical examples are kernel parameters for the client (ie, those that are passed using APPEND in pxelinux), for example console port and speed definition or module parameters related to the actual client hardware, or syslinux/pxelinux menu customizations.

It's even possible to fetch and boot stuff off the Internet, as in the iPXE demo image, which can be loaded by directing the client to chain http://boot.ipxe.org/demo/boot.php. It really works. But the coolest service is, as shown for the third client in the above example, http://boot.salstar.sk, which allows booting and installing a lot of operating systems off the Internet. It's really impressive. Well done!

pxelinux

If we direct a client to load pxelinux, then there is another degree of flexibility there, since pxelinux will try to load several configuration files, named from the most specific to the most generic, until it succeeds. Normally the sequence of attempts looks something like this:

GET /pxelinux.cfg/44454c4c-3900-104e-804e-b9c04f4d344a
GET /pxelinux.cfg/01-00-26-b9-5e-30-3a
GET /pxelinux.cfg/C0A80744
GET /pxelinux.cfg/C0A8074
GET /pxelinux.cfg/C0A807
GET /pxelinux.cfg/C0A80
GET /pxelinux.cfg/C0A8
GET /pxelinux.cfg/C0A
GET /pxelinux.cfg/C0
GET /pxelinux.cfg/C
GET /pxelinux.cfg/default

So again what a given client does can be decided by assigning it a pxelinux configuration file with a name more speficic than "default", which is what gets loaded if nothing better is found.

And of course, pxelinux.0 plus any other file needed by the configuration files (eg menu.c32 etc.) need to be present in the document root of the web server (or symlinks to them).

Since pxelinux is running with HTTP support thanks to iPXE, HTTP URLs can be used anywhere a file name would, eg

# ok, this doesn't make much sense
LINUX http://server1.example.com/vmlinuz
INITRD http://server2.example.com/initram.gz

and even if you don't explicitly specify http://server.name, it implicitly assumes that it has to use HTTP anyway (in that case, it automatically prepends the URL it's booting from to the names).

Conclusions

With this system it really becomes possible to do whatever one may imagine via PXE, and everything is controlled and managed from a single place.

Further reading (on the interactions between pxelinux and gPXE, but also relevant for iPXE):

Clarifying the relationship between PXELinux, Etherboot and gPXE/iPXE

Argument juggling with awk

This seems to be a sort of FAQ. A typical formulation goes like "I have a bash array, how do I pass it to awk so that it becomes an awk array"?

Leaving aside the fact that it may be possible to extend the awk code to do whatever one is doing with the shell array (in which cases the problem goes away), let's focus on how to do strictly what is requested (and more).

ARGC and ARGV

Like many other languages, awk has two special variables ARGC and ARGV that give information on the arguments passed to the awk program. ARGC contains the number of total arguments (including the awk interpreter or script), and ARGV is an array of ARGC elements (indexed from 0 to ARGC - 1) that contains all the arguments (ARGV[0] is always the name of the awk interpreter or script).
Let's demonstrate this with a simple example:

awk 'BEGIN{print "ARGC is " ARGC; for(i = 0; i < ARGC; i++) print "ARGV["i"] is " ARGV[i]}' foo bar
ARGC is 3
ARGV[0] is awk
ARGV[1] is foo
ARGV[2] is bar

There are two important things to know:

  • Unlike other languages, in awk ARGC and ARGV can be modified
  • When awk's main loop starts (and only then), awk processes whatever it finds in ARGV, starting from ARGV[1] up to ARGV[ARGC - 1].

Of course, these should normally be file names or variable assignments. But this is only relevant when the main loop starts; before then, in the BEGIN block we can manipulate ARGC and ARGV to our taste, and as long as what's left afterwards in ARGV is a list of files to process (or variable assignments), awk doesn't really care how those values got there.

So let's see some use cases for ARGC/ARGV manipulation.

Double pass over a file

Some code uses the two-file idiom to process the same file twice. So instead of doing

awk .... file.txt file.txt

we could just specify the file name once and double it in the BEGIN block so awk sees it twice:

# this is as if we said awk ..... file.txt file.txt
awk 'BEGIN{ARGV[ARGC++] = ARGV[1]} { ... }' file.txt

Fixed arguments

Let's assume that our awk code always has to process one or more files, whose names do not change. Of course, we could specify those names at each invocation of awk; nothing new here. However, for some reason we don't want to specify those names at each invocation, since they never change anyway; we only want to specify the variable file names. So if we have two never-changing files ("fixed1.txt" and "fixed2.txt"), we want to invoke our code with

process.awk file1 file2 file3 ...

but in fact we want awk to run as if we said

process.awk fixed1.txt fixed2.txt file1 file2 file3 ...

Let's see how the code to accomplish this may look like (of course it has to be adapted to the specific situation):

awk 'BEGIN {
  for(i = ARGC+1; i > 2; i--)
    ARGV[i] = ARGV[i - 2]
  ARGC += 2
  ARGV[1] = "fixed1.txt"
  ARGV[2] = "fixed2.txt"
}
# now awk processes fixed1.txt and fixed2.txt first, then whatever was specified on the command line
{
  ...
}' file1 file2 file3 ...

Passing a shell array (and more or less arbitrary data)

So, to back to the original question, how can we take advantage of this juggling to pass in an array? A simple way would be to pass all the array elements as normal awk arguments, process them in the BEGIN block, then remove them so when the main loop starts awk is unaware of what happened. Let's see an example:

shellarr=( 'foo' 'bar' 'baz' 'xxx' 'yyy' )
 
awk 'BEGIN{
 
  # ARGV[1] is the number of elements we have
  arrlen = ARGV[1]
 
  for(i = 2; i <= arrlen + 1; i++)
    awkarr[i - 1] = ARGV[i]
 
  # clean up
  j = 1
  for(i = arrlen + 2; i < ARGC; i++)
    ARGV[j++] = ARGV[i]
  ARGC = j
}
 
# here awk starts processing from file1, unaware of what we did earlier
# but we have awkarr[] populated with the values from shellarr (and arrlen is its length)
{
  ...
}
 
' ${#shellarr[@]} "${shellarr[@]}" file1 file2

awkarr has its elements indexed starting from 1, as is customary in awk; it's easy to adapt the code to use 0-based or another number.
We could also pass the number of elements in the array as a normal value using -v, which simplifies processing somewhat:

shellarr=( 'foo' 'bar' 'baz' 'xxx' 'yyy' )
 
awk -v arrlen="${#shellarr[@]}" 'BEGIN{
 
  for(i = 1; i <= arrlen; i++)
    awkarr[i] = ARGV[i]
 
  # clean up
  for(i = arrlen + 1; i < ARGC; i++)
    ARGV[i - arrlen] = ARGV[i]
  ARGC -= arrlen
}
# ... as before
 
' "${shellarr[@]}" file1 file2

If the number of files to process is known (which should be the most common case), then it's even easier as we can specify them first and the array elements afterwards. Let's assume we know that we always process two files:

shellarr=( 'foo' 'bar' 'baz' 'xxx' 'yyy' )
 
awk -v nfiles=2 'BEGIN{
  for(i = nfiles + 1; i < ARGC; i++)
    awkarr[i - nfiles] = ARGV[i]
  arrlen = ARGC - (nfiles + 1)
  ARGC = nfiles + 1
}
# ... as before
 
' file1 file2 "${shellarr[@]}"

Finally, if we want to "pass" a shell associative array to awk (such that it exists with the same keys and values in the awk code), we could do this:

declare -A shellarr
shellarr=( [fook]='foov' [bark]='barv' [bazk]='bazv' [xxxk]='xxxv' [yyyk]='yyyv' )
 
awk -v nfiles=2 'BEGIN{
  arrlen = ( ARGC - (nfiles + 1) ) / 2
  for(i = nfiles + 1; i < nfiles + 1 + arrlen; i++)
    awkarr[ARGV[i]] = ARGV[i + arrlen]
  ARGC = nfiles + 1
}
# ... as before
 
' file1 file2 "${!shellarr[@]}" "${shellarr[@]}"

This works because in bash, the order of expansion of "${!shellarr[@]}" and "${shellarr[@]}" is the same (currently, at least). To be 100% sure, however, we could of course copy all the key, value pairs to another array and pass that one, as in the following example:

declare -A shellarr
shellarr=( [fook]='foov' [bark]='barv' [bazk]='bazv' [xxxk]='xxxv' [yyyk]='yyyv' )
 
declare -a temp
for key in "${!shellarr[@]}"; do
  temp+=( "$key" "${shellarr[$key]}" )
done
 
awk -v nfiles=2 'BEGIN{
  arrlen = ( ARGC - (nfiles + 1) ) / 2
  for(i = nfiles + 1; i < ARGC; i += 2)
    awkarr[ARGV[i]] = ARGV[i + 1]
  ARGC = nfiles + 1
}
# ... as before
 
' file1 file2 "${temp[@]}"

In the last two examples, it should be noted that, as usual with associative arrays, the concept of array "length" doesn't make much sense; it's just an indication of how many elements the hash has, and nothing more (in awk, all arrays are associative regardless, though they can be used as "normal" ones as we did in the first examples).

Update 31/10/2013: So there's always something new to learn, and in my case it was that if an element of ARGV is the empty string, awk just skips it. This simplifies the examples where the ARGV elements are moved down to fill the positions where the shell array elements were. In fact, all that's needed is to set those elements to "", and awk will naturally skip them. So the first two examples above become:

shellarr=( 'foo' 'bar' 'baz' 'xxx' 'yyy' )
 
awk 'BEGIN{
 
  # ARGV[1] is the number of elements we have
  arrlen = ARGV[1]
  ARGV[1] = ""
 
  for(i = 2; i <= arrlen + 1; i++) {
    awkarr[i - 1] = ARGV[i]
    ARGV[i] = ""
  }
}
...' ${#shellarr[@]} "${shellarr[@]}" file1 file2

Second example:

shellarr=( 'foo' 'bar' 'baz' 'xxx' 'yyy' )
 
awk -v arrlen="${#shellarr[@]}" 'BEGIN{
  for(i = 1; i <= arrlen; i++) {
    awkarr[i] = ARGV[i]
    ARGV[i] = ""
  }
}
...' "${shellarr[@]}" file1 file2