Skip to content
 

Diskless iSCSI boot with PXE HOWTO

Here we will boot a machine (diskless or not, but even if it has a disk it won't be used) entirely from the network using PXE and the iSCSI protocol.

There are a few options to boot a system whose root partition is on iSCSI:

  • The machine could have a local bootloader that loads a local kernel and initrd. With suitable options, the initrd scripts are directed to log into an iSCSI LUN and use it as /. In this case, the LUN that is used as root filesystem does not need to have a kernel or bootloader installed.
  • Same as above, but the kernel and initrd are downloaded using PXE (via TFTP or HTTP).
  • The most interesting option (and the one that will be described here) is booting directly the iSCSI LUN via PXE. In this case, the LUN looks exactly like a local disk, with partitions, MBR, bootloader (grub) etc. The MBR is read and executed, which loads the second-stage bootloader and so on, just as if the disk were local.

A peculiar thing about iSCSI is that it doesn't really like the network going away while a session is connected. For this reason it is very important that the network be stable and reliable, but there are also a few specific boot-time tweaks to do in the Linux distribution that is being run from iSCSI. One of them is, of course, supplying the needed iSCSI information to the kernel; another one is preventing the initscripts from trying to (re)configure the network on the interface that is being used for the iSCSI session, as this may cause it to go down temporarily. In this case, the network is configured early, by the initrd, and should not be touched afterwards.

For this example, we will boot a Debian Wheezy over iSCSI, using PXE to read the LUN right from the very beginning (MBR and bootloader stage). For this to work, a PXE implementation that supports booting from iSCSI is obviously needed. iPXE is one such implementation (see here for more information on how to setup a more complete PXE infrastructure); here we will assume that the booting client is sent iPXE commands.

Installation

Debian does not (yet?) support direct installation to iSCSI, so there are two ways to do this: the first way is to transfer an existing installation to the LUN (eg using dd or rsync). The second (described here) is to use debootsrap on an existing helper machine to partition, install and prepare the LUN. The specific tweaks described starting from "iSCSI boot configuration" have to be performed regardless of whether it's an existing or a new install (if it's an existing installation, remember to chroot into it before).

When commands are shown, the prompt shows where they have to be run: "helper" is the helper machine, "client" is the chroot environment (ie the future iSCSI boot client).

Log into the LUN

We assume that our LUN is provided by the SAN at 10.10.10.10 (san.example.com), is called iqn.2007-08.com.example.san:rootp and has a size of 10G. So from a (possibly Debian or Ubuntu) machine with open-iscsi installed, we can log into it:

helper# iscsiadm -m discovery -t sendtargets -p 10.10.10.10
10.10.10.10:3260,1 iqn.2007-08.com.example.san:rootp
helper# iscsiadm -m node -T 'iqn.2007-08.com.example.san:rootp' -p 10.10.10.10 -l
Logging in to [iface: default, target: iqn.2007-08.com.example.san:rootp, portal: 10.10.10.10,3260] (multiple)
Login to [iface: default, target: iqn.2007-08.com.example.san:rootp, portal: 10.10.10.10,3260] successful.
helper# ls -l /dev/disk/by-path
...
lrwxrwxrwx 1 root root  9 Nov  2 15:03 ip-10.10.10.10:3260-iscsi-iqn.2007-08.com.example.san:rootp-lun-0 -> ../../sda
...
Partitioning

To make things more interesting (not much), we're going to use the newer GPT partitioning. For simplicity, here we'll create a 512MB swap partition and a 9.5G root partition. On BIOS systems, which are still the majority, GPT also needs a small partition at the beginning of the disk, the so-called "BIOS boot partition" (type EF02). See here, here and here for more info (all three documents are very interesting reads). So here's the disk layout:

helper# gdisk -l /dev/sda
GPT fdisk (gdisk) version 0.8.5

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 20971520 sectors, 10.0 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 67D92849-CD16-4CB1-8B3B-0758E62227CA
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 20971486
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048            8191   3.0 MiB     EF02  BIOS boot partition
   2            8192         1056767   512.0 MiB   8200  Linux swap
   3         1056768        20971486   9.5 GiB     8300  Linux filesystem
helper# mkfs.ext4 /dev/sda3
mke2fs 1.42.5 (29-Jul-2012)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
622592 inodes, 2489339 blocks
124466 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2550136832
76 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done 

helper# mkswap /dev/sda2
Setting up swapspace version 1, size = 524284 KiB
no label, UUID=e4f25981-3886-4939-a5cf-b05a0c7058a6
System installation

Let's mount the partition and install a minimal system with debootstrap:

helper# mkdir /mnt/chroot
helper# mount /dev/sda3 /mnt/chroot
helper# debootstrap wheezy /mnt/chroot
I: Retrieving Release
I: Retrieving Release.gpg
I: Checking Release signature
...
I: Configuring tasksel...
I: Configuring tasksel-data...
I: Base system installed successfully.

Now let's chroot into the system to finish the install:

helper# mount -t proc none /mnt/chroot/proc
helper# mount -t sysfs none /mnt/chroot/sys
helper# mount --bind /dev /mnt/chroot/dev
helper# chroot /mnt/chroot /bin/bash
client#

Let's create /etc/mtab which is needed by many programs:

client# cp /proc/mounts /etc/mtab
client# sed -i '\|^/dev/sda3|,$!d' /etc/mtab

The sed command removes the first lines from the file, which are not relevant for the chrooted system, and keeps only lines from the one starting with /dev/sda3 to the end (replace sda3 if your partition name is different, of course).

Now let's create /etc/fstab. In this case, the best option is working with UUIDS, so let's find them:

client# blkid /dev/sda2 /dev/sda3
/dev/sda2: UUID="e4f25981-3886-4939-a5cf-b05a0c7058a6" TYPE="swap" 
/dev/sda3: UUID="6c816f51-0613-45e7-a15b-bc2d5cd00f88" TYPE="ext4"
client# echo 'UUID=6c816f51-0613-45e7-a15b-bc2d5cd00f88 / ext4 errors=remount-ro 0 1' >> /etc/fstab
client# echo 'UUID=e4f25981-3886-4939-a5cf-b05a0c7058a6 none swap sw 0 0' >> /etc/fstab
client# cat /etc/fstab
# UNCONFIGURED FSTAB FOR BASE SYSTEM
UUID=6c816f51-0613-45e7-a15b-bc2d5cd00f88 / ext4 errors=remount-ro 0 1
UUID=e4f25981-3886-4939-a5cf-b05a0c7058a6 none swap sw 0 0

Here we can install any extra package that we want:

client# apt-get install vim less openssh-server locales

This is also the time to do any other needed customization (eg localization, setting hostname, repositories, etc.).

Finally, we need to install a kernel, a bootloader and the initramfs utilities that we'll use later:

client# apt-get install linux-image-amd64 grub2 initramfs-tools

When prompted, we choose to install grub to /dev/sda, just as we'd do with a local hard disk.

iSCSI boot configuration

Now it's time to finally do what it takes for the actual boot process to work. Basically, we need a special initrd that configures the network, logs into the iSCSI target LUN, mounts it as / and calls pivot_root() on it. We will provide the needed information in the form of kernel command line arguments.

The open-iscsi package includes the necessary initrd hooks to do the above, so let's install it:

client# apt-get install open-iscsi

The relevant bit are in /usr/share/initramfs-tools/scripts/local-top/iscsi, where we learn that we can pass information by setting various ISCSI_* variables. We also want early (ie, kernel-level) IP configuration, which again can be done with special arguments to the kernel. We pass all this information by modifying the grub kernel command line, so we need the following line in the client's /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX="ISCSI_INITIATOR=iqn.2007-08.com.example.client:client ISCSI_TARGET_NAME=iqn.2007-08.com.example.san:rootp ISCSI_TARGET_IP=10.10.10.10 ISCSI_TARGET_PORT=3260 root=UUID=6c816f51-0613-45e7-a15b-bc2d5cd00f88 ip=10.10.10.50::10.10.10.1:255.255.255.0:client:eth0:off"

Here we're using static IP configuration, use "ip=dhcp" for DHCP (here the full story). Also, the GRUB_CMDLINE_LINUX_DEFAULT variable is normally set to "quiet", but it's probably better to remove that to be able to see what happens at boot. It can be readded back later if wanted.

Also note that if the SAN needs authentication more variables are needed, most likely ISCSI_USERNAME and ISCSI_PASSWORD.

Looking into /usr/share/initramfs-tools/hooks/iscsi, we learn that for the initrd update process to know that we want the iSCSI stuff included, we need to create the file /etc/iscsi/iscsi.initramfs:

client# touch /etc/iscsi/iscsi.initramfs

We also see that the file /etc/iscsi/initiatorname.iscsi gets copied into the inird and sourced to learn the initiator name, so let's write it inside it in the expected format:

client# echo "InitiatorName=iqn.2007-08.com.example.client:client" > /etc/iscsi/initiatorname.iscsi

Now to apply all our changes, we regenerate grub config and the initrd:

client# update-grub
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.2.0-4-amd64
Found initrd image: /boot/initrd.img-3.2.0-4-amd64
done
client# update-initramfs -u
update-initramfs: Generating /boot/initrd.img-3.2.0-4-amd64

We also need to set a root password, otherwise we won't be able to login:

client# passwd
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully

Lastly, as we said we don't want that Debian initscripts try to configure eth0 at boot. This is achieved in a simple way by either removing any reference to eth0 in /etc/network/interfaces, or just telling Debian that the configuration is "manual":

#/etc/network/interfaces
auto eth0
    iface eth0 inet manual
# other interfaces here ...

We can finally exit the chroot environment and log out of the iSCSI LUN in the helper machine:

client# exit
helper# umount /mnt/chroot/{dev,proc,sys,}
helper# iscsiadm -m node -T 'iqn.2007-08.com.example.san:rootp' -p 10.10.10.10 -u

PXE

Let's summarize what happens when our client is booted:

  • iPXE configures the network (either via DHCP or statically)
  • iPXE logs into the iSCSI LUN, mapping it as a local disk.
  • The MBR is read, and the boot process is kickstarted, which loads the kernel and the initrd.
  • Early IP configuration is performed during the boot, and an initrd script logs into the iSCSI LUN as specified on the kernel command line (the kernel is unaware of the PXE login)
  • pivot_root() is called on the iSCSI partition specified on the command line with root=, and from there the boot process proceeds normally

So we need to configure the first three steps. Using iPXE, all that we have to do is sending this iPXE script to the client:

#!ipxe
set initiator-iqn iqn.2007-08.com.example.client:client
sanboot iscsi:san.example.com:6:3260:0:iqn.2007-08.com.example.san:rootp

This is the bare minimum; if your SAN needs authentication, then username and password should also be set before attempting to boot (see the iPXE docs, and SAN URIs explained).

Test it!

So if we boot our client, we should see that iPXE logs into the LUN and loads GRUB:

...
Registered as SAN device 0x80
Booting from SAN device 0x80
GRUB loading.
Welcome to GRUB!

and after GRUB has booted the kernel, something like this in the kernel messages:

...
[    2.073406] scsi2 : iSCSI Initiator over TCP/IP
[    2.335112] scsi 2:0:0:0: Direct-Access     EQLOGIC  100E-00          4.3  PQ: 0 ANSI: 5
[    2.337709] scsi 2:0:0:0: Attached scsi generic sg1 type 0
[    2.349859] sd 2:0:0:0: [sda] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
[    2.351322] sd 2:0:0:0: [sda] Write Protect is off
[    2.352271] sd 2:0:0:0: [sda] Mode Sense: 77 00 00 08
[    2.353451] sd 2:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[    2.368450]  sda: sda1 sda2 sda3
[    2.370812] sd 2:0:0:0: [sda] Attached SCSI disk
[    3.396538] EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null)
...
[    4.810052] Adding 524284k swap on /dev/sda2.  Priority:-1 extents:1 across:524284k 
[    4.824409] EXT4-fs (sda3): re-mounted. Opts: (null)
[    4.959888] EXT4-fs (sda3): re-mounted. Opts: errors=remount-ro
...

At this point, we can use this machine and do all the normal administrative operations (add/remove packages, upgrades, kernel configuration, etc.) in the usual way, as if it had a local hard disk.

Update 07/11/2014: According to the README.Debian file that comes with open-iscsi, when passing iSCSI variables from grub their names should be lowercase, though uppercase seems to work just as fine. It is also possible to set the various variables directly in the /etc/iscsi/iscsi.initramfs file (this time in uppercase).

If manual configuration is necessary, there are two ways to include iSCSI boot
options in your initramfs:

1) Touch /etc/iscsi/iscsi.initramfs and provide options on the command line.
This provides flexibility, but if passwords are used, is not very secure.
Available boot line options:
iscsi_initiator, iscsi_target_name, iscsi_target_ip,
iscsi_target_port, iscsi_target_group, iscsi_username,
iscsi_password, iscsi_in_username, iscsi_in_password
See iscsistart --help for a description of each option

2) Provide iSCSI option in /etc/iscsi/iscsi.initramfs.
Available options:
ISCSI_INITIATOR, ISCSI_TARGET_NAME, ISCSI_TARGET_IP,
ISCSI_TARGET_PORT, ISCSI_TARGET_GROUP, ISCSI_USERNAME
ISCSI_PASSWORD, ISCSI_IN_USERNAME, ISCSI_IN_PASSWORD

Example Syntax:

ISCSI_INITIATOR="iqn.1993-08.org.debian:01:9b3e5634fdb9"
ISCSI_TARGET_NAME=iqn.2008-01.com.example:storage.foo
ISCSI_TARGET_IP=192.168.1.1
ISCSI_TARGET_PORT=3160
ISCSI_USERNAME="username"
ISCSI_PASSWORD="password"
ISCSI_IN_USERNAME="in_username"
ISCSI_IN_PASSWORD="in_password"
ISCSI_TARGET_GROUP=1

Remember to set proper permissions if username/passwords are used.

Update2 29/01/2015: It seems iPXE has trouble booting from those iSCSI targets that use multiple IP addresses (eg most Dell Equallogic), where there is a "main" or "group" IP to which initiators connect, but the actual session is established against another IP to which the initator is directed after the first contact with the main IP (typically, the IP of a specific iSCSI interface on the storage). iPXE seems to have trouble understanding this form of "redirection", and just fails the iSCSI login. On the other hand, connecting to single-IP targets like iSCSI Enterprise Target works fine.

7 Comments

  1. Berrie says:

    This page has been very helpful in gaining an understanding of iPXE with iSCSI.

    My world runs on a home grown LFS 8.1 Linux version that boots through syslinux 6.03 on regular disks

    This config has been moved to a Synology iSCSI Lun/Target.
    The Lun is partitioned using gpt with only 2 partitions for / (ext4) and /boot (fat32)
    The gptmbr is put in place
    The syslinux is put in place on /boot and extlinux -i . is run

    When powering up the machine, I get to the iPXE prompt
    = run dhcp -> route shows that I got IP/mask and gateway addresses
    = set the iSCSI initiator identifier
    = sanboot the iSCSI Lun on the Synology that was prepared above

    At that point I see:
    . the SAN device 0x80 gets registered
    . it tells me 'Booting from SAN device 0x80'
    . A "SYSLINUX 6.03...." banner is shown
    .... immediately followed by the machine resetting

    It looks as if syslinux/extlinux is barely starting and crashing for reasons it cannot tell on the console...

    Any pointers as to why the recipe on this page should not be used for my setup, or pointers how I could address this are appreciated :-)

  2. gas says:

    Thanks for this.
    I confirm this works also for ubuntu 12.04.
    What you need to do is to install ubuntu 12.04 to iscsi first. (login to iscsi target, use entire disk and lvm)
    When you boot up you will have a lot of ip-config problems.

    To solve that just drop into recovery mode, make file system r/w, and then perform the above modifications to grub:

    i.e. modify /etc/default/grub and add those two lines
    then update-grub.

  3. Franz says:

    Dou you have problems with reboots with thic configuration.
    I use debian 7 and when I try to reboot client I get sysnc SCSI cache and system halts.

    • waldner says:

      I'm not seeing any problems. When I reboot I see that Debian correctly detects that / is on iSCSI and delays/avoids any umount or deactivation attempt. Other than that, which is expected and correct, the system reboots just fine.

      Sample output:

      [....] Unmounting iscsi-backed filesystems: Unmounting all devices marked _netdev
      [.ok
      [warn] /etc/iscsi/iscsi.initramfs present, not stopping iscsid yet ... (warning).
      [ ok ] Shutting down ALSA...done.
      [ ok ] Asking all remaining processes to terminate...done.
      [ ok ] All processes ended within 2 seconds...done.
      [ ok ] Stoppping enhanced syslogd: rsyslogd.
      [info] Saving the system clock.
      [warn] not deconfiguring network interfaces: ISCSI root is mounted. ... (warning).
      [info] Hardware Clock updated to Sun Dec  7 16:24:01 UTC 2014.
      [ ok ] Deactivating swap...done.
      [  971.426497] EXT4-fs (sda2): re-mounted. Opts: (null)
      [info] Will now restart.
      

      /etc/fstab on that machine:

      UUID=ff5ce2be-2f95-4ef0-a866-03de10290dbf / ext4 errors=remount-ro 0 1
      UUID=59df9489-2544-4631-abe4-e6e75590182c none swap sw 0 0
      
  4. John Meichle says:

    Great article. Worked more or less flawlessly with Debian Wheezy.

    When trying with Ubuntu Lucid I ran into the issue that installing open-iscsi in the chroot environment lead to apt attempting to start the service (which fails in a chroot). this left dpkg in a bad state, despite being able to proceed with the rest of this doc. Upon trying to install any software, apt would attempt to finish the install, restarting iscsid and dropping your rootfs when booting this live.

    The solution was to temporarily modify /etc/init.d/open-iscsi, blank out the "start_daemon" function, finish the apt install, then restore the initscript (and disable it on boot using 'update-rc.d -f open-iscsi remove'.

    Great article here, most of this stuff appears either undocumented or very sparsely documented, and this article pieced together information I found across the net into a clear explanation.

    A great follow up article would be avoiding GRUB entirely, as without it the entire boot process could be dynamic based on kernel arguments (asthe kernel and initramfs would be loaded not via iscsi, booting could be done via ipxe+http and no "sanboot" is needed.

    • waldner says:

      Thanks for sharing your findings! In principle, booting without grub (ie getting the kernel + initramfs directly from PXE) should be easier. I might write a short howto for that in the future.