Access partitions in non-disk block devices with kpartx

Posted by waldner on 25 September 2010, 12:15 am

Ever wondered why for normal disk devices (eg /dev/sda), device files for the contained partitions are usually available (eg /dev/sad1 etc.), while for other non-disk devices (eg, disk images, LVM or software RAID volumes) there are no such device files? How to access such partitions?

A typical scenario is an LVM logical volume that is used as virtual disk by a guest VM, and the guest OS creates partitions on it. On the host, you just see, say, /dev/vg0/guestdisk, yet it does contain partitions:

# sfdisk -l /dev/mapper/vg0-guestdisk 

Disk /dev/mapper/vg0-guestdisk: 4568 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start     End   #cyls    #blocks   Id  System
/dev/mapper/vg0-guestdisk1   *      0+   4376    4377-  35158221   83  Linux
/dev/mapper/vg0-guestdisk2       4377    4567     191    1534207+  82  Linux swap / Solaris
/dev/mapper/vg0-guestdisk3          0       -       0          0    0  Empty
/dev/mapper/vg0-guestdisk4          0       -       0          0    0  Empty

But those mysterious devices /dev/mapper/vg0-guestdisk1 etc. are nowhere.

The same can happen for plain disk images:

# sfdisk -l guest.img
Disk guest.img: cannot get geometry

Disk guest.img: 1305 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start     End   #cyls    #blocks   Id  System
guest.img1   *      0+    497     498-   4000153+  83  Linux
guest.img2        498    1119     622    4996215   83  Linux
guest.img3       1120    1304     185    1486012+  82  Linux swap / Solaris
guest.img4          0       -       0          0    0  Empty

and also for some software (md) RAID devices.

Anyway, in all these cases, it sometimes happens that one needs to do "something" with the inner partitions (eg, mount them, or recreating or resizing a file system, etc.). That obviously needs a device node to use, to avoid losing sanity. Here's where the neat utility kpartx saves the day.

Basically, what kpartx does is to scan a device or file and apply some magic to detect the partition table in it, and create devices corresponding to those partitions. Since it uses the device mapper, the devices it creates go under /dev/mapper, which may be somewhat confusing because that's also where other devices created using the device mapper (LVM volumes, SAN multipath devices), and against which kpartx may be run, live.

Depending on the distribution, kpartx comes either as part of multipath-tools, or packaged separately.

Some examples

So let's take a partitioned LVM logical volume, the one shown in the previous example:

# kpartx -l /dev/mapper/vg0-guestdisk
vg0-guestdisk1 : 0 70316442 /dev/mapper/vg0-guestdisk 63
vg0-guestdisk2 : 0 3068415 /dev/mapper/vg0-guestdisk 70316505

With -l, kpartx only displays what it found and the devices it would create, but doesn't actually create them. To create them, use -a:

# kpartx -a /dev/mapper/vg0-guestdisk

Nothing seems to happen, but let's have a look under /dev/mapper:

# ls -l /dev/mapper/vg0-guestdisk*
brw-rw---- 1 root disk 251, 0 2010-09-24 18:57 /dev/mapper/vg0-guestdisk
brw-rw---- 1 root disk 251, 3 2010-09-24 18:54 /dev/mapper/vg0-guestdisk1
brw-rw---- 1 root disk 251, 4 2010-09-24 18:54 /dev/mapper/vg0-guestdisk2

And now we can access them just fine:

# mount /dev/mapper/vg0-guestdisk1 /mnt
# ls /mnt
bin  boot  cdrom  dev  etc  home  initrd  initrd.img  initrd.img.old  lib  lost+found  media  mnt  opt  proc  root  sbin  srv  sys  tmp  usr  var  vmlinuz  vmlinuz.old

But what happened? Let's have a look. After all, the new devices are just device maps (yes, on top of the main logical volume, which is itself a device map):

# dmsetup table /dev/mapper/vg0-guestdisk1
0 70316442 linear 251:0 63
# dmsetup table /dev/mapper/vg0-guestdisk2
0 3068415 linear 251:0 70316505

What the above fields mean is as follows (values for the first of the two maps):

0: starting block of the map
70316442: number of blocks in the map (in this case this is the total number of blocks in the "device")
linear: mapping mode. Linear just means that: blocks are mapped sequentially from the source to this map
251:0: mapped device; here it's the man logical volume vg0-guestdisk, as could be seen from the previous ls output
63: starting block on the mapped device; this means that block 0 in the vg0-guestdisk1 map corresponds to block 63 in the vg0-guestdisk logical volume, block 1 here corresponds to block 64 there, etc.

A block here is 512 bytes, which means that 70316442 blocks are 36002018304 bytes, or about 33GiB or 36GB, depending on whether you like binary or decimal units (in case anybody cares at all, that is).

As a small aside just for completeness, I said that the "partitioned" device (/dev/mapper/vg0-guestdisk) is itself a device map, so here it is:

# dmsetup table /dev/mapper/vg0-guestdisk
0 73400320 linear 104:3 384

Which shows that this logical volume is a linear a map (LVM also allows for striped maps) built on top of the device with major 104 and minor 3, which on this system is nothing else than /dev/cciss/c0d0p3, a partition in an HP hardware RAID volume, which was previously turned into an LVM physical volume and added to the volume group vg0.
For an excellent introduction to the device mapper, which is what LVM, multipath devices and some disk encryption technologies are built upon, I suggest this Linux Gazette article which is quite enlightening.

Disk images

For disk images, kpartx can still be used, but since they are not real block devices, a block device needs to be associated to the file first. This sounds like a job for loopback devices, and indeed kpartx is smart enough to associate a loopback device automatically if it sees that what it's being asked to use is not a real block device:

# losetup -a    # no loop devices in use now
# kpartx -a guest.img
# losetup -a
/dev/loop0: [6801]:131312 (guest.img)
# ls -l /dev/mapper/loop0*
brw-rw---- 1 root disk 251, 5 2010-09-24 23:22 /dev/mapper/loop0p1
brw-rw---- 1 root disk 251, 7 2010-09-24 23:22 /dev/mapper/loop0p2

No need to add that /dev/mapper/loop0p1 and /dev/mapper/loop0p2 are maps that reference /dev/loop0 (which in turn is associated to our image file).

Conclusion

When the devices created by kpartx are no longer needed, the maps can be removed (either manually using dmsetup remove, or with kpartx -d). The devices should also be removed before the partitioning is changed (with fdisk, etc.) because it seems that otherwise kpartx sometimes is not able to delete the old maps, giving errors like "ioctl: LOOP_CLR_FD: Device or resource busy" when trying to delete or update the old maps. So to be safe, it's better to run kpartx -d, change the partitions, then again kpartx -a. If old maps lie around and are accidentally used, disaster is likely as they will be referencing the start and end of partitions that no longer exist, resulting in mapping now-unrelated parts of the device.

kpartx makes working with embedded partitions much easier, a scenario especially common in virtualization.

kpartx can handle different types of partition tables besides the classical DOS format, including BSD, Solaris, Sun and GPT (not tried, but it would seem so by looking at the source).

Finally, kpartx can be used manually on the command line, but it can also be integrated in udev rules to run automatically when the main device is created, so the corresponding devices for the partitions are created too. For example, many distributions run kpartx in a udev rule when a multipath device (eg /dev/mapper/mpath1 etc.) is created, so its partitions will show up as well as the main device.

Filed under linux, tips, worksforme Tagged device mapper, kpartx, linux, LVM, partitions

Comments are closed | Permalink

6 Comments

thj2 says:

October 20, 2011 at 03:27

Ok, but do you know of any way to map the blocks of an LVM LV-hosted filesystem back to the underlying physical disk (presuming a sequential, contiguous LV hosting the filesystem). Basically, I'm trying to find a Linux LVM2 equivalent to doing a vxunroot.
- waldner says:
  
  October 22, 2011 at 17:20
  
  Sorry I'm not aware of such a tool. IIUC what it does, assuming a simple contiguous LV entirely contained in a single PV, it might be possible to do something by moving each LV block back by N blocks (where N is the size of the LVM metadata in the PV), but this is just wild speculation, I may be misunderstanding (and even if not, it may not be possible to do what I suggest or it may be wrong).
Rich says:

December 8, 2010 at 11:41

You might want to look into libguestfs. It doesn't need root, and doesn't open up the security holes of using mount on untrusted guest images.
- waldner says:
  
  December 8, 2010 at 13:23
  
  Thanks. Libguestfs is indeed one of those things that have been in my TODO list for quite a while, I hope I'll be able to give it a go soon.
  - Rich says:
    
    December 8, 2010 at 15:31
    
    No problem. We're keen to get people to try out the Debian and Ubuntu version too, although I guess from reading your blog you are using CentOS (which will ship with it in CentOS 6).
    - waldner says:
      
      December 8, 2010 at 18:09
      
      In fact I use many different distros, the article about CentOS was only because that happened on a time I had to install CentOS (and actually, CentOS isn't exactly my favorite, but I digress). It would be nice to see a Gentoo ebuild of libguestfs, but I see from the related bug that the road may be a bit longer (unless one is willing to experiment a bit).
      But I won't mind trying it out under Debian, Ubuntu or Fedora/RHEL (which I suppose have the best support, seeing where you work) when I have time.

\1