Backstory
My apologies in advance I will likely use the names Softlayer and IBM Bluemix interchangably…
For years I have told people and been told that Softlayer virtual servers do not support LVM. No matter who I asked, no one could give me a solid technical reason why. I decided I would just go ahead and try it myself. Either it wont boot, or it will…
So I build a custom CentOS6 VHD. The partition scheme is:
/dev/xvda1 /boot ext4
/dev/xvda2 lvm partition
/dev/mapper/vg_test-lv_root. / ext4
/dev/mapper/vg_test-lv_home. /home ext4
I drop the VHD in cloud storage using swift.
swift upload <container> </path/to/file>
Attempt 1
I create an instance using my image. Unfortunately it never makes it out of provisioning and I get a ticket from the DC telling me they don’t support LVM. Well, we kind of expected this. Luckily a very nice Developer was able to tell me the console said it could not find “/etc”. The silver lining here is that, if it was looking for /etc, it must have at least booted. She was also nice enough to figure out how to get logged in, and noticed the box did not have IPs. At this point it is pretty clear. The way the Virtual Severs get IP’d must be by mounting the partition and overwriting the network config. Not quite how I would have done it, but to each his own.
Well this kind of sucks. I don’t want my root file system on a normal partition. I want it in LVM. Otherwise it defeats the entire purpose of this excercise.
Attempt 2
Something IBM recently introduced was cloud-init. Someone mentioned that cloud-init provisioning might get me a bit further. So I added the version of cloud-init packaged by rightscale. Other than that, it was the exact same image referenced above.
Oddly enough, it blew through provisioning this time. Unfortunately right at what I believe to be the end of the provisioning process, it just never came up. The usual ticket opens, and I inquire as to what happened. The same developer then tells me that the box is completely up, but it doesnt have any IPs. So she went ahead and set that up for me manually.
What do we know
- We cant have LVM on / or /etc or the box wont get IPs
- We can get past the /etc error, but still not get IPs if we select cloud-init in the portal (we dont necessarily need it installed)
This doesn’t entirely add up. They wouldnt let me use cloud-init and it just not work. So it must be because I am using LVM. My experience with cloud-init this far, has been passing in metadata. But in this case, I dont know the IP early enough to pass it in. The not getting IPs would also prevent it from querying any API to get it like AWS. So I must be missing something, lets dig through the logs!
The nice developer in passing also mentioned a MetaDisk. While looking at the logs, holy shit, it referenced a MetaDisk!
Lets mount it and see what goodies we have:
[root@test ~]# cd /mnt
[root@test mnt]# find ./
./
./openstack
./openstack/latest
./openstack/latest/meta_data.json
./openstack/latest/network_data.json
./openstack/latest/vendor_data.json
[root@test mnt]#
Well, if it is right there, why didnt cloud-init work? It seems the rightscale build happened before cloud-init was supposed to consume network_data.json. So it does everything, except what I needed it for. What about the latest build? TLDR; I built it and the same problem happened. There is/was a bug around getting network_data.json.
Attempt 3 (Solution)
At this point I have ditched cloud-init (even though I am still checking the cloud-init checkbox so I get a MetaDisk), and decided to do it myself.
We will need to make an init script, that runs before networking. This init script will call softlayer.sh to get our network configuration in place from the MetaDisk.
The script(softlayer.sh):
- find the device and mount it.
DISK=`blkid -t TYPE="vfat" | cut -f1 -d':'` MOUNT='/mnt' mount $DISK $MOUNT
- parse the $MOUNT/openstack/latest/network_data.json file. (This is no easy task. Look at this file.)
{"links":[{"id":"interface_22475947","name":"eth0","mtu":null,"type":"phy","ethernet_mac_address":"06:5b:90:a7:1f:8a"},{"id":"interface_22475945","name":"eth1","mtu":null,"type":"phy","ethernet_mac_address":"06:88:85:0f:56:3a"}],"networks":[{"id":"network_101290337","link":"interface_22475947","type":"ipv4","ip_address":"10.28.5.103","netmask":"255.255.255.192","routes":[{"network":"0.0.0.0","netmask":"0.0.0.0","gateway":"50.23.183.57"},{"network":"10.0.0.0","netmask":"255.0.0.0","gateway":"10.28.5.65"},{"network":"161.26.0.0","netmask":"255.255.0.0","gateway":"10.28.5.65"}]},{"id":"network_101290055","link":"interface_22475945","type":"ipv4","ip_address":"50.23.183.62","netmask":"255.255.255.248","routes":[{"network":"0.0.0.0","netmask":"0.0.0.0","gateway":"50.23.183.57"}]}],"services":[{"type":"dns","address":"10.0.80.11"},{"type":"dns","address":"10.0.80.12"}]}
- use jq in order to parse it properly.
hint(link to my code is at the end): JQBIN='/bin/jq' IP_ADDR=`$JQBIN -r ".networks | .[] | select(.link==\"$i\") | .ip_address" $MOUNT/openstack/latest/network_data.json`
-
Write your network configuration files. Keep in mind, there are multiple routes you need to add for the internal interface. If you neglect to do this, it will fail provisioning.
- dont forget to unmount at the end.
umount $MOUNT
Put it all together
So we have an RPM:
[root@centos6-box-1 x86_64]# rpm -qlp softlayer-networking-1.0-2.x86_64.rpm
/bin/jq
/bin/softlayer.sh
/etc/init.d/softlayer-networking
[root@centos6-box-1 x86_64]#
The RPM installs an init script and uses chkconfig as part of the rpm %post to make it start on bootup. The startup order comes from the chkconfig header in the init script. We want it to happen right before networking. So we set it 09 (since networking is 10).
our init script:
[root@centos6-box-1 softlayer-networking]# head softlayer-networking | grep chk
# chkconfig: 2345 09 89
[root@centos6-box-1 softlayer-networking]#
networking init script:
[root@centos6-box-1 x86_64]# head /etc/init.d/network | grep chk
# chkconfig: 2345 10 90
[root@centos6-box-1 x86_64]#
The init script softlayer-networking, is just a normal init script template that calls softlayer.sh when start() is run.
Assuming we did it right, the system boots up, runs our init script which populates /etc/sysconfig/network-scripts/ifcfg-$eth and /etc/sysconfig/network-script/route-$eth. Then the network init script runs and starts networking like normal.
james@test-01 [ ~ ] (04:38 PM - Sun Oct 01)
$ mount -l
/dev/mapper/vg_test-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/xvda1 on /boot type ext4 (rw)
/dev/mapper/vg_test-lv_home on /home type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
james@test-01 [ ~ ] (04:39 PM - Sun Oct 01)
$
Isn’t it beautiful?
Conclusion
By handling our networking manually using the cloud-init metadisk, we are able to achieve LVM Support in IBM Bluemix. Neat.
The code is available in my github.
Alternate Solution
- Of course after you come up with a solution someone will offer an easier one. But it does require prep work, and doesn’t offer a quick fix for everyone. A friend mentioned, you could just setup a DHCP server in your VLAN, and then leave your image set to DHCP. You will still need to account for this on both your internal and external interfaces for provisioning to finish successfully.