Upgrading Lustre

It’s been close to a year since I updated our cluster; I was going to do it over Christmas, but never quite got around to it. The period of social distancing (and procrastinating on my research) is a great time, right? The cluster is running Centos 7. The biggest issue with upgrading it is the Lustre file system. These are all my notes on the upgrade process. I’m hoping by writing them down here, my life will be somewhat easier the next time I need to do this. Learning how Lustre works all over again every time I do an update is an involved process!

Lustre is very picky about the version of the Linux kernel. This means we can’t just do a blanket “sudo yum update” on the system. We need to upgrade to the specific kernel version that is required by the new version of Lustre we will be installing.

On wyeast, the Lustre server is installed across three different nodes: wyeast-lustre01, wyeast-lustre02, and wyeast-lustre03. The metadata server is on the first node, and the object storage targets are stored on lustre02 and lustre03.

First, update the list of updates that yum knows about:

sudo yum makecache

Next, look at the lustre-server repo and find the current version of the Lustre server and the Linux kernel it uses.

sudo yum repo-pkgs lustre-server list

From this, I found that the current Lustre server version is 2.12.4. I checked the changelog on lustre.org to determine the kernel version needed:

http://wiki.lustre.org/Lustre_2.12.4_Changelog

The Linux kernel needed is actually available in the Lustre-server repo:

kernel-3.10.0-1062.9.1.el7_lustre

So I needed to make sure to install that particular version and not the most up-to-date kernel.

sudo yum repo-pkgs lustre-server update kernel-3.10.0-1062.9.1.el7_lustre kernel-devel-3.10.0-1062.9.1.el7_lustre kernel-headers-3.10.0-1062.9.1.el7_lustre

After that, I checked the current list of other updates available in the Lustre server repository.

sudo yum repo-pkgs lustre-server list

Next, I updated all the Lustre packages that were already installed:

sudo yum repo-pkgs lustre-server update kmod-lustre.x86_64 kmod-lustre-osd-ldiskfs.x86_64 libnvpair1.x86_64 libuutil1.x86_64 libzfs2.x86_64 libzpool2.x86_64 lustre.x86_64 lustre-osd-ldiskfs-mount.x86_64 lustre-osd-zfs-mount.x86_64 lustre-resource-agents.x86_64 lustre-zfs-dkms.noarch spl.x86_64 spl-dkms.noarch zfs.x86_64 zfs-dkms.noarch

Finally, I’ll update all the other system software, carefully excluding the Linux kernel packages:

sudo yum -x kernel,kernel-headers,kernel-debug-devel,kernel-tools,kernel-tools-libs,kmod-lustre.x86_64,kmod-lustre-osd-ldiskfs.x86_64,libnvpair1.x86_64,libuutil1.x86_64,libzfs2.x86_64,libzpool2.x86_64,lustre.x86_64,lustre-osd-ldiskfs-mount.x86_64,lustre-osd-zfs-mount.x86_64,lustre-resource-agents.x86_64,lustre-zfs-dkms.noarch,spl.x86_64,spl-dkms.noarch,spl-dkms.noarch,zfs.x86_64,zfs-dkms.noarch,kernel-devel update

That completes all the software upgrades. The same process needs to be done on wyeast-lustre02 and wyeast-lustre03. I probably should have umounted Lustre mounts before this process, but I didn’t. So after the reboot, Lustre wasn’t quite working. I had to fix it.

First, I had to fix the firewall again on the Lustre machines:

sudo iptables -F

Next, zfs (the file system used by Lustre) was messed up on wyeast-lustre01 and wyeast-lustre02.

The command:

zfs list

wasn’t working. It showed that zfs wasn’t loaded. So the first step is to do:

modprobe zfs

This loaded zfs. However, our zfs pools are missing. This command fixed that:

zpool import

This finds the zpools and allows them to be imported:

zpool import lustre-ost0/ost0

zpool import lustre-ost0/ost0

This loads the zfs pools, but I still need to remount the Lustre file system. This needs to be done on the object storage targets first (lustre02 and lustre03) before it is done on the metadata server (lustre01).

sudo mount -t lustre lustre-ost0/ost0 /lustre-ost0/ost0

sudo mount -t lustre lustre-ost1/ost1 /lustre-ost1/ost1

Lustre actually automounted correctly on Lustre03, so I didn’t have to fix anything. With the targets working, it was time to fix Lustre01:

mount -t lustre lustre-mgsmdt/mgsmdt /lustre-mgsmdt/mgsmdt

Mounting the Lustre file system starts the Lustre service and we are off to the races.

Back on the compute nodes, it wasn’t finding the Lustre mount on the head node. So I had to unmount and then remount Lustre.

First, when I tried to unmount Lustre, the file system was reported as busy. So I ran the following command the find the guilty processes:

sudo lsof +f -- /lustre

This gives me a list of processes that I was then able to kill off. After that:

sudo umount /lustre

Followed by:

sudo mount -t lustre 192.168.1.11@tcp:/lustre /lustre

Which worked! Although I hadn’t yet updated the Lustre client, it was still able to handle the updated Lustre server. The other nodes that didn’t have active shells attached to them didn’t have any trouble with the change; I didn’t even have to remount them; the file system just showed up without any trouble.

Next step is to update the software on the compute nodes. Similar process except somewhat easier since we don’t have to deal with zfs. I still want to limit the install to the particular Linux kernel and the “Lustre-client” repo. In this case, I had to download the rpms from rpmfind:

https://rpmfind.net/linux/rpm2html/search.php?query=kernel%28×86-64%29&submit=Search+…&system=&arch=

I downloaded RPMs for kernel, kernel-debug-devel, kernel-headers, kernel-tools, and kernel-tools-libs. This time, I remembered to unmount /lustre first. Then I installed the new kernel modules:

Then, to install them:

sudo yum localinstall kernel-3.10.0-1062.9.1.el7.x86_64.rpm kernel-debug-devel-3.10.0-1062.9.1.el7.x86_64.rpm kernel-headers-3.10.0-1062.9.1.el7.x86_64.rpm kernel-tools-3.10.0-1062.9.1.el7.x86_64.rpm kernel-tools-libs-3.10.0-1062.9.1.el7.x86_64.rpm

Next, update the Lustre client:

sudo yum repo-pkgs lustre-client update kmod-lustre-client.x86_64 lustre-client.x86_64

Then update everything else, excluding the kernel stuff:

sudo yum update -x kernel,kernel-debug-devel,kernel-headers,kernel-tools,kernel-tools-libs

Finally, reboot and then remount Lustre:

sudo mount -t lustre 192.168.1.11@tcp:/lustre /lustre

Unlike with the Lustre server, I didn’t encounter any trouble with the reboot. The Lustre partition survived the update just fine, and I was able to successfully update all the rest of the installed software on the system.

0 thoughts on “Upgrading Lustre

  1. World of Warcraft (WoW) is one of the most famous multiplayer online games in the whole world, where players can fight jointly in epic raids to overcome powerful bosses and earn unique merits. But not all players manage to successfully complete raids – due to their complexity and the coordination and experience requirements. For those who want to get help and complete the raid without unnecessary difficulties, there is the WoW Raid Carry service.

    WoW Raid Carry Service is a service provided by raid carry wow experienced and skilled players who assist other players to successfully complete raids in WoW. These professionals are highly experienced in boss battles, possess highest levels of skill and understanding of game mechanics, making them ideal guides for those who want to gain confidence and guaranteed success.

    Raid Carry services may include various features, largely on the needs of the player. For example, this could be running a full raid from start to finish, general role in battles with specific chiefs, leveling up a character, or receiving certain rewards and items . Each player can choose the option that is suitable for himself and receive professional help from experienced players.

    WoW Raid Carry services are popular among those who value their time don’t desire waste it on leveling up and completing difficult raids. It is also a good method to improve your skills and knowledge in the game by learning from experts and getting valuable tips and tips.

    Overall, WoW Raid Carry Service is a convenient and efficient method to complete difficult raids in World of Warcraft without any extra hassle and problems. Thanks to experienced players and professional assistance, every player can enjoy playing and achieve the desired results in short time.

  2. World of Warcraft (WoW) is one of the most popular multiplayer on-line games in the world, where players can fight together in epic raids to overcome powerful bosses and earn unique merits. But not many players manage to successfully complete raids all due to their complexity and the coordination and experience requirements. For those who want to get help and complete the raid without unnecessary problems, there is the WoW Raid Carry service.

    WoW Raid Carry Service is a service provided by wow raid carry group experienced and skilled players who assist other players to successfully complete raids in WoW. These masters are experienced in boss battles, possess high levels of skill and knowledge of game mechanics, making them ideal guides those who want to gain confidence and guaranteed success.

    Raid Carry services may include various features, largely on the needs of the player. For example, this could be running a full raid from start to finish, general participation in battles with specific bosses, leveling up a character, or receiving certain rewards and items . Each player can choose the option that is suitable for himself and receive professional help from experienced players.

    WoW Raid Carry services are simply extremely popular among those who value their time and don’t want spend it on leveling up and completing difficult raids. It is also a great method to improve your skills and knowledge in the game by learning from professionals and getting valuable tips and tips.

    Overall, WoW Raid Carry Service is a convenient and effective way to complete difficult raids in World of Warcraft without any extra hassle and difficulties. Thanks to experienced players and professional assistance, every player can enjoy the game and achieve the desired results in short time.

  3. World of Warcraft (WoW) is one of the most popular multiplayer on-line games in the whole world, where players can fight together in epic raids to overcome powerful bosses and earn unique rewards. However not many players manage to successfully complete raids all due to their complexity and the coordination and experience requirements. For those who wish to get help and complete the raid without unnecessary difficulties, there is the WoW Raid Carry service.

    WoW Raid Carry Service is a service provided by world of warcraft raid carry experienced and skilled players who help other players to successfully complete raids in WoW. These masters are highly experienced in boss battles, possess highest levels of skill and knowledge of game mechanics, making them exemplary guides those who want to gain confidence and guaranteed success.

    Raid Carry services may include various features, depending on the needs of the player. For example, this could be running a full raid from start to finish, joint role in battles with certain chiefs, leveling up a character, or receiving certain rewards and items . Each player can choose the option that is suitable for himself and receive professional help from experienced players.

    WoW Raid Carry services are simply extremely popular among those who value their time and don’t want spend it on leveling up and completing difficult raids. It is also a great method to improve your skills and knowledge in the game by learning from professionals and getting valuable tips and tips.

    Overall, WoW Raid Carry Service is a convenient and effective method to complete difficult raids in World of Warcraft without any hassle and difficulties. Thanks to experienced players and professional assistance, every player can enjoy the game and achieve the desired results in not long time.

Leave a Reply

Your email address will not be published. Required fields are marked *