Creating Ceph Bluestore OSDs with Spinning Drives and SSDs for DB/WAL

As a consultant I work with Ceph using a downstream version of the product; so once in awhile I like to catch up on new features and functions that have not yet hit the downstream/supported version of the product; that process has led me to setting up my homelab (again) and using Ceph Nautilus as a base for storage.

Using ceph-volume

Ceph comes with a deployment and inspection tool called ceph-volume. Much like the older ceph-deploy tool, ceph-volume will allow you to inspect, prepare, and activate object storage daemons (OSDs). The advantages of ceph-volume include support for LVM, dm-cache, and it no longer relies/interacts with udev rules.

For my use case I have installed a single Fusion IOMemory card unto each of my nodes in order to deploy OSDs with faster storage for the DB and WAL devices. It’s a very good idea to read the Bluestore configuration reference as that is default for new OSD deployments. Take careful note of the recommendations for the use of a DB and WAL device.

If there is only a small amount of fast storage available (e.g., less than a gigabyte), we recommend using it as a WAL device. If there is more, provisioning a DB device makes more sense. The BlueStore journal will always be placed on the fastest device available, so using a DB device will provide the same benefit that the WAL device would while also allowing additional metadata to be stored there (if it will fit).

Bluestore Configuration Reference

In my case, due to the access to the Fusion IOMemory card, I want to create enough partitions to support 11 OSDs and make them as large as possible for the DB device (which will put the WAL device on the same partition). My fast media is 931 GB of usable storage, if I split it evenly across all eleven OSDs I should end up with partitions ~84 GB in size. I like round numbers so those partitions are now 80 GB in size and the deployment command looks something like this.

root@ganymede:~# ceph-volume lvm prepare --bluestore --dmcrypt --data /dev/sdd --block.db /dev/fioa5

Be sure to replace the –data argument with the storage device and the –block.db argument needs to point to the partition on the fast storage you wish to use for the given OSD. After that I run the activation command for all OSDs on the node.

root@ganymede:~# ceph-volume lvm activate --all

Assuming everything has gone as expected the OSDs will start up and join the cluster and you’ll get all the speedy goodness of an SSD for the write ahead log and RocksDB.

Mounting CephFS From Multiple Clusters to a Single Machine using FUSE

For my new homelab cluster I’ve built up a fresh Ceph filesystem to store certain chunks of my data and found the need to mount both to one of my nodes. Normally I use ceph-fuse through /etc/fstab, so I simply modified with the following.

root@storage:~# grep fuse /etc/fstab
none	/mnt/storage/ceph	fuse.ceph,ceph.conf=/etc/ceph/ceph.conf,_netdev,defaults  0 0
none	/mnt/storage/ceph-old	fuse.ceph,ceph.conf=/etc/ceph-old/ceph.conf,_netdev,defaults  0 0

The /etc/ceph-old/ is a copy of my config files from the older cluster. In the /etc/ceph-old/ceph.conf file I added the following, since the keyring for the that cluster is not in the default path.

keyring = /etc/ceph-old/ceph.client.admin.keyring

Anytime the ceph.conf from the old cluster is used so is the old keyring and the cluster mounts up just fine.

Filesystem     Type            Size  Used Avail Use% Mounted on
ceph-fuse      fuse.ceph-fuse  100T   91T  9.4T  91% /mnt/storage/ceph-old

FreeIPA Certificates Displays CertificateOperationError

Working with a fresh install of FreeIPA using the Ubuntu Bionic package is displaying an error on the ‘Certificates’ page which reads:

IPA Error 4301: CertificateOperationError
Certificate operation cannot be completed: Unable to communicate with CMS (Start tag expected, '<' not found, line 1, column 1)

After doing some research on the problem it seems to have already been resolved upstream, and in the Ubuntu Cosmic distribution, however the backport has not yet hit Ubuntu Bionic. I’ve been able to safely apply this commit to the file at /usr/lib/python2.7/dist-packages/ipapython, then restarted FreeIPA and all was well.

root@ipa:~# ipactl restart
Stopping pki-tomcatd Service
Restarting Directory Service
Restarting krb5kdc Service
Restarting kadmin Service
Restarting named Service
Restarting httpd Service
Restarting ipa-custodia Service
Restarting pki-tomcatd Service
Restarting ipa-otpd Service
Restarting ipa-dnskeysyncd Service
ipa: INFO: The ipactl command was successful

Ubuntu Bionic (actually cloud-init) Reverting Hostname on Reboot

If you’ve changed the hostname on an Ubuntu Bionic install, restarted the node, then found that the hostname has reverted you may be wondering why this has happened. The problem actually stems from the cloud-init scripts and the ‘preserve_hostname’ option.

root@ipa:~# grep -H -n preserve /etc/cloud/cloud.cfg
/etc/cloud/cloud.cfg:15:preserve_hostname: false

Go change the variable to true and the next time you change the hostname and reboot it will be left intact.

FreeIPA WebUI Login Fails with “Login failed due to an unknown reason.”

I’ve been working with setting up a fresh install of my homelab and have been trying to get FreeIPA to work on Ubuntu Bionic. If you happen to see the “Login failed due to an unknown reason.” error while trying to login through the web UI, try adding execute permissions for all users to the “/var/lib/krb5kdc/” directory.

root@ipa:~# chmod a+x /var/lib/krb5kdc

Try to login after that and, if the problem was the same as my own, you’ll find it working now.

Unable to Access WordPress Dashboard after Upgrades

I use DreamHost’s DreamPress product for this website, as part of that product there are caching plugins installed. Normally this would be perfectly fine but after an upgrade that caching was preventing me from accessing the dashboard. This post explains how I got past it.

Step 1:
Make sure you are logged in and can see the admin bar at the top of your site. From that bar purge the page and database cache.

Step 2:
From the DreamPress dashboard go identify your SSH credentials and use them to login using SSH.

Step 3:

Once you are logged in using SSH use ‘cd yourdomain.tld’ to access your website directory and execute ‘wp cache flush’ to clear out any remaining issues.

That’s it, go ahead and try to access your dashboard again.

Use GitLab Personal Access Token as Password

I’m finally writing some more code so I’ve started to make use of my gitlab instance, one of the first things I turned on was 2FA but that also means that checking out through https:// could no longer authenticate. The solution is to visit your profile page, click on the “Access Tokens” tab, and generate a token which can be used as a password.