Cookie Consent by Free Privacy Policy Generator Page 7 of 7 for Home | Igor Moiseev

Igor Moiseev Applied mathematician, Web Developer

Manage wisely your cache setting in Ceph

Today we discuss how to manage wisely your cache setting in Ceph (docs.ceph.com/en/latest/rbd/index.html). The type of cache discussed below is the user space implementation of the Ceph block device (i.e., librbd).

Suddenly after migration to Ceph we start observing the doubling of memory usage by every virtual machine.

Ceph memory consumption is doubling

After some research we found the configuration error in ceph.conf. The option rbd cache size describes the amount of cache reserved for every virtual disk, so the total amount of RAM reserved for cache is given by

total amount of ram = (number of virtual disks)  X  (rbd cache size)

After some rational reasoning of the kind - the standard hard drive usually possess 64M of cache - the final version of configuration will be the following

rbd cache = true
rbd cache size = 67108864 # (64MB)
rbd cache max dirty = 50331648 # (48MB)
rbd cache target dirty = 33554432 # (32MB)
rbd cache max dirty age = 2
rbd cache writethrough until flush = true

Now push the configuration to the virtual hosts

$ ceph-deploy --overwrite-conf config push host1 host2

To apply settings without virtual machine restart use KVM live migration. After the virtual machine is migrated from host1 to host2 the new process will start taking into account the modification to the ceph configuration.

If not sure about setting leave Ceph’s default values for amount of cache, simply enable it

rbd cache = true
rbd cache writethrough until flush = true

SSD - the definitive guide for Linux!

STEP 1:

Choose a disk with the most optimal ratio IOPS/price. Check out the list of disks in order of IOPS (Input/Output Operations Per Second) performance: en.wikipedia.org/wiki/IOPS.

Quite often SSD disks are released with firmware bugs or with non-optimal configurations, so before putting the system in production checkout the latest version.

STEP 2:

Check that AHCI - Advanced Host Controller Interface is enabled and working:

$ sudo dmesg | grep -i ahci
    ahci 0000:00:11.0: version 3.0
    ahci 0000:00:11.0: irq 43 for MSI/MSI-X
    ahci 0000:00:11.0: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x3f impl SATA mode
    ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part 
    scsi0 : ahci
    scsi1 : ahci
    scsi2 : ahci

Check whether your controller supports AHCI:

$ sudo lshw | grep -i ahci
    product: 82801JI (ICH10 Family) SATA AHCI Controller
    capabilities: storage msi pm ahci_1.0 bus_master cap_list emulated
    configuration: driver=ahci latency=0

Quite often the AHCI is disabled in BIOS, in this case reboot and enable it.

I observed unstable behavior of disks without AHCI enabled and even the inability to execute TRIM correctly.

Identify the type of SATA modes available (for ex. SATA-II: 3Gbps gives the theoretical limit of speed 375MB/s)

$ sudo dmesg | grep SATA
    ahci 0000:00:11.0: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x3f impl SATA mode
    ata1: SATA max UDMA/133 abar m1024@0xfddffc00 port 0xfddffd00 irq 43

Check what is supported by disk

$ sudo hdparm -I /dev/sda | grep SATA
    Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0

The ideal would be any revision higher than SATA Rev 3.0 which guaranties the 6Gbps or higher speeds.


STEP 2.5: Pause


STEP 3:

Check that disk TRIM (wikipedia.org/wiki/Trim) works fine

$ sudo hdparm -I /dev/sda | grep -i trim
       *    Data Set Management TRIM supported (limit 8 blocks)

It is very important to have TRIM functioning. Without TRIM the disk speed will degrade with time due to the fact that the SSD will have to erase the cell before every write operation.

Let’s test it simply by executing fstrim

$ sudo fstrim -v /
/: 98147174400 bytes were trimmed

Lets test whether the TRIM is really doing what it should

$ sudo wget -O /tmp/test_trim.sh "https://sites.google.com/site/lightrush/random-1/checkiftrimonext4isenabledandworking/test_trim.sh?attredirects=0&d=1"
$ sudo chmod +x /tmp/test_trim.sh
$ sudo /tmp/test_trim.sh <tempfile> 50 /dev/sdX

If TRIM is properly working the result of the last command should be a bunch of zeros, thanks to Nicolay Doytchev for this script.

STEP 4:

Be sure to format disk in the SSD friendly file system: EXT4, F2FS, BTRFS, XFS or any other from this list File systems optimized for flash memory.

If not, you’ll better migrate it to EXT4 at least with the help of this manual Migrating a live system from ext3 to ext4 filesystem or consider the complete re-installation of the operating system.

STEP 5:

Mount disk with correct parameters

$ cat /etc/fstab
/dev/pve/data /var/lib/vz ext4 discard,noatime,commit=600,defaults 0 1

Check whether the disk is identified as non rotational

$ sudo for f in /sys/block/sd?/queue/rotational; do printf "$f is "; cat $f; done
/sys/block/sda/queue/rotational is 0

If you see 1 on SSD, that means there are some problem with kernel or AHCI

The next is to check that the scheduler option is selected on deadline for our brand new SSD drive

$ sudo for f in /sys/block/sd?/queue/scheduler; do printf "$f is "; cat $f; done
/sys/block/sda/queue/scheduler is noop [deadline] cfq

If not execute the following

$ sudo echo deadline > /sys/block/sda/queue/scheduler

References

moiseevigor.github.io is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means to earn fees when linking to Amazon.com and affiliated sites.