Revisting an article on how to set up Solid State Disks with Linux

Almost three years ago I wrote a lengthy article on how to align, partition, configure and benchmark Solid State Disks under Linux. Those were the early days of these NAND flash memory devices and you had to jump through some hoops to get them to perform at their best when it comes to performance and durability. So by now parts of the tutorial have been obsolete for some time now. Graphical partitioning tools e.g. handle the alignment of these devices correctly nowadays, which they did not do back then. I have never found the time to go back to that article and bring it up to the latest.

So we are grateful, that Don, who on the forum goes as dibl, took it upon him to present us with a modernised version of this tutorial. Not only does he eliminate cruft that is obsolete or plain wrong nowadays, he also describes a different concept of trimming these devices. This method substitutes the discard parameter in your fstab, that takes care of the trimming for most of us. This method makes use of the fstrim command from the package util-linux, which, which is run, preferably when you are absent from the machine, by a script using cron on a daily or weekly turn. This prevents that discard calls TRIM every time we delete a file and slow things down. So, here we go, thanks again, Don.

Optimizing SSD-based System Performance

The goal of these configuration settings, generally, is to select and configure a suitable filesystem, to minimize erase/write cycles that don’t add to system performance, to enable TRIM, and otherwise to optimize the OS to provide a very responsive user experience.

1. Filesystem type and /etc/fstab configuration

We want to use ext4 and take advantage of the journaling feature, for data security, but we want to reduce the frequency of journal commits (writes to the SSD) from the default 5 seconds to a slower rate, to extend the life of the SSD memory blocks. The “commit” mount option controls the frequency of journal commits, and as mentioned is set to 5 seconds by default. Understanding that slowing down this frequency also raises the risk of data lost in a power loss or system freeze situation, choose a slower frequency that you are comfortable with, like “120” for two minutes, a reduction of 24x. To avoid writes caused only by reading files, use the “noatime” option. As a result, the /etc/fstab line that mounts the OS will look like this:

UUID=bea3a748-3411-4024-acd0-39f3882ddaf9 / ext4 noatime,commit=120,errors=remount-ro 0 1

For typical laptop and single-user computers, we want to mount selected filesystems as “tmpfs”, which lets the OS use memory rather than the SSD for logging and spooling. The wise user will wait for a reasonable period of time after initially installing on the SSD, before these changes are made to /etc/fstab, because until you are sure your system is stable, you should allow the logs to be written on the SSD, for later review. Logs written in memory will not survive a reboot. When you are satisfied that the system is stable and the logs can safely be lost at each reboot, add these lines to the end of /etc/fstab:

none /tmp tmpfs defaults,noatime,mode=1777 0 0
none /var/tmp tmpfs defaults,noatime 0 0
none /var/log tmpfs defaults,noatime 0 0
none /var/spool tmpfs defaults,noatime 0 0

Alternatively, you could mount these directories on a hard disk drive if one is also installed in the system – a better approach for a server, for example, where you might need to periodically review older logs.

2. Enable TRIM

Recent guidance [4] recommends using the fstrim utility periodically, rather than using the “discard” filesystem mount option. At the conclusion of the linked blog, a very handy script for use as a cron job is shown, and it is repeated here for convenience:

#!/bin/sh
#
# To find which FS support trim, we check that DISC-MAX (discard max bytes)
# is greater than zero. Check discard_max_bytes documentation at
# https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt
#
for fs in $(lsblk -o MOUNTPOINT,DISC-MAX,FSTYPE | grep -E '^/.* [1-9]+.* ' | awk '{print $1}'); do
fstrim "$fs"
done

Save it as /etc/cron.weekly/fstrim_job.

Note that LVM systems and LUKS/dm encrypted filesystems add additional configuration tasks to the basic filesystem configuration described here – follow the guidance to enable TRIM at each level of your system configuration.

3. Outsource the browser cache to /run/user (guidance for single-user system, can be expanded for multi-user implementation)

Since Debian now has the shared directory /run/user/usernumber in RAM, we can outsource the cache generated during browsing to memory, and eliminate many SSD writes. For example, in the Firefox/Iceweasel address bar we enter “about:config” and confirm the warning. Now right-click in the white space and choose “New ==> String” and we create a new entry called:

browser.cache.disk.parent_directory

After double-clicking the new string, we assign it the value:

/run/user/1000/firefox-cache for the first user.

Now as user in the terminal create a directory:

mkdir -p /run/user/1000/firefox-cache

After a Firefox restart, browser caching happens in memory, not on the SSD.

For chromium-browser, the cache location is set with the --disk-cache-dir=”DIRNAME launch command option. So to outsource the chromium-browser cache:

mkdir -p /run/user/1000/chromium-cache

Open the chromium-browser launch icon for editing, change to the »Application« tab, and edit the start command to read as follows:

/usr/bin/chromium –disk-cache-dir=/run/user/1000/chromium-cache %U

In /usr/share/applications/chromium.desktop, find the line

Exec=/usr/bin/chromium %U

and edit it to read

Exec=/usr/bin/chromium –disk-cache-dir=/run/user/1000/chromium-cache %U.

Note that this will need to be done again after each chromium package update overwrites it.

The new browser cache directory in /run/user will not survive a reboot. To automate this process, put the following “auto_browser_cache.sh” script in your ~.kde/Autostart folder (for KDE users), and then chmod +x to make it executable:

#!/bin/bash
NEWDIR=/run/user/1000/chromium-cache
mkdir -p "$NEWDIR" &
sleep 1
NEWDIR1=/run/user/1000/firefox-cache
mkdir -p "$NEWDIR1" &
sleep 1
#end

Analogous cache outsourcing configuration can be made for other browsers, if they allow the user to specify the cache location, and the startup script can be adapted to add directories for each browser that the user wants to run.

For a desktop system that remains booted for long periods, and depending on the memory capacity and browsing activities, the outsourced browsing cache could grow to a problematic size and need to be manually cleared to avoid sending the system into swapping.

4. I/O Scheduler selection

Multiple sources that you can find with a google search indicate that, for SSDs, the “deadline” and “noop” schedulers perform better than the default “cfq” scheduler, with deadline getting the most recommendations. Set the scheduler in /etc/sysfs.conf as so:

block/sda/queue/scheduler=deadline

5. Virtual memory settings

Depending on how much memory your system has, and how you use it, the same tweaks to vm (swappiness, vfs_cache_pressure, etc.) that you use for a hard disk drive installation can also be applied to a system installed on a SSD. Guidance is available via google search as well as the two excellent references below. Here are the lines added to /etc/sysctl.d/sysctl.conf on one of my SSD installations:

vm.swappiness=1
vm.vfs_cache_pressure=25
vm.dirty_ratio = 50
vm.dirty_background_ratio = 3
#

Virtual Memory Tuning References:

https://business-asset.com/eng/wiki-blog/varia/the-linux-page-cache-and-pdflush-theory-of-operation-and-tuning-for-write-heavy-loads-2855.html

http://www.cyberciti.biz/faq/linux-kernel-tuning-virtual-memory-subsystem/

Performance Testing (from devil’s article)

Before you spend time on performance testing and benchmarking your SSD, you need to determine the firmware version you have, and then check the OEM’s website and learn whether a more recent version is available. Significant performance improvements can result merely from updating your SSD firmware — follow your OEM’s instruction to install updated firmware. To check your firmware version:

hdparm -iv /dev/sdx

6. Verify that TRIM is working (after setting the “discard” mount option as shown in #1 above).

# cd to some directory on the SSD, then
dd if = /dev/urandom of=tempfile bs=512k count=100 oflag=direct
hdparm - fibmap tempfile

# here we read the sectors from the tempfile

From the output we copy the number immediately under “begin_LBA” and insert it in the next command:
hdparm - read-sector 1234567 /dev/sdx
# 1234567 replaced with the number from the previous command and /dev/sdx with your device ID

The output should be a longer string. Next:

rm tempfile
sync
hdparm -read-sector 1234567 /dev/sdx

# replace 1234567 and /dev/sdx with your values

The sectors will not be cleared instantly due to caching — wait for some seconds. Then repeat the last command (hdparm – read-sector …) — it should (after a short while) come out all zeros. That means TRIM works! If you have problems with “discard” on your SSD and you have verified that your SSD does support TRIM, you can use fstrim which is in the current util-linux package (check “man fstrim”), or use the tool “DiskTrim” from http://disktrim.sourceforge.net/.

7. Throughput Benchmarking

CAUTION: You can benchmark your SSD to a premature death by subjecting it to frequent comprehensive benchmark tests!

7a. Simple hdparm test:

hdparm -tT /dev/sdx

Run it twice in rapid succession — normally the second run is fastest.

7b. hdparm with O_DIRECT kernel flag:

hdparm --direct -tT /dev/sdx

7c. More reliable benchmark using dd:

# cd to some directory on the SSD, then
$ dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync, notrunc
1024 +0 records in
1024 +0 records out
1073741824 bytes (1.1 GB) copied, 2.18232 s, 492 Mb/s

Now (as root) clear the buffer cache to force reading directly from disk:
# echo 3 > /proc/sys/vm/drop_caches
$ tempfile dd if=of=/dev/null bs=1M count=1024
1024 +0 records in
1024 +0 records out
1073741824 bytes (1.1 GB) copied, 2 , 55234 s, 421 Mb/s

Now we have the last file in the buffer cache and measure its speed:
$ dd if=tempfile of=/dev/null bs=1M count=1024
1024 +0 records in
1024 +0 records out
1073741824 bytes (1.1 GB) copied, 0.122594 s, 8.8 Gb/s

For the most accurate possible value for your SSD, re-run the last command 5 times and average the results.

7d. Other benchmarking tools are bonnie++ and compilebench. Have fun!

REFERENCES:

[1] Debian Wiki https://wiki.debian.org/SSDOptimization

[2] Arch Wiki https://wiki.archlinux.org/index.php/Solid_State_Drives

[3] SSD Endurance Testing http://techreport.com/review/25889/the-ssd-endurance-experiment-500tb-update

[4] Trim Guidance http://blog.neutrino.es/2013/howto-properly-activate-trim-for-your-ssd-on-linux-fstrim-lvm-and-dmcrypt/

Leave a Reply

Your email address will not be published. Required fields are marked *