I always short on disk space on my laptop. I just like to carry lots of “you never know when you need it” stuff with me. ISO images, lots of firmware for multitude of network devices and good part of my music collection. All this data is mostly read only, does not require fast access and it is not critical, so I can wait till it restored from backup if it lost. Putting it to NVME drive would be just a waste of disk space. I opted to get a big SD Card (512G micro SDXC Sandisk Ultra to be precise) plug it permanently to my laptop and use to store rarely changing data. This is slow card but it is cheap.

My system is ZFS only, so it would make sense to format SDcard by ZFS. Question is: what all important ashift value I should use to reduce wear and do not slow already unimpressive writes speed even more.

Bellow is just my speculations backed up by flawed experiments with different ashift-s and they need to be taken with grain of salt.

As all flash storage devices SDcard can’t just replace data in single 512 bytes sector as spinning rust disks can. All operation can be done only with chunk of data at least one AU(Allocation Unit) in size. Replacing sector can end up with SDcard controller doing 3 operations, read old data, write old + new data, erase old block (refer to 4.13.1.3). ZFS copy-on-write does not help there as we need to write whole AU(Allocation Unit) of data to avoid 3 step process, and AU seems to be too big for ZFS.

So main problem is to identify AU size for a card and match ashift to it. It is not easy, I spend good amount reading through SD Specification but did not get definitive an answer.

There are couple sources of information. First one is CSD Register. CSD stands for Card Specific Data.
I used csdinfo script to decode CSD

cat /sys/devices/.../mmc_host/mmc0/mmc0:aaaa/csd | python3 csdinfo.py

There are some interesting values in CSD. BTW: there are 2 versions of CSD structure.

  • CSD_STRUCTURE Field structures of the CSD register are different depend on the Physical Layer Specification Version
    and Card Capacity.

    my card is using CSDv2

  • ERASE_BLK_EN
    • SDC v1 - The ERASE_BLK_EN defines the granularity of the unit size of the data to be erased. If ERASE_BLK_EN=0, the host can erase one or multiple units of SECTOR_SIZE. If ERASE_BLK_EN=1 the host can erase one or multiple units of 512 bytes.
    • SDC v2 - This field is fixed to 1, which means the host can erase one or multiple units of 512 bytes.I read it as TRIM may be supported

    for my card it is 0, which should not be possible

  • SECTOR_SIZE
    • SDC v1 - The size of an erasable sector. The content of this register is a 7-bit binary coded value, defining the number of write blocks (see WRITE_BL_LEN). The actual size is computed by increasing this number by one. A value of zero means one write block, 127 means 128 write blocks.
    • SDC v2 - This field is fixed to 7Fh, which indicates 64 KBytes. This value is not related to erase operation. SDHC and SDXC Cards indicate memory boundary by AU size and this field should not be used.

    my card reports 128 blocks which means 64K, but it seems it useless for SDXC

  • READ_BL_LEN
    • SDC v1 - The maximum read data block length is computed as 2READ_BL_LEN. The maximum block length might therefore be in the range 512…2048 bytes (see Chapter 0 for details). Note that in an SD Memory Card the WRITE_BL_LEN is always equal to READ_BL_LEN
    • SDC v2 - This field is fixed to 9h, which indicates READ_BL_LEN=512 Byte.

    my card reports 512 bytes, no surprises

  • WRITE_BL_LEN Same as READ_BL_LEN

So it seems that CDC data is useless for determining optimal ashift and everything boiled down to AU size.

4.11 Memory Array Partitioning
AU (Allocation Unit): is a physical boundary of the card and consists of one or more blocks and its size depends on each card. The maximum AU size is defined for memory capacity. Furthermore AU is the minimal unit in which the card guarantees its performance for devices which complies with Speed Class Specification. The information about the size and the Speed Class are stored in the SD Status. AU is also used to calculate the erase timeout
13.2.1.1 AU
Capacities of up to 2TB and the UHS high speed interface require larger AU sizes. In the case of SDXC the maximum AU size is increased to 64MB.To record the stream data, a Speed Class host shall manage the memory area in units of an AU and use only completely free AUs (zero fragmentation) to record the data.
Note: that all AU sizes larger than 4MB are integer multiples of 4MB and performance is measured over each 4MB sub-unit of an AU.

Problem is I can’t find a way to determine size of AU on my card. Card returns is a part of 4.10.1 Card Status which is part of 4.9.1 R1 (normal response command). I’m not sure what does it means exactly. But AU can be from 16kb to 64Mb.

There is also UHS_AU_SIZE his 4-bit field indicates AU Size for UHS-I card. ant it vary from 1Mb to 64Mb. My card is UHS-I so I would assume that AU for my card is at least 1Mb. Biggest ashift gives us 64k chunks. So AU is way bigger than ZFS can use.

In the end I decided to go with biggest ashift supported to reduces number of copy-AU-and-add-some-data events. I also disable auto-TRIM, I guess periodic TRIM would be better for card life. ashift=16 translates to 64k blocks.

sudo zpool create -m none -o ashift=16 -o autotrim=off sdcard /dev/mmcblk0

I also run some speed tests on empty card with different ashift-s, and same test on nearly full card with ashift=16 only

  • write tests as
    time cp /tmp/ubuntu-21.10-live-server-amd64.iso /mnt/tmp/sdcard/F1.iso && time zpool export sdcard
    

    I repeat it 3 times for each ashif every time create new file on sdcard. zpool export guarantees that all data was written to card.

  • read test was run after pool was exported/imported so ARC was empty.
     time cp /mnt/tmp/sdcard/F1.iso  /dev/null
    

Results: (in seconds for 1229209 kbytes test file ):

operation ashift=12 ashift=14 ashift=16(empty) ashift=16(90% full)
write 75 75 77 83
read 18 18 18 18

Results are inconclusive, to make them useful I would need to repeat same tests on nearly full card with different ashift-s. but it will take too long and I’m too lazy. So only conclusion I can make from the experiment that write performance is about a same when card is nearly full.

Updated: