Storaged Project

Filtering Devices with LVM Devices File

2026-02-18T07:13:00+00:00

To control which devices LVM can work with, it was always possible to configure filtering in the devices section of the /etc/lvm/lvm.conf configuration file. But filtering devices this way was not very simple and could lead to problems when using paths like /dev/sda which are not stable. Many users also didn’t know this possibility exists and while using this type of filtering is possible for a single command with the --config option, it is not very user friendly. This all changed recently with the introduction of the new configuration file /etc/lvm/devices/system.devices and the corresponding lvmdevices command in LVM 2.03.12. A new option --devices was also added to the existing LVM commands for a quick way to limit which devices one specific command can use.

LVM Devices File

As was said above, there is a new /etc/lvm/devices/system.devices configuration file. When this file exists, it controls which devices LVM is allowed to scan. Instead of relying on matching the device path, the devices file uses stable identifiers like WWID, serial number or UUID.

A device file on a simple system with a single physical volume on a partition would look like this:

# LVM uses devices listed in this file.
# Created by LVM command vgimportdevices pid 187757 at Fri Feb 13 16:44:45 2026
# HASH=1524312511
PRODUCT_UUID=4d58d0c1-8b67-4fa6-a937-035d2bfbb220
VERSION=1.1.1
IDTYPE=devname IDNAME=/dev/sda2 DEVNAME=/dev/sda2 PVID=rYeMgwy0mO0THDagB6k8mZkoOSqAWfte PART=2

When the devices file is enabled, LVM will only scan and operate on devices listed in it. Any device not present in the file is invisible to LVM, even if it has a valid PV header.

This is the biggest change brought in with this feature. The old lvm.conf based filters were always optional and LVM always scanned all devices in the system, unless told otherwise. This could cause problems on systems with many disks, where LVM (especially during boot) could take a long time scanning devices that did not even “belong” to it.

By default, the LVM devices file is enabled with the latest versions of LVM and on systems without preexisting volume groups, creating new LVM setups with commands like pvcreate or vgcreate will automatically add the new physical volumes to the devices file. If desired, this feature can be disabled by setting use_devicesfile=0 in lvm.conf or by simply removing the existing devices file. On systems without the devices file, LVM will simply scan all devices in the system the same way it did before introduction of this configuration file.

Managing Devices with `lvmdevices` and `vgimportdevices`

On most newly installed systems with LVM, the devices file should be already present and populated, but you might want to either create it later on systems installed with an older version of LVM, or manage some devices manually. It is possible to modify the system.devices manually, but a new command lvmdevices was added for simple management of the file.

To simply import all devices in an existing volume group, vgimportdevices can be used and for all volume groups in the system, vgimportdevices -a can be used.

A single physical volume can be added to the file with lvmdevices --adddev and removed with lvmdevices --deldev.

To check all entries in the devices file, lvmdevices --check can be used and any issues found by the check command can be fixed with lvmdevices --update.

Backups

In the sample devices file above, you might have noticed the VERSION field. This is the current version of the file. LVM automatically makes a backup of the file with every change and old versions of the file can be found in the /etc/lvm/devices/backup directory. So if you make some mistakes when changing the file with lvmdevices, you can simply restore to a previous version of the file.

Overriding the Devices File and Filtering with Commands

Together with the devices file feature, a new option --devices was added to all LVM commands. This option allows specifying devices which are visible to the command. This overrides the existing devices file so it can be used either to restrict the command to work only on a subset of devices specified in the devices file or even to allow it to run on devices not specified in the file at all.

This option is also very useful when dealing with multiple volume groups with the same name. This is a known limitation of LVM – two volume groups with the same name cannot coexist in one system and LVM will refuse to work without renaming one of them. This can be a problem when dealing with cloned disks or backups. With --devices, commands like vgs can be restricted to “see” only one of the volume groups.

Issue: Missing Volume Group

As mentioned above, when installing a new system with LVM, for the newly created volume groups, the used devices will be added to the devices file. Fedora (and RHEL) installer, Anaconda, will also add all other volume groups present during installation to the devices file so these will also be visible in the installed system. The problems start when a device with a volume group is added to the system after installation. The volume group (and any logical volumes in it) is suddenly invisible. Even commands like vgs will simply ignore it, because its physical volumes are not listed in the devices file.

This can be a problem on dual boot systems with encryption. Because the second system’s volume group is “hidden” by the encryption layer, it is not visible during installation and not added to the devices file. When the user unlocks the LUKS device in their newly installed system, they can’t access their second system. Unfortunately in this situation, the only solution is to manually add the second system’s volume group with vgimportdevices as described above.

Conclusion

The LVM devices file provides a cleaner and more reliable way to control which devices LVM uses, replacing the old lvm.conf based filtering with stable device identifiers and simple management through the lvmdevices command. Overall, for most users the devices file should work transparently without any manual configuration needed.

ATA SMART in libblockdev and UDisks

2026-01-30T17:00:00+00:00

For a long time there was a need to modernize the UDisks’ way of ATA SMART data retrieval. The ageing libatasmart project went unmaintained over time yet there was no other alternative available. There was the smartmontools project with its smartctl command whose console output was rather clumsy to parse. It became apparent we need to decouple the SMART functionality and create an abstraction.

libblockdev-3.2.0 introduced a new smart plugin API tailored for UDisks needs, first used by the udisks-2.10.90 public beta release. We haven’t received much feedback for this beta release and so the code was released as the final 2.11.0 release about a year later.

While the libblockdev-smart plugin API is the single public interface, we created two plugin implementations right away - the existing libatasmart-based solution (plugin name libbd_smart.so) that was mostly a straight port of the existing UDisks code, and a new libbd_smartmontools.so plugin based around smartctl JSON output.

Furthermore, there’s a promising initiative going on: the libsmartmon library and if that ever materializes we’d like to build a new plugin around it - likely deprecating the smartctl JSON-based implementation along with it. Contributions welcome, this effort deserves more public attention.

Whichever plugin gets actually used is controlled by the libblockdev plugin configuration - see /etc/libblockdev/3/conf.d/00-default.cfg for example or, if that file is absent, have a look at the builtin defaults: https://github.com/storaged-project/libblockdev/blob/master/data/conf.d/00-default.cfg. Distributors and sysadmins are free to change the preference so be sure to check it out. Thus whenever you’re about to submit a bugreport upstream, please specify which plugin you do use.

Plugin differences

libatasmart plugin:

small library, small runtime I/O footprint
the preferred plugin, stable for decades
libatasmart unmaintained upstream
no internal drive/quirk database, possibly reporting false values for some attributes

smartmontools plugin:

well-maintained upstream
extensive drivedb, filtering out any false attribute interpretation
experimental plugin, possibly to be dropped in the future
heavy on runtime I/O due to additional device scanning and probing (ATA IDENTIFY)
forking and calling smartctl

Naturally the available features do vary across plugin implementations and though we tried to abstract the differences as much as possible, there are still certain gaps.

The libblockdev-smart API

Please refer to our extensive public documentation: https://storaged.org/libblockdev/docs/libblockdev-SMART.html#libblockdev-SMART.description

Apart from ATA SMART, we also laid out foundation for SCSI/SAS(?) SMART, though currently unused in UDisks and essentially untested. Note that NVMe Health Information has been available through the libblockdev-nvme plugin for a while and is not subject to this API.

Attribute names & validation

We spent great deal of effort to provide unified attribute naming, consistent data type interpretation and attribute validation. While libatasmart mostly provides raw values, smartmontools benefits from their drivedb and provide better interpretation of each attribute value.

For the public API we had to make a decision about attribute naming style. While libatasmart only provides single style with no variations, we’ve discovered lots of inconsistencies just by grepping the drivedb.h. For example attribute ID 171 translates to program-fail-count with libatasmart while smartctl may report variations of Program_Fail_Cnt, Program_Fail_Count, Program_Fail_Ct, etc. And with UDisks historically providing untranslated libatasmart attribute names, we had to create a translation table for drivedb.h -> libatasmart names. Check this atrocity out in https://github.com/storaged-project/libblockdev/blob/master/src/plugins/smart/smart-private.h. This table is by no means complete, just a bunch of commonly used attributes.

Unknown attributes or those that fail validation are reported as generic attribute-171. For this reason consumers of the new UDisks release (e.g. Gnome Disks) may spot some differences and perhaps more attributes reported as unknown comparing to previous UDisks releases. Feel free to submit fixes for the mapping table, we’ve only tested this on a limited set of drives.

Oh, and we also fixed the notoriously broken libatasmart drive temperature reporting, though the fix is not 100% bulletproof either.

We’ve also created an experimental drivedb.h validator on top of libatasmart, mixing the best of both worlds, with uncertain results. This feature can be turned on by the --with-drivedb[=PATH] configure option.

Disabling ATA SMART functionality in UDisks

UDisks 2.10.90 release also brought a new configure option --disable-smart to disable ATA SMART completely. This was exceptionally possible without breaking public ABI due to the API providing the Drive.Ata.SmartUpdated property indicating the timestamp the data were last refreshed. When disabled compile-time, this property remains always set to zero.

We also made SMART data retrieval work with dm-multipath to avoid accessing particular device paths directly and tested that on a particularly large system.

Drive access methods

The ID_ATA_SMART_ACCESS udev property - see man udisks(8). This property was a very well hidden secret, only found by accident while reading the libatasmart code. As such, this property was in place for over a decade. It controls the access method for the drive. Only udisks-2.11.0 learned to respect this property in general no matter what libblockdev-smart plugin is actually used.

Those who prefer UDisks to avoid accessing their drives at all may want to set this ID_ATA_SMART_ACCESS udev property to none. The effect is similar to compiling UDisks with ATA SMART disabled, though this allows fine-grained control with the usual udev rule match constructions.

Future plans, nice-to-haves

Apart from high hopes for the aforementioned libsmartmon library effort there are some more rough edges in UDisks.

For example, housekeeping could use refactoring to allow arbitrary intervals for specific jobs or even particular drives other than the fixed 10 minutes interval that is used for SMART data polling as well. Furthermore some kind of throttling or a constrained worker pool should be put in place to avoid either spawning all jobs at once (think of spawning smartctl for your 100 of drives at the same time) or to avoid bottlenecks where one slow housekeeping job blocks the rest of the queue.

At last, make SMART data retrieval via USB passthrough work. If that happened to work in the past, it was a pure coincidence. After receiving dozen of bugreports citing spurious kernel failure messages that often led to a USB device being disconnected, we’ve disabled our ATA device probes for USB devices. As a result the org.freedesktop.UDisks2.Drive.Ata D-Bus interface gets never attached for USB devices.

Partitioning with Ansible Storage Role: Partitions

2025-12-15T09:13:00+00:00

The storage role always allowed creating and managing different storage technologies like LVM, LUKS encryption or MD RAID, but one technology seemed to be missing for a long time, and surprisingly, it was the most basic one, the actual partitioning. Support for partition management was always something that was planned for the storage role, but it was never a high priority. From the start, the role could create partitions. When creating a more complex storage setup on an empty disk, for example creating a new LVM volume group or adding a new physical volume to an existing LVM setup, the role would always automatically create a single partition on the disk. But that was all the role could do, just one single partition spanning the entire disk.

The reason for this limitation was simple: creating multiple partitions is something usually reserved for the OS installation process, where users need to have separate partitions required by the bootloader, like /boot and /boot/efi. The more advanced “partitioning” is then delegated to a more complex storage technologies like LVM, which is where most of the changes are done in an existing system and where users will usually employ Ansible to make changes later.

But the requirement for more advanced partition management was always there, and since the 1.19 release, the role can now create and manage partitions in the Ansible way.

Partition Management with Storage Role

The usage of the role for partition management is simple and follows the same logic as the other storage technologies, with the management divided into two parts: managing the storage_pools, which in the case of partitions is the underlying disk (or to be more precise, the partition table), and the volumes, which are the partitions themselves. A simple playbook to create two partitions on a disk can look like this:

  roles:
    - name: linux-system-roles.storage
      storage_pools:
        - name: sdb
          type: partition
          disks: sdb
          volumes:
            - name: sdb1
              type: partition
              size: 1 GiB
              fs_type: ext4
            - name: sdb2
              type: partition
              size: 10 GiB
              fs_type: ext4

and the partitions it creates will look like this

NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS FSTYPE
sdb      8:16   0  20G  0 disk
├─sdb1   8:17   0   1G  0 part             ext4
└─sdb2   8:18   0  10G  0 part             ext4

Other filesystem-related properties (like mount_point or fs_label) can be specified, and these work in the same way as for any other volume type.

The only property that is specific to partitions is part_type, which allows you to choose a partition type when using the MBR/MSDOS partition table. Supported types are primary, logical and extended. If you don’t specify the partition type, the role will create the first three partitions as primary and for the fourth one, add an extended partition and create it as a logical partition inside it. On GPT, which is used as the default partition table, the partition type is ignored.

Encrypted partitions can be created by adding the encryption: true option for the partition and setting the passphrase:

  roles:
    - name: linux-system-roles.storage
      storage_pools:
        - name: sdb
          type: partition
          disks: sdb
          volumes:
            - name: sdb1
              type: partition
              size: 1 GiB
              fs_type: ext4
              encryption: true
              encryption_password: "aaaaaaaaa"
            - name: sdb2
              type: partition
              size: 10 GiB
              fs_type: ext4
              encryption: true
              encryption_password: "aaaaaaaaa"

Don’t forget that adding the encryption layer is a destructive operation – if you run the two playbooks above one after another, the filesystems created by the first one will be removed, and all data on them will be lost. Adding the LUKS encryption layer (so-called re-encryption) is currently not supported by the role.

Idempotency and Partition Numbers

One of the core principles of Ansible is idempotency, or the ability to re-run the same playbook, and if the system is in the state specified by the playbook, no changes will be made.

This is true for partitioning with the storage role as well. When running the playbook from our example above for the second time, the role will check the sdb disk and look for the two specified partitions. And if there are two partitions 1 and 10 GiB large, it won’t do anything. This is how the role works in general, but with partitions, there is a new challenge: partitions don’t have unique names and using partition numbers for idempotency can be tricky.

Did you know that partition numbers for logical partitions are not stable? If you have two logical partitions sdb5 and sdb6, removing the sdb5 partition will automatically re-number the sdb6 partition to sdb5.

Predicting the partition name is not always straightforward. For example, disks that end in a number (common with NVMe drives) require adding a p separator before the partition number (nvme0n1 becomes nvme0n1p1).

For these reasons, the role requires explicitly using the state: absent option to remove a partition, and partitions can be referred to by their numbers in the playbooks as well as their full names. So, for example, the following playbook will resize the sdb2 partition from our first example

  roles:
    - name: linux-system-roles.storage
      storage_pools:
        - name: sdb
          type: partition
          disks: sdb
          volumes:
            - name: 2
              type: partition
              size: 15 GiB
              fs_type: ext4

and the first partition won’t be removed, because it is not explicitly mentioned as absent, only omitted in the playbook:

NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS FSTYPE
sdb      8:16   0  20G  0 disk
├─sdb1   8:17   0   1G  0 part             ext4
└─sdb2   8:18   0  15G  0 part             ext4

Feedback and Future Features

With this change, the storage role can now manage all basic storage technologies. We are of course not yet covering all the potential features, but we are always looking for more ideas from our users. If you have any features you’d like to see in the role, please don’t hesitate and let us know.

Partitioning with Ansible Storage Role: VDO

2023-10-05T12:17:00+00:00

This time we shall talk about Storage Role support of VDO. The abbreviation stands for Virtual Data Optimizer and that is exactly what it does. It reduces stored data size to save space. To be precise Storage Role utilizes LVM version of VDO called (what a surprise) LVM VDO.

How Does VDO Do It

There are two main options VDO uses to reduce the data size:

Data compression
Data deduplication

Data compression works just like regular file compression. However VDO packs and unpacks blocks of data automatically and on lower level, so user does not even know about it happening.

The same goes for data deduplication. VDO identifies and removes duplicit blocks of data. Redundant blocks are removed and the last remaining copy of the block gets to do all their work.

During the VDO device creation compression and deduplication can be turned on or off.

Using VDO in Storage Role

We should be already pretty confident about how to use the Storage Role and using VDO is not much different.

To use it the storage_pool has to be created. Then set true to one or both of the options compression and deduplication on one of the volumes. This will tell the role to use VDO. Please note that currently the Storage Role supports only one VDO volume per storage_pool.

You also want to set both the vdo_pool_size and size options.

Why two sizes? The first size represented by vdo_pool_size option is actual physical space reserved for the compressed data.

The other option - size - tells the device how should it present itself on the outside. This value is virtual and can (and it is supposed to) be larger than reserved physical space. By how much is left to users discretion and should be based on estimation of data compressibility.

The playbook for VDO creation then should look like this:

    ---
    - hosts: all
    become: true
    vars:
    storage_safe_mode: false

    tasks:
    - name: Create LVM VDO volume under volume group 'vg1'
    include_role:
    name: linux-system-roles.storage
    vars:
    storage_pools:
            - name: vg1
            disks:
            - "dev/sda"
            - "dev/sdb"
            - "dev/sdc"
            volumes:
            - name: test1
            compression: true
            deduplication: true
            vdo_pool_size: "9 GiB" # space taken on disk
            size: "12 GiB"         # virtual space
            mount_point: "/opt/test1"
            state: present

Things to Know Before Creating a VDO Device

As goes for all more advanced features that Storage Role provides, VDO is meant for specific use cases.

Some data such as logs are much easier to compress or deduplicate. This makes them much better candidates. On the other hand, using VDO with data that are often modified or already scrambled by encryption can result in just an additional strain on resources.

Data that cannot be easily deduplicated or compressed can also cause a situation when user runs out of physical storage space with VDO showing lots of free space left.

Since the system has no way of telling what kind of data are eventually going to be put on which device, the responsibility of choosing wisely falls upon the user.

Couple of Tips at the End

And that’s it. As I already mentioned, Storage Role VDO uses LVM VDO so its manpages are a good point to start if you want to know more about it. And for more general information about VDO you can also check VDO project on Github.