Intel® Virtual RAID on CPU (Intel® VROC) RAID Write Hole (RWH) Closure in Linux* Environments
Environment
Intel® VROC for Linux*
Intel® Virtual RAID on CPU (Intel® VROC) can protect RAID 5 data even when both unexpected power loss and RAID volume degradation occur at the same time. This double fault condition is referred to as RAID Write Hole (RWH). Intel® VROC supports the ability to close the RWH scenario in RAID 5 configurations using a feature called RWH Closure. This applies to Intel® VROC enabled platforms.
Note
The below information describes the specific behavior of the Intel® VROC RWH Closure feature in Linux* environments. To learn about the Intel® VROC RWH Closure feature in general, refer to Intel® Virtual RAID on CPU (Intel® VROC) RAID Write Hole (RWH) Closure.
Intel® VROC for Linux* implements distributed Partial Parity Logging (PPL) to close the RWH scenario. This feature can be enabled or disabled through the mdadm utility after the Intel® VROC RAID 5 volume is created. With PPL configuration enabled, resync of the array is not needed after a dirty shutdown. PPL configuration is disabled by default unless it is explicitly enabled when creating the RAID 5 volume using mdadm commands. It is also supported to enable PPL configuration for an active RAID 5 volume.
Enabling/Disabling the RWH Closure Feature
Intel® VROC for Linux* provides support to enable or disable the RWH protection feature through the mdadm utility when a RAID 5 volume is being created. The options support the ability to disable the feature or define the PPL configuration (with Intel® VROC 8.0 or higher, multiple PPL usage is automatic). There is no longer a --rwh-policy parameter. Instead, there is a --consistency-policy parameter (or -k as short version).
An example command to create the RAID volume with PPL configuration is the following:
$ mdadm --create /dev/md/volume -l5 --size=1G --consistency-policy=ppl -n3 /dev/sd[a-c]
To enable/disable the PPL configuration during runtime for an active RAID volume, the following command can be used:
$ mdadm --grow /dev/md/volume --consistency-policy=[ppl | resync]
In the above example command, use ppl to enable the PPL configuration or resync to disable the PPL configuration. On success, the above command does not return any output. The result can be verified by checking the details of the RAID volume. As for a successful enabling of PPL configuration, the Consistency Policy value in the volume details should be set to ppl. As for a successful disabling of PPL configuration, the Consistency Policy value in the volume details should be set to resync. The default Consistency Policy value is resync.
RWH Recovery
The Intel® VROC driver for Linux* will be able to recover from the RAID 5 volume invalid state caused by the following reasons:
- An RWH condition occurrence for all RAID 5 volumes in the system that were exposed to I/O interruption (such as dirty shutdown).
- An RWH condition occurrence when the RAID 5 volume is discovered by the driver after hot plug of all member drives except the failed drive.
- An RWH condition occurrence when the RAID 5 volume is discovered by the driver during the driver load process.
- An RHW condition occurrence when the RAID 5 volume is discovered by the driver after enabling all the member drives except the failed drive in the device management utility.
RWH Closure Considerations
Disable On-Device Cache for NVMe*
The RWH Closure feature is intended to be used with the NVMe* onboard volatile cache disabled. Enter the NVMe* drive properties to disable on-device cache before proceeding to enable the RWH Closure feature. If a RAID 5 volume has the RWH Closure feature enabled, with the Intel® VROC driver for Linux* installed, if an attempt is made to enable on-device cache on a RAID 5 member drive a warning message will be added to syslog that PPL is meant to be used with on-device volatile cache disabled.
RWH Closure PPL Distributed Mode
The RWH Closure feature implementation on Intel® VROC for Linux* will be able to close the RAID 5 RWH condition without the use of additional drives. This is referred to as the PPL Distributed mode of the RWH Closure feature.
Runtime Switching Between RWH Closure Modes
Intel® VROC for Linux* provides the user the ability to enable or disable the RWH Closure feature during normal operating system operation mode.
Intel® VROC for Linux* provides support to enable/disable the RWH protection feature through the mdadm utility on existing RAID 5 volumes in the system. The options will support the ability to disable the feature or define the PPL configuration (PPL or multiple PPLs).
Interrupted PPL Write
If the PPL write request has been interrupted and the PPL was not fully written, the RWH recovery process will not be performed for this particular RAID 5 I/O request.
Ability to Switch Between RWH Closure Modes for SATA
Intel® VROC for Linux* allows the user the ability to switch between the following RWH Closure modes during normal operating system operation mode: PPL Distributed mode and Off state for SATA RAID 5 volumes.
RWH Closure Restrictions
The following are restrictions of the RWH Closure feature:
- Intel® VROC for Linux* will block the expansion of a RAID 5 volume that is being protected by the RWH Closure feature.
- Intel® VROC for Linux* will block the changing of the strip size of a RAID 5 volume that is being protected by the RWH Closure feature.
- Intel® VROC for Linux* will block adding a drive to an existing RAID 5 volume that is being protected by the RWH Closure feature.
- When the system discovers a RAID 5 volume with RWH Closure enabled using Journaling Drive mode, Intel® VROC for Linux* will disable the RWH Closure feature. The Linux* environment does not support RWH Closure using Journaling Drive mode.
RWH Closure Configuration Example
To create a RAID 5 volume with enabled RWH Closure run the following commands. It is recommended to clear out the metadata of the drive members first.
$ mdadm -C /dev/md/imsm0 -e imsm -n4 /dev/nvme[0-3]n1
$ mdadm -C /dev/md/vol0 -l5 -n4 /dev/nvme[0-3]n1 --consistency-policy=ppl
To check the current RWH Closure policy, use the below command:
$ mdadm -D /dev/md/vol0
To enable the RWH Closure feature for a running array, execute the below command:
$ mdadm --grow /dev/md/vol0 --consistency-policy=ppl
To disable the RWH Closure feature for a running array, execute the below command:
$ mdadm --grow /dev/md/vol0 --consistency-policy=resync