SUMMARY
Software RAID / mdadm
ISSUE
*** Caution: For advanced users only. *** May cause un-repairable damage! ***
Software RAID
Contents
|
The Linux Software RAID system is used in place of 3ware hardware-based RAID controllers in the following DPUs:
- Recovery-171
- Recovery-300
- Recovery-610
- Recovery-710
In these systems, software RAID is used on both the operating system and storage partitions. Software RAIDs are also used to complement 3ware RAIDs on the following systems:
- Recovery-720
- Recovery-730
In these systems, software RAID is used to stripe the operating system drives, and traditional hardware RAID is used for the storage partition. Gathering information about the RAID If a DPU is equipped with a software RAID, its configuration and status is reported in real time by viewing the contents of the /proc/mdstat file:
[root@BTFTWUNI002 ~]# cat /proc/mdstat Personalities : [raid1] [raid0] md5 : active raid0 md3[0] md4[1] 2891000832 blocks 64k chunks md1 : active raid1 sdb3[1] sda3[0] 4008128 blocks [2/2] [UU] md3 : active raid1 sdb5[1] sda5[0] 1445873984 blocks [2/2] [UU] md2 : active raid1 sdd1[1] sdc1[0] 20008832 blocks [2/2] [UU] md4 : active raid1 sdd2[1] sdc2[0] 1445126976 blocks [2/2] [UU] md0 : active raid1 sdb2[1] sda2[0] 15004608 blocks [2/2] [UU] unused devices:
Each mdX entry refers to an active software RAID device, and takes the following form:
RAID device : state type device[number] device[number]… SIZE blocks [active devices / total devices] [deviceState deviceState…]
Each of these is a virtual device which is either used directly, or used as a constituent of another RAID device.
- The active keyword describes the state of the RAID device. All devices should be active in a normally functioning DPU.
- The raid1/raid0 keyword describes the type of this RAID.
- The items in the format sdXN[M] describe the devices which compose the RAID. These can be physical devices (starting with sd) or other RAID devices (starting with md). All Unitrends software RAIDs as of this writing are composed of exactly two other devices.
- The number in brackets following each device indicate the number of that device in the RAID. This is determined automatically, and rarely significant.
- The [UU] indicates the health of the RAID. Each U corresponds to a constituent device of the RAID. Failed, offline, or missing devices would be indicated with an underscore, as in [U_] or [_U].
There should never be any unused devices, since we do not configure spare devices into any of our software RAIDs. Degraded RAIDs
A degraded Software RAID with a Failed Device looks like this:
[root@sp-vdpu5 ~]# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sde[1] sdd[0] 1048512 blocks [2/2] [UU] md0 : active raid1 sdc[1] sdb[2](F) 1048512 blocks [2/1] [_U] unused devices:
Here, the virtual disk device /dev/md0 is active, but degraded, because one of its disks, /dev/sdb has failed, indicated by the (F), and the missing U. Since this is a mirrored array (RAID1), /dev/md0 can still be used until /dev/sdc also fails.
A degraded software RAID with a Missing Device looks like this:
[root@sp-vdpu5 ~]# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sde[1] sdd[0] 1048512 blocks [2/2] [UU] md0 : active raid1 sdc[1] 1048512 blocks [2/1] [_U] unused devices:
Here, the virtual disk device /dev/md0 is also active and degraded. It's running off of the physical device /dev/sdc, but the other half of the mirror is missing.
Repairing Degraded RAIDs
mdadm is the command for creating, manipulating, and repairing software RAIDs.
Failed drive in RAID
If a device has failed, it must be removed before it can be re-added. Use the following command to remove all failed disks from a RAID. Note you must specify the particular RAID device in question:
mdadm --remove /dev/md0 failed
Replace the drive as necessary, then add it back into the appropriate RAID:
mdadm --re-add /dev/md0 /dev/sdb
Missing drive in RAID More often than not, when a software RAID is degraded, the nature of the problem causes the drive to no longer show up at all.
md0 : active raid1 sdc2[1] 1048512 blocks [2/1] [_U]
When a device is missing, the RAID system will not tell you which device used to be a part of that RAID. To re-assemble the RAID, note the device that is still in the RAID - in this case, it's /dev/sdc2. Consult the RAID configuration document for the model of DPU with which you're working to determine the corresponding device in the RAID pair. Generally speaking, /dev/sdaX and /dev/sdbX are always associated, while /dev/sdcX and /dev/sddX are always associated. So, since /dev/sdc2 is in this RAID, the missing complementary device is /dev/sdd2.
Add the missing complementary device with this command:
mdadm --re-add /dev/md0 /dev/sdd2
Monitoring rebuild status
Use this command to view the status of a rebuild:
[root@sp-vdpu5 ~]# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sde[1] sdd[0] 1048512 blocks [2/2] [UU] md0 : active raid1 sdb[0] sdc[1] 1048512 blocks [2/1] [_U] [==============>......] recovery = 70.2% (737152/1048512) finish=0.1min speed=40952K/sec unused devices:
If more than one RAID using the same physical device needs to be rebuilt, only one will rebuild at a time, and you'll see a DELAYED message on the others.
About physical device naming
When the DPU boots up, each physical disk drive is given a device name as the bus is enumerated. Each disk is discovered in the order in which it is connected to the bus, the so the device in SATA port 0 is found before the device in SATA port 1. If a RAID card is present, every unit configured on the hardware RAID is also giving a device name in order. Device names proceed alphabetically from sda:
/dev/sda /dev/sdb /dev/sdc …
Under normal circumstances, drive will always be discovered in the same order, and so will always receive the same assignment. For example, the drive in the second port of a Recovery 710 should always be assigned /dev/sdb.
If we remove the second drive from a cold Recovery 710 and then boot it, the /dev/sdb assignment will be given to the drive in the THIRD port, /dev/sdc will be assigned to the forth.
If we then insert the drive back into the second port, the drive will be assigned the next available drive label - /dev/sdd.
This is not a bug. Device names are not intended to statically identify the location of a drive.
The software RAID system is not directly affected by this, since it uses partition UUIDs instead of device names to identify devices. A previously existing RAID can be reconstructed regardless of the names associated with its constituent devices. If the RAID is missing a device, and the engineer needs to identify the name of the device which should be re-added.
You can do this with the following command:
ls -d1 /sys/bus/scsi/devices/*/block* /sys/bus/scsi/devices/0:0:0:0/block:sda /sys/bus/scsi/devices/1:0:0:0/block:sdb /sys/bus/scsi/devices/2:0:0:0/block:sdf /sys/bus/scsi/devices/3:0:0:0/block:sdd /sys/bus/scsi/devices/4:0:0:0/block:sr0 /sys/bus/scsi/devices/6:0:0:0/block:sde
This shows that the device on SATA port 0 is currently assigned device name sda, while the device on SATA port 3 is currently assigned device name sdf. sr0 refers to the optical drive.
Links
- A nice mdadm cheat sheet: http://www.ducea.com/2009/03/08/mdadm-cheat-sheet/