[RAID FAIL] How to restore RAID for certain disk on Linux server

Post Reply
User avatar
isscbta
Team Member
Posts: 130
Joined: Mon Jul 19, 2021 1:41 am
Has thanked: 15 times
Been thanked: 3 times

RAID, or Redundant Array of Independent Disks, is a method of storing duplicate data across multiple hard drives or solid-state drives (SSDs) to safeguard against loss in the event of a drive failure. Although there are various RAID levels available, not all of them are designed with redundancy in mind.

RAID enhances performance by spreading data across multiple disks and facilitating concurrent I/O operations. Its use of multiple disks also increases the mean time between failures, thereby enhancing fault tolerance.

To the operating system, a RAID array presents itself as a single, unified drive.

In some cases, RAID may malfunction as a result of high disk activity. In the following steps, we will guide you through the process of repairing it on Linux:

In your SSH, as root, run:

Code: Select all

cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md3 : active raid1 sdb1[1] sda1[0]
      3906886464 blocks super 1.2 [2/2] [UU]
      bitmap: 1/30 pages [4KB], 65536KB chunk

md0 : active raid1 nvme0n1p1[0] nvme1n1p1[1]
      33521664 blocks super 1.2 [2/2] [UU]
      
md2 : active raid1 nvme1n1p3[1] nvme0n1p3[0](F)    <------ here we see what failed: nvme0n1p3[0](F)
      965992768 blocks super 1.2 [2/1] [_U]        <------ here we see that RAID failed: [_U]
      bitmap: 8/8 pages [32KB], 65536KB chunk

md1 : active raid1 nvme1n1p2[1] nvme0n1p2[0]
      523712 blocks super 1.2 [2/2] [UU]
Restoring this into the RAID:

Code: Select all

# mark it as failed
mdadm --manage /dev/md2 --fail /dev/nvme0n1p3

# remove from RAID
mdadm --manage /dev/md2 --remove /dev/nvme0n1p3

# re-add it to RAID again
mdadm --manage /dev/md2 --add /dev/nvme0n1p3
Check progress:

Code: Select all

cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md3 : active raid1 sdb1[1] sda1[0]
      3906886464 blocks super 1.2 [2/2] [UU]
      bitmap: 2/30 pages [8KB], 65536KB chunk

md0 : active raid1 nvme0n1p1[0] nvme1n1p1[1]
      33521664 blocks super 1.2 [2/2] [UU]
      
md2 : active raid1 nvme0n1p3[0] nvme1n1p3[1]
      965992768 blocks super 1.2 [2/1] [_U]
      [=====>...............]  recovery = 28.5% (275856000/965992768) finish=173.3min speed=66338K/sec
      bitmap: 7/8 pages [28KB], 65536KB chunk

md1 : active raid1 nvme1n1p2[1] nvme0n1p2[0]
      523712 blocks super 1.2 [2/2] [UU]

Tags:
Post Reply