thelastpsion,
@thelastpsion@bitbang.social avatar

It's RAID time. I've started the server and I'm just having a look around before I start poking around with mdadm.

The RAID10 has 4x 3TB drives. The first two (sda, sdb) are WD Red CMR, The second two (sdc, sdd) are ancient WD Greens, that won't re-add to the array. The plan was to replace the Greens with more Reds.

SMART tests on all drives: No reallocated sectors, no sectors pending reallocation. The Greens aren't dead. This surprised me.

However... sdb has 113 raw read errors. Hmm...

thelastpsion,
@thelastpsion@bitbang.social avatar

Sidenote: When interviewing people for IT jobs, I have a #RAID question.

In a 4-drive array, what is the most reliable solution: RAID6 or RAID 10?

The answer is RAID6, as you can lose any two drives. With RAID10, you can only lose two drives that aren't in the same mirror. (Yes, I know there's risks with vibration. That's why no RAID5.)

Once you get to 6 drives, the odds change, and RAID10 is statistically better.

Of course, for my own 4-drive array, I didn't follow my own advice.

thelastpsion,
@thelastpsion@bitbang.social avatar

Having said that, this isn't necessarily looking like a hardware fault, at least not in the way I was expecting. The WD Greens, in spite of how old they are (and their reputation for unreliability), could still be usable as spares.

It does mean that I might have to buy another drive, though. I don't like those read errors, in spite of the fact that SMART doesn't seem to think that it's an issue yet.

thelastpsion,
@thelastpsion@bitbang.social avatar

So, running mdadm --assemble with all the drives, this is what I get.

When I first thought there was an issue last week, I tried re-adding sdc to the array. When it failed, I shut the server down. It now says it's a spare, which is worrying.

sdd won't re-add because it's too old (in dmesg: "kicking non-fresh sdd1 from array!").

Bear in mind that I don't need any recent writes, so I don't really care about corruption on any data from the past, say, 3 months.

Any ideas, people?

#mdadm

thelastpsion,
@thelastpsion@bitbang.social avatar

Just done an mdadm --examine on the drives.

sda1, sdb1 and sdc1 all have Last Update times of Thu Mar 28 15:59. sdd1's Last Update time is Wed Mar 27 05:32. The only writes during that 34-hour gap would have been from Syncthing, so I have all that elsewhere.

However, sdc1 has a "bad blocks present" error.

This looks like a classic case of a drive dropping off an array and exposing corruption elsewhere.

Can I re-add sdd1 to the array read-only, without completely breaking it?

#mdadm

thelastpsion,
@thelastpsion@bitbang.social avatar

PRAISE INSERT DEITY! IT LIVES!

Overlays have saved my bacon and resurrected the mdraid array! At least I think so... I haven't tried copying files off yet.

Now that I know that all is not lost, I can back up anything I'm missing to multiple locations. I've got a stash of hard drives upstairs in various sizes, so I'll do health checks on them all and put them to use.

This is the article that worked for me, in case anyone ever needs it.

https://raid.wiki.kernel.org/index.php/Recovering_a_damaged_RAID#Overlay_manipulation_functions

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • DreamBathrooms
  • everett
  • osvaldo12
  • magazineikmin
  • thenastyranch
  • rosin
  • normalnudes
  • Youngstown
  • Durango
  • slotface
  • ngwrru68w68
  • kavyap
  • mdbf
  • InstantRegret
  • JUstTest
  • ethstaker
  • GTA5RPClips
  • tacticalgear
  • Leos
  • anitta
  • modclub
  • khanakhh
  • cubers
  • cisconetworking
  • provamag3
  • megavids
  • tester
  • lostlight
  • All magazines