{"id":419,"date":"2018-03-03T10:53:31","date_gmt":"2018-03-03T09:53:31","guid":{"rendered":"https:\/\/tollana.d-tor.org\/notes-to-self\/?p=419"},"modified":"2018-03-03T10:53:31","modified_gmt":"2018-03-03T09:53:31","slug":"dusting-off-the-array-part-7","status":"publish","type":"post","link":"https:\/\/tollana.d-tor.org\/notes-to-self\/?p=419","title":{"rendered":"Dusting off the Array! (Part 7)"},"content":{"rendered":"<h3>What happened<\/h3>\n<p>Oh my, I really hope this is the final chapter of this fucking story&#8230; On Feb. 20th, 2018, one of the HGST disks failed (what a surprise!). Since the serial number reported by hdparm bears absolutely <strong>no<\/strong> resemblance to the serial number printed on the label of the disk (thanks a bunch HGST, BTW), I pulled the wrong disk, inserted the spare, started the rebuild and&#8230; Lo and behold! The failing disk still failed! My attempts to recover the RAID destroyed it completely.<\/p>\n<p>It was already kaputt when I thought of a way to identify the failing disk&#8217;s slot:<\/p>\n<pre># badblocks &lt;reported device by kernel&gt;<\/pre>\n<p>This should light up the LED permanently on the external SATA casing.<\/p>\n<p>I was so fed up that I ordered 4 2TB SSD-Disks shortly after that. Yesterday (Mar. 2nd, 2018) I finally had time to install them. Of course this setup has its quirks, too, but at least I can identify the disks via hdparm. The serial reported is actually the serial on the label:<\/p>\n<pre>HDD1:\u00a0SerialNo 1744197E67EE\r\nHDD2:\u00a0SerialNo 1744197E7B92\r\nHDD3:\u00a0SerialNo 1744197E7104\r\nHDD4:\u00a0SerialNo 1744197E836D<\/pre>\n<h3>The Quirks<\/h3>\n<p>Of course it didn&#8217;t just work out of the box\u2122. When I booted with the shiny, new SSD disks, hadante got stuck at the BIOS splash screen while HDD3 was throwing a shining, red light. I pulled HDD3 and HDD4, rebooted and got a login prompt. Since SATA is hot-pluggable, I inserted HDD3 and 4. Fortunately, they showed up on the SCSI-Bus (cries of joy!).<\/p>\n<p>I created a RAID5 with:<\/p>\n<pre># mdadm --create \/dev\/md1 --level=5 --raid-devices=4 \/dev\/sd[efgh]<\/pre>\n<p>and waited until today (Mar. 3rd, 2018) for the rebuild to finish. After that I tested the setup:<\/p>\n<ul>\n<li>Power off hadante<\/li>\n<li>Turn off the external casing<\/li>\n<li>Wait about 30 seconds<\/li>\n<li>Turn on the external casing and then hadante<\/li>\n<li>Wait eagerly&#8230;<\/li>\n<\/ul>\n<p>&#8230; and watch the kernel error messages scrolling down the screen \ud83d\ude41<\/p>\n<h3>The solution<\/h3>\n<p>Note the (not really) failing drive by staring at the LEDs of the external casing. Power off hadante and pull <strong>the failing drive and any other non-failing drive<\/strong>! It&#8217;s important to pull 2 drives, so the kernel cannot assemble the RAID! Then reboot and stop <strong>failing<\/strong> the RAID:<\/p>\n<pre># mdadm -S \/dev\/md?<\/pre>\n<p>Now hot-plug the missing drives, reboot again and be amazed how everything magically works again \ud83d\ude42<\/p>\n<p>From my observations the drives in the external bay are recognized until you <strong>cut the power<\/strong>, but that&#8217;s just a guess.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>What happened Oh my, I really hope this is the final chapter of this fucking story&#8230; On Feb. 20th, 2018, one of the HGST disks failed (what a surprise!). Since the serial number reported by hdparm bears absolutely no resemblance to the serial number printed on the label of the disk (thanks a bunch HGST, &hellip; <a href=\"https:\/\/tollana.d-tor.org\/notes-to-self\/?p=419\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Dusting off the Array! (Part 7)<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[116,77],"tags":[58,26],"class_list":["post-419","post","type-post","status-publish","format-standard","hentry","category-dusting-off-the-array","category-linux","tag-mdadm","tag-ssd"],"_links":{"self":[{"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=\/wp\/v2\/posts\/419","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=419"}],"version-history":[{"count":5,"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=\/wp\/v2\/posts\/419\/revisions"}],"predecessor-version":[{"id":424,"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=\/wp\/v2\/posts\/419\/revisions\/424"}],"wp:attachment":[{"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=419"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=419"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tollana.d-tor.org\/notes-to-self\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=419"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}