Normally if you detect an error(bad sector) on scsi disk it is highly recommended to change hard disk of heavy duty server. You can also use the smartctl utility for further investigation. With the reservation that I might have done things slightly wrong, here's my actions: Check current status of units and drives: Code: sudo ./tw_cli /c0 show Unit UnitType Status %RCmpl %V/I/M

It will probably still have writes going to the remaining drives, and I don't know how this is handled.

Check that the drive is actually removed from the unit: Code: sudo ./tw_cli /c0 show Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrify ------------------------------------------------------------------------------ u0 RAID-1 OK - - - Is there any option other than replace the HDD?

Error logging capability: (0x01) Error logging supported. One of my drives started giving a SMART error: Offline uncorrectable sector. It "wraps" after 49.710 days.

Self-test supported. Offline data collection capabilities: (0x73) SMART execute Offline immediate. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 31 eb 50 e4 Error: UNC at LBA Total time to complete Offline data collection: ( 422) seconds.

Total time to complete Offline data collection: ( 584) seconds. If this assumption is wrong you don't have to read any further. SCT Feature Control supported. If number of bad sectors keep growing, the drive should be replaced.

SCT Feature Control supported. I read the manpages for lvm, and I didn't find anything that explained how to convert the lvm sizes and extents to physical sector numbers.

SCT capabilities: (0x10b9) SCT Status supported. Error 25 occurred at disk power-on lifetime: 9481 hours (395 days + 1 hours) When the command that caused the error occurred, the device was active or idle. Hello, I have SMART monitoring set up and received two emails just recently that i'm have a peek here If ´+´ is specified, a report is only printed if the number of sec‐ tors has increased since the last check cycle.

Auto Offline data collection on/off support.

Total time to complete Offline data collection: (2880) seconds. Join & Write a Comment Already a member? Note that you already have 1240 reallocated sectors, that's never a good thing.You better have good backups, 3TB are going to take a good while to resync, and with raid5 if

hard-drive hardware-failure smart bad-blocks share|improve this question edited Apr 29 at 5:44 fixer1234 11.2k122949 asked Jan 10 '13 at 1:08 Christian 168112 add a comment| 2 Answers 2 active oldest votes from syslog: # grep sdb syslog Nov 25 08:26:14 hostname smartd[3968]: Device: /dev/sdb, 1 Currently unreadable (pending) sectors Nov 25 08:26:14 hostname smartd[3968]: Device: /dev/sdb, 1 Offline uncorrectable sectors Nov 25 Ser Olmy View Public Profile View LQ Blog View Review Entries View HCL Entries Find More Posts by Ser Olmy 11-17-2013, 09:35 AM #3 TobiSGD Moderator Registered: Dec 2009 Check This Out Follow the user guide and my advice is to not hot-swap the drive.

Suspend Offline collection upon new command.

The /proc/mdstat file currently contains the following: Personalities : [raid6] [raid5] [raid4] md127 : active raid5 sda[0] sdd[3] sdc[2] sdb[4] 5860526592 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] So it is a very high risc not to replace the drive soon. But one more fail in one of the remaining drives and it will lead to data loss. etc ...

I looked around the forums but I couldn't find a great match. Craig Henry replied Feb 3, 2012 You should definitely back up system ASAP & consider replacing your drive since this is a production machine if it is under warranty, don't hesitate Short self-test routine recommended polling time: ( 1) minutes. If you do not know which drive is ada3, use the serial number of the drive.

SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was I contacted WD and they gave me an RMA number. Good thing you have a RAIDZ2 but you will need to replace the failing drive. General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error.

It might be today or a few weeks from now. Hot Network Questions I came from a distant land Is it unethical of me and can I get in trouble if a professor passes me based on an oral exam without

SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Thanks, -Sean Quote Report Content Go to Page Top subzero79 Moderator Likes Received 464 Posts 5,604 2 Oct 13th 2014, 5:54pm RAID5 gives you n-1 redundancy. smartd is able to send mails by itself, but it's most likely not configured this way so I assume that these mails are sent by logcheck.