[colug-432] I/O Error?
Bill Schwanitz
bilsch at bilsch.org
Thu Dec 6 11:15:39 EST 2012
On Dec 6, 2012, at 10:26 AM, Thomas W. cranston <thomas.w.cranston at gmail.com> wrote:
> I don't Know. Perhaps I do not know how to interpret the data.
>
> Looking at the man right now.
>
> What are you getting at?
>
> Tom
Tom,
If you want email the output of smartctl -a <dev> to the list.
I put *** in front of the ones you really want to look at. Note, not all drives have the same attributes.
I will also echo the comment from Rob Funk - if you have important stuff on that drive I'd make a backup of that data/get it off of that disk while you still can. Look towards the end of this email to see a really dead drive logs for comparison.
╰─○ sudo smartctl -a /dev/sdb <<<
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-2.6.32-71.29.1.el6.x86_64] (local build)
(snip)
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 060 052 006 Pre-fail Always - 4891588
3 Spin_Up_Time 0x0003 096 096 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 096 096 020 Old_age Always - 4758
*** 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 073 060 030 Pre-fail Always - 103678017425
9 Power_On_Hours 0x0032 050 050 000 Old_age Always - 44500
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 445
194 Temperature_Celsius 0x0022 035 058 000 Old_age Always - 35 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 060 051 000 Old_age Always - 4891588
*** 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
*** 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
*** 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
*** 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 42335 -
# 2 Extended offline Completed without error 00% 30238 -
# 3 Extended offline Completed without error 00% 21211 -
# 4 Extended offline Completed without error 00% 19939 -
-- really dead drives --
( note, these are different hosts/drives, examples only but real )
smartctl output
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
*** 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 2218
2 Throughput_Performance 0x0026 055 055 000 Old_age Always - 8480
3 Spin_Up_Time 0x0023 071 070 025 Pre-fail Always - 8984
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 24
5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0
8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 8360
10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 24
13 Read_Soft_Error_Rate 0x003a 100 100 000 Old_age Always - 0
191 G-Sense_Error_Rate 0x0022 252 252 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 25
194 Temperature_Celsius 0x0002 064 064 000 Old_age Always - 24 (Lifetime Min/Max 16/33)
195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0
*** 197 Current_Pending_Sector 0x0032 002 002 000 Old_age Always - 8143
*** 198 Offline_Uncorrectable 0x0030 252 093 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 0
240 Head_Flying_Hours 0x0032 100 100 000 Old_age Always - 8360
241 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 16766
242 Total_LBAs_Read 0x0032 100 100 000 Old_age Always - 1913
254 Free_Fall_Sensor 0x0032 100 100 000 Old_age Always - 1
$ dmesg
sd 4:0:0:0: [sde] CDB: Read(10): 28 00 6e 79 e3 90 00 00 08 00
end_request: I/O error, dev sde, sector 1853481872
sd 4:0:0:0: [sde] Unhandled error code
sd 4:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 4:0:0:0: [sde] CDB: Read(10): 28 00 6e 79 e3 90 00 00 08 00
end_request: I/O error, dev sde, sector 1853481872
sd 4:0:0:0: [sde] Unhandled error code
sd 4:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 4:0:0:0: [sde] CDB: Read(10): 28 00 6e 79 e3 98 00 00 08 00
end_request: I/O error, dev sde, sector 1853481880
sd 4:0:0:0: [sde] Unhandled error code
email from smarts
The following warning/error was logged by the smartd daemon:
Device: /dev/sdc [SAT], FAILED SMART self-check. BACK UP DATA NOW!
For details see host's SYSLOG (default: /var/log/messages).
You can also use the smartctl utility for further investigation.
No additional email messages about this problem will be sent.
More information about the colug-432
mailing list