The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Is my HDD really going to fail?

Discussion in 'General Discussion' started by evisions, Feb 1, 2005.

  1. evisions

    evisions Well-Known Member

    Joined:
    Jan 25, 2004
    Messages:
    68
    Likes Received:
    0
    Trophy Points:
    6
    Back in December I had an issue with one of my Hard Drives and have to run chkfs to repair it. Since then I have been getting a daily email from cpanel telling me the hard drive is going to fail (see below). It didn't give me any notice before the problems I was having in December.... The content of the message hasn't changed in the last month and a half since I started receiving the message. So the question is, is it really going to fail? If not how can I make this message stop?

    --------------------------------------------------------
    IMPORTANT: Do not ignore this email.
    You should backup all the data on the hard drives listed below and replace them as soon as possible.
    S.M.A.R.T has detected that they are not peforming within normal operating paramaters.

    Excessive ATA Errors on disk /dev/hdc. Please consider replacing this drive. Some Errors may be normal due to not 100% compatible IDE controllers and may be ignored.

    SMART Error Log:
    SMART Error Logging Version: 1
    Error Log Data Structure Pointer: 05
    ATA Error Count: 265
    Non-Fatal Count: 0

    Error Log Structure 1:
    DCR FR SC SN CL SH D/H CR Timestamp
    00 00 08 89 d2 d7 e0 35 33381
    00 00 08 71 0a d8 e0 35 33381
    00 00 08 49 53 d8 e0 35 33381
    00 00 08 c1 74 d8 e0 35 33381
    00 00 08 69 aa 2d e0 25 33385
    00 40 04 6d aa 2d e0 51 922746
    Error condition: 33 Error State: 3
    Number of Hours in Drive Life: 1872 (life of the drive in hours)

    Error Log Structure 2:
    DCR FR SC SN CL SH D/H CR Timestamp
    00 00 80 d1 1a c7 e0 35 33579
    00 00 80 51 1b c7 e0 35 33579
    00 00 18 d1 1b c7 e0 35 33584
    00 00 08 e9 1b c7 e0 35 33584
    00 00 08 69 aa 2d e0 25 33584
    00 40 04 6d aa 2d e0 51 922746
    Error condition: 33 Error State: 3
    Number of Hours in Drive Life: 1872 (life of the drive in hours)

    Error Log Structure 3:
    DCR FR SC SN CL SH D/H CR Timestamp
    00 00 08 e9 7e 2a e0 25 35094
    00 00 08 69 8a 2a e0 25 35094
    00 00 08 69 83 2b e0 25 35094
    00 00 08 f9 a0 2b e0 25 35094
    00 00 08 69 aa 2d e0 25 35094
    00 40 04 6d aa 2d e0 51 922746
    Error condition: 33 Error State: 3
    Number of Hours in Drive Life: 1872 (life of the drive in hours)

    Error Log Structure 4:
    DCR FR SC SN CL SH D/H CR Timestamp
    00 00 80 c9 ee c6 e0 35 35098
    00 00 10 49 ef c6 e0 35 35098
    00 00 08 24 46 87 e0 35 35098
    00 00 08 59 ef c6 e0 35 35098
    00 00 08 69 aa 2d e0 25 35098
    00 40 04 6d aa 2d e0 51 922746
    Error condition: 33 Error State: 3
    Number of Hours in Drive Life: 1872 (life of the drive in hours)

    Error Log Structure 5:
    DCR FR SC SN CL SH D/H CR Timestamp
    00 00 08 f9 f4 b3 e0 35 35331
    00 00 80 c1 79 c7 e0 35 35331
    00 00 10 41 7a c7 e0 35 35331
    00 00 08 51 7a c7 e0 35 35331
    00 00 08 69 aa 2d e0 25 35331
    00 40 04 6d aa 2d e0 51 922746
    Error condition: 33 Error State: 3
    Number of Hours in Drive Life: 1872 (life of the drive in hours)
     
  2. ntwaddel

    ntwaddel Well-Known Member

    Joined:
    Nov 3, 2003
    Messages:
    173
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Templeton, CA
    if your getting SMART errors, the drive is a goner im sure. what type of drives are you using??
     
  3. rpmws

    rpmws Well-Known Member

    Joined:
    Aug 14, 2001
    Messages:
    1,824
    Likes Received:
    5
    Trophy Points:
    38
    Location:
    back woods of NC, USA

    You know ..no one really knows until it dies. I guess you have decided to wait and see if that is gonna happen. What about your clients on that box? Shouldn't you not take any chances like that? That's just my thoughts ..but you run it like you want. It's been 6 weeks ..are you waiting for it to go boom?
     
  4. dezignguy

    dezignguy Well-Known Member

    Joined:
    Sep 26, 2004
    Messages:
    534
    Likes Received:
    0
    Trophy Points:
    16
    well it's possible that your earlier problems bumped some of the errors that SMART looks for over the thresholds for drive failure... it keeps track of how many errors build up in the drive's life, and when too many types of errors start going over the thresholds it assumes that the drive will fail soon.

    However... I had a similar thing with a windows drive I had... kept giving smart errors... most of which I assumed was because of drive cable issues I had in the past, and the manufacturer's test software showed it to be still good... so i ignored it for a few months... (found that it had something to do with the internal ntfs structure table being just slightly corrupt or something) and it finally crashed and burned on me... though I was able to fully recover the drive (after a full 24 hour day spent on it). The drive then totally and completely failed a week later. Replaced under warranty. ;-)

    I should have known better with my drive... heh, but my parent's ignored SMART's drive failure warnings for several months before their drive failed. (Not totally their fault since they were only logged in the event log, no where else).

    But yeah... you never know when a drive will fail... SMART just tries to give you a little advance warning.

    You may also have a bad ide cable...

    Hmm, your drive doesn't seem to be that old either... my server's drive says it has Power_On_Hours = 8503. Seems about right... since my server has been up for about a year.
     
    #4 dezignguy, Feb 3, 2005
    Last edited: Feb 3, 2005
Loading...

Share This Page