The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

S.M.A.R.T Drive Failure

Discussion in 'General Discussion' started by procam, Jan 12, 2006.

  1. procam

    procam Well-Known Member

    Joined:
    Nov 24, 2003
    Messages:
    123
    Likes Received:
    0
    Trophy Points:
    16
    I got this fancy smancy email again this morning about one of my drives on one of my units,
    blah blah -- I panic nearly spill my coffee on my keyboard and choke on my granola breakfast bar while reading it ~ and then take a deep breathe two xanax and send it to a tech ~
    S.M.A.R.T Errors on /dev/hda
    From Command: /usr/sbin/smartctl -q errorsonly -H -l selftest -l error /dev/hda
    ATA Error Count: 1
    Error 1 occurred at disk power-on lifetime: 921 hours
    ----END /dev/hda--

    S.M.A.R.T Errors on /dev/hdb
    From Command: /usr/sbin/smartctl -q errorsonly -H -l selftest -l error /dev/hdb
    ATA Error Count: 4
    Error 4 occurred at disk power-on lifetime: 6760 hours
    Error 3 occurred at disk power-on lifetime: 6760 hours
    Error 2 occurred at disk power-on lifetime: 1394 hours
    Error 1 occurred at disk power-on lifetime: 3 hours
    ----END /dev/hdb--

    So I send this over to technical now and tell them that I need to schedule a drive replacement as I always have in the past everytime I get one of these errors I promptly change the drive that day or have one overnighted if its not a common stock drive so its there for replacement the next morning~

    Much to my surprise I get this email back telling me that this is not a really reliable method of detecting a drive problem is this true anyone have opinions on this Id really like to hear them cause if this is the case I been wasting a LOT of money replacing drives over the years~

    Email I got back from the tech "I have performed extended tests with smartcl checked for error logging in /var/log/messages to verify the integrity of your drive and I am satisfied all is well.

    Despite the wording of the message, the drive is most likely fine. Every night, cPanel runs the script /scripts/smartcheck to read the SMART diagnostic information from the drive (using smartctl) and warn the system administrator if the drive is having problems, but smartcheck defines "problems" as including the case where the ATA Error Count in 100 or more.

    The ATA Error Count is the number of errors recorded by the SMART circuitry on the drive, and this count is cumulative over the life of the drive. Since all drives experience some errors in the course of normal operation, the ATA Error Count will always become greater than 100 at some point, regardless of whether the drive is failing or not. Thus, cPanels smartcheck script will start to produce incorrect warnings when the error count reaches 100.

    To prevent these false warnings from being sent, you must disable smartcheck with this command: touch /var/cpanel/disablesmartcheck

    Once that is done, you can manually run /scripts/smartcheck once a week or so to keep an eye on the drive. The ATA Error Count should only be a concern if it increases by a large amount, or if it increases consistently. Smaller increases and sporadic jumps can normally be ignored.

    When you do suspect a problem with a drive, you should always perform other tests to confirm the problem. The ATA Error Count by itself is simply not conclusive evidence of a pending failure. Some of those other tests are badblocks and monitoring /var/log/messages for I/O errors. If your concerns about this drive continue to grow, please open a trouble ticket requesting a drive replacement. "
     
  2. Izzee

    Izzee Well-Known Member

    Joined:
    Feb 6, 2004
    Messages:
    469
    Likes Received:
    0
    Trophy Points:
    16
    Seems to be a s.m.a.r.t. and well informed tech you got there.

    :)
     
  3. procam

    procam Well-Known Member

    Joined:
    Nov 24, 2003
    Messages:
    123
    Likes Received:
    0
    Trophy Points:
    16
    Well thank you, but I would like someone to confirm that as gospel :cool:
    Im one of those people that likes to gather lots of opinions and experiences so I can make goooood decisions instead of spillin my morning coffee :D
     
  4. Izzee

    Izzee Well-Known Member

    Joined:
    Feb 6, 2004
    Messages:
    469
    Likes Received:
    0
    Trophy Points:
    16
    Try not spilling any coffee over the search facility either as it has been discussed many times of late and has many opinions for you to select from.

    :)
     
    #4 Izzee, Jan 12, 2006
    Last edited: Jan 12, 2006
  5. chirpy

    chirpy Well-Known Member

    Joined:
    Jun 15, 2002
    Messages:
    13,475
    Likes Received:
    20
    Trophy Points:
    38
    Location:
    Go on, have a guess
Loading...

Share This Page