Community Forums
Connect with us on LinkedIn
Community Notice
+ Reply to Thread
Results 1 to 5 of 5
  1. #1
    Member
    Join Date
    Nov 2003
    Posts
    129

    Default S.M.A.R.T Drive Failure

    I got this fancy smancy email again this morning about one of my drives on one of my units,
    blah blah -- I panic nearly spill my coffee on my keyboard and choke on my granola breakfast bar while reading it ~ and then take a deep breathe two xanax and send it to a tech ~
    S.M.A.R.T Errors on /dev/hda
    From Command: /usr/sbin/smartctl -q errorsonly -H -l selftest -l error /dev/hda
    ATA Error Count: 1
    Error 1 occurred at disk power-on lifetime: 921 hours
    ----END /dev/hda--

    S.M.A.R.T Errors on /dev/hdb
    From Command: /usr/sbin/smartctl -q errorsonly -H -l selftest -l error /dev/hdb
    ATA Error Count: 4
    Error 4 occurred at disk power-on lifetime: 6760 hours
    Error 3 occurred at disk power-on lifetime: 6760 hours
    Error 2 occurred at disk power-on lifetime: 1394 hours
    Error 1 occurred at disk power-on lifetime: 3 hours
    ----END /dev/hdb--

    So I send this over to technical now and tell them that I need to schedule a drive replacement as I always have in the past everytime I get one of these errors I promptly change the drive that day or have one overnighted if its not a common stock drive so its there for replacement the next morning~

    Much to my surprise I get this email back telling me that this is not a really reliable method of detecting a drive problem is this true anyone have opinions on this Id really like to hear them cause if this is the case I been wasting a LOT of money replacing drives over the years~

    Email I got back from the tech "I have performed extended tests with smartcl checked for error logging in /var/log/messages to verify the integrity of your drive and I am satisfied all is well.

    Despite the wording of the message, the drive is most likely fine. Every night, cPanel runs the script /scripts/smartcheck to read the SMART diagnostic information from the drive (using smartctl) and warn the system administrator if the drive is having problems, but smartcheck defines "problems" as including the case where the ATA Error Count in 100 or more.

    The ATA Error Count is the number of errors recorded by the SMART circuitry on the drive, and this count is cumulative over the life of the drive. Since all drives experience some errors in the course of normal operation, the ATA Error Count will always become greater than 100 at some point, regardless of whether the drive is failing or not. Thus, cPanels smartcheck script will start to produce incorrect warnings when the error count reaches 100.

    To prevent these false warnings from being sent, you must disable smartcheck with this command: touch /var/cpanel/disablesmartcheck

    Once that is done, you can manually run /scripts/smartcheck once a week or so to keep an eye on the drive. The ATA Error Count should only be a concern if it increases by a large amount, or if it increases consistently. Smaller increases and sporadic jumps can normally be ignored.

    When you do suspect a problem with a drive, you should always perform other tests to confirm the problem. The ATA Error Count by itself is simply not conclusive evidence of a pending failure. Some of those other tests are badblocks and monitoring /var/log/messages for I/O errors. If your concerns about this drive continue to grow, please open a trouble ticket requesting a drive replacement. "

  2. #2
    Member
    Join Date
    Feb 2004
    Posts
    469

    Default

    Seems to be a s.m.a.r.t. and well informed tech you got there.


  3. #3
    Member
    Join Date
    Nov 2003
    Posts
    129

    Default

    Quote Originally Posted by Izzee
    Seems to be a s.m.a.r.t. and well informed tech you got there.

    Well thank you, but I would like someone to confirm that as gospel
    Im one of those people that likes to gather lots of opinions and experiences so I can make goooood decisions instead of spillin my morning coffee

  4. #4
    Member
    Join Date
    Feb 2004
    Posts
    469

    Default

    Try not spilling any coffee over the search facility either as it has been discussed many times of late and has many opinions for you to select from.

    Last edited by Izzee; 01-13-2006 at 12:29 AM.

  5. #5
    Super Moderator This forum account has been confirmed by cPanel staff to represent a vendor. chirpy's Avatar
    Join Date
    Jun 2002
    Location
    Go on, have a guess
    Posts
    13,495
    Jonathan Michaelson

    Need your cPanel servers secured and tuned?
    cPanel Server Configuration, Security, Recovery and Antivirus/AntiSpam Services
    Developers of the most effective (and free) Firewall & Security Solution for cPanel Servers - csf
    http://www.configserver.com

Similar Threads & Tags
Similar threads

  1. Possible Hard Drive Failure Soon ?
    By isputra in forum cPanel and WHM Discussions
    Replies: 6
    Last Post: 06-04-2009, 09:38 AM
  2. Possible Drive Failure - Need to clone
    By Dhp4 in forum cPanel and WHM Discussions
    Replies: 5
    Last Post: 03-02-2008, 11:26 AM
  3. Possible Hard Drive Failure Soon
    By jameshsi in forum cPanel and WHM Discussions
    Replies: 17
    Last Post: 08-02-2007, 03:30 AM
  4. Smartcheck drive failure.
    By paulm in forum cPanel and WHM Discussions
    Replies: 2
    Last Post: 04-27-2007, 03:32 PM
  5. [smartcheck] Hard Drive Failure Soon
    By ericfire in forum cPanel and WHM Discussions
    Replies: 14
    Last Post: 09-29-2004, 05:42 AM
Linkedin       Facebook       Twitter       RSS       Flickr       YouTube