The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[cPanel smartcheck] Possible Hard Drive Failure Soon

Discussion in 'General Discussion' started by sreevishnu, Dec 13, 2005.

  1. sreevishnu

    sreevishnu Member

    Joined:
    Aug 18, 2003
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    Location:
    india
    Hi,

    Since last night we've been getting this message from a lot of our servers, 6 so far, and more coming it. Its highly unlikely that all the harddrives decide to have problems together :p , so im wondering if it might be a problem with Cpanel's Smartcheck,

    is anybody else having this problem? :confused:

    and there arent any errors logged either

    S.M.A.R.T Errors on /dev/hda
    From Command: /usr/sbin/smartctl -l /dev/hda
    SMART Error Log:
    SMART Error Logging Version: 1
    No Errors Logged
    ----END /dev/hda--

    S.M.A.R.T Errors on /dev/hdb
    From Command: /usr/sbin/smartctl -l /dev/hdb
    SMART Error Log:
    SMART Error Logging Version: 1
    No Errors Logged
    ----END /dev/hdb--

    any ideas?
     
  2. PanelGuy

    PanelGuy Well-Known Member

    Joined:
    Oct 13, 2004
    Messages:
    106
    Likes Received:
    0
    Trophy Points:
    16
    SNART Decvice may fail

    Me too! Although I have suspected HD problems, maybe we are both aout to feel disaster!
     
  3. sreevishnu

    sreevishnu Member

    Joined:
    Aug 18, 2003
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    Location:
    india
    god!...harddrive failures on 6 servers???!!!!! :eek: i sure hope not! :p
     
  4. PanelGuy

    PanelGuy Well-Known Member

    Joined:
    Oct 13, 2004
    Messages:
    106
    Likes Received:
    0
    Trophy Points:
    16
    SMART Warning

    Hmm, on 6 servers? Perhaps it's time to visit Bugzilla, what do you think?
     
  5. chirpy

    chirpy Well-Known Member

    Joined:
    Jun 15, 2002
    Messages:
    13,475
    Likes Received:
    20
    Trophy Points:
    38
    Location:
    Go on, have a guess
    Not sure what the problem is that you're asking about. What you've posted shows no errors. If it's saying that there was a problem but displays no errors then, AFAIK, it's not a problem with cPanel's implementation of smartcheck. It's probably either:

    1. The disk(s) really are clocking SMART errors

    2. There's an incompatibility between smartcheck and your drives SMART implementation

    Often it is 2. sometimes it is 1. The safest way to check is to run a SMART check yourself and note the number of errors. Try it again after a few days and see if the number of errors changes.

    If you want to disable smartcheck then look at the topf of the script and it shows you a file you can create to have it skipped.
     
  6. myusername

    myusername Well-Known Member
    PartnerNOC

    Joined:
    Mar 6, 2003
    Messages:
    691
    Likes Received:
    1
    Trophy Points:
    18
    Location:
    chown -R us.*yourbase*
    cPanel Access Level:
    DataCenter Provider
    Twitter:
    I think cPanel has been busy making a new version of smartcheck. if you search BZ it is in there. You might copy the script they have written and see if it clears up any of your problems, or lack there of.

    http://bugzilla.cpanel.net/show_bug.cgi?id=336
     
  7. Salman75

    Salman75 Well-Known Member

    Joined:
    Jan 20, 2004
    Messages:
    102
    Likes Received:
    0
    Trophy Points:
    16
    We also got this email from one of our servers:

    S.M.A.R.T Errors on /dev/hdc
    From Command: /usr/sbin/smartctl -q errorsonly -H -l selftest -l error /dev/hdc
    ATA Error Count: 2
    Error 2 occurred at disk power-on lifetime: 7 hours Error 1 occurred at disk power-on lifetime: 1 hours
    ----END /dev/hdc--


    We have been seeing repeated apache failure on this server though :(
     
  8. bking

    bking Well-Known Member

    Joined:
    Mar 1, 2004
    Messages:
    206
    Likes Received:
    1
    Trophy Points:
    18
    Location:
    Sydney
    I am seeing the following errors which I have not occured before...

    Using smartcheck config 5.32 for smartctl(5.1)
    Checking /dev/hda....
    Errors:
    ATA Error Count: 77 (device log contains only the most recent five errors)
    Error 77 occurred at disk power-on lifetime: 4248 hours
    Error 76 occurred at disk power-on lifetime: 2152 hours
    Error 75 occurred at disk power-on lifetime: 2092 hours
    Error 74 occurred at disk power-on lifetime: 2092 hours
    Error 73 occurred at disk power-on lifetime: 2092 hours

    Checking /dev/hdb....
    Errors:
    ATA Error Count: 2
    Error 2 occurred at disk power-on lifetime: 0 hours
    Error 1 occurred at disk power-on lifetime: 0 hours
     
  9. jackie46

    jackie46 BANNED

    Joined:
    Jul 25, 2005
    Messages:
    537
    Likes Received:
    0
    Trophy Points:
    0
    We are getting these errors since the upgrade to R82.
     
  10. Salman75

    Salman75 Well-Known Member

    Joined:
    Jan 20, 2004
    Messages:
    102
    Likes Received:
    0
    Trophy Points:
    16
    Yes, we have seen these errors on MANY boxes. We had many drives checked by NOC teams. Here is their reply:

     
  11. DigiCrime

    DigiCrime Well-Known Member

    Joined:
    Nov 27, 2002
    Messages:
    399
    Likes Received:
    0
    Trophy Points:
    16
    I got a similar response as well Question is what if the drive is going bad?

     
  12. Salman75

    Salman75 Well-Known Member

    Joined:
    Jan 20, 2004
    Messages:
    102
    Likes Received:
    0
    Trophy Points:
    16
    Well, thats not a nice thought - is it :(

    Maybe one of the cPanel staff can shed some light into whats going on.
     
  13. Murtaza_t

    Murtaza_t Well-Known Member

    Joined:
    Jan 24, 2005
    Messages:
    476
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Earth
    cPanel Access Level:
    Website Owner
    Well even we got these errors on one of our server but din't bother as we were already switching servers. does it really has to do some thing with cpanel..? :confused:
     
  14. dropby23

    dropby23 Well-Known Member

    Joined:
    Jan 16, 2005
    Messages:
    155
    Likes Received:
    0
    Trophy Points:
    16
    does not work properly for scsi devices
     
  15. pcsousa

    pcsousa Well-Known Member

    Joined:
    May 28, 2004
    Messages:
    63
    Likes Received:
    0
    Trophy Points:
    6
    where are your servers located?
    Recently, 2 servers owned by us and located @ SAME datacenter @ EV1 need to hard drive substitution, in the same week with diference of 1 or 2 days. Now, with new hard drives, I'm starting to receive these alerts. This one in one server:
    S.M.A.R.T Errors on /dev/hda
    From Command: /usr/sbin/smartctl -q errorsonly -H -l selftest -l error /dev/hda
    ATA Error Count: 10 (device log contains only the most recent five errors) Error 10 occurred at disk power-on lifetime: 14925 hours Error 9 occurred at disk power-on lifetime: 14925 hours Error 8 occurred at disk power-on lifetime: 14925 hours Error 7 occurred at disk power-on lifetime: 14925 hours Error 6 occurred at disk power-on lifetime: 14925 hours
    ----END /dev/hda--

    and this one on the other:
    S.M.A.R.T Errors on /dev/hdb
    From Command: /usr/sbin/smartctl -q errorsonly -H -l selftest -l error /dev/hdb
    ATA Error Count: 1
    Error 1 occurred at disk power-on lifetime: 315 hours
    ----END /dev/hdb--

    I'll appreciate a cPanel confirmation on this bug. Other way I'll need to ask to EV1 if they are playing baseball at IDC1!

    Regards.
     
  16. jamesbond

    jamesbond Well-Known Member

    Joined:
    Oct 9, 2002
    Messages:
    738
    Likes Received:
    1
    Trophy Points:
    18
    I upgraded to the latest release, and I noticed that in the upcp mail smartcheck only mentions one drive (there are 3 - hda, hdb and hdc):

    Using smartcheck config 5.32 for smartctl(5.1)
    Checking /dev/hdb....Ok

    I verified (smartctl -i /dev/hda) that smartcheck support is enabled.

    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled

    Do i need to manually configure smartd.conf?
     
    #16 jamesbond, Jan 9, 2006
    Last edited: Jan 9, 2006
  17. jozeph

    jozeph Well-Known Member

    Joined:
    Apr 16, 2004
    Messages:
    59
    Likes Received:
    0
    Trophy Points:
    6
    begin...

    Hi,

    I begin to receive these errors after upgrade to _95.
     
  18. chirpy

    chirpy Well-Known Member

    Joined:
    Jun 15, 2002
    Messages:
    13,475
    Likes Received:
    20
    Trophy Points:
    38
    Location:
    Go on, have a guess
    There's little that cPanel can do about incompatibilities between the utility and some S.M.A.R.T drives. I se many such issues, especially on EV1 servers, where the compatibility issue between the drives they use and the smartmontools application throws up errors. The important thing is to watch the error count. If it does not increase then there should not be a problem. If it does increase it's almost definitely a disk going bad.

    Whether you live with the error emails and keep track of the errors, or create the file to block the script is your choice.

    cPanel are caught between a rock and a hard place with this, really. The older version of the application that cPanel used for ages was broken and didn't work on as many drives. This new version (remember it's a GPL utility) works with more drives, but does kick up false-positives.
     
  19. jackie46

    jackie46 BANNED

    Joined:
    Jul 25, 2005
    Messages:
    537
    Likes Received:
    0
    Trophy Points:
    0
    All my boxes are upgraded to R95 and only one box is sending me this error. Its my RHEL 3 server. Does not happen with RH 7.1 7.2 or 9.
     
  20. jackie46

    jackie46 BANNED

    Joined:
    Jul 25, 2005
    Messages:
    537
    Likes Received:
    0
    Trophy Points:
    0
    I dont think it has anything to do with hard drives and controllers. My new drive has been in the box for the past 2 months and we have never received a smartcheck notice until we upgraded to R80. Once the box was upgraded to R80 we started getting these notices. That just proves it is the smartcheck utility the was upgraded, SEE CHANGELOG, not the drive or controller.
     

Share This Page