The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Hard Disk Problems

Discussion in 'General Discussion' started by tarheel, Jan 30, 2005.

  1. tarheel

    tarheel Registered

    Joined:
    Dec 29, 2004
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    OK,

    Really hoping somebody can help here. Our deidcated server started locking up a few days ago - could ping, but no ssh or http access. Rebooting solved the problem.

    It happened again, then again. Some investigation showed:
    1) All the filesystems were READ-ONLY
    2) The following messages appearedin /var/log/messages

    Jan 30 11:24:26 srv1 kernel: blk: queue c0402f40, I/O limit 4095Mb (mask 0xffffffff)
    Jan 30 11:24:26 srv1 kernel: blk: queue c0403080, I/O limit 4095Mb (mask 0xffffffff)
    Jan 30 11:24:26 srv1 kernel: hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }
    Jan 30 11:24:26 srv1 kernel:
    Jan 30 11:24:26 srv1 kernel: hda: drive not ready for command

    Has anybody seen this before? My guess is we have been hacked, or have a bad hard drive.

    HELP - any thoughts.

    Simon
     
  2. haze

    haze Well-Known Member

    Joined:
    Dec 21, 2001
    Messages:
    1,550
    Likes Received:
    3
    Trophy Points:
    38
    What OS is this ? Have you clicked the option to enable DMA from within WHM ? Have you tried turning off DMA to see if this resolves the problem ? My guess is it may just be a misconfiguration.
     
  3. chirpy

    chirpy Well-Known Member

    Joined:
    Jun 15, 2002
    Messages:
    13,475
    Likes Received:
    20
    Trophy Points:
    38
    Location:
    Go on, have a guess
  4. jester.ro

    jester.ro Well-Known Member
    PartnerNOC

    Joined:
    Feb 6, 2004
    Messages:
    304
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Bucharest, Romania
    cPanel Access Level:
    DataCenter Provider
    looks like a hdd preparing to crash.
    backup and ask the dc to change the harddrive.
     
  5. haze

    haze Well-Known Member

    Joined:
    Dec 21, 2001
    Messages:
    1,550
    Likes Received:
    3
    Trophy Points:
    38
    Although that's a possibility, its not a direct sign of failure. I've seen this sort of error most commonly with misconfigured DMA settings.
     
  6. jester.ro

    jester.ro Well-Known Member
    PartnerNOC

    Joined:
    Feb 6, 2004
    Messages:
    304
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Bucharest, Romania
    cPanel Access Level:
    DataCenter Provider
    maybe, but turning dma off kicks back the performance to such a level that you can't use the server anymore.

    do a hdparm /dev/hda (if your drive is hda)
    and paste the results here

    alson, i have only fedora 1 for my cpanels, and i see that smarttools are installed by efault(never used them tough)

    do a smartctl -a /dev/hda
    and look at the results

    mine look like this:



    1 Raw_Read_Error_Rate 0x000b 200 200 051 Pre-fail Always - 0
    3 Spin_Up_Time 0x0007 100 253 021 Pre-fail Always - 0
    4 Start_Stop_Count 0x0032 100 100 040 Old_age Always - 15
    5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
    7 Seek_Error_Rate 0x000b 200 200 051 Pre-fail Always - 0
    9 Power_On_Hours 0x0032 091 091 000 Old_age Always - 6686
    10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0
    11 Calibration_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0
    12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 15
    194 Temperature_Celsius 0x0022 109 006 000 Old_age Always - 34
    196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
    197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0
    198 Offline_Uncorrectable 0x0012 200 200 000 Old_age Always - 0
    199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
    200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0


    so you must have "0" for every id that represents an error, otherwise...
     
  7. Blue|Fusion

    Blue|Fusion Well-Known Member

    Joined:
    Sep 12, 2004
    Messages:
    378
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Cleveland, Ohio
    This just happened to a friend of mine this past week. Updated to FC3, and a few days later, LogWatch was starting to report kernel errors very similar to that about the harddrive. I told managed about it and they confirmed that the drive is at fault. I got my friend to backup the accounts and submit a ticket to managed.com. 2 days now and still no replacement done or any acknowledgement to it (also submitted 2 tickets now and noone is ever on the live support).
     
Loading...

Share This Page