The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

High server loads - no idea about the reason

Discussion in 'General Discussion' started by hmm, Jan 24, 2006.

  1. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India
    Hi,
    I am using Dual Xeon 2.4 GHz Machine with 1 GIG RAM and 2 80 GIG IDE Hard Disks...
    I have around 400 sites running on the server..

    Now the problem is, since last few days server load reaches around 80+...

    When I checked the processes I could not find anything wrong....

    Following is the output of top...

    Code:
     03:09:12  up 13 days,  2:02,  1 user,  load average: 86.69, 75.87, 49.11
    635 processes: 630 sleeping, 1 running, 3 zombie, 1 stopped
    CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
               total    0.6%    1.1%    0.9%   0.4%     0.2%   96.5%    0.0%
               cpu00    2.7%    0.9%    3.7%   0.0%     0.0%   92.5%    0.0%
               cpu01    0.0%    2.7%    0.0%   1.8%     0.0%   95.3%    0.0%
               cpu02    0.0%    0.9%    0.0%   0.0%     0.9%   98.1%    0.0%
               cpu03    0.0%    0.0%    0.0%   0.0%     0.0%  100.0%    0.0%
    Mem:  1025320k av, 1007264k used,   18056k free,       0k shrd,   18092k buff
                        733688k actv,  137564k in_d,   14792k in_c
    Swap: 2040212k av,  761140k used, 1279072k free                   98572k cached
    
      PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
    30401 root      18   0  1600 1600   884 R     1.6  0.1   0:00   0 top
    17882 nobody    19   4 20240  14M  2896 D N   0.4  1.4   0:00   1 httpd
    17235 nobody    20   4 17160  11M  2852 D N   0.2  1.1   0:00   1 httpd
    17278 nobody    19   4 22024  15M  2940 D N   0.2  1.5   0:01   0 httpd
    17389 nobody    19   4 20188  14M  2916 D N   0.2  1.4   0:00   2 httpd
    
    For some reason vbulletin is not allowing me to post the whole ouput...following is the link to download the output..

    http://rapidshare.de/files/11710180/top.txt.html

    in the stats iowait time reaches too high..touches 100% in the last cpu..

    Can anyone help me out with this issues?
     
    #1 hmm, Jan 24, 2006
    Last edited: Jan 24, 2006
  2. randomuser

    randomuser Well-Known Member

    Joined:
    Jun 25, 2005
    Messages:
    147
    Likes Received:
    0
    Trophy Points:
    16
    Is that the entire ps aux output you put on rapidshare? If so something is obviously horribly wrong:

    635 processes: 630 sleeping, 1 running, 3 zombie, 1 stopped

    That's quite a lot of processes. With 1.2G of swap left, I wouldn't say this is a memory issue, although you've only got 18M of physical memory left to use in that output, which could be a little better. With those iowait states, something is writing to your drive like crazy. I'd be curious to see a few lines of output from "vmstat 1". Try stopping a service and see if that helps, like MySQL. If that doesn't help, try stopping Apache, and so on. Could be someone's error_log is getting written to constantly. Have you noticed a significant, sudden decrease in drive space recently? If so, what partition is losing space (/home? /var? etc) What I'd really like to see is all 630+ process in the "ps auxwwwf" format.
     
  3. chirpy

    chirpy Well-Known Member

    Joined:
    Jun 15, 2002
    Messages:
    13,475
    Likes Received:
    20
    Trophy Points:
    38
    Location:
    Go on, have a guess
    Actually I'd be pretty worried about the amount of swap used (761140k used) as well as the ridiculously high process count. Although the latter is probably the cause of the former. Looks like you've got something spawning processes ad infinitum.
     
  4. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India
    Actually likle chirpy said someone might be trying to hack some forum or something...but the problem was I did not have enough resources to trace the problem...If I knew the correct command then I would definitely traced the stuff...anyways now back to the point...

    Currently server is running fine, I will run the vmstat command once I see the high load again..

    About suddent decrease in drive space - nopes everything seem to be fine without much of changes...

    If it happens next time then I will def. run both the commands provided by you and post the output...

    Thanks to both for helping me..I will update this thread if problem occurs again...

    btw anything else should i do when it happens again?

    Thanks
    Deep
     
  5. compunet2

    compunet2 Well-Known Member

    Joined:
    Feb 21, 2003
    Messages:
    310
    Likes Received:
    0
    Trophy Points:
    16
    Do you happen to be running CentOS 4.2 with kernel version 2.6.9?
     
  6. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India
    Its RHEL 3 with kernel 2.4.21
     
  7. compunet2

    compunet2 Well-Known Member

    Joined:
    Feb 21, 2003
    Messages:
    310
    Likes Received:
    0
    Trophy Points:
    16
  8. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India
    Here we go..sevrer is loaded again..this time I did not miss the stuff mentioned by randomuser

    top output
    vmstat 1 output
    ps auxwwwf output (there were too many lines, it didnt show many lines from top)

    The load went down when I stopped httpd, this means someone is attacking the server / site hosted on server?

    Any solution to this?

    Thanks
    Deep

    Edit: It happens daily around 230PM IST...but I do not see any cron jobs running at that time...
     
    #8 hmm, Jan 26, 2006
    Last edited: Jan 26, 2006
  9. mctDarren

    mctDarren Well-Known Member

    Joined:
    Jan 6, 2004
    Messages:
    664
    Likes Received:
    2
    Trophy Points:
    18
    Location:
    New Jersey
    cPanel Access Level:
    Root Administrator
    Check your domlogs to see who was requesting what pages during the high load period. Wondering if you have a spammer running a perl or php script and hogging httpd...?
     
  10. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India


    Edit: Sorry I misread it..
    Actually domlog wont be possible here because I have around 400 sites and checking logs of each and every site will be next to impossible....

    Is there any way to trace? may be at the time when this load is high?

    Thanks
    Deep
     
    #10 hmm, Jan 26, 2006
    Last edited: Jan 26, 2006
  11. MMarko

    MMarko Well-Known Member

    Joined:
    Apr 18, 2005
    Messages:
    316
    Likes Received:
    0
    Trophy Points:
    16
    Something is wrong withing apache. I had problems with htaccess. Have do optimize apache?
     
  12. mctDarren

    mctDarren Well-Known Member

    Joined:
    Jan 6, 2004
    Messages:
    664
    Likes Received:
    2
    Trophy Points:
    18
    Location:
    New Jersey
    cPanel Access Level:
    Root Administrator
    To find your domain logs do 'updatedb; locate domlogs' .Your domain logs are often located in /usr/local/apache/domlogs/ but not always. Next do 'man grep' or google the command grep and learn about how it can help you in this situation. Then grep the logs for the time/dates your load was heavy. You should be able to at least see what, if any, pages were being accessed at the time. Hope that helps

    [edit] Aha, you edited post while I was replying. :) You can still grep the logs to see what was going on. But you need to know how to use grep first. Then you can scan all the logs at once within a certain hour to see who was looking at what.[edit]
     
    #12 mctDarren, Jan 26, 2006
    Last edited: Jan 26, 2006
  13. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India
    Hi,
    I think it will display too many lines if I grep the folder with the time string...coz there are around 400 sites...

    Any other solution for this?

    Thanks
    Deep
     
    #13 hmm, Jan 27, 2006
    Last edited: Jan 27, 2006
  14. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India
    The load went up again..

    I restarted httpd..the load went down...

    I tried to store log of that time using grep command but it was taking too much time and increasing the load...so I had to cancel it in between

    one thing I noticed..in my mod_security logs I found many attacks on one site between that time...(mambo attacks)

    I have currently made the site offline and asked the owner of the site to fix it...

    but now again the load is high and I can see one process pkgacct run by root in the process and when I go to the process id folder in proc..it shows me following stuff...

    Code:
    dr-xr-xr-x    3 root     root            0 Jan 27 03:49 ./
    dr-xr-xr-x  279 root     root            0 Jan 10 20:07 ../
    -r--r--r--    1 root     root            0 Jan 27 03:49 cmdline
    -r--r--r--    1 root     root            0 Jan 27 03:49 cpu
    lrwxrwxrwx    1 root     root            0 Jan 27 03:49 cwd -> /root/
    -r--------    1 root     root            0 Jan 27 03:49 environ
    lrwxrwxrwx    1 root     root            0 Jan 27 03:49 exe -> /usr/bin/perl*
    dr-x------    2 root     root            0 Jan 27 03:49 fd/
    -r--------    1 root     root            0 Jan 27 03:49 maps
    -rw-------    1 root     root            0 Jan 27 03:49 mem
    -r--r--r--    1 root     root            0 Jan 27 03:49 mounts
    lrwxrwxrwx    1 root     root            0 Jan 27 03:49 root -> //
    -r--r--r--    1 root     root            0 Jan 27 03:49 stat
    -r--r--r--    1 root     root            0 Jan 27 03:49 statm
    -r--r--r--    1 root     root            0 Jan 27 03:49 status
    
    Do you think it can be anything suspicious? I know ppkgacct is to tar files but what wonders me is.../root/ as folder and perl script doing it...

    My perl version is latest i.e. 5.8.7...

    Any ideas?

    Deep
     
    #14 hmm, Jan 27, 2006
    Last edited: Jan 27, 2006
  15. simplestar

    simplestar Well-Known Member

    Joined:
    Nov 15, 2005
    Messages:
    97
    Likes Received:
    0
    Trophy Points:
    6
    Looking at the resource dispersement is fine but you really need to check your logs to find out what is happening around the times your server is having problems in order to fix it. I, personally have been getting hit quite hard by various no-good exploits, attacks, etc. and that might be the case for you too. Alos, are you running any type of rootkit checks?


    PHP:
    /var/log/messages
    /usr/local/apache/logs/error_logs
     
  16. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India
    Hi,
    Server is protected by APF, BFD, RKhunter, mod_security...and is totally updated with latest versions of softwares....

    I will have a check on the logs coz this problem is not stopping at all...I have contacted chirpy's site for paid support but still waiting for his reply.

    Deep
     
  17. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India
    YOU ROCK...
    I overlooked error_log file
    Just checked it..found one script creating php errors and increasing the load...

    I have suspended that account.. :)

    Thanks a bunch..the problem seem to have fixed now :)

    Deep
     
  18. simplestar

    simplestar Well-Known Member

    Joined:
    Nov 15, 2005
    Messages:
    97
    Likes Received:
    0
    Trophy Points:
    6
    Happy to hear you were able to fix the problem :)
     
  19. rip_curl

    rip_curl Well-Known Member

    Joined:
    Jan 30, 2005
    Messages:
    81
    Likes Received:
    0
    Trophy Points:
    6
    what's the script was that?
    what says apache for this script in your error_log?
     
  20. hmm

    hmm Well-Known Member

    Joined:
    Jan 11, 2006
    Messages:
    56
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    India
    Hi,
    It was custom script creting problems with feof and fgets function...
    I deleted it...

    2nd problem was mambo exploit -- one site was getting attacked every other minute by some bot..so I had to delete that site....and ask client to point their name servers to somewhere else...

    Regards,
    Deep
     
Loading...

Share This Page