The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

CPU Site slowdown - Help needed

Discussion in 'General Discussion' started by fuzioneer, Jan 23, 2005.

  1. fuzioneer

    fuzioneer Well-Known Member

    Joined:
    Dec 12, 2003
    Messages:
    98
    Likes Received:
    0
    Trophy Points:
    6
    Hiya all, my old WHM/Cpanel setup was sweet but unfortunately i outgrew the server and had to upgrade.

    MY old server was 728Mb RAM, 2 X 40MB HDs, and an AMD 2400+ CPU running RH9.0 and WHM/CPanel etc

    It was running around 8 phpnuke sites, 3 are very busy and rest are growing but not really hitting the server. It also hosts a few static sites.

    New server is Pentium Celeron 2,53GHz, 512MB Ram at present but going up to 2GB in 2 days, 2 X 180GB SATA Drives.

    Now i have been handed my new shiny server built by hosting company with CentOS 3.3, Fantastico, RVSKin, Cpanel/WHM etc and my content has been restored onto it, but the CPU keeps maxing out.

    It will be fine for a while then suddenly the CPU Climbs and stays there. managed to get a few stats from WHM->System Health->Show Current CPU Usage and its always either httpd or Perl that seems to be hogging majority of usage with either of these using >70% on its own.

    This post http://forums.cpanel.net/showthread.php?t=33854&highlight=perl+cpu+usage seems to have similar symptoms but i dont have the same trace message, mine is as follows:-

    select(0, NULL, NULL, NULL, {0, 920000}) = 0 (Timeout)
    time(NULL) = 1106509641
    waitpid(-1, 0xbfff938c, WNOHANG) = 0
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    time(NULL) = 1106509642
    waitpid(-1, 0xbfff938c, WNOHANG) = 0
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    time(NULL) = 1106509643
    waitpid(-1, 0xbfff938c, WNOHANG) = 0
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    time(NULL) = 1106509644
    waitpid(-1, 0xbfff938c, WNOHANG) = 0
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    time(NULL) = 1106509645
    waitpid(-1, 0xbfff938c, WNOHANG) = 0
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    time(NULL) = 1106509649
    clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb75e90c8) = 9865
    waitpid(-1, 0xbfff938c, WNOHANG) = 0
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    time(NULL) = 1106509650
    waitpid(-1, 0xbfff938c, WNOHANG) = 0
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
    time(NULL) = 1106509651
    waitpid(-1, 0xbfff938c, WNOHANG) = 0
    select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)




    I have the following setup in httpd.conf:-
    KeepAlive Off
    MaxKeepAliveRequests 100
    MinSpareServers 5
    MaxSpareServers 10
    StartServers 5
    MaxClients 200


    NB:I had KeepAlive On originally but made no difference On or Off

    any comments / feedback gratefully accepted :)
     
  2. fuzioneer

    fuzioneer Well-Known Member

    Joined:
    Dec 12, 2003
    Messages:
    98
    Likes Received:
    0
    Trophy Points:
    6
    i think i may have tracked this down a bit further.

    The server was running at fine at 0900 this morning, then went back 20 or so minutes later and it was dog slow

    did a top and results below:-
    09:51:02 up 15:14, 2 users, load average: 167.59, 116.53, 55.61
    531 processes: 523 sleeping, 6 running, 0 zombie, 2 stopped
    CPU states: cpu user nice system irq softirq iowait idle
    total 1.5% 0.0% 2.7% 0.0% 1.1% 94.4% 0.0%
    Mem: 502452k av, 496908k used, 5544k free, 0k shrd, 3116k buff
    365068k actv, 69360k in_d, 6660k in_c
    Swap: 1044216k av, 522108k used, 522108k free 20048k cached


    as you can see helluva lot of processes and load sky high

    i did nothing but research for 15 mins and then all of a sudden the load dropped right off and so did the processes count:-

    09:56:06 up 15:19, 2 users, load average: 18.10, 89.59, 65.04
    104 processes: 100 sleeping, 2 running, 0 zombie, 2 stopped
    CPU states: cpu user nice system irq softirq iowait idle
    total 0.0% 0.0% 0.0% 0.0% 0.0% 0.2% 99.8%
    Mem: 502452k av, 112044k used, 390408k free, 0k shrd, 6432k buff
    71080k actv, 7704k in_d, 560k in_c
    Swap: 1044216k av, 98932k used, 945284k free 32284k cached

    another short time later and server back to normal:-

    10:00:25 up 15:23, 2 users, load average: 0.23, 37.51, 49.14
    109 processes: 103 sleeping, 2 running, 2 zombie, 2 stopped
    CPU states: cpu user nice system irq softirq iowait idle
    total 0.1% 0.0% 0.0% 0.0% 0.0% 0.0% 99.8%
    Mem: 502452k av, 116596k used, 385856k free, 0k shrd, 7880k buff
    75508k actv, 7740k in_d, 544k in_c
    Swap: 1044216k av, 98932k used, 945284k free 32764k cached


    any idea what could be causing this, or how i can track it down ?
     
  3. dezignguy

    dezignguy Well-Known Member

    Joined:
    Sep 26, 2004
    Messages:
    534
    Likes Received:
    0
    Trophy Points:
    16
    You seem to have some high IOWAIT percentages... there were some issues with recent redhat kernels on SMP machines causing high loads from iowaits backing up. Make sure you have the latest kernel. The latest for RHEL3/CentOS3 is 2.4.21-27.0.2.EL (if you don't have a multiprocessor machine).

    Otherwise, you have something running that is very disk intensive... could be apache/mysql etc.

    When you get your new ram, I'd suggest optimizing mysql... and increasing the apache server limits.
     
  4. fuzioneer

    fuzioneer Well-Known Member

    Joined:
    Dec 12, 2003
    Messages:
    98
    Likes Received:
    0
    Trophy Points:
    6
    hi thx for replying

    my kernel version is 2.4.21-27.0.1.EL on CentOS 3.3

    did that have the same issues ?

    where did u get this information from ?
     
  5. gunmuse

    gunmuse Well-Known Member

    Joined:
    Jul 3, 2003
    Messages:
    98
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    New Mexico
    You downgraded your Cpu's The AMD is one of the best server processor's out there. Especially the Opteron's

    RHE3 is just slower than RH9.0- Its the funniest thing RH started whining about Microsofts bloat and then they copied it because they don't know when to say. NO we are not supporting that. Anytime you build an OS that is all things to all people its going to slow down. RHE3 was a noticable pig on our cpu's. In fact this month we bought a box and put 7.3 back on it just so we could run games on it without all the extra garbage we don't need.


    The only answer we have come up with is watching the timeouts on everything. Reconfigure you httpd.conf and stop child processes and turn off keep alive. It slows down graphic delivery but it keeps all else moving quickly. Tak your timeouts down to 2 seconds. Make the client reconnect everytime they request a file. It actually increases speed unless your on an internal network. The default httpd.conf is closer to defunked to todays hardware and web needs.

    Ram
    Swap: 1044216k av, 522108k used, 522108k free 20048k cached
    Holy hell man you need at least another gig of ram just to keep up. 2gig total would help you out. A lot of your hang time is waiting on your harddrives to respond instead of ram.

    Also on every new WHM we put together we have to install about a half dozen perl modules that the scripts call for. Usually fairly common ones at that like DBI MySql.

    A perl module monitor would be a nice host feature. If someone installed software that asked for a module just install the damn thing or at least give a me a list of what they are wanting to use half the time the clients don't even know.

    One of these Exim updates and/or pop3 updates really started causing us to have some cpu drain spikes. Make sure your default accounts for your clients is set to :blackhole: most people don't use the default email and anything other than that will cause power drains for no usuable reason.
     
  6. fuzioneer

    fuzioneer Well-Known Member

    Joined:
    Dec 12, 2003
    Messages:
    98
    Likes Received:
    0
    Trophy Points:
    6
    Hi gunmuse, thanks for giving me your take on this.

    I have upgraded the server to 2GB of RAM now. Its weird it ticks along nicely and then i get sudden massive spikes,

    see mrtg stats here:-

    http://217.112.89.232/mrtg/processes.html - process count
    http://217.112.89.232/mrtg/load.html - load averages
    http://217.112.89.232/mrtg/cpu.html - cpu

    now i know these may be possibly related to users hitting site at peak times, but my gut feeling is that something is misconfigured or some other variable involved like someone attacking server etc ???

    I have KeepAlive off and have already done a few of the tweaks to httpd you mentioned.

    hmmm setting the default accounts to blackhole, did some research on that and most people were saying to set it to fail (Which i have) explain why you think it should be otherwise please.

    thanks for the feedback all, getting there slowly

    btw anyone know how to get damn mrtg to display in normal figures and not X100 ???
     
  7. gunmuse

    gunmuse Well-Known Member

    Joined:
    Jul 3, 2003
    Messages:
    98
    Likes Received:
    0
    Trophy Points:
    6
    Location:
    New Mexico
    Fail acknowledges the mail versus just trashing it. A black hole is much less load than a fail.

    My next suggestion is one that most will disagree because of what you read on the web but start more servers and have a higher reserve.

    Now everyone's Anti-policy on this is its a waste of resources when the server is not doing anything.

    Its a web server if its not doing anything then it can waste all the resources it wants waiting for a visitor.

    When apache needs a few more servers it starts 2 waits 4 seconds starts 4 waits for seconds then starts 8 and so on.

    So if 10 guys hit you at once and have to wait for servers to crank up you get a lot of httpd load and perl load if its a perl script waiting for it to start.

    FYI A server should run smooth of this type until around a 20 load and then mysql will slow down.

    I have served html pages all the way up to a 700 load during a DOS attack.

    Your not being attacked that is for certain if you are tehy are doing a poor job of it.

    One thing that can load a server with the nuke or e107 scripts is that search box built into their sites. They are looping server/mysql pigs. PHPBB2 search function is horrible too.

    Advise high use sites to get phpmysearch and crawl their own site its much faster and more accurate type of search.
     
  8. fuzioneer

    fuzioneer Well-Known Member

    Joined:
    Dec 12, 2003
    Messages:
    98
    Likes Received:
    0
    Trophy Points:
    6
    thx for advice gunmuse, i'll switch it to blackhole (from existing fail) and see if there is any improvement, and i'll remove search function on the nuke sites.

    I am techy, but the problem is i am techy on Windoze so big learning curve here ;)

    any other suggestions particulary with regard to the phpnuke/ mysql/apache side of life ?
     
  9. fuzioneer

    fuzioneer Well-Known Member

    Joined:
    Dec 12, 2003
    Messages:
    98
    Likes Received:
    0
    Trophy Points:
    6
    hmm http://217.112.89.232/mrtg/cpu.html

    i changed to blackhole from fail, not much of a change.

    The cpu is being hammered so much now that i cant ssh to the server without getting a remote reboot first and then stopping httpd whilst i work.

    Any more recommendations on how i can get this server in a more stable situation until i can get CPU Upgraded ?

    I have KeepAlive off atm
    and timeouts reduced

    I am sure with a little tweaking of my.cnf and httpd.conf that i can get this server into a state whereby it is at least usable whilst i cost up this cpu upgrade
     
  10. dezignguy

    dezignguy Well-Known Member

    Joined:
    Sep 26, 2004
    Messages:
    534
    Likes Received:
    0
    Trophy Points:
    16
    no, :fail: is the better setting... your server doesn't even have to touch the unwanted email then. Search the forums for more info as this has been discussed to death.

    But it won't do anything if incoming mail isn't your problem.

    Couple ideas... have you seen the discussions recently about a new or modified phpbb worm that is causing high loads and apache crashing when it connects to servers... basically a ddos worm. It apparently takes all the open slots and just sits there, in the READING state... and the default timeout is too high to deal with it. So the suggestion there seems to be to reduce the apache timeouts to somewhere between 10 and 30.

    Second... if perl usage is high and your sites don't use perl, make sure that you don't have one of the perl worms that have been going around... getting in through vulnerable scripts and possibly a vulnerable php install.

    I'd highly recommend that you hire a competent server administrator to look over your server and optimize and secure it. You can find a number of them here.
     
  11. jester.ro

    jester.ro Well-Known Member
    PartnerNOC

    Joined:
    Feb 6, 2004
    Messages:
    304
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Bucharest, Romania
    cPanel Access Level:
    DataCenter Provider
    first of all, AMD's suck as server processors :)
    unstable and always heating too much.

    on the other hand, celeron is not a server processor. that athlon was faster indeed.
    I would suggest getting a p4 prescott with 1 mb cache. The cache is very important for a server.

    as for optimising.

    search google or freashmeat.net for a little program called mytop. It's like top, but for mysql processes. Check to see if you have too many queryes, also check the cache hits .

    you might wanna optimize your my.cnf for mysql-intensive servers. The default my.cnf doen't have caching enabled, and trust me, it helps alot.

    Search on you harddrive for some files called my-huge.cnf, or my-large.cnf

    Backup your existing /etc/my.cnf, and replace with one og those, restart mysql, see if there's any improvement.

    You could also try disabling mysq persistent connections (in php.ini), or enabling them if they're not.
    You must experiment with the 2 numbers, persistent connections and total connections, and find the one that suits your server best.
     
Loading...

Share This Page