The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Extreme Load Peaks on VMware Server

Discussion in 'General Discussion' started by mikelegg, Jun 8, 2010.

  1. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    I'm running cPanel 11.25.0-S46156 on Centos 5.5 x86_64 on VMware. The virtual server has 4 CPUs and 8Gb of RAM and is hosting just under 200 sites.

    It runs like a dream 99% of the time, but once every week day, the load skyrockets bringing the server to it's knees. (I've seen the load average reach 160)

    Once I restart the machine everything returns to normal until the next day when at a some random time the load will skyrocket again.

    The logs reveal nothing unusual. I've been able to watch top a few times while the load has been going through the ceiling and there were no visible processes contributing to the load in any significant way.

    The MySQL slow query log shows whatever queries happen to be running at the time of the high load, these are totally random and appear to be running slowly *because* of the high load, rather than being the cause of the high load.

    I currently have two separate server management teams monitoring the server as well as my own techs. Copious amounts of logging have been added to the system, but none of the logs have revealed anything significant. (When the server is under load many of the logs are not even written)

    I've added a bash script to PT_LOAD in CSF which restarts apache if the load average reaches 6, but the load spikes so rapidly that this doesn't even have enough resources to fire.

    On those rare occasions where I've been able to shell into the box, I've sometimes been able to reduce the load by restarting apache manually, but interestingly if I do this, another load spike will occur within a few hours. Only rebooting the server prevents a re-occurrence (until the next day).

    The host server does not appear to be causing the load either, during times when the Guest machine was under extreme load, the host machine was running at about 25% capacity.

    So I'm posting this in the hope that someone else out there may have had a similar experience and might be able to offer some insights as to the cause of these "phantom" loads.
     
    #1 mikelegg, Jun 8, 2010
    Last edited: Jun 8, 2010
  2. radeonpower

    radeonpower Well-Known Member

    Joined:
    Jul 23, 2009
    Messages:
    129
    Likes Received:
    1
    Trophy Points:
    18
    cPanel Access Level:
    Root Administrator
    I had the same problem, eventually I gave up and moved the server to dedicated one. :)
     
  3. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    That's what some of the techs have suggested and I haven't ruled it out as the ultimate solution, but I see there are other users running cPanel/VMware servers without a problem.

    I love the "High Availability" benefits of running VMware on clustered hosts so I would really prefer it if I could just fix this one issue, rather than sacrifice all of the advantages that VMware provides by going back to stand-alone servers.
     
  4. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    I don't know what the problem actually was, but I've fixed it.

    Since restarting the VM each time the load peaked seemed to fix things for about a day, I decided to do a pre-emptive reboot early one morning.

    Since the server wasn't under load at the time I was able to access WHM and do a "Graceful Reboot".

    The problem has not occurred since and it's been about 2 weeks now since the reboot.

    To quote the IT Crowd ... "Have you tried turning it off and on again" :)
     
  5. headout

    headout Well-Known Member

    Joined:
    Aug 20, 2003
    Messages:
    78
    Likes Received:
    0
    Trophy Points:
    6
    Hi Mike,

    since this action, your VPS is running smoothly? Or did your load peaks return?
     
  6. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    It's interesting that you ask that question, because everything has been fine until about a week ago when I started noticing some more load peaks.

    These are not as dramatic as the previous peaks and the load returns to normal within a few minutes.

    I'm monitoring the server at the moment to see if there's any pattern to the peaks and they appear to coincide with cpanellogd processing, but I'll give it bit longer before I draw a final conclusion.
     
  7. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    The recent peaks were definitely caused by cpanellogd.

    I've used "Main >> Server Configuration >> Statistics Software Configuration" to prevent log processing between 9:00AM and 5:00PM. So the loads won't occur during local business hours.
     
  8. InterServed

    InterServed Well-Known Member

    Joined:
    Jul 10, 2007
    Messages:
    255
    Likes Received:
    2
    Trophy Points:
    18
    cPanel Access Level:
    DataCenter Provider
    Is your vm placed into a raid array datastore ? Does it have an attached battery for cache ? Does it have redundant power supplies ? If so then you might use the noop elevator on your vm storage (if you haven't already).

    That could be accomplished by editing the boot option of the kernel:

    This is just an example from one of my servers:
    Code:
    title CentOS (2.6.18-194.17.4.el5)
            root (hd0,0)
            kernel /vmlinuz-2.6.18-194.17.4.el5 ro root=/dev/VolGroup00/lv_root [COLOR="Red"]elevator=noop[/COLOR]
            initrd /initrd-2.6.18-194.17.4.el5.img
    
    A reboot is required to active the noop anticipatory from the boot section or it can also be instantly activated with the following:
    echo noop > /sys/block/sda/queue/scheduler (replace sda if needed)

    Step2:
    Add the following to /etc/sysctl.conf
    Code:
    vm.swappiness = 0
    vm.overcommit_memory = 1
    vm.dirty_background_ratio = 5
    vm.dirty_ratio = 10
    vm.dirty_expire_centisecs = 1000
    dev.rtc.max-user-freq = 1024
    then do:
    Code:
    sysctl -p
     
    #8 InterServed, Nov 9, 2010
    Last edited: Nov 9, 2010
  9. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    Yes, it's a RAID 5 array, it's an EMC fibre channel SAN.

    Yes

    Yes

    I'm not sure what noop elevator storage is, I found some reference to it on Linux servers, but I don't think it applies to VMware ESX.
     
  10. InterServed

    InterServed Well-Known Member

    Joined:
    Jul 10, 2007
    Messages:
    255
    Likes Received:
    2
    Trophy Points:
    18
    cPanel Access Level:
    DataCenter Provider
    Sorry for my mistake , i was in a rush when i posted to this topic.

    There are various I/O-scheduling algorithms to the linux kernel.
    noop ; anticipatory ; deadline and cfq

    Most of the linux distributions uses CFQ as default. Based on your answers i would suggest to use the noop scheduler (in my case i also use it on my vmware linux guests).

    You may see the current scheduler with the following command(replace sda with your block device):
    Code:
    cat /sys/block/[COLOR="Red"]sda[/COLOR]/queue/scheduler
    In my case it looks like this :
    Code:
    root@server [~]# cat /sys/block/sda/queue/scheduler
    [noop] anticipatory deadline cfq
    
    As you can see , the active scheduler is marked with [ ] . In my case it indicates noop scheduler is in use.

    You can read more about the I/O scheduling algorithms at the following link:
    http://www.redhat.com/magazine/008jun05/features/schedulers/
     
    #10 InterServed, Nov 11, 2010
    Last edited: Nov 11, 2010
  11. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    Thanks for the IO scheduler suggestion. This same problem just developed on another VM of ours, so I've changed the scheduler to noop.
    Code:
    echo noop > /sys/block/sda/queue/scheduler
    echo noop > /sys/block/sdb/queue/scheduler
    I'll post the results here
     
  12. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    Not only have the loads peaks stopped since changing the IO scheduler to noop, the overall load has dropped considerably.

    Many thanks to InterServed for this tip!

    This advice should be written in mile-high letters in the sky so that everyone using VMware (or any virtual machine) as a shared web server can make this simple but critical adjustment.
     
  13. InterServed

    InterServed Well-Known Member

    Joined:
    Jul 10, 2007
    Messages:
    255
    Likes Received:
    2
    Trophy Points:
    18
    cPanel Access Level:
    DataCenter Provider
    Glad it worked out for you , but don't forget that if you made the change via:
    Code:
    echo noop > /sys/block/sda/queue/scheduler
    echo noop > /sys/block/sdb/queue/scheduler
    <-- wont survive on server reboot , so you either need to execute it again after a server/vm reboot or use the kernel elevator parameter as I've stated on the first instructions post.
     
  14. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    I've added
    Code:
    elevator=noop
    to grub.conf.

    Will that suffice to persist the noop elevator after reboot?
     
  15. InterServed

    InterServed Well-Known Member

    Joined:
    Jul 10, 2007
    Messages:
    255
    Likes Received:
    2
    Trophy Points:
    18
    cPanel Access Level:
    DataCenter Provider
    Yes that should do it , as long as added it to the kernel line:
    example:
    Code:
    title CentOS (2.6.18-238.9.1.el5)
            root (hd0,0)
            kernel /vmlinuz-2.6.18-238.9.1.el5 ro root=/dev/VolGroup00/lv_root [COLOR="red"]elevator=noop[/COLOR]
            initrd /initrd-2.6.18-238.9.1.el5.img
    Forgot to mention that using the kernel elevator will result in all block devices to use the defined IO scheduler.
     
    #15 InterServed, May 26, 2011
    Last edited: May 26, 2011
  16. mikelegg

    mikelegg Well-Known Member

    Joined:
    Mar 29, 2005
    Messages:
    330
    Likes Received:
    0
    Trophy Points:
    16
    Gotcha - thanks.
     
Loading...

Share This Page