Please whitelist cPanel in your adblocker so that you’re able to see our version release promotions, thanks!

The Community Forums

Interact with an entire community of cPanel & WHM users!

Huge increase in Apache processes

Discussion in 'General Discussion' started by GoWilkes, Jun 1, 2019.

  1. GoWilkes

    GoWilkes Well-Known Member

    Joined:
    Sep 26, 2006
    Messages:
    425
    Likes Received:
    7
    Trophy Points:
    168
    cPanel Access Level:
    Root Administrator
    I'm having a problem that I can't figure out, and I'm wondering if it's cPanel related? If not, maybe you guys will have an idea of how to narrow it down.

    Yesterday from around 5:30am until 7am, I had a huge increase in Apache processes that was causing my server to freeze up. I normally don't have more than 50 or so processes during my peak time, but this period was hitting the Server Limit that I had set in Apache configuration of 100.

    By time I saw it, though, it had ended.

    Then at around 4pm, it started again. This time I was there to see it, but couldn't find any reason for it. I checked the number of connections using:

    Code:
    netstat -plan | grep :80 | awk '{print $5}' | cut -d : -f 1 | sort | uniq -c | sort -nr | head
    but didn't see anything unexpected. I rebooted Apache, then MySQL, then the entire server, but none of them had any impact.

    I was able to stop the server from freezing up by increasing Server Limit in Apache configuration to 256, but that's just a Band-aid. My number of Apache processes has stayed between 100 and 150 all night and all day, even when netstat showed that I only had 4 or 5 connections.

    It's also notable that "Individual Interrupts" and "Disk Latency" in Munin went crazy at the same time.

    I'm not sure what "Individual Interrupts" means, but an orange graph that's usually near 1e+02 dropped down below 1e-04.

    And under "Disk Latency", /dev/xvdb/ has a green graph that's usually at around 1e-02 that dropped down to 1e-04. That made me suspect hardware failure, but I messaged Softlayer (who has the worst service now) and they said that with it being a virtual server then I wouldn't see hardware errors like that.

    So I'm not sure if the change in Interrupts and Latency is relevant, or just a symptom of another problem.

    I'm running CentOS 6.10 xen hvm, and WHM is v 76.0.20. I'm still running EasyApache 3, so WHM/cPanel hasn't updated to 78.

    Any suggestions you guys can give would be greatly appreciated!! Thanks in advance!
     
  2. GOT

    GOT Get Proactive! PartnerNOC

    Joined:
    Apr 8, 2003
    Messages:
    1,485
    Likes Received:
    187
    Trophy Points:
    193
    Location:
    Chesapeake, VA
    cPanel Access Level:
    DataCenter Provider
    Well, it sounds like you are getting some kind of DoS attack. This command:

    /usr/bin/lynx -dump -width 500 http://127.0.0.1/whm-server-status | grep GET | awk '{print $12}' | sort | uniq -c | sort -rn | head

    Will show you the number of connections each domain has active at that moment and this command:

    netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n

    Will show you the number of connections to your server per connecting IP.

    The problem with the command you used above is that it only reports port 80 traffic whereas if they are attacking on SSL it would not show those connections.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. GoWilkes

    GoWilkes Well-Known Member

    Joined:
    Sep 26, 2006
    Messages:
    425
    Likes Received:
    7
    Trophy Points:
    168
    cPanel Access Level:
    Root Administrator
    Thanks for the commands, those are very helpful! I didn't think about changing it from :80 after I moved everything to HTTPS.

    But I'm still not seeing a high number of connections. From the first command, I have 28 connections right now, but Munin is showing about 100 Apache processes; roughly double the number that I had at this time on May 30.

    And using the second command, the IP with the highest number of connections is a local IP, with 13 connections (pretty much what I would expect).

    I even blocked all non-US IP addresses in CSF (firewall) using CC_ALL_FILTER (only allowing US), but it had no noticeable impact on the problem.
     
  4. GOT

    GOT Get Proactive! PartnerNOC

    Joined:
    Apr 8, 2003
    Messages:
    1,485
    Likes Received:
    187
    Trophy Points:
    193
    Location:
    Chesapeake, VA
    cPanel Access Level:
    DataCenter Provider
    You might want to read the docs on that csf filter. If my memory serves me I dont think it works like you're expecting.

    As for munin I would not necessarily use that for real time diagnostics.

    ps axf|grep httpd|wc will give you a live count of Apache processes. From your numbers it doesn't sound like an attack but you should look at your general Apache settings. I believe the default max children/servers is set to 150 by default and if you are exceeding that then pages won't load

    I would also look at Apache status in whm because sometimes your eyes can show you things that just getting numbers from commands doesn't reveal.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. GoWilkes

    GoWilkes Well-Known Member

    Joined:
    Sep 26, 2006
    Messages:
    425
    Likes Received:
    7
    Trophy Points:
    168
    cPanel Access Level:
    Root Administrator
    This is what I was going by on CC_ALLOW_FILTER:

    And this:

    crybit.com/block-whole-countries-csf/

    I used the command you posted (ps axf|grep httpd|wc ) and this was the result:

    46 366 3106

    There wasn't a column header, though, so I'm not sure what I'm looking at here. It's 1:30am here right now, and the 46 matches what Munin shows for the current number of processes, but I'm not sure what the 366 or 3106 represent. Regardless, I would usually have 46 processes at peak time, not at 1:30. It should be more like 15-20 right now.


    You're right, and that turned out to be why my site was freezing up. Raising the number stopped it from freezing, but I have no clue why it increased in the first place :-(

    Possibly because of the increase I made on Max Clients and Server Limit, but I do have about 100 of these:

    ::1 myservername.com OPTIONS * HTTP/1.0

    I'm guessing that's normal, though... 100+/- free slots?
     
    #5 GoWilkes, Jun 2, 2019
    Last edited by a moderator: Jun 2, 2019
  6. MaFt

    MaFt Registered

    Joined:
    Jun 3, 2019
    Messages:
    2
    Likes Received:
    0
    Trophy Points:
    1
    Location:
    UK
    cPanel Access Level:
    Reseller Owner
    I'm following this as I've seen exactly the same. For years my sites averaged 4-6 Entry Processes and suddenly on Friday around 4-5pm UK time I was hitting "resource limit is reached" errors as these were limited to 20 on this server.

    I'm a reseller though and have no control over the limits. I've managed to minimise this by shutting down 1 site completely and using Cloudflare's "I'm under attack" to reduce the number of visitors. Not ideal though as it's meant a 40% loss of income over the weekend compared to normal - but at least the sites are online.

    The hosts are being painfully slow and keep saying they'll increase the limits. They still haven't. However, they've still not actually responded to my main query as to why the sites in question, with no changes at my end, are suddenly being reported as using a lot more processes than previously. Looking at the cPanel "concurrent usage" logs for 30 days you can see the sudden spike from Friday.

    It seems very weird that the only similar thing I can find is this post - and the same issue also started on Friday too; Around the same time too (assuming the original poster is in the US).

    I'm hopeful my hosts can find out what's going on and I'll certainly report back here if they find anything out.
     
    #6 MaFt, Jun 3, 2019
    Last edited: Jun 3, 2019
  7. GoWilkes

    GoWilkes Well-Known Member

    Joined:
    Sep 26, 2006
    Messages:
    425
    Likes Received:
    7
    Trophy Points:
    168
    cPanel Access Level:
    Root Administrator
    You're right, MaFt, I'm in eastern US. That's too much to be a coincidence, I think.

    I ran ClamAV and rkhunter, and neither found anything, so I'm ruling out a virus on my end.

    Right now (roughly 2pm EST) I have 101 busy Apache servers, but only 46 connections. The IP with the highest connection has 13 connections, which is reasonable, so I think that I can rule out a DDoS attack.

    My RAM is high, too; I'm usually at around 3G at this time of day, but it's currently over 4G (I have 4G of RAM, so it's maxing out). My CPU load is fine, though: 0.87, and since I have 2 CPUs a load of 2 would be a normal-high.

    MaFt, there's no excuse for your host to be dragging their feet on increasing the limits. It literally takes 30 seconds, and the restart of Apache might have a downtime of less than 1 second. It doesn't solve the problem, but it definitely help with the symptom (and should bring your revenue back on track).
     
  8. cPanelMichael

    cPanelMichael Technical Support Community Manager Staff Member

    Joined:
    Apr 11, 2011
    Messages:
    47,555
    Likes Received:
    2,182
    Trophy Points:
    363
    cPanel Access Level:
    Root Administrator
    Twitter:
    Hello Everyone,

    Can anyone affected by this issue verify if the Prefork MPM is enabled? You can execute the following command to check:

    Code:
    rpm -qa|grep mpm
    If so, verify if any recent entries like the one below exist in /usr/local/apache/logs/error_log:

    Code:
    AH00144: couldn't grab the accept mutex
    Thank you.
     
  9. GoWilkes

    GoWilkes Well-Known Member

    Joined:
    Sep 26, 2006
    Messages:
    425
    Likes Received:
    7
    Trophy Points:
    168
    cPanel Access Level:
    Root Administrator
    I SSH'ed in to my server as root via Putty, ran rpm -qa|grep mpm, and basically nothing happened. It ran for about 2 seconds, then just gave me the prompt again.

    In /usr/local/apache/conf/httpd.conf, though, the only reference to prefork is here:

    Code:
    Timeout 60
    TraceEnable Off
    ServerSignature Off
    ServerTokens ProductOnly
    FileETag None
    StartServers 15
    
    <IfModule prefork.c>
      MinSpareServers 10
      MaxSpareServers 20
    </IfModule>
    
    <IfModule itk.c>
      MinSpareServers 10
      MaxSpareServers 20
    </IfModule>
    
    ServerLimit 256
    MaxClients 150
    MaxRequestsPerChild 10000
    KeepAlive On
    KeepAliveTimeout 5
    MaxKeepAliveRequests 100
    I checked my error_log, anyway, but didn't find any reference to "mutex". The oldest entry was May 31, about 12 hours before this problem began the first time. I looked through, and don't see any errors other than attempts for pages that don't exist, and a handful of errors that I see all the time that I don't understand, but I doubt that they're related to this:

    Code:
    RewriteOptions: MaxRedirects option has been removed in favor of the global LimitInternalRecursion directive and will be ignored.
    Hostname X provided via SNI and hostname example.com provided via HTTP are different
    Thanks, Michael!
     
  10. dalem

    dalem Well-Known Member PartnerNOC

    Joined:
    Oct 24, 2003
    Messages:
    2,909
    Likes Received:
    127
    Trophy Points:
    368
    Location:
    SLC
    cPanel Access Level:
    DataCenter Provider
    Are you running a lot of WordPress sites?
    What you are describing sounds Just your run of the mill Layer 7 attack which happen 24/7 365 days a year non stop from bots, the typical wp-login & xmlrp attacks.

    I have noticed that some of the bots have a new plan instead of rapid fire brute force they are connecting and reconnecting or one in out & switch to a new IP which will allow them to not get banned as easily. So we did not notice right away what was going on.
    A good custom Mod security rule stops them in their tracks.


    One of our servers has been acting up as you described a couple times a day and we realized on of our clients multiple Magento installs was getting hammered adedd a mod security rule all is well now (well all most as soon a all in the botnet ips get banned ).


    Also realized for some reason our WordPress mod security rule was not working which did not help
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  11. MaFt

    MaFt Registered

    Joined:
    Jun 3, 2019
    Messages:
    2
    Likes Received:
    0
    Trophy Points:
    1
    Location:
    UK
    cPanel Access Level:
    Reseller Owner
    I have 2 wordpress installs on the hosting I mentioned in my reply.

    Can you expand on what the "good mod security rule" would be?
     
  12. dalem

    dalem Well-Known Member PartnerNOC

    Joined:
    Oct 24, 2003
    Messages:
    2,909
    Likes Received:
    127
    Trophy Points:
    368
    Location:
    SLC
    cPanel Access Level:
    DataCenter Provider
    one that blocks bots "connections with no referrer"

    Like this one
    wp-login.php and mod security

    xmlrpc is the same rule just change the wp-login.php to xmlrpc & change the mod security ID


    and set up your firewall to ban them the time to ban will be entirely up to you & is server specific
    We have ours set to 1 time block permanent as the more WordPress sites on a server the more connections there will be.


    Make sure it works as expected as different server set ups seam to behave differently
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  13. cPanelMichael

    cPanelMichael Technical Support Community Manager Staff Member

    Joined:
    Apr 11, 2011
    Messages:
    47,555
    Likes Received:
    2,182
    Trophy Points:
    363
    cPanel Access Level:
    Root Administrator
    Twitter:
    Hello @GoWilkes,

    Thank you for sharing the additional information. The issue reported on this thread does not appear related to the case quoted below, but feel free to test out the temporary workaround if the affected system uses the Prefork MPM to see if it has any impact on the reported issue:

    If the workaround doesn't help, could you open a support ticket so we can rule out any issues with cPanel & WHM? Post the ticket number here and I'll link this thread to it.

    Thank you.
     
  14. dalem

    dalem Well-Known Member PartnerNOC

    Joined:
    Oct 24, 2003
    Messages:
    2,909
    Likes Received:
    127
    Trophy Points:
    368
    Location:
    SLC
    cPanel Access Level:
    DataCenter Provider
    PS this was just a guess as what your issue is on our server it was definitely the issue
    you can do a quick check and see how many foreign ip's are brute forcing


    grep -ir wp-login.php /var/log/apache2/domlogs
    grep -ir wp-admin /var/log/apache2/domlogs
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
    cPanelMichael likes this.
  15. GoWilkes

    GoWilkes Well-Known Member

    Joined:
    Sep 26, 2006
    Messages:
    425
    Likes Received:
    7
    Trophy Points:
    168
    cPanel Access Level:
    Root Administrator
    Michael, it turns that I don't have Prefork, after all. This was the result when I ran the commands you gave:

    Code:
    -bash: /etc/apache2/conf.modules.d/000_mod_mpm_prefork.conf: No such file or directory
    Built /usr/local/apache/conf/httpd.conf OK
    Waiting for âhttpdâhttpdâ
    
    Service Status
            httpd (/usr/local/apache/bin/httpd -k start) is running as root with PID 4923 (pidfile+/proc check method).
    
    Startup Log
            [Wed Jun 05 02:50:50 2019] [error] VirtualHost *:443 -- mixing * ports and non-* ports with a NameVirtualHost address is not supported, proceeding with undefined results
    
    Log Messages
            [Wed Jun 05 02:50:51 2019] [notice] ModSecurity for Apache/2.9.0 (http://www.modsecurity.org/) configured.
            [Wed Jun 05 02:50:51 2019] [notice] suEXEC mechanism enabled (wrapper: /usr/local/apache/bin/suexec)
    
    httpd restarted successfully.
    I'm a tad concerned about the error message, considering that all of the accounts on the server were created with WHM and I haven't manually edited httpd.conf in years... probably not since I got this server, honestly. All of my sites seem to be running so I don't think it's a fatal error, but I definitely wasn't expecting it!


    @dalem, that was a great thought, but unfortunately not my issue :-( My log files were at:

    /usr/local/apache/domlogs/[USERNAME]/[DOMAIN.COM]

    I already test for references to wp-admin and wp-login via PHP and block IPs, but not at the firewall so it was an idea! But I only had 5 references to wp-login, and 2 to wp-admin. So that wasn't the culprit, either.

    @GOT, just FYI, it looks like CC_ALLOW_FILTER isn't blocking non-US IPs the way I'd hoped, so you could be right on that one. I was manually adding RIPE, APNIC, and LACNIC IP ranges but removed them in favor of CC_ALLOW_FILTER a few days ago. I didn't notice an increase in processes or anything, but I just now looked and saw that I have 7 RIPE connections.

    But anyway... no change on my end, I still have almost double the number of processes, my RAM usage is off the charts, etc. I'm at a complete loss.
     
  16. dalem

    dalem Well-Known Member PartnerNOC

    Joined:
    Oct 24, 2003
    Messages:
    2,909
    Likes Received:
    127
    Trophy Points:
    368
    Location:
    SLC
    cPanel Access Level:
    DataCenter Provider
    you are still running Easyapache3 (EOL) best to think about upgrading to Easyapache4
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  17. GoWilkes

    GoWilkes Well-Known Member

    Joined:
    Sep 26, 2006
    Messages:
    425
    Likes Received:
    7
    Trophy Points:
    168
    cPanel Access Level:
    Root Administrator
    I am... I'm procrastinating for 2 reasons:

    1. I always wait until the last minute for software updates, to let everyone else figure out the bugs before I deal with them; and

    2. Nothing in the documentation has commented on potential down time while waiting for it to update, so I'm waiting for a time when I have a few hours to possibly wait, and then another few hours to sort out bugs before the next business day.
     
Loading...

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice