Operating System & Version
CENTOS 7.7
cPanel & WHM Version
v86.0.4

ardimardiana

Member
Feb 23, 2020
5
0
1
majalengka
cPanel Access Level
DataCenter Provider
First allow me to intoduced my self here
i have read this thread, same problem but different source of problem.

Prologue
We (me and my team/ campus) have a dedicated server placed in our place. we choose WHM as our tools to manage server, and have a little (almost zero) knowledge about how managing a server. Before we use WHM, we rent a hosting to host most of our website. but we concerned about our student data, and decided to build own server.

Part 1 (server spesification)

Total processors: 32Processor #1Vendor Genuine IntelNameIntel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz Speed 2100.000 MHzCache 11264 KB

Memory Information​
[ 0.000000] Memory: 5717008k/35651584k available (7760k kernel code, 2419284k absent, 831520k reserved, 5967k data, 1984k init)​

System Information​
Linux server_name 3.10.0-1062.1.2.el7.x86_64 #1 SMP Mon Sep 30 14:19:46 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux​

Physical Disks​
[ 4.301984] sd 0:2:0:0: [sda] 2341994496 512-byte logical blocks: (1.19 TB/1.08 TiB)
[ 4.302055] sd 0:2:0:0: [sda] Write Protect is off
[ 4.302059] sd 0:2:0:0: [sda] Mode Sense: 22 00 00 0b
[ 4.302099] sd 0:2:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4.309591] sda: sda1 sda2
[ 4.311039] sd 0:2:0:0: [sda] Attached SCSI disk
[ 10.484707] sd 0:2:0:0: Attached scsi generic sg0 type 0​

Current Memory Usage​
total used free shared buff/cache available
Mem: 32424544 3429240 4857924 428340 24137380 28134456
Swap: 16318460 776 16317684
Total: 48743004 3430016 21175608​

Current Disk Usage​
Filesystem Size Used Avail Use% Mounted on
devtmpfs 16G 0 16G 0% /dev
tmpfs 16G 4.0K 16G 1% /dev/shm
tmpfs 16G 294M 16G 2% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/mapper/centos-root 50G 17G 34G 34% /
/dev/sda1 1016M 269M 747M 27% /boot
/dev/mapper/centos-home 1.1T 139G 911G 14% /home
/dev/loop0 2.2G 71M 2.0G 4% /tmp
tmpfs 3.1G 0 3.1G 0% /run/user/0​

And Dedicated Connection 50 Mbps with Mikrotik OS as Router

PART 2 (base on this recommendation)
How many sites on the server?

Currently 44 Account in our account list. every account consist about 2-3 websites. majority of website we using wordpress.
Beside sites, we have a Open Journal System (OJS) 2 and 3 Installed on Server (one most most consistent traffic).
and of course Academic Information System. (AIS)
most (not all) AIS built with Code Igniter.
one REST API server built with Lumens.

What is the normal load averages?
on daily use, we just have a little problem with server load. only 1 month every semester we have high load problem because a lot of activity in initial semester such as student make a study contract, input student grade, prepare class data, sync with our government server about study data.

How much ram is on the server?
32 GB.

What version of PHP are you using?
Mixed
but a majority use 7.3 only a small portion using 5.6. but most of 5.6 App will be there (just for data show) but not for CRUD.

Do you have PHP opcache installed?
yes, we have PHP opchace Installed

What database version are you using?
in this early month we currently using remote MySQL (different server with local ip)
Server: xxx.xxx.xxx.xxx via TCP/IP
Server type: MySQL
Server connection: SSL is not being used Documentation
Server version: 5.7.29 - MySQL Community Server (GPL)
Protocol version: 10

Apache Global Conf
Start Servers [?] 25
Minimum Spare Servers [?] 25
Maximum Spare Servers [?] 100
Server Limit (Maximum: 20,000) [?] 500
Max Request Workers [?] 250
Max Connections Per Child [?] 10000
Keep-Alive [?] On
Keep-Alive Timeout [?] 5
Max Keep-Alive Requests [?] 100
Timeout [?] 300
Symlink Protection [?] On

PHP-FPM Pool Options

Max Requests 250
Process Idle Timeout 300
Max Children 25

Problem

Every time our server stuck/ crash (when a lot of Process wait to send Reply (a lots of W )), we must restart PHP-FPM server sometimes PHP-FPM and http server (apache) before we move out out DB. we restart mysql services too. this problem happen not very often. only when we start new semester. daily average just need not more than twice a day. on daily basis just Presence App and our website running. not many user (student and lecturer) accessed AIS.

Epilogue
this problem happen because we miss configuration our server or (and) our bandwidth is too small to handle the request? if bandwidth to small. why we can still access WHM (or another port beside 443 or 80) with no problem. we still can reset out server with WHM interface in port 2087.

Please help us.
thank you.
 

rackaid

Well-Known Member
Jan 18, 2003
89
27
168
Jacksonville, FL
cPanel Access Level
DataCenter Provider
Due to stuck processes, you are filling up the connection queue in Apache. When this happens, web resources will not be available or will be very slow. This is controlled by the MaxRequestsWorkers value. (You may want to try 1024). If you look in the apache error log, you will see a warning if you are hitting the MaxRequestWorkers value.

Why is this happening?
The W's are usually due to PHP scripts waiting for something. The something can be difficult to identify but common items are:
  • Slow MySQL queries.
  • Slowly responding APIs.
  • Bad PHP loops that require a lot of processing.

If you check the SS column the Apache Status page, you will see these process are running for a long time. When you see this, check to see if there is corresponding activity in the database.

We once had a WordPress site have a transients issue where the table ballooned to >10M rows. A plugin was trying to run a poorly optimized query on the table. We were able to see the query by using MySQL's slow query logging feature.

In addition to SQL issues, I have seen this when awaiting responses from 3rd party APIs. Slowly responding or blocked APIs can cause scripts to hang for 30s or more. Many apps do not set timeouts for these connections and in some cases, the timeout can be very long (minutes). As these build up, the Apache server stops responding.

Debugging these issues can be difficult, but if you start tracing through the layers of your app, you can usually find the issue.
 
  • Like
Reactions: cPanelLauren

ardimardiana

Member
Feb 23, 2020
5
0
1
majalengka
cPanel Access Level
DataCenter Provider
thanks for quick response.
i have change MaxRequestsWorkers to 1024.

i never pay attention to SS header column before.
tomorrow when server get heavy load i will take a note which process have a long timeout.

maybe its good decision if i try to limit timeout every app to make sure server wont load too high.

are bandwidth still related to my issue? since server try to sending a reply while bandwitdh can handle it?
because at early this month we still use 20Mpbs but since 02-16 we have upgrade to 50Mbps yet i cant feel any difference with high server load.

server hard to reach only from port 80 or 443. i mean when server on high load all website down. but not WHM.
 

cPanelLauren

Technical Support Community Manager
Staff member
Nov 14, 2017
13,237
1,232
313
Houston
You might also want to try running some benchmarking for apache using ab to determine what your needs will be in respect to that. ab - Apache HTTP server benchmarking tool - Apache HTTP Server Version 2.4

When you're seeing the high count of W which indicates= "Sending Reply" are you also experiencing errors in the apache error log or php-fpm specific error logs? If you were low on bandwidth I'd anticipate seeing a high count of "_" rather than "W" and if the timeout was the concern I would anticipate seeing a large amount of connections with "K"
 

rackaid

Well-Known Member
Jan 18, 2003
89
27
168
Jacksonville, FL
cPanel Access Level
DataCenter Provider
Bandwidth is not likely an issue. Don't mistake Sending Reply for the server actually sending data. Apache will be in the Sending Reply state as soon as the system processes the request headers and determines what response to send. In this case, you likely have apache waiting on a PHP app to send data. So the server is stuck in the Sending Reply state. Since the issue is due to a slow application, more bandwidth will not help. If bandwidth were the issue, you would see 100% max utilization of your network link.
 
  • Like
Reactions: cPanelLauren

ardimardiana

Member
Feb 23, 2020
5
0
1
majalengka
cPanel Access Level
DataCenter Provider
in addition to make my application faster, is there any suggestion about my php-fpm setting for php use all available resources to process my application faster?
we will make comparison between application run in cpanel (for global access) and local access but still same database.
currently in apache status still open just 10 server and child up to 250. even i have update my php-fpm as sugesstion to 1024.
 

ardimardiana

Member
Feb 23, 2020
5
0
1
majalengka
cPanel Access Level
DataCenter Provider
This ab for all user data with no lib (raw sql) framework lumens by laravel
[[email protected]]# ab -n 500 -c 50 http://domain/all_user_data
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, 齐乐娱乐老虎机_齐乐娱乐老虎机平台_齐乐娱乐老虎机游戏
Licensed to The Apache Software Foundation, Welcome to The Apache Software Foundation!

Benchmarking domain (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Finished 500 requests


Server Software: Apache
Server Hostname: domain
Server Port: 80

Document Path: all_user_data
Document Length: 26226047 bytes

Concurrency Level: 50
Time taken for tests: 135.480 seconds
Complete requests: 500
Failed requests: 0
Total transferred: 13113121000 bytes
HTML transferred: 13113023500 bytes
Requests per second: 3.69 [#/sec] (mean)
Time per request: 13547.969 [ms] (mean)
Time per request: 270.959 [ms] (mean, across all concurrent requests)
Transfer rate: 94521.79 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 24 194.3 2 3008
Processing: 943 12992 2428.6 13083 18990
Waiting: 609 10155 2338.8 10602 14324
Total: 944 13016 2429.4 13122 18991

Percentage of the requests served within a certain time (ms)
50% 13122
66% 14285
75% 14838
80% 15152
90% 16004
95% 16306
98% 16958
99% 17291
100% 18991 (longest request)

This ab for single user data with no lib (raw sql) framework lumens by laravel
[[email protected]]# ab -n 500 -c 50 http://domain/single_data
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, 齐乐娱乐老虎机_齐乐娱乐老虎机平台_齐乐娱乐老虎机游戏
Licensed to The Apache Software Foundation, Welcome to The Apache Software Foundation!

Benchmarking domain (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Finished 500 requests


Server Software: Apache
Server Hostname: domain
Server Port: 80

Document Path: single_data
Document Length: 1559 bytes

Concurrency Level: 50
Time taken for tests: 0.283 seconds
Complete requests: 500
Failed requests: 0
Total transferred: 877000 bytes
HTML transferred: 779500 bytes
Requests per second: 1766.40 [#/sec] (mean)
Time per request: 28.306 [ms] (mean)
Time per request: 0.566 [ms] (mean, across all concurrent requests)
Transfer rate: 3025.66 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 1.0 0 5
Processing: 18 24 5.5 23 60
Waiting: 18 24 5.5 23 60
Total: 18 25 6.0 24 63

Percentage of the requests served within a certain time (ms)
50% 24
66% 25
75% 26
80% 27
90% 29
95% 36
98% 46
99% 57
100% 63 (longest request)

still benchmarking some function with lib (datatables lib in lumen)
With only return 3 row result
[[email protected]]# ab -n 500 -c 50 https://domain/datatable/single_user
[1] 202471
[[email protected]]# This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, 齐乐娱乐老虎机_齐乐娱乐老虎机平台_齐乐娱乐老虎机游戏
Licensed to The Apache Software Foundation, Welcome to The Apache Software Foundation!

Benchmarking domain (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Finished 500 requests


Server Software: Apache
Server Hostname: domain
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
Server Temp Key: X25519 253 bits
TLS Server Name: domain

Document Path: /datatable/single_user
Document Length: 1320500 bytes

Concurrency Level: 50
Time taken for tests: 110.795 seconds
Complete requests: 500
Failed requests: 0
Total transferred: 660347500 bytes
HTML transferred: 660250000 bytes
Requests per second: 4.51 [#/sec] (mean)
Time per request: 11079.526 [ms] (mean)
Time per request: 221.591 [ms] (mean, across all concurrent requests)
Transfer rate: 5820.38 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 3 9 7.4 7 38
Processing: 1537 10569 2246.6 11446 17122
Waiting: 1455 10511 2245.6 11390 17043
Total: 1548 10577 2246.1 11452 17154

Percentage of the requests served within a certain time (ms)
50% 11452
66% 11764
75% 11884
80% 12122
90% 12598
95% 12909
98% 13929
99% 16497
100% 17154 (longest request)

same lib same query but return 31 row result
[[email protected]]# ab -n 250 -c 50 domain/datatable/single_user
[1] 203623
[[email protected]]# This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, 齐乐娱乐老虎机_齐乐娱乐老虎机平台_齐乐娱乐老虎机游戏
Licensed to The Apache Software Foundation, Welcome to The Apache Software Foundation!

Benchmarking domain (be patient)
Completed 100 requests
Completed 200 requests
Finished 250 requests


Server Software: Apache
Server Hostname: domain
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
Server Temp Key: X25519 253 bits
TLS Server Name: domain

Document Path: /datatable/single_user
Document Length: 447 bytes

Concurrency Level: 50
Time taken for tests: 50.731 seconds
Complete requests: 250
Failed requests: 226
(Connect: 0, Receive: 0, Length: 226, Exceptions: 0)
Non-2xx responses: 158
Total transferred: 150832 bytes
HTML transferred: 109172 bytes
Requests per second: 4.93 [#/sec] (mean)
Time per request: 10146.129 [ms] (mean)
Time per request: 202.923 [ms] (mean, across all concurrent requests)
Transfer rate: 2.90 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 3 19 11.6 20 42
Processing: 5 9699 10074.3 7279 30878
Waiting: 4 9696 10075.7 7279 30878
Total: 28 9718 10072.6 7284 30914

Percentage of the requests served within a certain time (ms)
50% 7284
66% 15910
75% 20395
80% 20983
90% 24086
95% 25843
98% 27574
99% 27983
100% 30914 (longest request)

apparently my app not run fast enough? i will optimize my app, but any suggestion for make my server run at maximun resources?
 
Last edited:

ardimardiana

Member
Feb 23, 2020
5
0
1
majalengka
cPanel Access Level
DataCenter Provider
Server Version: Apache/2.4.41 (cPanel) OpenSSL/1.1.1d mod_bwlimited/1.4 Phusion_Passenger/5.3.7
Server MPM: event
Server Built: Jan 28 2020 19:40:24
Current Time: Sunday, 01-Mar-2020 23:08:56 WIB
Restart Time: Sunday, 01-Mar-2020 21:15:20 WIB
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 1 hour 53 minutes 36 seconds
Server load: 0.21 0.16 0.13
Total accesses: 15949 - Total Traffic: 1.1 GB - Total Duration: 31563742
CPU Usage: u82.62 s18.57 cu0 cs0 - 1.48% CPU load
2.34 requests/sec - 174.0 kB/second - 74.4 kB/request - 1979.04 ms/request
55 requests currently being processed, 120 idle workers
Slot PID Stopping Connections Threads Async connections
total accepting busy idle writing keep-alive closing
0 202209 no 2 yes 2 23 0 0 0
1 202210 no 3 yes 2 23 0 1 0
2 202211 no 8 yes 6 19 0 0 0
3 202212 no 4 yes 4 21 0 0 0
4 202213 no 12 yes 11 14 0 0 0
5 202487 no 8 yes 15 10 0 0 0
6 203843 no 9 yes 15 10 0 0 0
Sum 7 0 46 55 120 0 1 0
___________________W__W_____W_________________W________W____W___
W_WW_W__________W________W__W_____W___WW_WW____W___W_W___WWWWWWW
W_W___W__WW__W_W_WWWWW___WW_WWWWW___WWWW___WWWW.................
..........................................................
Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process

this is my apache status while processing this ab

[[email protected] ~]# ab -n 500 -c 50 https://domain/data_user_raw_sql_lot_of_join
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, 齐乐娱乐老虎机_齐乐娱乐老虎机平台_齐乐娱乐老虎机游戏
Licensed to The Apache Software Foundation, Welcome to The Apache Software Foundation!

Benchmarking domain (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
apr_pollset_poll: The timeout specified has expired (70007)
Total of 314 requests completed

and our server is lag while processing this ab.

[[email protected] ~]# ab -n 500 -c 50 http://domain/data_user_raw_sql_lot_of_join
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, 齐乐娱乐老虎机_齐乐娱乐老虎机平台_齐乐娱乐老虎机游戏
Licensed to The Apache Software Foundation, Welcome to The Apache Software Foundation!

Benchmarking domain (be patient)
Completed 100 requests
apr_pollset_poll: The timeout specified has expired (70007)
Total of 126 requests completed