how to get a stack trace of php-fpm error AH01067: Failed to read FastCGI header ?

mec-forum

Member
Jun 18, 2019
23
4
3
Italy
cPanel Access Level
Root Administrator
Hello! I am getting a 503 error page on a certain web application (too big to debug without xdebug, but I cannot put xdebug on the server, unless there is a way to load the extension only on a specific domain). Nothing is appended to php error logs (i see no error_log file in the working directory).

Is there a way to find out what was the last executed position (file, line)?

I imagine it could be a sort of timeout between apache and php-fpm since the requests stall for 30-40 seconds before returning a 503 page, where it would normaly take a couple of seconds to complete. Thanks.
 
Last edited:

cPanelLauren

Product Owner II
Staff member
Nov 14, 2017
13,266
1,309
363
Houston
The error log for PHP-FPM per user should be at /home/$user/logs/domain_tld.php.error.log let me know if this is not enough to identify the cause of the error and I'll see what else there is that can be done.
 

mec-forum

Member
Jun 18, 2019
23
4
3
Italy
cPanel Access Level
Root Administrator
No stack trace :\ Last line is from a week ago but I we are experiencing this on some ajax requests seemigly at random.
I tried wrapping the whole web app in a try catch php block but it crashed anyway with no logs or output.

I disabled opcache but no change. I also increased the global apache connections to be greater than php-fpm child process count (that is 60).
About php fpm logs it is the same issue of this abandoned thread: Failed to read FastCGI header
I posted the ram usage in there, but it does not seem abnormal.
 

cPanelLauren

Product Owner II
Staff member
Nov 14, 2017
13,266
1,309
363
Houston
I actually removed that post, but I am curious what is in the PHP-FPM logs for that specific PHP version. You can find them here:

Code:
/opt/cpanel/ea-phpXX/root/usr/var/log/php-fpm/error.log
It may be possible to configure more in-depth logging with the following: Configuration Values of PHP-FPM | cPanel & WHM Documentation
 

mec-forum

Member
Jun 18, 2019
23
4
3
Italy
cPanel Access Level
Root Administrator
I actually removed that post, but I am curious what is in the PHP-FPM logs for that specific PHP version.
I did post in there since the log was exactly the same, with no apparent clues, beside the evidence of my manual reboots before.

Code:
[19-Jul-2020 03:39:01] NOTICE: error log file re-opened
[20-Jul-2020 09:26:54] WARNING: [pool website_com] child 14291 exited on signal 9 (SIGKILL) after 26.415603 seconds from start
[20-Jul-2020 09:26:54] NOTICE: [pool website_com] child 14292 started
[20-Jul-2020 10:01:01] WARNING: [pool website_com] child 15558 exited on signal 9 (SIGKILL) after 125.401374 seconds from start
[20-Jul-2020 10:01:01] NOTICE: [pool website_com] child 15753 started
[20-Jul-2020 10:02:15] WARNING: [pool website_com] child 15753 exited on signal 9 (SIGKILL) after 73.905925 seconds from start
[20-Jul-2020 10:02:15] NOTICE: [pool website_com] child 15781 started
[20-Jul-2020 10:02:52] WARNING: [pool website_com] child 15563 exited on signal 9 (SIGKILL) after 234.220034 seconds from start
[20-Jul-2020 10:02:52] NOTICE: [pool website_com] child 15785 started
[20-Jul-2020 10:17:16] WARNING: [pool website_com] child 15785 exited on signal 9 (SIGKILL) after 863.981561 seconds from start
[20-Jul-2020 10:17:16] NOTICE: [pool website_com] child 16353 started
[20-Jul-2020 10:24:10] WARNING: [pool website_com] child 15781 exited on signal 9 (SIGKILL) after 1315.300730 seconds from start
[20-Jul-2020 10:24:10] NOTICE: [pool website_com] child 16737 started
[20-Jul-2020 10:31:35] WARNING: [pool website_com] child 16619 exited on signal 9 (SIGKILL) after 466.013445 seconds from start
[20-Jul-2020 10:31:35] NOTICE: [pool website_com] child 17086 started
[20-Jul-2020 10:43:17] WARNING: [pool website_com] child 17086 exited on signal 9 (SIGKILL) after 702.189098 seconds from start
[20-Jul-2020 10:43:17] NOTICE: [pool website_com] child 17598 started
[20-Jul-2020 10:44:31] NOTICE: Terminating ...
[20-Jul-2020 10:44:31] NOTICE: exiting, bye-bye!
[20-Jul-2020 10:45:40] NOTICE: fpm is running, pid 855
[20-Jul-2020 10:45:40] NOTICE: ready to handle connections
[20-Jul-2020 10:45:40] NOTICE: systemd monitor interval set to 10000ms
[20-Jul-2020 10:48:01] WARNING: [pool website_com] child 1395 exited on signal 9 (SIGKILL) after 138.944209 seconds from start
[20-Jul-2020 10:48:01] NOTICE: [pool website_com] child 1948 started
[20-Jul-2020 10:49:42] WARNING: [pool website_com] child 1948 exited on signal 9 (SIGKILL) after 101.200377 seconds from start
[20-Jul-2020 10:49:42] NOTICE: [pool website_com] child 2105 started
[20-Jul-2020 10:57:22] WARNING: [pool website_com] child 2105 exited on signal 9 (SIGKILL) after 459.554757 seconds from start
[20-Jul-2020 10:57:22] NOTICE: [pool website_com] child 2609 started
[20-Jul-2020 11:00:31] WARNING: [pool website_com] child 1396 exited on signal 9 (SIGKILL) after 888.501843 seconds from start
[20-Jul-2020 11:00:31] NOTICE: [pool website_com] child 2856 started
[20-Jul-2020 12:18:22] WARNING: [pool website_com] child 2609 exited on signal 9 (SIGKILL) after 4859.812517 seconds from start
[20-Jul-2020 12:18:22] NOTICE: [pool website_com] child 7014 started
[20-Jul-2020 13:36:05] NOTICE: [pool website_com] child 1448 exited with code 0 after 10222.029839 seconds from start
[20-Jul-2020 13:36:05] NOTICE: [pool website_com] child 10601 started
[20-Jul-2020 14:02:32] NOTICE: [pool website_com] child 2856 exited with code 0 after 10921.016997 seconds from start
[20-Jul-2020 14:02:32] NOTICE: [pool website_com] child 11752 started
[20-Jul-2020 14:32:52] WARNING: [pool website_com] child 10601 exited on signal 9 (SIGKILL) after 3406.663971 seconds from start
[20-Jul-2020 14:32:52] NOTICE: [pool website_com] child 13473 started
[20-Jul-2020 14:41:08] NOTICE: [pool mec-group_it] child 1589 exited with code 0 after 14122.554547 seconds from start
[20-Jul-2020 14:41:08] NOTICE: [pool mec-group_it] child 13844 started
[20-Jul-2020 14:50:04] NOTICE: [pool mec-group_it] child 2060 exited with code 0 after 14450.654382 seconds from start
[20-Jul-2020 14:50:04] NOTICE: [pool mec-group_it] child 14108 started
[20-Jul-2020 14:50:24] NOTICE: [pool mec-group_it] child 2062 exited with code 0 after 14470.157946 seconds from start
[20-Jul-2020 14:50:24] NOTICE: [pool mec-group_it] child 14220 started
[20-Jul-2020 15:03:34] WARNING: [pool website_com] child 11752 exited on signal 9 (SIGKILL) after 3661.918113 seconds from start
[20-Jul-2020 15:03:34] NOTICE: [pool website_com] child 14642 started
[20-Jul-2020 15:04:13] WARNING: [pool website_com] child 13473 exited on signal 9 (SIGKILL) after 1880.886754 seconds from start
[20-Jul-2020 15:04:13] NOTICE: [pool website_com] child 14658 started
[20-Jul-2020 15:04:47] WARNING: [pool website_com] child 14642 exited on signal 9 (SIGKILL) after 73.380361 seconds from start
[20-Jul-2020 15:04:47] NOTICE: [pool website_com] child 14660 started
[20-Jul-2020 15:05:28] WARNING: [pool website_com] child 14658 exited on signal 9 (SIGKILL) after 75.180764 seconds from start
[20-Jul-2020 15:05:28] NOTICE: [pool website_com] child 14690 started
[20-Jul-2020 15:06:08] WARNING: [pool website_com] child 7014 exited on signal 9 (SIGKILL) after 10065.997246 seconds from start
[20-Jul-2020 15:06:08] NOTICE: [pool website_com] child 14711 started
[20-Jul-2020 15:06:51] WARNING: [pool website_com] child 14690 exited on signal 9 (SIGKILL) after 82.992763 seconds from start
[20-Jul-2020 15:06:51] NOTICE: [pool website_com] child 14724 started
[20-Jul-2020 15:07:33] WARNING: [pool website_com] child 14660 exited on signal 9 (SIGKILL) after 165.810749 seconds from start
[20-Jul-2020 15:07:33] NOTICE: [pool website_com] child 14831 started
[20-Jul-2020 15:09:05] WARNING: [pool website_com] child 14831 exited on signal 9 (SIGKILL) after 92.453188 seconds from start
[20-Jul-2020 15:09:05] NOTICE: [pool website_com] child 14899 started
[20-Jul-2020 15:09:56] WARNING: [pool website_com] child 14711 exited on signal 9 (SIGKILL) after 228.504433 seconds from start
[20-Jul-2020 15:09:56] NOTICE: [pool website_com] child 14910 started
[20-Jul-2020 15:12:08] WARNING: [pool website_com] child 14724 exited on signal 9 (SIGKILL) after 316.809745 seconds from start
[20-Jul-2020 15:12:08] NOTICE: [pool website_com] child 15020 started
[20-Jul-2020 15:12:49] WARNING: [pool website_com] child 14910 exited on signal 9 (SIGKILL) after 173.098154 seconds from start
[20-Jul-2020 15:12:49] NOTICE: [pool website_com] child 15026 started
[20-Jul-2020 15:13:22] WARNING: [pool website_com] child 14899 exited on signal 9 (SIGKILL) after 256.695732 seconds from start
[20-Jul-2020 15:13:22] NOTICE: [pool website_com] child 15036 started
[20-Jul-2020 15:15:59] WARNING: [pool website_com] child 15036 exited on signal 9 (SIGKILL) after 156.821707 seconds from start
[20-Jul-2020 15:15:59] NOTICE: [pool website_com] child 15208 started
[20-Jul-2020 15:22:19] WARNING: [pool website_com] child 15020 exited on signal 9 (SIGKILL) after 611.018698 seconds from start
[20-Jul-2020 15:22:19] NOTICE: [pool website_com] child 15482 started
[20-Jul-2020 15:23:39] WARNING: [pool website_com] child 15472 exited on signal 9 (SIGKILL) after 99.676116 seconds from start
[20-Jul-2020 15:23:39] NOTICE: [pool website_com] child 15508 started
[20-Jul-2020 15:29:18] WARNING: [pool website_com] child 15508 exited on signal 9 (SIGKILL) after 339.757153 seconds from start
[20-Jul-2020 15:29:18] NOTICE: [pool website_com] child 15709 started
[20-Jul-2020 15:30:34] NOTICE: Terminating ...
[20-Jul-2020 15:30:34] NOTICE: exiting, bye-bye!
[20-Jul-2020 15:30:34] NOTICE: fpm is running, pid 15936
[20-Jul-2020 15:30:34] NOTICE: ready to handle connections
[20-Jul-2020 15:30:34] NOTICE: systemd monitor interval set to 10000ms
[20-Jul-2020 15:31:11] WARNING: [pool website_com] child 15948 exited on signal 9 (SIGKILL) after 33.477142 seconds from start
[20-Jul-2020 15:31:11] NOTICE: [pool website_com] child 15964 started
[20-Jul-2020 15:35:36] WARNING: [pool website_com] child 15964 exited on signal 9 (SIGKILL) after 264.989420 seconds from start
[20-Jul-2020 15:35:36] NOTICE: [pool website_com] child 16162 started
[20-Jul-2020 15:35:55] WARNING: [pool website_com] child 15949 exited on signal 9 (SIGKILL) after 312.459127 seconds from start
[20-Jul-2020 15:35:55] NOTICE: [pool website_com] child 16165 started
[20-Jul-2020 15:36:44] WARNING: [pool website_com] child 16162 exited on signal 9 (SIGKILL) after 67.537397 seconds from start
[20-Jul-2020 15:36:44] NOTICE: [pool website_com] child 16208 started
[20-Jul-2020 15:37:10] WARNING: [pool website_com] child 16163 exited on signal 9 (SIGKILL) after 75.328750 seconds from start
[20-Jul-2020 15:37:10] NOTICE: [pool website_com] child 16223 started
[20-Jul-2020 15:37:19] WARNING: [pool website_com] child 16165 exited on signal 9 (SIGKILL) after 83.797250 seconds from start
[20-Jul-2020 15:37:19] NOTICE: [pool website_com] child 16224 started
[20-Jul-2020 15:37:46] WARNING: [pool website_com] child 16208 exited on signal 9 (SIGKILL) after 62.336825 seconds from start
[20-Jul-2020 15:37:46] NOTICE: [pool website_com] child 16225 started
[20-Jul-2020 15:43:47] WARNING: [pool website_com] child 16225 exited on signal 9 (SIGKILL) after 360.996396 seconds from start
[20-Jul-2020 15:43:47] NOTICE: [pool website_com] child 16484 started
[20-Jul-2020 15:44:53] WARNING: [pool website_com] child 16160 exited on signal 9 (SIGKILL) after 560.211278 seconds from start
[20-Jul-2020 15:44:53] NOTICE: [pool website_com] child 16504 started
[20-Jul-2020 15:45:26] WARNING: [pool website_com] child 16224 exited on signal 9 (SIGKILL) after 487.227068 seconds from start
[20-Jul-2020 15:45:26] NOTICE: [pool website_com] child 16532 started
[20-Jul-2020 16:12:58] WARNING: [pool website_com] child 16532 exited on signal 9 (SIGKILL) after 1651.636826 seconds from start
[20-Jul-2020 16:12:58] NOTICE: [pool website_com] child 17682 started
[20-Jul-2020 16:54:04] NOTICE: [pool website_com] child 16223 exited with code 0 after 4613.658408 seconds from start
[20-Jul-2020 16:54:04] NOTICE: [pool website_com] child 19567 started
[20-Jul-2020 16:59:40] NOTICE: Terminating ...
[20-Jul-2020 16:59:40] NOTICE: exiting, bye-bye!
[20-Jul-2020 16:59:40] NOTICE: fpm is running, pid 19971
[20-Jul-2020 16:59:40] NOTICE: ready to handle connections
[20-Jul-2020 16:59:40] NOTICE: systemd monitor interval set to 10000ms
[20-Jul-2020 17:01:06] WARNING: [pool website_com] child 19976 exited on signal 9 (SIGKILL) after 86.013725 seconds from start
[20-Jul-2020 17:01:06] NOTICE: [pool website_com] child 20183 started
[20-Jul-2020 17:06:01] NOTICE: Terminating ...
[20-Jul-2020 17:06:01] NOTICE: exiting, bye-bye!
[20-Jul-2020 17:06:44] NOTICE: fpm is running, pid 875
[20-Jul-2020 17:06:44] NOTICE: ready to handle connections
[20-Jul-2020 17:06:44] NOTICE: systemd monitor interval set to 10000ms
[20-Jul-2020 17:08:13] WARNING: [pool website_com] child 1453 exited on signal 9 (SIGKILL) after 86.487068 seconds from start
[20-Jul-2020 17:08:13] NOTICE: [pool website_com] child 1830 started
[20-Jul-2020 18:58:54] WARNING: [pool website_com] child 7394 exited on signal 9 (SIGKILL) after 63.249500 seconds from start
[20-Jul-2020 18:58:54] NOTICE: [pool website_com] child 7416 started
It may be possible to configure more in-depth logging with the following: Configuration Values of PHP-FPM | cPanel & WHM Documentation
This is the log file after I changed the php-fpm log level to debug. I uploaded to mega since it is too big for attachment or pasting...
 
Last edited by a moderator:

cPanelLauren

Product Owner II
Staff member
Nov 14, 2017
13,266
1,309
363
Houston
Your output really isn't the same as the other users there. They were reaching their pool limits, I don't see the same behavior here. Nothing in the output you provided, including the debug output, indicates any kind of issue. It might at this point be best to open a ticket where the configuration can be looked at by one of our analysts - it's really difficult to troubleshoot this kind of issue over a forum with no access. If you do open a ticket update here with the ticket ID and I'll follow up with the outcome.
 

mec-forum

Member
Jun 18, 2019
23
4
3
Italy
cPanel Access Level
Root Administrator
Well, what I meant is that it is similar to the last log they posted, after solving the pool limit issue. I am now trying to check better if the php side is somehow involved... but I do not think so, since I ended up isolating the occurrence of the error to whether some code is run or not, but maybe it's just a timing coincidence.

I am going to try this, before opening a ticket tough: How to find Which Process Is Killing mysqld With SIGKILL or SIGTERM on Linux – The Geek Diary

Update: I tried that. According on how the article tells to interpret these log lines php-fpm is killing the child process (I 'm not sure of understanding properly) as soon as I send the AJAX request that triggers the error 503. I think I will open a ticket now, but if you have any insight on this log entries please add a reply to thisthread or to the ticket.

Code:
> cat /var/log/audit/audit.log

type=CRED_DISP msg=audit(1595368388.304:44195): pid=80479 uid=0 auid=1002 ses=4705 msg='op=PAM:setcred grantors=pam_env,pam_hulk,pam_unix acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_END msg=audit(1595368388.305:44196): pid=80479 uid=0 auid=1002 ses=4705 msg='op=PAM:session_close grantors=pam_loginuid,pam_keyinit,pam_limits,pam_systemd acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=SYSCALL msg=audit(1595368416.744:44197): arch=c000003e syscall=62 success=yes exit=0 a0=13a43 a1=3 a2=64 a3=2e8 items=0 ppid=1 pid=19057 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="php-fpm" exe="/opt/cpanel/ea-php73/root/usr/sbin/php-fpm" key="audit_kill"
type=OBJ_PID msg=audit(1595368416.744:44197): opid=80451 oauid=-1 ouid=1002 oses=-1 ocomm="php-fpm"
type=PROCTITLE msg=audit(1595368416.744:44197): proctitle=7068702D66706D3A206D61737465722070726F6365737320282F6F70742F6370616E656C2F65612D70687037332F726F6F742F6574632F7068702D66706D2E636F6E6629
type=USER_ACCT msg=audit(1595368441.352:44198): pid=80499 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_localuser acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_ACQ msg=audit(1595368441.352:44199): pid=80499 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_hulk,pam_unix acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=LOGIN msg=audit(1595368441.352:44200): pid=80499 uid=0 old-auid=4294967295 auid=1002 tty=(none) old-ses=4294967295 ses=4706 res=1
type=USER_ACCT msg=audit(1595368441.355:44201): pid=80500 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_localuser acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_ACQ msg=audit(1595368441.355:44202): pid=80500 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_hulk,pam_unix acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=LOGIN msg=audit(1595368441.355:44203): pid=80500 uid=0 old-auid=4294967295 auid=1002 tty=(none) old-ses=4294967295 ses=4707 res=1
type=USER_START msg=audit(1595368441.355:44204): pid=80500 uid=0 auid=1002 ses=4707 msg='op=PAM:session_open grantors=pam_loginuid,pam_keyinit,pam_limits,pam_systemd acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_START msg=audit(1595368441.355:44205): pid=80499 uid=0 auid=1002 ses=4706 msg='op=PAM:session_open grantors=pam_loginuid,pam_keyinit,pam_limits,pam_systemd acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_REFR msg=audit(1595368441.355:44206): pid=80500 uid=0 auid=1002 ses=4707 msg='op=PAM:setcred grantors=pam_env,pam_hulk,pam_unix acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_REFR msg=audit(1595368441.385:44207): pid=80499 uid=0 auid=1002 ses=4706 msg='op=PAM:setcred grantors=pam_env,pam_hulk,pam_unix acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_DISP msg=audit(1595368444.031:44208): pid=80499 uid=0 auid=1002 ses=4706 msg='op=PAM:setcred grantors=pam_env,pam_hulk,pam_unix acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_END msg=audit(1595368444.034:44209): pid=80499 uid=0 auid=1002 ses=4706 msg='op=PAM:session_close grantors=pam_loginuid,pam_keyinit,pam_limits,pam_systemd acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_DISP msg=audit(1595368448.024:44210): pid=80500 uid=0 auid=1002 ses=4707 msg='op=PAM:setcred grantors=pam_env,pam_hulk,pam_unix acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_END msg=audit(1595368448.027:44211): pid=80500 uid=0 auid=1002 ses=4707 msg='op=PAM:session_close grantors=pam_loginuid,pam_keyinit,pam_limits,pam_systemd acct="mecgroup" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
 
Last edited:
  • Like
Reactions: cPanelLauren

cPanelLauren

Product Owner II
Staff member
Nov 14, 2017
13,266
1,309
363
Houston
Hello,


Thanks for providing that log output I made sure I noted the ticket with the log output information - it does look like they are looking for a way to replicate this behavior reliably, it may be that they need to know what the request is you're making that is causing the issue.
 

mec-forum

Member
Jun 18, 2019
23
4
3
Italy
cPanel Access Level
Root Administrator
Hello,


Thanks for providing that log output I made sure I noted the ticket with the log output information - it does look like they are looking for a way to replicate this behavior reliably, it may be that they need to know what the request is you're making that is causing the issue.
I will try to create a script that can be executed in-browser without problematic steps like login into the production crm.

Update (Debriefing): it was a php recursion problem (caused by an edge case of a value-changing middleware that cause it to recurse infinitely), which in turn caused ram exhaustion eating ram until no free ram remained after ~30 seconds from the beginning of the request. Having no memory_limit it would just crash. I discovered the root of the problem because i bit the bullet and installed xdebug, which raises an exception when the call stack depth reaches a certain limit. I hope this helps other people that incurs in this kind of problem.
 
Last edited:
  • Like
Reactions: cPanelLauren