HTTPD Keep Restarting Every 10 Minutes

md201

Active Member
Aug 13, 2001
28
0
301
[quote:fc2bb1c404][i:fc2bb1c404]Originally posted by myros[/i:fc2bb1c404]

Anton over at VO came up with this temporary fix to the &restarting apache every 10 minutes& problem -

1. Moved bin/httpd to bin/httpd.real
2. Wrote perl script with the following and named it bin/httpd:

#!/usr/bin/perl

$SIG{USR1} = IGN;

exec(&/usr/local/apache/bin/httpd.real & . join(& &, @ARGV));

3. Chmodded it 755 and restarted Apache.


Causes apache to ignore the shutdown command so would require a manual restart if anything is changed that would normaly require an apache restart.
Apache has been running 7 hours for me now without a problem, so it seems something was sending it an uneeded sigusr1. Hopefully somebody will figure out why pretty soon :)

Cheers,
Myros[/quote:fc2bb1c404]

i tried above script but it doesn't work on my system




[Fri Jul 19 00:48:26 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 01:06:00 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 01:16:01 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 01:26:35 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 01:36:35 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 01:46:35 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 01:57:10 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 02:07:10 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 02:17:20 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 02:22:20 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 02:29:48 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 02:41:37 2002] [notice] SIGUSR1 received. Doing graceful restart
[Fri Jul 19 02:51:46 2002] [notice] SIGUSR1 received. Doing graceful restart

******** INSTALL SCRIPT ***********

[Fri Jul 19 03:01:50 2002] [notice] SIGUSR1 received. Doing graceful restart


^^^^^^^^^^^^^^^^^ &&&&& problem not solved.
 

hst

Well-Known Member
Feb 24, 2002
111
0
316
Check you httpd.conf

/etc/rc.d/init.d/httpd configtest

Run this and see if there are any errors
If there are fix the httpd.conf file.

Thats just an idea but if it keeps restarting there may be a syntax error somewhere there. Also you might just try to restart the crond and that may clear up something that may be looped.

/etc/rc.d/init.d/crond restart

I know my crond gets messed up sometimes.
 

hooper

Registered
Apr 12, 2002
4
0
301
Has this been figured out?

I checked my logs and I have tons of graceful restarts.

Is this still an issue for you folks?
 

alwaysweb

Well-Known Member
Mar 8, 2002
97
0
306
Dallas, TX
cPanel Access Level
Root Administrator
Background info: http://httpd.apache.org/docs/stopping.html


&USR1 Signal: graceful restart
Note: prior to release 1.2b9 this code is quite unstable and shouldn't be used at all.

The USR1 signal causes the parent process to advise the children to exit after their current request (or to exit immediately if they're not serving anything). The parent re-reads its configuration files and re-opens its log files. As each child dies off the parent replaces it with a child from the new generation of the configuration, which begins serving new requests immediately.

This code is designed to always respect the MaxClients, MinSpareServers, and MaxSpareServers settings. Furthermore, it respects StartServers in the following manner: if after one second at least StartServers new children have not been created, then create enough to pick up the slack. This is to say that the code tries to maintain both the number of children appropriate for the current load on the server, and respect your wishes with the StartServers parameter.

Users of the status module will notice that the server statistics are not set to zero when a USR1 is sent. The code was written to both minimize the time in which the server is unable to serve new requests (they will be queued up by the operating system, so they're not lost in any event) and to respect your tuning parameters. In order to do this it has to keep the scoreboard used to keep track of all children across generations.

The status module will also use a G to indicate those children which are still serving requests started before the graceful restart was given.

At present there is no way for a log rotation script using USR1 to know for certain that all children writing the pre-restart log have finished. We suggest that you use a suitable delay after sending the USR1 signal before you do anything with the old log. For example if most of your hits take less than 10 minutes to complete for users on low bandwidth links then you could wait 15 minutes before doing anything with the old log.

Note: If your configuration file has errors in it when you issue a restart then your parent will not restart, it will exit with an error. In the case of graceful restarts it will also leave children running when it exits. (These are the children which are &gracefully exiting& by handling their last request.) This will cause problems if you attempt to restart the server -- it will not be able to bind to its listening ports. Before doing a restart, you can check the syntax of the configuration files with the -t command line argument (see httpd ). This still will not guarantee that the server will restart correctly. To check the semantics of the configuration files as well as the syntax, you can try starting httpd as a non-root user. If there are no errors it will attempt to open its sockets and logs and fail because it's not root (or because the currently running httpd already has those ports bound). If it fails for any other reason then it's probably a config file error and the error should be fixed before issuing the graceful restart.&
 

porcupine

Well-Known Member
PartnerNOC
Apr 18, 2002
74
0
306
Toronto, Ontario
cPanel Access Level
DataCenter Provider
I too have been having this problem in multiple servers, some are bare redhat 7.3, some are just redhat 7.2 with the 2.4.19 kernel, ever since we've upgraded the kernel on a few of them (i even recompiled apache just in case) they started doing this. Network montior has also been reporting httpd down, then cpanel restarts apache and returns:

apache failed @ Mon Aug 26 14:28:06 2002. A restart was attempted automagicly.

I have noticed this in the logs though... just nothign i could tell that was indicating *why* apache was stoppping/crashing

Warning: /boot/System.map has an incorrect kernel version.
Warning: /boot/System.map has an incorrect kernel version.
[Mon Aug 26 15:06:43 2002] [notice] Apache/1.3.26 (Unix) AuthMySQL/2.20 mod_log_bytes/0.3 mod_bwlimited/1.0 PHP/4.2.2 FrontPage/5.0.2.2510 mod_ssl/2.8.9 OpenSSL/0.9.6b configured -- resuming normal operations
[Mon Aug 26 15:06:43 2002] [notice] suEXEC mechanism enabled (wrapper: /usr/local/apache/bin/suexec)
[Mon Aug 26 15:06:43 2002] [notice] Accept mutex: sysvsem (Default: sysvsem)
 

cyberlin

Registered
Aug 29, 2002
3
0
151
Same Thing

WHM 4.9.0
Cpanel 4.9.0-24
RedHat 7.2


I am having the same problem... It is causing problems with a live e-commerce site. The post data is lost.

Hope there is a fix soon
 

nitromax

Well-Known Member
Feb 12, 2002
189
0
316
I am having the same problem. We have people running the ShoppingQ e-commerce software on our server, and from time to time various database files get corrupted which causes the store to appear that it is under construction.

We setup a cron that checks to see if the database files are up and running and if they aren't the cron restores them. When that happens the cron sends us an email with a time stamp to let us know when it went down.

In checking the error logs it seems that the database files are corrupted within the same minute that the &SIGUSR1 received. Doing graceful restart& message comes up. I am suspecting that's why the store's databases are crashing.

This was Nick's answer on page 4 of this post: - &Its only going to gracefully restart apache if it processes a log file so apache will reopen the log file and begin writing to it again so you don't loose your stats. If you are getting segfaults, try commented out any extra apache modules you have added.. ie mod_perl, mod_headers .. etc as some of them are not compatible.&


I commented out various mod_* modules and have only recieved one Seg fault 11 error. But I guess I don't see or understand the fix for the graceful restart problem in his answer.

I notice this problem started to be reported about 2 and half months ago. I sure wish there was a solution! :)

Anyone figure this out yet??? :)
 

WebHostPro

Well-Known Member
PartnerNOC
Jul 28, 2002
1,726
28
328
LA, Costa RIca
cPanel Access Level
Root Administrator
Twitter
Hey,

I hate to join the bandwagon but I got the bug!

But here it is for me:

[Sat Sep 21 12:14:47 2002] [notice] SIGUSR1 received. Doing graceful restart
[Sat Sep 21 12:14:48 2002] [error] mod_ssl: Child could not open SSLMutex lockfile /usr/local/apache/logs/ssl_mutex.30358 (Sy$
[Sat Sep 21 12:14:48 2002] [error] System: No such file or directory (errno: 2)
Failed loading /usr/local/Zend/lib/ZendOptimizer.so: /usr/local/Zend/lib/ZendOptimizer.so: cannot open shared object file: N$
[Sat Sep 21 12:14:48 2002] [notice] Apache/1.3.26 (Unix) mod_log_bytes/0.3 mod_bwlimited/1.0 PHP/4.2.2 FrontPage/5.0.2.2510 m$
[Sat Sep 21 12:14:48 2002] [notice] suEXEC mechanism enabled (wrapper: /usr/local/apache/bin/suexec)
[Sat Sep 21 12:14:48 2002] [notice] Accept mutex: sysvsem (Default: sysvsem)


Every ten minutes :)

Any help is very appreciated

Thanks

Charles
 

Annette

Well-Known Member
PartnerNOC
Aug 12, 2001
445
0
316
We're also starting to see this more explicitly across numerous boxes, although in the past it really hasn't been that big a deal since apache always came back without issues. No tomcat, no mod_perl, no mod_gzip, nothing terribly special in the way of compiles.

In addition to this thread, these other threads indicate the problem has been going on for some time and are affecting quite a number of people:

http://forums.cpanel.net/read.php?TID=3402
http://forums.cpanel.net/read.php?TID=4434
http://forums.cpanel.net/read.php?TID=4853

On some servers under our management, the restart is every 10 minutes without fail, on others, it varies from minutes to hours between restarts. This is fairly typical:

[Wed Sep 25 08:04:37 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 08:14:38 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 08:24:39 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 08:34:50 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 08:45:02 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 08:55:03 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 09:05:03 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 09:29:01 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 09:39:01 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 09:49:02 2002] [notice] SIGUSR1 received. Doing graceful restart
[Wed Sep 25 09:59:03 2002] [notice] SIGUSR1 received. Doing graceful restart

On some of these boxes, there are random segfaults here and there, but nothing serious, and not immediately prior to the restarts. There are no serious errors in the logs indicating that someone is pounding away at the box, and no oddball errors indicating someone is intentionally trying to force failure. There is no correlation between adding or removing accounts from the boxes and the random fails/restarts (that is, there is nothing being done on the server from a management standpoint when the restarts occur).

I haven't been able to find anything really useful about this problem - and it is a problem, as there is absolutely no reason for apache to do this. If it's related to the rotation of logs for counting, then perhaps a new routine is in order to do this less frequently. I'd prefer to have apache run as the solid process that it is rather than have the occasional page not served because all the child processes have been allowed to complete but no new processes have yet spawned.
 

hostbet

Well-Known Member
Aug 13, 2001
80
0
306
this is what I get from apache:
===============
Warning: DocumentRoot [/dev/null] does not exis
Syntax error on line 2126 of /usr/local/apache/conf/httpd.conf:
Invalid command 'SSLVerifyClient', perhaps mis-spelled or defined by a module not included in the server configuration
===============

The server stop working about 8:30 PM (every day now) and a reboot need to be done.

I try to make a folder /dev/null but null is not a folder, is part of linux program, some how apache think is a folder....

I made the folder null and rename null program to test, apache was happy and found the folder by WHM did not work too happy and I have to put everything back to normal.
 
B

bdraco

Guest
SIGUSR1 is NORMAL. This is required for apache to reopen log files. If your server is crashing when getting this signal there is something very wrong.

99.9999% of the time is is because mod_auth_mysql is installed with php. There are not compatible .... get rid of mod_auth_mysql.
 

moronhead

Well-Known Member
Aug 12, 2001
706
0
316
[quote:b64111832e][i:b64111832e]Originally posted by bdraco[/i:b64111832e]

SIGUSR1 is NORMAL. This is required for apache to reopen log files. If your server is crashing when getting this signal there is something very wrong. [/quote:b64111832e]
If there are 10 processes going on at the time of the SIGUSR1 request and the time span to complete all those 10 jobs say is about 10 seconds, does that mean any new web processes are getting put on hold during those 10 seconds?
 
B

bdraco

Guest
[quote:1196567755][i:1196567755]Originally posted by moronhead[/i:1196567755]

[quote:1196567755][i:1196567755]Originally posted by bdraco[/i:1196567755]

SIGUSR1 is NORMAL. This is required for apache to reopen log files. If your server is crashing when getting this signal there is something very wrong. [/quote:1196567755]
If there are 10 processes going on at the time of the SIGUSR1 request and the time span to complete all those 10 jobs say is about 10 seconds, does that mean any new web processes are getting put on hold during those 10 seconds?[/quote:1196567755]

Nope.. from the apache site:

The USR1 signal causes the parent process to advise the children to exit after their current request (or to exit immediately if they're not serving anything). The parent re-reads its configuration files and re-opens its log files. As each child dies off the parent replaces it with a child from the new generation of the configuration, which begins serving new requests immediately.
 

moronhead

Well-Known Member
Aug 12, 2001
706
0
316
Well, this kind of explains the reason for this heart-warming message in the error logs after a SIGUSR1:

[Thu Sep 26 10:29:19 2002] [warn] long lost child came home! (pid 7772)

It looks like this child has been taken away from its parent's control without consent at the SIGUSR1.

Although it still isn't clear from that apache quote what happens to the new job requests received during the span between SIGUSR1 and the time when the parent becomes available to receive those requests.
 

itf

Well-Known Member
May 9, 2002
624
0
316
[quote:cb15a68fc8][i:cb15a68fc8]Originally posted by moronhead[/i:cb15a68fc8]

Well, this kind of explains the reason for this heart-warming message in the error logs after a SIGUSR1:

[Thu Sep 26 10:29:19 2002] [warn] long lost child came home! (pid 7772)

It looks like this child has been taken away from its parent's control without consent at the SIGUSR1.

Although it still isn't clear from that apache quote what happens to the new job requests received during the span between SIGUSR1 and the time when the parent becomes available to receive those requests.[/quote:cb15a68fc8]
[Thu Sep 26 10:29:19 2002] [warn] long lost child came home!
I've seen like this error on non-cpanel servers with Cronolog sub-processes
However sometime a child process could be out of control but this does not mean that you will lose requests, Signal USR1 does not close the program completely just reloads it
Also OS can queue up requests too
 

hostbet

Well-Known Member
Aug 13, 2001
80
0
306
Nick can we have the server automatic reboot feature?

If apache had fail after a few times, allow the entire server reboot it self. (optional)

it hapen to me a few times and the server was offline until I got a few users mad at me to let me know it was offline.
 

itf

Well-Known Member
May 9, 2002
624
0
316
[quote:6047d27c15][i:6047d27c15]Originally posted by hostbet[/i:6047d27c15]

Nick can we have the server automatic reboot feature?

If apache had fail after a few times, allow the entire server reboot it self. (optional)

it hapen to me a few times and the server was offline until I got a few users mad at me to let me know it was offline.[/quote:6047d27c15]
It is not necessary to reboot the server just because Apache is failed; there is a running daemon: chkservd which monitors and reloads failed services
 

Vital

Active Member
Nov 17, 2001
37
0
306
Nick, it seems like we have kinda big problem with SIGUSR1 restarts here... Two of our dedicated server customers use persistent mysql connections from perl cgi scripts, and here's what we have:

&filename.pl: DBD::mysql::st execute failed: Server shutdown in progress at a filename.pl line XX.&

Application is time-critical, i.e., you can't just reload the page, if it fails. Can i set this thing to restart apache once a day, for example, and if so - how? Also, all php chats with persistent connection and transparent sessions become broken this way... I don't think it was a good idea. But from your point of view, i don't see a way around too, as a system administrator, except of decreasing restart rate.
 

lightnin

Member
Jul 11, 2002
11
0
151
Kinda seems like a HUGE problem to me. The load shoots up like crazy when apache restarts and by the time it's worked it's way back down to .3 where it generally sits, the damn thing restarts again.

This sucks!