#1 (permalink)  
Old 02-05-2008, 02:51 PM
Registered User
 
Join Date: Aug 2004
Posts: 215
s_2_s is an unknown quantity at this point
Exclamation SOS apache fails every few seconds

hello
i have few servers where i deleted domlogs folders because that was above 10-20 GB
suddenly after few hours i suddenly discovered that apache fails every few seconds

i then recreated domlogs made sure that domlogs have same permission and owner

# ll
total 76
drwxr-xr-x 15 root root 4096 Feb 5 22:00 ./
drwxr-xr-x 23 root root 4096 Jan 10 20:00 ../
drwxr-xr-x 2 root root 4096 Jan 21 00:33 bin/
drwxr-xr-x 2 root root 4096 Jan 21 01:04 cgi-bin/
drwxr-xr-x 11 root root 4096 Feb 5 22:35 conf/
drwxr-xr-x 9 root root 4096 Nov 23 14:02 conf_pre_ea3/
drwx--x--x 2 root wheel 12288 Feb 5 22:42 domlogs/
drwxr-xr-x 70 root wheel 12288 Feb 5 21:45 domlogsx/
drwxr-xr-x 4 root root 4096 Feb 4 17:06 htdocs/
drwxr-xr-x 3 root root 4096 Jan 21 00:33 icons/
drwxr-xr-x 3 root root 4096 Jan 21 00:33 include/
drwxr-xr-x 2 root root 4096 Jan 21 00:48 libexec/
drwxr-xr-x 2 root root 4096 Feb 5 22:48 logs/
drwxr-xr-x 4 root root 4096 Jan 21 00:33 man/
drwxr-xr-x 2 nobody nobody 4096 Jan 21 00:33 proxy/


then restarted and problem repeated

then i commented out the
#CustomLog
#BytesLog
and restarted but same problem

i rebuilded apache and php several times with different versions and it didn't help either
i have the same problem on all the servers from which i delete domlogs

on var/log/messages nothing
on apache error log there is

[Tue Feb 4 12:16:04 2008] [notice] caught SIGTERM, shutting down

which usually apache dies after it appear
please do help me
waiting ..........
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-05-2008, 03:48 PM
Registered User
 
Join Date: Nov 2001
Location: Athens - Greece
Posts: 98
troxalias
1. Do any files get created in the domlogs directory?
2. Does the process dies immediately ?
3. Have you tried to strace the apache process ?
__________________
------
CPanel Tips and Solutions
http://www.cphelp.gr
------
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-05-2008, 03:59 PM
Registered User
 
Join Date: Aug 2004
Posts: 215
s_2_s is an unknown quantity at this point
thank you for your reply
1-yes and their owner is root:root or root:user
2- it dies in few seconds only
3- no i didn't and dont know how either

waiting for your update
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-05-2008, 04:07 PM
Registered User
 
Join Date: Aug 2004
Posts: 215
s_2_s is an unknown quantity at this point
i saw this on one of the affected server before apache dies too

[Wed Feb 6 00:05:03 2008] [error] Bad pid (7747) in scoreboard slot 23
[Wed Feb 6 00:05:03 2008] [error] Bad pid (7809) in scoreboard slot 24
[Wed Feb 6 00:05:03 2008] [error] Bad pid (9910) in scoreboard slot 25
[Wed Feb 6 00:05:03 2008] [error] Bad pid (10105) in scoreboard slot 26
[Wed Feb 6 00:05:03 2008] [error] Bad pid (7461) in scoreboard slot 22
[Wed Feb 6 00:05:03 2008] [error] Bad pid (7747) in scoreboard slot 23
[Wed Feb 6 00:05:03 2008] [error] Bad pid (7809) in scoreboard slot 24
[Wed Feb 6 00:05:03 2008] [error] Bad pid (9910) in scoreboard slot 25
[Wed Feb 6 00:05:03 2008] [error] Bad pid (10105) in scoreboard slot 26
[Wed Feb 6 00:05:03 2008] [notice] caught SIGTERM, shutting down



now running your command
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-05-2008, 04:07 PM
nyjimbo's Avatar
Registered User
 
Join Date: Jan 2003
Location: New York
Posts: 1,055
nyjimbo is on a distinguished road
Quote:
Originally Posted by s_2_s View Post

on apache error log there is

[Tue Feb 4 12:16:04 2008] [notice] caught SIGTERM, shutting down

which usually apache dies after it appear
please do help me
waiting ..........
Are there any lines above it that give you any more ideas?. Can you turn up the debug level on apache.conf manually to LogLevel debug and see if more info comes out ?
__________________
"A dog has raised it’s hind leg on the age of nevermore !"
-- Rolf
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-05-2008, 04:08 PM
Registered User
 
Join Date: Nov 2001
Location: Athens - Greece
Posts: 98
troxalias
Try to run the following command:

Code:
strace -o /tmp/lala -f /usr/local/apache/bin/httpd -DSSL
it will probably take some time until apache dies. then check the file /tmp/lala (especially the last 50-100 lines) for any files that could not be accessed either with the error ENOENT of EACC. I know it is not an easy way to debug this but it's the best i can suggest. If you can compress the file /tmp/lala and post it as an attachment would be perfect.
__________________
------
CPanel Tips and Solutions
http://www.cphelp.gr
------
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-05-2008, 04:10 PM
Registered User
 
Join Date: Aug 2004
Posts: 215
s_2_s is an unknown quantity at this point
also could catch 25821 bind(16, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)


appearing everywhile but apache is not died yet
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 02-05-2008, 04:16 PM
Registered User
 
Join Date: Nov 2001
Location: Athens - Greece
Posts: 98
troxalias
Quote:
Originally Posted by s_2_s View Post
also could catch 25821 bind(16, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)


appearing everywhile but apache is not died yet

Is apache already running ???? this means that a process is already bind to TCP port 443. Try
Code:
lsof -i tcp |grep https
to check the process that is bind to port 443.
__________________
------
CPanel Tips and Solutions
http://www.cphelp.gr
------
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 02-05-2008, 04:23 PM
Registered User
 
Join Date: Aug 2004
Posts: 215
s_2_s is an unknown quantity at this point
now apache dies again

attached is the lala file

at the time oif apache failure this reappaeared in error log

[Wed Feb 6 00:10:07 2008] [error] Bad pid (28416) in scoreboard slot 16
[Wed Feb 6 00:10:07 2008] [error] Bad pid (22008) in scoreboard slot 18
[Wed Feb 6 00:10:07 2008] [error] Bad pid (28574) in scoreboard slot 19
[Wed Feb 6 00:10:07 2008] [error] Bad pid (22136) in scoreboard slot 21
[Wed Feb 6 00:10:07 2008] [error] Bad pid (22518) in scoreboard slot 22
[Wed Feb 6 00:10:07 2008] [error] Bad pid (23590) in scoreboard slot 23
[Wed Feb 6 00:10:07 2008] [error] Bad pid (23634) in scoreboard slot 25
[Wed Feb 6 00:10:07 2008] [error] Bad pid (23664) in scoreboard slot 26
[Wed Feb 6 00:10:07 2008] [error] Bad pid (28416) in scoreboard slot 16
[Wed Feb 6 00:10:07 2008] [error] Bad pid (22008) in scoreboard slot 18
[Wed Feb 6 00:10:07 2008] [error] Bad pid (28574) in scoreboard slot 19
[Wed Feb 6 00:10:07 2008] [error] Bad pid (22136) in scoreboard slot 21
[Wed Feb 6 00:10:07 2008] [error] Bad pid (22518) in scoreboard slot 22
[Wed Feb 6 00:10:07 2008] [error] Bad pid (23590) in scoreboard slot 23
[Wed Feb 6 00:10:07 2008] [error] Bad pid (23634) in scoreboard slot 25
[Wed Feb 6 00:10:07 2008] [error] Bad pid (23664) in scoreboard slot 26
[Wed Feb 6 00:10:07 2008] [notice] caught SIGTERM, shutting down
Attached Files
File Type: zip lala.zip (26.4 KB, 4 views)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 02-05-2008, 04:25 PM
Registered User
 
Join Date: Aug 2004
Posts: 215
s_2_s is an unknown quantity at this point
thanks for your interest in helping me

while httpd was down there were nothing listening to the https port

but after i restarted it , its only https

httpd 27962 root 16u IPv4 29004321 TCP *:https (LISTEN)
httpd 28034 nobody 16u IPv4 29004321 TCP *:https (LISTEN)
httpd 28035 nobody 16u IPv4 29004321 TCP *:https (LISTEN)
httpd 28036 nobody 16u IPv4 29004321 TCP *:https (LISTEN)
httpd 28037 nobody 16u IPv4 29004321 TCP *:https (LISTEN)
httpd 28038 nobody 16u IPv4 29004321 TCP *:https (LISTEN)


also apache dies only after [Wed Feb 6 00:10:07 2008] [notice] caught SIGTERM, shutting down
however the error of listening on ssl port happens but it doesn't kill apache
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #11 (permalink)  
Old 02-05-2008, 04:28 PM
Registered User
 
Join Date: Nov 2001
Location: Athens - Greece
Posts: 98
troxalias
Can you please post the file again but run it with the command (please make sure that no apache processes are still running, if any, and post also the results of lsof i ask ed you befora:

Code:
strace -s 512 -o /tmp/lala -f /usr/local/apache/bin/httpd -DSSL
I would also suggest you to disable any eaccelarator entry in /usr/local/lib/php.ini file before you try to start apache.
__________________
------
CPanel Tips and Solutions
http://www.cphelp.gr
------
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 02-05-2008, 04:52 PM
Registered User
 
Join Date: Aug 2004
Posts: 215
s_2_s is an unknown quantity at this point
died again

this time i see only


[Wed Feb 6 00:35:03 2008] [notice] caught SIGTERM, shutting down

i commented out all php extensions
in error log 200+ mb this time so here are the last 100 lines of it
Quote:
11380 waitpid(13539, 0xbf85a0e0, WNOHANG) = 0
11380 select(0, NULL, NULL, NULL, {0, 65536} <unfinished ...>
13539 exit_group(0) = ?
11738 fcntl64(6, F_SETFL, O_RDWR <unfinished ...>
11785 close(20 <unfinished ...>
11380 <... select resumed> ) = ? ERESTARTNOHAND (To be restarted)
11380 --- SIGCHLD (Child exited) @ 0 (0) ---
11380 select(0, NULL, NULL, NULL, {0, 61000} <unfinished ...>
11738 <... fcntl64 resumed> ) = 0
11738 setsockopt(6, SOL_SOCKET, SO_SNDTIMEO, "\2003\341\1\0\0\0\0", 8) = 0
11738 write(6, "\1\0\0\0\1", 5) = 5
11738 shutdown(6, 2 /* send and receive */) = 0
11738 close(6) = 0
11738 brk(0x82e9000) = 0x82e9000
11738 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
12231 close(20 <unfinished ...>
11785 <... close resumed> ) = 0
11738 close(20) = 0
11785 exit_group(0) = ?
11738 exit_group(0) = ?
11380 <... select resumed> ) = ? ERESTARTNOHAND (To be restarted)
12231 <... close resumed> ) = 0
11380 --- SIGCHLD (Child exited) @ 0 (0) ---
12231 exit_group(0) = ?
11380 select(0, NULL, NULL, NULL, {0, 51000}) = ? ERESTARTNOHAND (To be restarted)
11380 --- SIGCHLD (Child exited) @ 0 (0) ---
11380 select(0, NULL, NULL, NULL, {0, 51000}) = 0 (Timeout)
11380 waitpid(11736, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11736
11380 waitpid(11738, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11738
11380 waitpid(11739, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11739
11380 waitpid(11784, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11784
11380 waitpid(11785, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11785
11380 waitpid(11790, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11790
11380 waitpid(12031, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12031
11380 waitpid(12068, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12068
11380 waitpid(12123, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12123
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (12228) in scoreboard slot 29\n", 73) = 73
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (12229) in scoreboard slot 30\n", 73) = 73
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (12230) in scoreboard slot 31\n", 73) = 73
11380 waitpid(12231, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12231
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13396) in scoreboard slot 33\n", 73) = 73
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13425) in scoreboard slot 35\n", 73) = 73
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13477) in scoreboard slot 36\n", 73) = 73
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13478) in scoreboard slot 37\n", 73) = 73
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13479) in scoreboard slot 38\n", 73) = 73
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13480) in scoreboard slot 39\n", 73) = 73
11380 waitpid(13539, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 13539
11380 unlink("/usr/local/apache/logs/httpd.pid") = 0
11380 time(NULL) = 1202251204
11380 write(15, "[Wed Feb 6 00:40:04 2008] [notice] caught SIGTERM, shutting down\n", 66) = 66
11380 close(15) = 0
11380 munmap(0xb7f4e000, 4096) = 0
11380 semctl(1474574, 0, IPC_64|IPC_RMID, 0xbf859ff4) = 0
11380 close(7) = 0
11380 munmap(0xb7671000, 4096) = 0
11380 close(4) = 0
11380 munmap(0xb7f48000, 4096) = 0
11380 close(5) = 0
11380 munmap(0xb7f4d000, 4096) = 0
11380 close(19) = 0
11380 close(18) = 0
11380 unlink("/usr/local/apache/logs/ssl_scache.dir") = 0
11380 unlink("/usr/local/apache/logs/ssl_scache.pag") = 0
11380 unlink("/usr/local/apache/logs/ssl_scache.dir") = -1 ENOENT (No such file or directory)
11380 unlink("/usr/local/apache/logs/ssl_scache.pag") = -1 ENOENT (No such file or directory)
11380 unlink("/usr/local/apache/logs/ssl_scache.db") = -1 ENOENT (No such file or directory)
11380 unlink("/usr/local/apache/logs/ssl_scache") = -1 ENOENT (No such file or directory)
11380 unlink("/usr/local/apache/logs/ssl_mutex.11367") = 0
11380 close(17) = 0
11380 close(16) = 0
11380 munmap(0xb78e2000, 2998976) = 0
11380 munmap(0xb781a000, 818920) = 0
11380 munmap(0xb76e3000, 1273472) = 0
11380 munmap(0xb76a9000, 235120) = 0
11380 munmap(0xb7679000, 196352) = 0
11380 munmap(0xb7672000, 26200) = 0
11380 munmap(0xb7662000, 33620) = 0
11380 munmap(0xb75f7000, 437856) = 0
11380 munmap(0xb7509000, 58244) = 0
11380 munmap(0xb7470000, 55444) = 0
11380 munmap(0xb7518000, 911768) = 0
11380 munmap(0xb74e5000, 145732) = 0
11380 munmap(0xb74c7000, 120820) = 0
11380 munmap(0xb74b0000, 92328) = 0
11380 munmap(0xb74a6000, 37340) = 0
11380 munmap(0xb7496000, 61720) = 0
11380 munmap(0xb7bbf000, 8092) = 0
11380 munmap(0xb7f49000, 7932) = 0
11380 munmap(0xb7f4b000, 7964) = 0
11380 waitpid(11692, NULL, WNOHANG) = 11692
11380 exit_group(0) = ?
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 02-05-2008, 04:57 PM
Registered User
 
Join Date: Nov 2001
Location: Athens - Greece
Posts: 98
troxalias
Not very helpful. If you zip the file how big is it ? Please try to run it again under strace but instead of 512 (in -s 512) try 64 to see it the trace file gets smaller. Also try to compress it with gzip -9 in order to get maximum compression.
__________________
------
CPanel Tips and Solutions
http://www.cphelp.gr
------
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #14 (permalink)  
Old 02-05-2008, 05:04 PM
Registered User
 
Join Date: Aug 2004
Posts: 215
s_2_s is an unknown quantity at this point
its 115 mb zipped to maximum
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #15 (permalink)  
Old 02-05-2008, 05:15 PM
Registered User
 
Join Date: Nov 2001
Location: Athens - Greece
Posts: 98
troxalias
That's not good. One last try to see if the file gets smaller:
Code:
strace -s 64 -o /tmp/lala -f /usr/local/apache/bin/httpd -F -DSSL

If the file is huge again I have to try to guess...
1. Disable any php modules in your configuration file.
2. Disable any other loaded modules (one at a time).
__________________
------
CPanel Tips and Solutions
http://www.cphelp.gr
------
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -5. The time now is 01:47 AM.


Powered by vBulletin® Version 3.8.2
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
© cPanel Inc