Community Forums
Connect with us on LinkedIn
+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 19
  1. #1
    Member
    Join Date
    Aug 2004
    Posts
    215

    Exclamation SOS apache fails every few seconds

    hello
    i have few servers where i deleted domlogs folders because that was above 10-20 GB
    suddenly after few hours i suddenly discovered that apache fails every few seconds

    i then recreated domlogs made sure that domlogs have same permission and owner

    # ll
    total 76
    drwxr-xr-x 15 root root 4096 Feb 5 22:00 ./
    drwxr-xr-x 23 root root 4096 Jan 10 20:00 ../
    drwxr-xr-x 2 root root 4096 Jan 21 00:33 bin/
    drwxr-xr-x 2 root root 4096 Jan 21 01:04 cgi-bin/
    drwxr-xr-x 11 root root 4096 Feb 5 22:35 conf/
    drwxr-xr-x 9 root root 4096 Nov 23 14:02 conf_pre_ea3/
    drwx--x--x 2 root wheel 12288 Feb 5 22:42 domlogs/
    drwxr-xr-x 70 root wheel 12288 Feb 5 21:45 domlogsx/
    drwxr-xr-x 4 root root 4096 Feb 4 17:06 htdocs/
    drwxr-xr-x 3 root root 4096 Jan 21 00:33 icons/
    drwxr-xr-x 3 root root 4096 Jan 21 00:33 include/
    drwxr-xr-x 2 root root 4096 Jan 21 00:48 libexec/
    drwxr-xr-x 2 root root 4096 Feb 5 22:48 logs/
    drwxr-xr-x 4 root root 4096 Jan 21 00:33 man/
    drwxr-xr-x 2 nobody nobody 4096 Jan 21 00:33 proxy/


    then restarted and problem repeated

    then i commented out the
    #CustomLog
    #BytesLog
    and restarted but same problem

    i rebuilded apache and php several times with different versions and it didn't help either
    i have the same problem on all the servers from which i delete domlogs

    on var/log/messages nothing
    on apache error log there is

    [Tue Feb 4 12:16:04 2008] [notice] caught SIGTERM, shutting down

    which usually apache dies after it appear
    please do help me
    waiting ..........

  2. #2
    Member
    Join Date
    Nov 2001
    Location
    Athens - Greece
    Posts
    98

    Default

    1. Do any files get created in the domlogs directory?
    2. Does the process dies immediately ?
    3. Have you tried to strace the apache process ?
    ------
    CPanel Tips and Solutions
    http://www.cphelp.gr
    ------

  3. #3
    Member
    Join Date
    Aug 2004
    Posts
    215

    Default

    thank you for your reply
    1-yes and their owner is root:root or root:user
    2- it dies in few seconds only
    3- no i didn't and dont know how either

    waiting for your update

  4. #4
    Member
    Join Date
    Aug 2004
    Posts
    215

    Default

    i saw this on one of the affected server before apache dies too

    [Wed Feb 6 00:05:03 2008] [error] Bad pid (7747) in scoreboard slot 23
    [Wed Feb 6 00:05:03 2008] [error] Bad pid (7809) in scoreboard slot 24
    [Wed Feb 6 00:05:03 2008] [error] Bad pid (9910) in scoreboard slot 25
    [Wed Feb 6 00:05:03 2008] [error] Bad pid (10105) in scoreboard slot 26
    [Wed Feb 6 00:05:03 2008] [error] Bad pid (7461) in scoreboard slot 22
    [Wed Feb 6 00:05:03 2008] [error] Bad pid (7747) in scoreboard slot 23
    [Wed Feb 6 00:05:03 2008] [error] Bad pid (7809) in scoreboard slot 24
    [Wed Feb 6 00:05:03 2008] [error] Bad pid (9910) in scoreboard slot 25
    [Wed Feb 6 00:05:03 2008] [error] Bad pid (10105) in scoreboard slot 26
    [Wed Feb 6 00:05:03 2008] [notice] caught SIGTERM, shutting down



    now running your command

  5. #5
    Member nyjimbo's Avatar
    Join Date
    Jan 2003
    Location
    New York
    Posts
    1,105

    Default

    Quote Originally Posted by s_2_s View Post

    on apache error log there is

    [Tue Feb 4 12:16:04 2008] [notice] caught SIGTERM, shutting down

    which usually apache dies after it appear
    please do help me
    waiting ..........
    Are there any lines above it that give you any more ideas?. Can you turn up the debug level on apache.conf manually to LogLevel debug and see if more info comes out ?
    "A dog has raised it’s hind leg on the age of nevermore !"
    -- Rolf

  6. #6
    Member
    Join Date
    Nov 2001
    Location
    Athens - Greece
    Posts
    98

    Default

    Try to run the following command:

    Code:
    strace -o /tmp/lala -f /usr/local/apache/bin/httpd -DSSL
    it will probably take some time until apache dies. then check the file /tmp/lala (especially the last 50-100 lines) for any files that could not be accessed either with the error ENOENT of EACC. I know it is not an easy way to debug this but it's the best i can suggest. If you can compress the file /tmp/lala and post it as an attachment would be perfect.
    ------
    CPanel Tips and Solutions
    http://www.cphelp.gr
    ------

  7. #7
    Member
    Join Date
    Aug 2004
    Posts
    215

    Default

    also could catch 25821 bind(16, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)


    appearing everywhile but apache is not died yet

  8. #8
    Member
    Join Date
    Nov 2001
    Location
    Athens - Greece
    Posts
    98

    Default

    Quote Originally Posted by s_2_s View Post
    also could catch 25821 bind(16, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)


    appearing everywhile but apache is not died yet

    Is apache already running ???? this means that a process is already bind to TCP port 443. Try
    Code:
    lsof -i tcp |grep https
    to check the process that is bind to port 443.
    ------
    CPanel Tips and Solutions
    http://www.cphelp.gr
    ------

  9. #9
    Member
    Join Date
    Aug 2004
    Posts
    215

    Default

    now apache dies again

    attached is the lala file

    at the time oif apache failure this reappaeared in error log

    [Wed Feb 6 00:10:07 2008] [error] Bad pid (28416) in scoreboard slot 16
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (22008) in scoreboard slot 18
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (28574) in scoreboard slot 19
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (22136) in scoreboard slot 21
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (22518) in scoreboard slot 22
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (23590) in scoreboard slot 23
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (23634) in scoreboard slot 25
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (23664) in scoreboard slot 26
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (28416) in scoreboard slot 16
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (22008) in scoreboard slot 18
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (28574) in scoreboard slot 19
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (22136) in scoreboard slot 21
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (22518) in scoreboard slot 22
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (23590) in scoreboard slot 23
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (23634) in scoreboard slot 25
    [Wed Feb 6 00:10:07 2008] [error] Bad pid (23664) in scoreboard slot 26
    [Wed Feb 6 00:10:07 2008] [notice] caught SIGTERM, shutting down
    Attached Files

  10. #10
    Member
    Join Date
    Aug 2004
    Posts
    215

    Default

    thanks for your interest in helping me

    while httpd was down there were nothing listening to the https port

    but after i restarted it , its only https

    httpd 27962 root 16u IPv4 29004321 TCP *:https (LISTEN)
    httpd 28034 nobody 16u IPv4 29004321 TCP *:https (LISTEN)
    httpd 28035 nobody 16u IPv4 29004321 TCP *:https (LISTEN)
    httpd 28036 nobody 16u IPv4 29004321 TCP *:https (LISTEN)
    httpd 28037 nobody 16u IPv4 29004321 TCP *:https (LISTEN)
    httpd 28038 nobody 16u IPv4 29004321 TCP *:https (LISTEN)


    also apache dies only after [Wed Feb 6 00:10:07 2008] [notice] caught SIGTERM, shutting down
    however the error of listening on ssl port happens but it doesn't kill apache

  11. #11
    Member
    Join Date
    Nov 2001
    Location
    Athens - Greece
    Posts
    98

    Default

    Can you please post the file again but run it with the command (please make sure that no apache processes are still running, if any, and post also the results of lsof i ask ed you befora:

    Code:
    strace -s 512 -o /tmp/lala -f /usr/local/apache/bin/httpd -DSSL
    I would also suggest you to disable any eaccelarator entry in /usr/local/lib/php.ini file before you try to start apache.
    ------
    CPanel Tips and Solutions
    http://www.cphelp.gr
    ------

  12. #12
    Member
    Join Date
    Aug 2004
    Posts
    215

    Default

    died again

    this time i see only


    [Wed Feb 6 00:35:03 2008] [notice] caught SIGTERM, shutting down

    i commented out all php extensions
    in error log 200+ mb this time so here are the last 100 lines of it
    11380 waitpid(13539, 0xbf85a0e0, WNOHANG) = 0
    11380 select(0, NULL, NULL, NULL, {0, 65536} <unfinished ...>
    13539 exit_group(0) = ?
    11738 fcntl64(6, F_SETFL, O_RDWR <unfinished ...>
    11785 close(20 <unfinished ...>
    11380 <... select resumed> ) = ? ERESTARTNOHAND (To be restarted)
    11380 --- SIGCHLD (Child exited) @ 0 (0) ---
    11380 select(0, NULL, NULL, NULL, {0, 61000} <unfinished ...>
    11738 <... fcntl64 resumed> ) = 0
    11738 setsockopt(6, SOL_SOCKET, SO_SNDTIMEO, "\2003\341\1\0\0\0\0", 8) = 0
    11738 write(6, "\1\0\0\0\1", 5) = 5
    11738 shutdown(6, 2 /* send and receive */) = 0
    11738 close(6) = 0
    11738 brk(0x82e9000) = 0x82e9000
    11738 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
    12231 close(20 <unfinished ...>
    11785 <... close resumed> ) = 0
    11738 close(20) = 0
    11785 exit_group(0) = ?
    11738 exit_group(0) = ?
    11380 <... select resumed> ) = ? ERESTARTNOHAND (To be restarted)
    12231 <... close resumed> ) = 0
    11380 --- SIGCHLD (Child exited) @ 0 (0) ---
    12231 exit_group(0) = ?
    11380 select(0, NULL, NULL, NULL, {0, 51000}) = ? ERESTARTNOHAND (To be restarted)
    11380 --- SIGCHLD (Child exited) @ 0 (0) ---
    11380 select(0, NULL, NULL, NULL, {0, 51000}) = 0 (Timeout)
    11380 waitpid(11736, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11736
    11380 waitpid(11738, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11738
    11380 waitpid(11739, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11739
    11380 waitpid(11784, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11784
    11380 waitpid(11785, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11785
    11380 waitpid(11790, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 11790
    11380 waitpid(12031, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12031
    11380 waitpid(12068, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12068
    11380 waitpid(12123, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12123
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (12228) in scoreboard slot 29\n", 73) = 73
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (12229) in scoreboard slot 30\n", 73) = 73
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (12230) in scoreboard slot 31\n", 73) = 73
    11380 waitpid(12231, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12231
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13396) in scoreboard slot 33\n", 73) = 73
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13425) in scoreboard slot 35\n", 73) = 73
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13477) in scoreboard slot 36\n", 73) = 73
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13478) in scoreboard slot 37\n", 73) = 73
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13479) in scoreboard slot 38\n", 73) = 73
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [error] Bad pid (13480) in scoreboard slot 39\n", 73) = 73
    11380 waitpid(13539, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 13539
    11380 unlink("/usr/local/apache/logs/httpd.pid") = 0
    11380 time(NULL) = 1202251204
    11380 write(15, "[Wed Feb 6 00:40:04 2008] [notice] caught SIGTERM, shutting down\n", 66) = 66
    11380 close(15) = 0
    11380 munmap(0xb7f4e000, 4096) = 0
    11380 semctl(1474574, 0, IPC_64|IPC_RMID, 0xbf859ff4) = 0
    11380 close(7) = 0
    11380 munmap(0xb7671000, 4096) = 0
    11380 close(4) = 0
    11380 munmap(0xb7f48000, 4096) = 0
    11380 close(5) = 0
    11380 munmap(0xb7f4d000, 4096) = 0
    11380 close(19) = 0
    11380 close(18) = 0
    11380 unlink("/usr/local/apache/logs/ssl_scache.dir") = 0
    11380 unlink("/usr/local/apache/logs/ssl_scache.pag") = 0
    11380 unlink("/usr/local/apache/logs/ssl_scache.dir") = -1 ENOENT (No such file or directory)
    11380 unlink("/usr/local/apache/logs/ssl_scache.pag") = -1 ENOENT (No such file or directory)
    11380 unlink("/usr/local/apache/logs/ssl_scache.db") = -1 ENOENT (No such file or directory)
    11380 unlink("/usr/local/apache/logs/ssl_scache") = -1 ENOENT (No such file or directory)
    11380 unlink("/usr/local/apache/logs/ssl_mutex.11367") = 0
    11380 close(17) = 0
    11380 close(16) = 0
    11380 munmap(0xb78e2000, 2998976) = 0
    11380 munmap(0xb781a000, 818920) = 0
    11380 munmap(0xb76e3000, 1273472) = 0
    11380 munmap(0xb76a9000, 235120) = 0
    11380 munmap(0xb7679000, 196352) = 0
    11380 munmap(0xb7672000, 26200) = 0
    11380 munmap(0xb7662000, 33620) = 0
    11380 munmap(0xb75f7000, 437856) = 0
    11380 munmap(0xb7509000, 58244) = 0
    11380 munmap(0xb7470000, 55444) = 0
    11380 munmap(0xb7518000, 911768) = 0
    11380 munmap(0xb74e5000, 145732) = 0
    11380 munmap(0xb74c7000, 120820) = 0
    11380 munmap(0xb74b0000, 92328) = 0
    11380 munmap(0xb74a6000, 37340) = 0
    11380 munmap(0xb7496000, 61720) = 0
    11380 munmap(0xb7bbf000, 8092) = 0
    11380 munmap(0xb7f49000, 7932) = 0
    11380 munmap(0xb7f4b000, 7964) = 0
    11380 waitpid(11692, NULL, WNOHANG) = 11692
    11380 exit_group(0) = ?

  13. #13
    Member
    Join Date
    Nov 2001
    Location
    Athens - Greece
    Posts
    98

    Default

    Not very helpful. If you zip the file how big is it ? Please try to run it again under strace but instead of 512 (in -s 512) try 64 to see it the trace file gets smaller. Also try to compress it with gzip -9 in order to get maximum compression.
    ------
    CPanel Tips and Solutions
    http://www.cphelp.gr
    ------

  14. #14
    Member
    Join Date
    Aug 2004
    Posts
    215

    Default

    its 115 mb zipped to maximum

  15. #15
    Member
    Join Date
    Nov 2001
    Location
    Athens - Greece
    Posts
    98

    Default

    That's not good. One last try to see if the file gets smaller:
    Code:
    strace -s 64 -o /tmp/lala -f /usr/local/apache/bin/httpd -F -DSSL

    If the file is huge again I have to try to guess...
    1. Disable any php modules in your configuration file.
    2. Disable any other loaded modules (one at a time).
    ------
    CPanel Tips and Solutions
    http://www.cphelp.gr
    ------

Similar Threads & Tags
Similar threads

  1. Replies: 4
    Last Post: 12-15-2007, 03:26 PM
  2. rebuild Apache w/ 4.4.2, apache fails
    By intel352 in forum cPanel and WHM Discussions
    Replies: 7
    Last Post: 04-27-2006, 10:04 AM
  3. Replies: 0
    Last Post: 09-04-2004, 10:56 PM
  4. SOS ! Apache core dumped ?
    By Jedia in forum cPanel and WHM Discussions
    Replies: 0
    Last Post: 05-25-2004, 02:47 PM
  5. SOS!!!Apache Failed, what do i do??
    By arnab in forum cPanel and WHM Discussions
    Replies: 8
    Last Post: 03-09-2002, 03:35 PM
Linkedin       Facebook       Twitter       RSS       Flickr       YouTube