httpd or mysql problem?

PattyO

Member
May 6, 2012
14
0
51
Brazil
cPanel Access Level
Root Administrator
Hello all. My first time here. :)

I'm completely new to server management and trying to learn mostly with the help of my good friend Mr. Google. But I'm stuck with this problem for days already and don't know what else to do.

Last weekend my client's server had a serious hardware problem and had motherboard and HD replaced. After that I had to copy all data from the old HD to the next one. Mr. Google helped me here, since support wanted to charge me US$ 60 for it, and I knew that I'd end up doing everything myself anyway.

I did OK, I guess, for a first timer. But the next morning all sites were down, although the server was running. I restarted Apache and everything went back to normal. Unfortunately for few hours only. So the whole week sites are going offline every few hours and I need to restart Apache over and over again every few hours or so. I tried using a cronjob to have Apache restarted every 30 minutes but it's not working, for Apache is on now for over 2 hours already. I also know this is not the solution to the problem, but at least it would keep the sites on during the night while I try to get some sleep.

This is the cronjob: */30 * * * * /etc/init.d/httpd restart >/dell/null 2>&1

Server is not overloading before sites go off or at any other time as a matter of fact. It seems to be running pretty smoothly actually, except for that.

I also checked my.cnf and was puzzled to find nothing there. So I copied the settings from a similar server I have at another DC (this one managed by them, thank God!) and restarted mysql.

Since I'm new to all this, I have no idea where to start looking and the host support won't help with that since it's not hardware or power but software related.

So can someone point me to the right direction or maybe give a hand and guide me thru this?
I really appreciate any help I can get. :)

Tks in advance, guys.

Patty
 
Last edited:

PattyO

Member
May 6, 2012
14
0
51
Brazil
cPanel Access Level
Root Administrator
I think I've found what's causing the problem:

Bug: Apache 2.2 Child Processes in “G” Status
Problem

Apache 2.2 has a bug affecting some systems, wherein a graceful restart does not fully clean up any running processes.

The result is that child processes are stuck in "G" status, and more file descriptors are used, which can push Apache over its limit and crash it.
Fix

Apache is aware of this bug and patches have been submitted; however, they are all problematic. For more information, see http://issues.apache.org/bugzilla/show_bug.cgi?id=42829.

If you experience this issue on a server, your best and only supported option is to run Apache 2.0 instead of 2.2. This is because the patches are not ready for production, as they have known issues and need further refining by Apache.
I did noticed a lot of those "G" on Apache Status:

Current Time: Sunday, 06-May-2012 16:28:23 BRT
Restart Time: Sunday, 06-May-2012 08:56:41 BRT
Parent Server Generation: 5
Server uptime: 7 hours 31 minutes 42 seconds
Total accesses: 156995 - Total Traffic: 1.8 GB
CPU Usage: u1.99 s1.38 cu45.02 cs0 - .179% CPU load
5.79 requests/sec - 70.3 kB/second - 12.1 kB/request
178 requests currently being processed, 0 idle workers

GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGWGGGGGGGGGGGGGGCGGGGCGGG
WGCGGGGGGGGWWGCGGCGCCGWWGCCWGCCWWGCGGCCWW..W.W.W.W..WW.WG...W...
................................................................

Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process

Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request
0-2 14133 1/28/2520 G 10.76 17966 300061 0.0 0.34 30.50 189.47.121.228 jaunabalada.com.br GET /fotos%20jau%20na%20balada%202010/main.php?cmd=thmb&var1=%2
1-3 23122 1/27/4238 G 0.11 12310 900001 0.7 0.28 49.09 66.249.73.72 jaunabalada.com.br GET /fotos%20jau%20na%20balada%202010/main.php?cmd=setquality&v
2-3 23809 1/144/4631 G 1.30 11743 900001 0.7 1.11 52.73 66.249.73.72 jaunabalada.com.br GET /fotos%20jau%20na%20balada%202010/main.php?cmd=setquality&v
3-2 18225 1/4/3331 G 0.00 14957 1052369 0.7 0.00 40.64 66.249.73.72 jaunabalada.com.br GET /fotos%20jau%20na%20balada%202010/main.php?cmd=imageview&va
4-2 16250 1/2/3035 G 0.00 16272 900001 0.7 0.02 40.56 207.46.204.240 jaunabalada.com.br GET /fotos%20jau%20na%20balada%202010/main.php?cmd=imageview&va
5-3 19099 1/49/3447 G 0.23 14577 900000 0.7 0.50 41.47 66.249.73.72 jaunabalada.com.br GET /fotos%20jau%20na%20balada%202010/main.php?cmd=setquality&v
6-2 14217 1/19/2569 G 0.78 17966 300112 0.0 0.17 30.10 189.47.121.228 jaunabalada.com.br GET /fotos%20jau%20na%20balada%202010/main.php?cmd=thmb&var1=%2
7-3 27729 1/14/5029 G 0.14 9728 900002 0.7 0.19 59.85 200.220.180.66 jaunabalada.com.br GET /fotos%20jau%20na%20balada%202010/main.php?cmd=thmb&var1=%2
And so on.

So I'm trying to revert Apache to 2.0 and compiling it right now. Let's see if it works.

I'd appreciate if someone from cPanel could comment on that.
 

PattyO

Member
May 6, 2012
14
0
51
Brazil
cPanel Access Level
Root Administrator
I think I'm finding my way by myself, so I'll post here my findings, maybe it can help others in a similar situation.

In my case, I found out that one of the causes of the issue is a known Apache bug:
Known Issues (Bug: Apache 2.2 Child Processes in “G” Status)

So I followed cPanel suggestions and compiled Apache back to v2.0. "G"s are now gone! So that helped a lot.

But I still noticed a lot of "C"s on the Apache log, caused by the same script used by one of the accounts. Bingo! Client has been notified to optimize his script in order to properly close connections.

I'm hoping this is is, but I'll keep monitoring the server and will update this thread.
 

cPanelTristan

Quality Assurance Analyst
Staff member
Oct 2, 2010
7,607
42
348
somewhere over the rainbow
cPanel Access Level
Root Administrator
What PHP handler are you using and is it the same one you had on the old machine? You can see the handler in WHM > Apache Configuration > PHP and SuExec Configuration area or by running the following in SSH as root user:

Code:
/usr/local/cpanel/bin/rebuild_phpconf --current
Thanks!
 

PattyO

Member
May 6, 2012
14
0
51
Brazil
cPanel Access Level
Root Administrator
Hello, Tristan.
Tks for your reply.

This is it:

Option Configured Value
Default PHP Version (.php files) 5
PHP 5 Handler suphp
PHP 4 Handler suphp

Apache suEXEC on
Apache Ruid2 off


And yes, it's the same configuration on the old drive.

Even after recompiling Apache back to v2.0, the "G"s keep coming back. Problem persists. I really don't know what to do about that.

Any clues?

TIA
 

cPanelTristan

Quality Assurance Analyst
Staff member
Oct 2, 2010
7,607
42
348
somewhere over the rainbow
cPanel Access Level
Root Administrator

PattyO

Member
May 6, 2012
14
0
51
Brazil
cPanel Access Level
Root Administrator
Tks for your reply, Tristan.

Yeah, I have checked that thread before and it was very helpful indeed. But it's not a DDOS attack, thank God. The problem is indeed with that script in that domain, that is not closing the connections, as far as I understood.

The funny thing is it's the same script used before the drive replacement and it didn't cause that problem before. Could this be caused by any configuration on the server after the drive replacement?
 

cPanelTristan

Quality Assurance Analyst
Staff member
Oct 2, 2010
7,607
42
348
somewhere over the rainbow
cPanel Access Level
Root Administrator
So, when you state it isn't a DoS, you've checked when the issue happens and ran the commands noted, including checking MaxClients isn't erroring or a high number of SYN_RECV entries in netstat? If you haven't checked using the actual commands (all of them indicated that check for various types of attack), can you please run them regardless of what you believe is the issue. If the script were being attacked, it would show a high number of connections that won't close, because those type of connections purposely are maintained due to being an attack on the script. It would appear the script or site is the issue in that type of event, but it would be IPs purposely trying to bring down the server by hitting the site or script. That's what a DoS does.
 

PattyO

Member
May 6, 2012
14
0
51
Brazil
cPanel Access Level
Root Administrator
Hi, Tristan.

Yeah, I have run them all even before you suggested and the results didn't indicated a DoS as far as I could tell.
In any case, client moved the account to another server outside our network so now there's nothing else I can do. :(
Pitty, because I really wanted to get to the bottom of this.

Anyway, tks for your time and efforts. :)