monkey64

Well-Known Member
Nov 6, 2011
124
5
68
cPanel Access Level
Root Administrator
My server crashed without any warning.
Looking in var/log/messages, I found the following:

Code:
Aug 14 14:23:36 server kernel: Call Trace:
Aug 14 14:23:36 server kernel:  [<c0167fbf>] T.648+0x55/0x15f
Aug 14 14:23:36 server kernel:  [<c0405b3d>] ? _raw_spin_unlock_irqrestore+0xf/0x11
Aug 14 14:23:36 server kernel:  [<c02a3fbd>] ? ___ratelimit+0xc9/0xdc
Aug 14 14:23:36 server kernel:  [<c01680f8>] T.647+0x2f/0x205
Aug 14 14:23:36 server kernel:  [<c0133881>] ? has_capability_noaudit+0x18/0x21
Aug 14 14:23:36 server kernel:  [<c01684c6>] out_of_memory+0x1f8/0x27b
Aug 14 14:23:36 server kernel:  [<c016adf8>] __alloc_pages_nodemask+0x3f9/0x512
Aug 14 14:23:36 server kernel:  [<c016c22f>] __do_page_cache_readahead+0xa6/0x187
Aug 14 14:23:36 server kernel:  [<c0166144>] ? wait_on_page_bit+0x78/0x81
...but it doesn't really help me understand what happened.

WHM > Daily Process log does not show any CPU usage above 1%.
Here are the Top Processes for the day the crash occurred:


Code:
User	Domain          %       Process

root 		        83.0 	gzip
root 		        4.8 	/bin/gtar -t -v -f /backup/cpbackup/daily/site1/homedir.tar --utc
root 		        3.0 	/usr/bin/perl /usr/local/cpanel/whostmgr/bin/dnsqueue
site2 	site2.co.uk 	3.0 	php -q /home/site2/public_html/php/cron_basket.php
site1 	site1.co.uk 	24.1 	/usr/bin/php public_html/generator/runcrawl.php
site1 	site1.co.uk 	22.6 	/usr/bin/php public_html/generator/runcrawl.php
mysql 		        2.8 	/usr/sbin/mysqld --basedir/ --datadir/var/lib/mysql --usermysql --log-error/var/lib/mysql/myserver1err --pid-file/var/lib/mysql/myserver1pid
mysql 		        2.0 	/usr/sbin/mysqld --basedir/ --datadir/var/lib/mysql --usermysql --log-error/var/lib/mysql/myserver1err --pid-file/var/lib/mysql/myserver1pid
site1 	site1.co.uk 	10.0 	[php]
mysql 		        1.8 	/usr/sbin/mysqld --basedir/ --datadir/var/lib/mysql --usermysql --log-error/var/lib/mysql/myserver1err --pid-file/var/lib/mysql/myserver1pid
site2 	site2.co.uk 	1.5 	/usr/bin/php /home/site2/public_html/parts.php
nobody 		        1.0 	/usr/local/apache/bin/httpd -k start -DSSL
mailnull 		1.0 	/usr/sbin/exim -bd -q60m
nobody 		        0.5 	[httpd]
site2 	site2.co.uk 	0.4 	httpd [site2] [/]
nobody 		        0.3 	/usr/local/apache/bin/httpd -k start -DSSL
68 		        0.1 	hald
I'm a bit lost really because I can't find any info on what caused the crash.
The only changes I have made recently are:

1. Turned CpHulk on.
2. I run a script each morning to upload backup files to Amazon S3 using S3cmd.

Server Spec:

1 x 0.40GHz
1024 MBytes

Where should I look for more info regarding the OOM message?
Any ideas?
 
Last edited:

monkey64

Well-Known Member
Nov 6, 2011
124
5
68
cPanel Access Level
Root Administrator
Manually reboot your server, this always helps the case. This will clear the overload on your server. Also I suggest upgrading your hardware
Thanks your input, but I was looking for something more technical than just "Manually reboot your server". As an admin for a couple of multi-million pound companies, I can't just hit re-boot when there's an issue. And as for "upgrading your hardware". Well that's not really any help at all...

Every error needs to be investigated... that's how we learn.
 

cPanelTristan

Quality Assurance Analyst
Staff member
Oct 2, 2010
7,607
43
348
somewhere over the rainbow
cPanel Access Level
Root Administrator
Hello,

Before the call trace, it should show which process exceeded memory. If it's httpd, it might be that you had a high number of connections that caused excessive latency and the memory to be exhausted, especially if you are using any type of PHP OPCode caching such as EAccelerator or xCache.

If it was Apache, you would likewise want to check the Apache error log at /usr/local/apache/logs/error_log during the time this occurred.

Next, if it was MySQL, it might need to be optimized. There are a slew of guides on the forum for optimization. Check the MySQL error log at /var/lib/mysql/hostname.err (where hostname is the server's fully qualified hostname) for the times in question to see if any errors were outputting.

Without knowing the exact process or processes that were the cause of exceeding memory, it's far more difficult to provide suggestions.

Thanks!
 

monkey64

Well-Known Member
Nov 6, 2011
124
5
68
cPanel Access Level
Root Administrator
Tristan

The Apache error log at /usr/local/apache/logs/error_log shows the following just before the crash:

Code:
[Tue Aug 14 14:23:38 2012] [error] server reached MaxClients setting, consider raising the MaxClients setting
After some reseach, it seems that I should increase the following, which are set to the default settings:

Server Limit:
Max Clients:

from Main >> Service Configuration >> Apache Configuration >> Global Configuration.

I could change the above values, but the load on the server is really low (see below):
Is this really the answer?

Typical System information
Server load 0.03 (2 CPUs)
Memory Used 11.35% (235,160 of 2,071,304)
Swap Used 0% (0 of 1)

I've since added Munin so I can track the Server.
Could it have been an attack?

Any ideas?
 
Last edited:

cPanelTristan

Quality Assurance Analyst
Staff member
Oct 2, 2010
7,607
43
348
somewhere over the rainbow
cPanel Access Level
Root Administrator
Yes, it could have been an attack. The load isn't the same as the memory usage. You could run out of memory and have a low load. The current memory is already at 11.35%