CPanel crashes every single day

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
Originally posted by shaun
None of your guy's box's should have ran close to over 100+ days this year... Their has been multiple kernel vulns out this year...
I agree :) maybe now we know why the new boxes crash and the old unpatched ones don't.
 

sexy_guy

Well-Known Member
Mar 19, 2003
847
0
166
Originally posted by shaun
None of your guy's box's should have ran close to over 100+ days this year... Their has been multiple kernel vulns out this year...
Umm we dont or at least i dont. My last kernel upgrade was 40 days ago and thats why the systems uptime was only 40. As for the 400+ uptime, that was more 10months ago.
 

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
It should also depend on what security risks are involved in each patch. At times a box with one site on it, where you know what being run may not have a issue and can be left alone for sake of no downtime.
 

hostcp3

Well-Known Member
Jun 18, 2002
155
0
166
To Reduce your top load

When in TOP

Hit

s

Delay between updates

change to 5 or 10 and it will cut the load down.

Yeah I Like to watch!
 

imagic

Well-Known Member
Verifed Vendor
Jan 16, 2003
155
0
166
BEWARE!

Originally posted by Sash
Here is the solution that fixed the problem for us and rpmws:

----------------------
Add the following to /etc/rc.d/rc.local

echo 64000 > /proc/sys/fs/file-max
ulimit -u unlimited
ulimit -S -H -n 4096

Reboot the server

Recompile apache
----------------------
We had a server that looked like it was starting into this cycle of going down for no apparent reason, so we tried this fix.

We added the code to the file and before the server could be rebooted, it crashed. It needed a hard reboot at the NOC and came back with this error:
Error failed to mount dir or dir not found.

The NOC said the hard drive would only come back up in read-only.

12 hours later, we finally had the server back online after changing out the hard drive, re-installing cpanel and all of the accounts.

This is what our NOC had to say about the reason for the catastrophe:
"yes it was that code - Im not 100% sure but I think
that is dependent on the size and blocks of the hard drive - if you have the same size drive they did, it would probably work fine - they might have had an 80g or something like that - or a smaller drive with different block size"

Here is what a systems administrator had to say about the situation:
"We have access to several 80gb and 40gb servers, all have the same file-max limits (cannot be drive dependant or one would have crashed). Below is exactly what the file-max is for, it is kernel and ram related not harddrive related.

The file-max file /proc/sys/fs/file-max sets the maximum number of
file-handles that the Linux kernel will allocate. We generally tune this file to improve the number of open files by increasing the value of /proc/sys/fs/file-max to something reasonable like 256 for every 4M of RAM we have: i.e. for a machine with 128 MB of RAM, set it to 8192 - 128/4=32 32*256=8192.

The default setup for the file-max parameter under Red Hat Linux is: "4096" "

Here is a quote from one of the Unix Engineers that I asked about this issue:
"It's not harddrive related. Maximum number of file-handles has to do with memory, you want use it you set 256 for every 4MB or memory. In order to use 64000, you need 1GB of memory. Those applies to kernel 2.2x only.

for kernel 2.4x (redhat) this is best way you set:

edit /etc/sysctl.conf, and add the following line:

e.g. fs.file-max = 64000"



I'm not convinced at all that it was this change in the code that took the server down, but it is very odd that it happened just after the file was modified.

Can anybody (Sash?) shed any light on what happened to us?

cPanel.net Support Ticket Number:
 

Sash

Well-Known Member
Feb 18, 2003
252
0
166
We've done this procedure on ~25 servers without any problems.
All our servers run either 36 or 73 SCSII drives, with RAID1 or RAID5, and have between 512 meg and 2 gig of DDR RAM.

I'd be interested in hearing other people's experiences.

RPMWS: How has the fixed worked for you?

Mike

cPanel.net Support Ticket Number:
 

promak

Well-Known Member
Oct 6, 2001
248
0
316
Just Install Cpanel On new server 3 day , First day it down , i check that and when i reboot found antirelayd down, failed , also update to new kernel , server load is ok but ram use is too high

you see >
total used free shared buffers cached
Mem: 1030200 975332 54868 0 174532 629740
-/+ buffers/cache: 171060 859140
Swap: 2096472 852 2095620
Total: 3126672 976184 2150488

My god just restart one day only , less then 5 customer in this server!

My new server is Celeron 1.8 + 1G DDR Ram and 60GB HD!
What can i do , never want server down!:confused:

cPanel.net Support Ticket Number:
 

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
running 25 plus days finally with this fix. also added the httpd restart in cron nightly.

cPanel.net Support Ticket Number:
 

cPanelNick

Administrator
Staff member
Mar 9, 2015
3,481
35
208
cPanel Access Level
DataCenter Provider
If your system does crash, you can see what processes where eating up ram/cpu before it did by looking at /var/log/dcpumon/boot.[boot time here in unix time]/*.gz

cPanel.net Support Ticket Number:
 

sgemma

Member
Nov 17, 2002
24
0
151
Just to let you know, I have now used this fix on 5 servers and this has been the only thing that has solved their runaway problem. Thanks mike!

cPanel.net Support Ticket Number:
 

sgemma

Member
Nov 17, 2002
24
0
151
Odd.

I implemented this fix. It worked well for about a week. Now for some reason all the settings have been reset. For instance, I type ulimit -u and it's no longer set to unlimited. The file /proc/sys/fs/file-max is set way too high. What would run after the /etc/rc.d/rc.local script that would reset these options? The server is running away again.

cPanel.net Support Ticket Number:
 

fmalekpour

Well-Known Member
PartnerNOC
Dec 4, 2002
85
1
158
Recently we have this problem as well but only with one of our boxes, Linux 8 and latest version on CPanel.

:confused:

cPanel.net Support Ticket Number:
 

fmalekpour

Well-Known Member
PartnerNOC
Dec 4, 2002
85
1
158
Any news?

We have lots of :
filelimits: Increasing file system limits succeeded
on boot.log, no high process, no mem usage, just crash.

:mad:

cPanel.net Support Ticket Number:
 

shazad

Active Member
Sep 1, 2001
38
0
306
Originally posted by sgemma
Odd.

I implemented this fix. It worked well for about a week. Now for some reason all the settings have been reset. For instance, I type ulimit -u and it's no longer set to unlimited. The file /proc/sys/fs/file-max is set way too high. What would run after the /etc/rc.d/rc.local script that would reset these options? The server is running away again.

cPanel.net Support Ticket Number:
Did you happen to revert this back and If so, did it fix the issue again?
 
Last edited:

Imai

Well-Known Member
Aug 11, 2003
45
0
156
RPMUP2?????

Before trying the fixes suggested here I thought I would look at the processes that run before the crash.

In my case is RPMUP2.

This process starts at 5 am using 65% CPU and about 70% MEM. it goes on for 208 minutes and finally crashes the server at about 8 am with 99.9% CPU and 95% MEM

Can I disable rpmup2 or at least run it at desired intervals in /etc/cpupdate.conf

Is this a bug in rpmup?

Please help. My clients come to work at 8 am and they are yelling.

cPanel.net Support Ticket Number:
 

Joshfrom

Well-Known Member
Jun 3, 2003
151
0
166
White Haven, PA, US
Re: RPMUP2?????

Originally posted by Imai
Before trying the fixes suggested here I thought I would look at the processes that run before the crash.

In my case is RPMUP2.

This process starts at 5 am using 65% CPU and about 70% MEM. it goes on for 208 minutes and finally crashes the server at about 8 am with 99.9% CPU and 95% MEM

Can I disable rpmup2 or at least run it at desired intervals in /etc/cpupdate.conf

Is this a bug in rpmup?

Please help. My clients come to work at 8 am and they are yelling.

cPanel.net Support Ticket Number:
Try the latest version

/scripts/updatenow

cPanel.net Support Ticket Number:
 

MattDr2

Well-Known Member
PartnerNOC
Feb 19, 2003
84
0
156
Norman, OK
As fmalekpour noted, my systems are also reverting back to their original settings.

I set the correct lines in rc.local, and upon rebooting the server, the limits (aside from file-max) are not set as they were.

I placed "fs.file-max 64000" in sysctl.conf now, just in case it decides to revert back, also. =)

But, for the others, I'm a bit sketchy on how they would (or could) be added to sysctl.conf.

One thing to note, though... when httpd is started, open files limit is reset, it looks like (/etc/rc.d/init.d/httpd)

This issue has plagued all of our servers since day 1.

Regards,
Matt
 

Imai

Well-Known Member
Aug 11, 2003
45
0
156
Thank you Josh

I did the update as suggested.
There was not problem today.

I hope it stays.

Imai

cPanel.net Support Ticket Number:
 

MattDr2

Well-Known Member
PartnerNOC
Feb 19, 2003
84
0
156
Norman, OK
I've been having this problem across all of my servers but two for the past 8 months. Loads are not abnormal and crashes are common leaving everything down except for being able to ping the server. Oddly enough, the two servers that were running ProFTPD had been running fine for about 4 months, only problems were memory-related and resolved quickly and crashes ceased. The rest of the 10 servers, however, were running Pure-FTPD, and crashed on a regular basis.

I have tried the fix noted in this thread, but the same day I completed the modification to a certain server, the settings went back to normal and the machine crashed.

When I'd check the logs after the server came back online, there would always be something relating to the FTP daemon within one or two lines of when the crash actually took place. There were also notes of these filesystem limit increases. There was never anything directly relating to why the server crashed in the messages log or within the top logs, everything looked normal.

On average, we have a server crash every couple of days. Sometimes they'll get mad at me (heh) and crash a couple times during a single day (as one server did this week). On all but two of my servers, I am now running ProFTPD, the build linked below from Nick. So far, so good.

http://ftp.cpanel.net/proftpd-1.2.9rc2tls-3_linuxprivs.i386.rpm

Anyone else notice anything similar? Or have I just had a bit of luck over the past 48 hours? heh :(

Oddly similar these crashes I've had are to the ones described in this thread. I can't say this -is- the fix, but it's worked for me so far. The fix described previously in this thread was overridden on all of my servers after a reboot. (tried in /etc/rc.d/rc.local and in /etc/sysctl.conf ... no luck making it stick)

Another note, the filesystem file totally fubars my one server running Redhat 9, every few hours. I had to rename the file to keep my server from being inaccessible.

Regards,
Matt
 
Last edited: