CPanel crashes every single day

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
Originally posted by Hoster2k
Is the onboard sound disabled in the bios?
humm .. never thought of a issue like that.

1st box was a ABIT VP6 dual 933 P3 .. crashed every week.

New box built by RS is a Dell Dual Xeon 2GHZ running serverworks i think.

why?? and what's easiest way to tell?
 

Hoster2k

Well-Known Member
Jun 17, 2002
131
0
166
UK
Just from past experience.. random crashes no entry in logs, no high load. turned out onboard ac97 was doing it. When disabled it flew. Ask RS or who ever to check they have any onboard sound disabled.
 

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
Originally posted by shaun
rpmws: did you try my smartcheck fix? or is it staying up on it's own?
it's still up. if I try all teh fixes I won't know what was wrong :(
 

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
Originally posted by Sash
Hello

So what did you do that has fixed so far?

Mike
Only what you suggested in that PM to me. Why ...has your crashed now after 17 days?
 

Sash

Well-Known Member
Feb 18, 2003
252
0
166
Originally posted by rpmws
Only what you suggested in that PM to me. Why ...has your crashed now after 17 days?
Nah....still up and running.

Just wanted to know what worked.

MIke
 

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
Originally posted by shaun
Please post what you did so others will know...

Remember people search these forums for awnsers to their problems.
I woul dleave that up to Sash . he PM'd me and told me what he did and that is was working for him. I think he was concerned that everyone would try it without us **knowing** yet if it was the solution. So far it's worked for him for 17 days and me about 4 so there hasn't been a whole lot of testing. I can't believe others see this issue also. Eric at cpanel acted like he had never head of such a strange thing. I am starting to wonder if many people keep this quiet and just accept a box going on strike twice a week for no reason.

Sash ..should we post it?
 

Sash

Well-Known Member
Feb 18, 2003
252
0
166
Originally posted by rpmws

Sash ..should we post it?
Hello Everyone

We've brought 5 cpanel servers online in about the past month. 3 out of the 5 would crash every couple days. No warning, no high usage, and nothing in the logs. Sometimes the servers were pingable sometimes they weren't. Only a power cycle would bring them back online.

Here is the solution that fixed the problem for us and rpmws:

----------------------
Add the following to /etc/rc.d/rc.local

echo 64000 > /proc/sys/fs/file-max
ulimit -u unlimited
ulimit -S -H -n 4096

Reboot the server

Recompile apache
----------------------


If you run into the same problem and the above fixes it please post here. I'd like to know how many people are encountering the problem.

Mike
 

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
I am up now for almost 5 days. 400 accounts. 107GB of transfer so far for May 1 -13. Loads seem lower also. I would say the server is getting used harder the last couple days than it has been while it was crashing every 30 hours or so. So far so good for me :)
 

jamesbond

Well-Known Member
Oct 9, 2002
737
1
168
I'm not experiencing any problems, but I did notice the following in my /var/log/messages

May 12 07:44:42 host filelimits: Increasing file system limits succeeded
May 12 14:04:42 host filelimits: Increasing file system limits succeeded
May 12 19:16:40 host filelimits: Increasing file system limits succeeded


Should I be concerned about this? Is this in any way related to the problems you are experiencing?

I couldn't find anything relevant in google.
 

jamesbond

Well-Known Member
Oct 9, 2002
737
1
168
Ok, I looked at the file "file-limits' on my server, and apparently this file increases the file-limits when needed.

cat /etc/rc.d/init.d/filelimits

And I saw this segment where the /proc/sys/fs/file-max value is et :

action "Increasing file system limits" /bin/true
echo 131072 > /proc/sys/fs/file-max
echo 16384 > /proc/sys/fs/dquot-max


So that explains my log messages :)
 

sexy_guy

Well-Known Member
Mar 19, 2003
847
0
166
My server crashed to the ground last night. There is not evidence in the logs or reason why but she went down, and hard. Im not impressed. 200+ users lost their sites for 35min. Problem, the damn thing never rebooted either. :mad: Now im wondering why all of a sudden because she was stable for 40 days and down she goes. Accidental reboot by my NOC? Possible, and highly unlikely although it has happend 4 times over the past year. So what should i do? Wait for the next crash to determine it was cPanel and not my NOC?
 
Last edited:

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
Originally posted by sexy_guy
My server crashed to the ground last night. There is not evidence in the logs or reason why but she went down, and hard. Im not impressed. 200+ users lost their sites for 35min. Problem, the damn thing never rebooted either. :mad: Now im wondering why all of a sudden because she was stable for 40 days and down she goes. Accidental reboot by my NOC? Possible, and highly unlikely although it has happend 4 times over the past year. So what should i do? Wait for the next crash to determine it was cPanel and not my NOC?
I have had mine also not want to come alive. I think in most cases it's fsck but not sure. Mine (both boxes) would crash and not respond to anything but pings. So far so good. I can tell you I have 80 sites on a ensim box that does 150GB month and has been up for 231 days.
 

sexy_guy

Well-Known Member
Mar 19, 2003
847
0
166
Originally posted by rpmws
I have had mine also not want to come alive. I think in most cases it's fsck but not sure. Mine (both boxes) would crash and not respond to anything but pings. So far so good. I can tell you I have 80 sites on a ensim box that does 150GB month and has been up for 231 days.
I can beat your uptime. My Ensim box ran for 422 days before i converted it to a cPanel box. Big mistake! Anyway, im still looking though the logs on what happened last night. Problem is MRTG was show 10% loads when it crashed. So unless my NOC rebooted my server instead of somebody elses i cant see any other reason for it to have just crashed like that. BTW, running TOP all day on your 19inch takes up quite a bit of resource.

One strange thing i found after my NOC rebooted my box in my kernel log;

May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 230025
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 2982067
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 3638083
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 2982066
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 2490396
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 2212488
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 4325697
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 2343708
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 721728
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 1966626
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 1737206
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 327774
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 213331
May 12 20:53:36 srv05 kernel: ext3_orphan_cleanup: deleting unreferenced inode 213320

and

May 13 01:42:00 srv05 kernel: VFS: find_free_dqentry(): Data block full but it shouldn't.
May 13 01:42:00 srv05 kernel: VFS: Error -5 occured while creating quota.
May 13 04:45:34 srv05 kernel: VFS: Quota for id 32130 referenced but not present.
May 13 04:45:34 srv05 kernel: VFS: Can't read quota structure for id 32130.
May 13 04:55:39 srv05 kernel: VFS: Quota for id 32143 referenced but not present.
May 13 04:55:39 srv05 kernel: VFS: Can't read quota structure for id 32143.
May 13 04:55:40 srv05 kernel: VFS: Quota for id 32131 referenced but not present.
May 13 04:55:40 srv05 kernel: VFS: Can't read quota structure for id 32131.
May 13 06:02:23 srv05 kernel: VFS: Quota for id 32139 referenced but not present.
May 13 06:02:23 srv05 kernel: VFS: Can't read quota structure for id 32139.
May 13 06:02:26 srv05 kernel: VFS: Quota for id 32140 referenced but not present.
May 13 06:02:26 srv05 kernel: VFS: Can't read quota structure for id 32140.

What is all this? I see this in the logs and is part of the server inital boot process. This looks like some kind of quota problem and i dont believe this was the reason for the crash. Since the system crashed to abruptly i think these entries were a result of the system going down.
 
Last edited:

rpmws

Well-Known Member
Aug 14, 2001
1,787
10
318
back woods of NC, USA
Originally posted by sexy_guy
I can beat your uptime. My Ensim box ran for 422 days before i converted it to a cPanel box. Big mistake! Anyway, im still looking though the logs on what happened last night. Problem is MRTG was show 10% loads when it crashed. So unless my NOC rebooted my server instead of somebody elses i cant see any other reason for it to have just crashed like that. BTW, running TOP all day on your 19inch takes up quite a bit of resource.
TOP doesn't seem to bother it. I mean my loads are less than .50 most of the time. When the ssh session with top running stops the loads are always normal. What ever causes it it happens fast!!! BTW that Ensim box I am talking about was started then. It wasn't a reboot or a problem. Otherwise it would have been longer. I am not knocking cPanel. I love it. Not sure it's cpanel issue at all. Hell no telling ..no logs!!!