Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
Code:
root     11640  0.0  0.0   8676   928 ?        S    Nov20   0:00  \_ /bin/bash ./force-stats.sh
root     11642  0.0  0.0   4032   484 ?        S    Nov20   0:00      \_ xargs -iUSER /scripts/runweblogs USER
root     24648  0.0  0.0  22728  1484 ?        S    Nov21   0:00          \_ /usr/bin/perl /scripts/runweblogs martincl
root     24649 75.6  0.6  73396 13620 ?        R    Nov21 1166:47              \_ cpanellogd - updating bandwidth for martincl
root     28829  4.6  0.8 131936 16896 ?        RN   Nov21  80:42 cpanellogd - updating bandwidth for ruben
root     29807  0.0  0.0   8676   940 ?        S    Nov21   0:00 /bin/bash ./cron-diario.sh
root     30942  0.0  0.0   8676   928 ?        S    Nov21   0:00  \_ /bin/bash ./force-stats.sh
root     30944  0.0  0.0   4028   480 ?        S    Nov21   0:00      \_ xargs -iUSER /scripts/runweblogs USER
root     10012  0.0  0.0  22724  1484 ?        S    08:26   0:00          \_ /usr/bin/perl /scripts/runweblogs naguera
root     10013 60.8  0.6  74072 14236 ?        R    08:26 323:26              \_ cpanellogd - updating bandwidth for naguera
I've wrote a script so that at 00 hours everyday, the script runweblogs updates every user stats, but it ALLWAYS hangs on a random user. And I don't know WHY ?!

If I restart the cpanellogd (by going to the Statistic Configuration on WHM, and setting any option that restarts it) tadá, it keeps going, until some other user get's stopped by anything stupid on the way and it NEVER ends so the other users never update their stats until i restart cpanellogd... etc.

WHY does this happen ?

PS: The cron works on ALL the other servers, I've already done an UPCP, and the cpanellogd script in /usr/local/cpanel/etc/init/centos doesn't do anything but tells me that it doesn't work...
[email protected] [/usr/local/cpanel/etc/init/scripts/centos]# ./cpanellogd restart
./cpanellogd: line 9: /etc/rc.d/init.d/cpfunctions: No such file or directory
Shutting down cpanellogd:
./cpanellogd: line 27: killproc: command not found
./cpanellogd: line 17: action: command not found
 
Last edited:

Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
It seems that this user have a LOT to update, but why doesn't the runweblogs die after cpanellogd dies ?

Code:
root		143.94	11.68	2.0
Top Process	%CPU 85.6	cpanellogd - updating bandwidth for martincl
Top Process	%CPU 85.4	cpanellogd - updating bandwidth for martincl
Top Process	%CPU 85.3	cpanellogd - updating bandwidth for martincl
 

Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
Would it help to use the script "runlogsnow" instead of running the "runweblogs" per each user using the xargs and a ls script like I'm doing now ? Or should I use some timeout script.. or something like it ?
 

cPanelKenneth

cPanel Development
Staff member
Apr 7, 2006
4,607
79
458
cPanel Access Level
Root Administrator
You can call cpanellogd directly:

Code:
/usr/local/cpanel/cpanellogd USER
That at least removes an extra process/script from the equation.

Have you attached, via strace, to the log running process to determine what it was doing?
 

Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
It hanged up again (I have setted that cpanellogd stops when the cpu load is up to 30)...

Code:
root		89.51	7.48	2.0
Top Process	%CPU 89.4	cpanellogd - updating bandwidth for sierras
Top Process	%CPU 89.3	cpanellogd - updating bandwidth for sierras
Top Process	%CPU 89.2	cpanellogd - updating bandwidth for sierras
But the user sierras does the cpanellogd correctly... (attached file)

[EDIT] I did now both runweblogs and cpanellogd correctly.
 

Attachments

Last edited:

Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
It would help to know how to reload cpanellogd (like when you change something on the Statistic options of WHM) via ssh, so I can cron it to one hour so it restarts the tilted stat that's running.
 

cPanelKenneth

cPanel Development
Staff member
Apr 7, 2006
4,607
79
458
cPanel Access Level
Root Administrator
It would help to know how to reload cpanellogd (like when you change something on the Statistic options of WHM) via ssh, so I can cron it to one hour so it restarts the tilted stat that's running.
Code:
/scripts/ckillall -HUP cpanellogd
 

Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
It keeps stopping on some random user, using about 95 / 97 / 100 % cpu and if I don't kill it, It keeps hanging there.is there any way to check if any cpanellogd process for any user has been running for more than 30 minutes and automatically kill it ?

This one is going to stop the stats process... check the cpu usage.
Code:
root     21590  0.0  0.0   8676   928 ?        S    Dec04   0:00  \_ /bin/bash ./force-stats.sh
root     21592  0.0  0.0   4028   480 ?        S    Dec04   0:00      \_ xargs -iUSER /scripts/runweblogs USER
root      9979  0.0  0.0  22728  1488 ?        S    12:09   0:00          \_ /usr/bin/perl /scripts/runweblogs economic
root      9980 96.4  0.5  71144 11452 ?        R    12:09   2:59              \_ cpanellogd - updating bandwidth for economic
But the bytes-log is ok... if you want to I can upload it, but it's allways a random user.
I've noticed though that it usually tends to stop on the bandwidth update... is there any secondary script for this type of log that can be updated ?
Like the rrdtool for example.

Also would be nice to know if there's a way to force the awstats update, skipping the bandwidth update, so the awstats can be updated while we deal with the bandwidth script.
 
Last edited:

cPanelKenneth

cPanel Development
Staff member
Apr 7, 2006
4,607
79
458
cPanel Access Level
Root Administrator
We are working on various fixes for our RRDTool integration which may relieve this issue.

Is this server on 11.23 or 11.24? There are a number of log processing improvements in 11.24.

As for your specific question, you could create a cronjob to execute ever 30 minutes, compare the current process table for cpanellogd to the prior cronjob run and terminate processes that are still running.
 

Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
I'm running cPanel CURRENT 11.24, I've also did a fixquotas, and installed manually (via rpm) perl-rrdtool and rrdtool, to see if it fixes some strange dependences, also did a checkperlmodules and re-run my stats updater script (it's just an ls and a runweblogs for each user).

The script was totally verbose, really a LOT better than the latest runweblogs script, but still it hangs on a random user updating bandwidth logs. If I kill that cpanellogd (that is updating the bandwidth), then the stats go on and end up totally normal (if some other user hangs just repeat the process).

The only thing that's wierd about my computer is that the BIOS from the motherboard is really outdated, I don't know if it has something to do with the kernel, but the CPU is a Core Duo 3.0 and because of the outdated BIOS it's just running at 2.4 Ghz... I really don't see any connection but still I'll update it.

Do you have any other ideas ?
This is one of the times that it hanged up :
Code:
[email protected] [~]# screen -r
Phase 1 : First bypass old records, searching new record...
Direct access to last remembered record has fallen on another record.
So searching new records from beginning of log file...
Phase 2 : Now process new records (Flush history on disk after 20000 hosts)...
Jumped lines in file: 0
Parsed lines in file: 2107
 Found 0 dropped records,
 Found 0 corrupted records,
 Found 0 old records,
 Found 2107 new qualified records.
Child [22450]: exited with signal 0
Complete
Log checker loaded ok..
==> cPanel Log Daemon version 24.0
==> Loaded RRDs: version 1.3004
==> WARNING: The configured processor count does not match the
==> actual processor count (2)! Running log analysis programs
==> on this system may cause excessive load! You should set "extracpus"
==> to "0" in /var/cpanel/cpanel.config if this is not ok.
[sepxferlog]
[sepxferlog] complete
Processing kalisol...
Run Logs domain: kalisol.com.ar BW Limit: unlimited Domains: []
This is the process:
Code:
root     22455 98.7  0.7 162332 14444 pts/1    R+   22:42   7:12              \_ cpanellogd - updating bandwidth for kalisol
I've killed it and tadá, the script kept going to the next user... just that it leaves the user bandwidth at 0 Mb used ant some users are starting to notice it and use it for their own good by only putting 2mb on a user that uses some more gigabits...


I think that sooner or later I'll end up formatting this computer... if you haven't got any other ideas I might send a TT to cPanel but I think that I'll end up with the same answer... I'll do it after the bios update.. just to be sure..
 
Last edited:

Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
I've added a cron that every 15 minutes does...
/usr/bin/pkill -9 cpanellogd

It works and kills the process that's stoping the stats... the wrong side is that it kills it even if it's not stopping the stats.

This is an error that I've seen on martincl which is a user that ALLWAYS hangs (the one that started this issue):
Code:
Run Logs domain: closeloops.com BW Limit: 3145728000 Domains: [ar.closeloops.com cl.closeloops.com]
Update status for /var/cpanel/bandwidth/martincl-http.rrd: (222) /var/cpanel/bandwidth/martincl-http.rrd: illegal attempt to update using time 1228058902 when last update time is 1228058901228107264 (minimum one second step)
Processing exim stats for martincl.......Done
(After this was the hang of the bandwidth updater...)

Would it help to update all stats that's possible and then flush all domlog files ?
 
Last edited:

cPanelKenneth

cPanel Development
Staff member
Apr 7, 2006
4,607
79
458
cPanel Access Level
Root Administrator
After you kill the process, how long does it take for another to cause log processing to hang? Is it rather reliable?

The reason for the questions is I'm wondering if you would be willing to open a ticket with us for QA to investigate what is happening so we can possibly rectify this once and for all.
 

Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
It woooorks !!

FIXED ! :D

I've found that the usr local apache was filled with logs D: !
almos 98% (4 GBytes!) of the apache error_log, thankfully the cpanel 11.24 has a logrotate to it. So i've warned the account that was causing the most errors, deleted the error_log finished all possible stats, deleted all remaining domlogs.

Then did (just to be sure), an httpd restart, upcp --force, fixquotas, and... the other night, the bash ended succesfully and all stats are now updated :D

Thanks for all of your help :)
 

cPanelKenneth

cPanel Development
Staff member
Apr 7, 2006
4,607
79
458
cPanel Access Level
Root Administrator
FIXED ! :D

I've found that the usr local apache was filled with logs D: !
almos 98% (4 GBytes!) of the apache error_log, thankfully the cpanel 11.24 has a logrotate to it.
Well, yeah that would do it. I'm glad you discovered the root cause.
 

Kent Brockman

Well-Known Member
PartnerNOC
Jan 20, 2008
1,287
65
178
Buenos Aires, Argentina
cPanel Access Level
Root Administrator
Hi, I have the same problem. I'm running STABLE 11.23.6-S27698. Is this issue already solved for the new STABLE 33345. I mean, is that new version allowing for logrotating these heavy log files?
Thanks
 

Silent Ninja

Well-Known Member
Apr 18, 2006
196
0
166
Buenos Aires, Argentina
It has started again to happen in one of our cPanel servers, that the cpanellogd, hungs up while accounting a user bandwidth, and we didn't noticed about it like for a week.

The solution is do a forced statistic update for all users, kill the processess from the bandwidth log that hangs with 99% of cpu, until it finishes accounting the max possible users. Then, delete all domlogs, restart apache (so they re-create automaticaly, if that doesn't happen you may have to recreate them manually); and then it'll all start to work again.

It would be good to add some sort of timeout to the bandwidth accounting log so that if it takes more than.. let's say something huge... half an hour, it get's killed and the process doesn't hang all the users, and just that user. It may report you that those users are behind, and you'll fix them, but this way I ended up with a week of uncomputated logs and statistics because it seems that the domlogs are being deleted but not accounted while this happens.

PS: I've noticed that one of the bandwidth files that hanged up the process have more than 23.000 lines (because of the outdate, it got all the information on it) and maybe it's that there's a limit to be processed because on files that are smaller there were no problems with the same cpanellogd script.

PS2: Some stats told me that the last update date, was from after the date of the first log on the domlog, it may be that a wrong date on the cpanellogd last update time, is stopping them from being updated, or viceversa the one's that didn't fail are not being updated at all because of this.

Code:
Update status for /var/cpanel/bandwidth/comagolf-http.rrd: (912) /var/cpanel/bandwidth/comagolf-http.rrd: illegal attempt to update using time -9223372036854775808 when last update time is 1239199361 (minimum one second step)
Processing exim stats for comagolf.......Done
Update status for /var/cpanel/bandwidth/comagolf-pop3.rrd: (138) 
Update status for /var/cpanel/bandwidth/comagolf-all.rrd: (1050) /var/cpanel/bandwidth/comagolf-all.rrd: illegal attempt to update using time -9223372036854775808 when last update time is 1239199361 (minimum one second step)
 
Last edited:

cPanelDavidG

Technical Product Specialist
Nov 29, 2006
11,212
13
313
Houston, TX
cPanel Access Level
Root Administrator
It has started again to happen in one of our cPanel servers, that the cpanellogd, hungs up while accounting a user bandwidth, and we didn't noticed about it like for a week.

The solution is do a forced statistic update for all users, kill the processess from the bandwidth log that hangs with 99% of cpu, until it finishes accounting the max possible users. Then, delete all domlogs, restart apache (so they re-create automaticaly, if that doesn't happen you may have to recreate them manually); and then it'll all start to work again.

It would be good to add some sort of timeout to the bandwidth accounting log so that if it takes more than.. let's say something huge... half an hour, it get's killed and the process doesn't hang all the users, and just that user. It may report you that those users are behind, and you'll fix them, but this way I ended up with a week of uncomputated logs and statistics because it seems that the domlogs are being deleted but not accounted while this happens.

PS: I've noticed that one of the bandwidth files that hanged up the process have more than 23.000 lines (because of the outdate, it got all the information on it) and maybe it's that there's a limit to be processed because on files that are smaller there were no problems with the same cpanellogd script.

PS2: Some stats told me that the last update date, was from after the date of the first log on the domlog, it may be that a wrong date on the cpanellogd last update time, is stopping them from being updated, or viceversa the one's that didn't fail are not being updated at all because of this.

Code:
Update status for /var/cpanel/bandwidth/comagolf-http.rrd: (912) /var/cpanel/bandwidth/comagolf-http.rrd: illegal attempt to update using time -9223372036854775808 when last update time is 1239199361 (minimum one second step)
Processing exim stats for comagolf.......Done
Update status for /var/cpanel/bandwidth/comagolf-pop3.rrd: (138) 
Update status for /var/cpanel/bandwidth/comagolf-all.rrd: (1050) /var/cpanel/bandwidth/comagolf-all.rrd: illegal attempt to update using time -9223372036854775808 when last update time is 1239199361 (minimum one second step)
Please let our technical analysts take a look at this server for you. They can test your hypothesis of the log file sizes causing the issue. Also, if this is something that needs to be optimized in our code, our technical analysts will be able to discuss that with our quality assurance team who can then formulate a long-term solution.