Large logfile, merge into existing stats

Andrew Boring

Member
Sep 27, 2006
20
0
151
Hi all,

I've got a customer with a large 4GB domlog file. Since cpanellogd wasn't processing it for several months, I renamed it and recreated an empty logfile to start the processing again.

Now, I've got a huge file covering stats for two months that I would like to process without killing the server for other customers.

I attempted to split it into 10MB chunks so I could merge each small chunk back into the current domlog file and then execute /scripts/runweblogs on that user. However, the logfile gets processed and goes away, but Awstats doesn't reflect any changes.

Here are the steps that I did:

cd /usr/local/apache/logs/domlogs
mv example.com example.com.bak
touch example.com

split -b 10m example.bak example_
(this creates the files example_aa, example_ab, example, ac, etc)
cat example_aa >> example.com
/scripts/runweblogs username

This results in no Awstats update for the data.

Is something missing when I merge the chunk back into the current domlog or is there a better way to do this?
 

nyjimbo

Well-Known Member
Jan 25, 2003
1,137
1
168
New York
Not to be a smart ass but why not just delete it and let the log start over again. If the customer complains NOW after 4 months of no logs make up some excuse and just let it start from today. Why kill yourself over this ?
 

Andrew Boring

Member
Sep 27, 2006
20
0
151
Not to be a smart ass but why not just delete it and let the log start over again. If the customer complains NOW after 4 months of no logs make up some excuse and just let it start from today. Why kill yourself over this ?
Because the customer wasn't being an ass about it. I'm more inclined to go out of my way for someone who asks politely and is patient for a resolution. And, there may come a time when a service agreement, etc may be in force that I'll need to put more priority on stats.

Plus, I'm really and truly curious about how to address this. There must be something else that cpanellogd or awstats processes to bring in the new information when it runs.

-Andrew
 

sparek-3

Well-Known Member
Aug 10, 2002
2,019
226
368
cPanel Access Level
Root Administrator
Not sure if this pertains to your situation or not. Keep in mind that there are hard-links for domlog statistics in:

/usr/local/apache/domlogs/username/domain.com

So if you delete the log file:

/usr/local/apache/domlogs/domain.com

You also need to recreate the hard-link:

rm -f /usr/local/apache/domlogs/username/domain.com
ln /usr/local/apache/domlogs/domain.com /usr/local/apache/domlogs/username/domain.com


The only way I would know to accomplish this would be to split the large log file that you have into smaller files. Store the new file in /usr/local/apache/domlogs/username/domain.com and recreate the hard-links. Then run the weblogs scripts. Repeat this process for each split.

The downside to this is that you are going to get a lot of statistic blimps because the statistics for that split might be from January 20 through January 28th and also the current date as visitors visit the website today. The only way I would know to stop this would be to firewall off port 80 while you run these statistics, or if the account is on a dedicated IP, firewall off that IP on port 80.

You can probably process the log files without interrupting the domlogs access-log for this domain, I just don't know exactly how to do it. You might read through the script at:

/usr/local/cpanel/cpanellogd

which is the script that is actually responsible for creating statistics.
 

Andrew Boring

Member
Sep 27, 2006
20
0
151
Sparek-3,

Thanks for the suggestions.

I did try this:

cd /usr/local/apache/domlogs
rm -f user/example.com
cp example_ab-ac example.com
ln example.com user/example.com
/scripts/runweblogs user

Pretty much the same thing. Log file goes away, but the Awstats doesn't reflect any changes (this is a file that spans a day in March, which has no stats data yet).

I did look through the cpanellogd script itself, and though I can follow Perl code I'm not a competent perl programmer. I did notice several functions that process rrd-tool and stats. I'm wondering if there is something else that specifies to cpanellogd what the range of data (dates/times) should be in the domlogs, so that it is ignoring the old data that I'm merging in?

I did notice this bit of code:

Code:
if ( $ENV{'IGNORELASTRUN'} eq '1' ) {
    print "==> Ignoring lastrun files and running all stats now\n";
    print "==> cpanellogd will exit after stats have run\n";
so I copied everything over again, but set environment variable IGNORELASTRUN=1 in the shell before running "/scripts/runweblogs user"

I'm not sure if the script didn't get the env variable or if it reset it from loading a configuration file (loadConfs() function) from somewhere, but the result was the same (no stats update for that chunk of March).

Am I on the right track? Any other suggestions?