HOW TO: make full backups use --rsyncable w/ gzip

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
HOW TO: make full backups work with rsync

Latest edit: rsyncable option is now included in cpbackup by cPanel, so all the instructions below are moot for version 11 and up.

Edits: The simplified version after much testing and trial is provided below. Thanks to Chirpy for finding the exact method that works so simply! Also added a simple rotation script.

Modify the root crontab changing the line shown as follows:

0 1 * * * /scripts/cpbackup

to:

0 1 * * * export GZIP="--rsyncable" ; /scripts/cpbackup
That's it!

Here's a simple script to rotate the backups:
Code:
#!/bin/sh

# This script rotates the backup files daily, keeping 7 days worth.

DATEFORMAT=$(date +%a)
BACKUPSOURCE=/backup
BACKUPDEST=/Data/backup/rotation/

rm -Rf $BACKUPDEST/$DATEFORMAT
mkdir $BACKUPDEST/$DATEFORMAT
cp -R $BACKUPSOURCE/* $BACKUPDEST/$DATEFORMAT
The original post is below.

******** Original Post ************


Note: This post may become obsolete if it turns out that setting a permanent environment variable fixes the issue. If so, this will get updated to reflect that.

Using the Full Backup feature built into WHM is nice, because it grabs pretty much everything and creates stand-alone backups. The problem with this backup, however, is that because it uses gzip compression, it isn't rsync-friendly.

Rsync is a nifty file transfer tool that only transfers the changes in a file. So, once you have downloaded a backup once, rather than download the whole backup again to update it, it only sends the changes to the file and incorporates them into the existing file. In the case of files that have been compressed using the default gzip settings, almost the entire file is transferred again because of the adaptive compression algorithm.

There is good news, though. I just saved a bunch of money on my car insurance by switching to Geico ;)

Ok, seriously, the good news is that later versions of gzip have the capability of playing nice with rysnc by way of the '--rsyncable' switch. With an rsyncable gzip file, rsync can process it much more efficiently. (You can check your version of gzip to see if it will handle rsync by looking at the help: 'gzip --help')

So how do we make that happen? The good news is it's a simple, one-line modification to the 'cpbackup' script located in the /scripts directory. The bad news is that cpanel will overwrite this file every time it runs the 'upcp' script. This means we have to protect it somehow. Another nifty script will take care of that.

So, first, let's modify the 'cpbackup' script:

1) make a copy of the script and call it 'cpbackupunedited'
cp /scripts/cpbackup /scripts/cpbackupunedited
2) Open the 'cpbackup' script and add the following line just after the remarks at the top, but before the 'BEGIN' statement
$ENV{'GZIP'} .= "--rsyncable";
3) Save the script. Now, make another copy of the file, this time calling it 'cpbackupedited'
cp /scripts/cpbackup /scripts/cpbackupedited
Your backups are now going to work with rsync rather nicely! For instance, on the first test I made using a 1.5gig set of gzip files, 35 megs were transferred, in contrast to the 1.2gigs transferred without the --rsyncable switch. That speeds up transfer times quite a bit!

Now, we need to protect our cpbackup file. While we could 'chattr' so that it can't be overwritten, there is another, more refined method that will let us know if there are changes in the script that need to be looked at. This is a modification of the 'watchwwwacct' script you can read more about elsewhere on the forum. I'll post just the modified script for sake of brevity:

#!/usr/bin/perl
# You may need to change this path to /usr/local/bin/perl

$mailprog = "/usr/sbin/sendmail";

#**************************************************************
#
# Script to monitor cpbackup script customizations: watchcpbackup V1.1
#
# Written by:
# Premier Website Solutions - http://www.premierwebsitesolutions.com
# Created - in 2003 (V1.0)
# Modified - September 12, 2004 (V1.1)
# - minor modifications to make usable by others
#
#
# Set a few variables below and upload this script to anywhere on your server,
# then set a cron job to run the script every hour. I put mine in a subfolder
# of the servers scripts folder. (/scripts/custom)
#
# To set the cronjob, in shell, type crontab -e, then enter what's between the quotes
# on the following line as a new line in your cron listings:
# "0 * * * * /scripts/custom/watchcpbackup.cgi"
#
# Ownership and permissions of the script should be root:root, and 0700.
#
# This script only needs to run after a cpanel upgrade, so you could set
# the cron job to run it 1 hour after upcp runs, but then if you change
# when upcp runs, you will need to change this also. This script is fast,
# so it's easier to just run it every hour.
#
# **** Using this script ****
# --------------------------------------------------------------------------------------
# |To use this script, you need to make a copy of the real cpbackup script and call it |
# |cpbackupunedited, then after your customizations, make a copy of your custom one and |
# |call it cpbackupedited. These copies need to be in the same directory as cpbackup. |
# | |
# |What this script does is compare your custom cpbackup script to a copy of it. |
# |If a cpanel update changes the cpbackup script, this script will notice the change. |
# | |
# |Now the fun part. Many upgrades only change the cpbackup script back to what it was |
# |originally. If it's changed, this script compares the new cpbackup to a copy of the |
# |original one. If the update merely wrote the original one back, it would match the |
# |copy. This script would then take the copy of your custom one and reuse it. Now, if |
# |the update altered the script, you would be emailed and told that your customizations |
# |are lost and that you will need to redo them. The script does tell you where changes |
# |were made to help with reapplying your customizations. |
# | |
# |I customized my cpbackup script over a year ago and have only had to redo the changes |
# |maybe half a dozen times. |
# --------------------------------------------------------------------------------------
#
#
# Registered users of this script will be notified of any future updates.
# If you registered this copy with me, put your email here for future reference.
# This copy is registered to:
#
#**************************************************************


# This is where the email will be sent when a change is detected.
# If you use spamassassin, you should include a name, like this:
# $sendto_email = 'YourName <[email protected]>';
$sendto_email = 'YourName <[email protected]>';

# This is the sender for the email message.
# Change it if you wish.
$sender_email = 'YourName <[email protected]>';

# This is where your cpbackup file is located.
# It shouldn't need to be changed.
$path = "/scripts";



$diff1 = system("cmp $path/cpbackup $path/cpbackupedited");

if ($diff1 eq "0") {
exit;
}
else {
$diff2 = system("cmp $path/cpbackup $path/cpbackupunedited");
}

if ($diff2 eq "0") {
system("cp -f $path/cpbackupedited $path/cpbackup");

# Open The Mail Program
open(MAIL,"|$mailprog -t");
print MAIL "Content-Type: text/html; charset=iso-8859-1\n";

print MAIL "To: $sendto_email\n";
print MAIL "From: $sender_email\n";

print MAIL "Subject: cpbackup file changed and restored\n";

print MAIL "<b>The cpbackup file was changed back to the original and has been automatically replaced with the edited version.</b><br><br>\n\n";

close (MAIL);

}

if ($diff2 ne "0") {

# Open The Mail Program
open(MAIL,"|$mailprog -t");
print MAIL "Content-Type: text/html; charset=iso-8859-1\n";

print MAIL "To: $sendto_email\n";
print MAIL "From: $sender_email\n";

print MAIL "Subject: cpbackup file changed\n";

print MAIL "<b>The cpbackup file has been changed and no longer matches the original file. You will need to redo your custom work.</b><br><br>\n\n";

close (MAIL);

}
Lastly, we need to setup our crontab to run this watcher script *after* 'upcp' runs, but *before* the cpbackup script runs. Edit the crontab (as root, obviously):

crontab -e
Look for the line with '/scripts/upcp' and change it:
40 0 * * * /scripts/upcp

becomes

40 0 * * * /scripts/upcp; /scripts/watchcpbackup
Save it, and you're done!


If you want to automate the rsyncing of your files, take a peek at the following HOW-TO, which is clear and works great:

http://www.jdmz.net/ssh/
 
Last edited:
  • Like
Reactions: alwaysweb

CoolMike

Well-Known Member
Sep 6, 2001
313
0
316
Hi

Did you test the restore already? Because if this works, it would be a great solution for us. Because we copy every day all the backup files to a external server.

Michael
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
I've not had time to do an actual restore test, but I have extracted the things on the backup server and it all looks well.

What I need to do is flesh this out a bit so that the md5 checksums on both systems are compared to determine if there were any errors.
 

alwaysweb

Well-Known Member
Mar 8, 2002
97
0
306
Dallas, TX
cPanel Access Level
Root Administrator
Thanks for this cool post! (We previously had to adjust our rsyncing of our server's backups due to the bandwidth used.)

I tried it out, but wasn't successful in actually seeing any rsync savings in the transfer... I installed as indicated, and ran /scripts/cpbackup manually, and then rsync'd that to my offsite backup server. Easy.

Then, I dropped a few 5mb files into a few domains' www folders... then I renamed /backup/cpbackup/daily to /backup/cpbackup/daily.old (so cpbackup would run again) and ran /scripts/cpbackup let it finish, and did the rsync again...

It seems to have transmitted the full .tar.gz for each user in full all over again.

Any ideas? Could you perhaps help me on this on forum or off? Is there another way you can think of to test this? Etc.

(We would be glad to pay you for your time) I PM'd you my contact info, thanks.
 

dgbaker

Well-Known Member
PartnerNOC
Sep 20, 2002
2,531
10
343
Toronto, Ontario Canada
cPanel Access Level
DataCenter Provider
Because you renamed the daily directory to something else, there is nothing for it to rsync with. The script only works to copy to an existing gz backup file. If no gz file exists the entire file is copied.

Nice script though BTW.
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
Here's what I think has happened:

1) original cpbackup ran and created all the new backup files under /cpbackup/daily

2) rsync'd, pulling down /cpbackup/daily

3) renamed /cpbackup/daily to /cpbackup/daily.old

4) ran cpbackup again, recreating /cpbackup/daily

At this point you have /cpbackup/daily and /cpbackup/daily.old Assuming you used the same rsync script that I did:

5) rsync'd again, pulling down minor changes from /cpbackup/daily and the full /cpbackup/daily.old directory, since it didn't exist the first time.

The script pulls down changes recursively from /cpbackup.
 

alwaysweb

Well-Known Member
Mar 8, 2002
97
0
306
Dallas, TX
cPanel Access Level
Root Administrator
dgbaker said:
Because you renamed the daily directory to something else, there is nothing for it to rsync with. The script only works to copy to an existing gz backup file. If no gz file exists the entire file is copied.

Nice script though BTW.
HI, I did the renaming on the local machine (not the backup server) so that I could run the cpbackup "daily" again and rsync that...
 

alwaysweb

Well-Known Member
Mar 8, 2002
97
0
306
Dallas, TX
cPanel Access Level
Root Administrator
Lyttek said:
Here's what I think has happened:

1) original cpbackup ran and created all the new backup files under /cpbackup/daily

2) rsync'd, pulling down /cpbackup/daily

3) renamed /cpbackup/daily to /cpbackup/daily.old

4) ran cpbackup again, recreating /cpbackup/daily

At this point you have /cpbackup/daily and /cpbackup/daily.old Assuming you used the same rsync script that I did:

5) rsync'd again, pulling down minor changes from /cpbackup/daily and the full /cpbackup/daily.old directory, since it didn't exist the first time.

The script pulls down changes recursively from /cpbackup.
Hi, I don't think I retransmitted the .old... as I used "rsync-exclude" to ignore the daily.old in my remote backup script... but I think the reason it did a full transfer was that the .tar.gz's on the backup server were not "gzip --rsyncable"... Apparently, the file on both ends needs to be rsync friendly right to make a difference? Since the one on the backup server were older and not rsync friendly, and even though the local daily backup .tar.gz for a site WAS rsync friendly, unless they both are then it does a full transmit...? That's at least what it seems to me! By letting the full rsync finish (so the backup server has rsync friendly .tar.gz's) and then with the next day's cpbackup running.. I then did my remote backup script and it only transmitted changes. If this is the case, it may not be a bad idea to mention in the original post... Thanks all !



P.S. I explored the "global environment variables" you mentioned, and by editing:

/etc/profile

and above this line (approx line 58):

export PATH USER LOGNAME MAIL HOSTNAME HISTSIZE INPUTRC

I added:

GZIP="--rsyncable"

And then added GZIP to the end of the export line to look like this:

export PATH USER LOGNAME MAIL HOSTNAME HISTSIZE INPUTRC GZIP


And, to make it go into effect right away I did this from the command line with:

# GZIP="--rsyncable"; export GZIP

I believe it has accomplished the same thing. I took the line out of the /scripts/cpbackup and it appears to successfully do a rsync-friendly backup. Can anyone else test and confirm this?
 
Last edited:

CoolMike

Well-Known Member
Sep 6, 2001
313
0
316
Is the restore function in WHM still possible for this soluiton? Because if yes, maybe it would make sense, when Cpanel would implement this change globaly...

Michael
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
alwaysweb said:
/etc/profile

and above this line (approx line 58):

export PATH USER LOGNAME MAIL HOSTNAME HISTSIZE INPUTRC

I added:

GZIP="--rsyncable"

And then added GZIP to the end of the export line to look like this:

export PATH USER LOGNAME MAIL HOSTNAME HISTSIZE INPUTRC GZIP


And, to make it go into effect right away I did this from the command line with:

# GZIP="--rsyncable"; export GZIP

I believe it has accomplished the same thing. I took the line out of the /scripts/cpbackup and it appears to successfully do a rsync-friendly backup. Can anyone else test and confirm this?
I'm testing a similar route. I've also reverted back to the original cpbackup script. Rather than edit /etc/profile I dropped a new script in /etc/profile.d named gzipenv.sh that is as follows:

#!/bin/sh
# Sets environment variable to make gzip rsync-friendly

GZIP="--rsyncable";
export GZIP
I'll post back with results. How's your test going?
 

CoolMike

Well-Known Member
Sep 6, 2001
313
0
316
Lyttek said:
I'll post back with results. How's your test going?
This looks even more simple, please inform us, if this works, I will implement it imidatly and I'm sure I can save a lot of traffic in my network.

Michael
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
Well...

The script ran, and the following shows up when I use the 'env' command:

GZIP=--rsyncable

However, the sync took 1.2 gigs, so something didn't work correctly. Not sure why.

Here's something I don't understand yet, and it may have bearing:

In the HOW TO section, the actual command to set the GZIP variable is
$ENV{'GZIP'} .= "--rsyncable"
How does this differ in results from
#!/bin/sh
# Sets environment variable to make gzip rsync-friendly

GZIP="--rsyncable";
export GZIP
In particular, does the . before the = in the former statement mean anything?
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
Ok, did some more testing thusly:

I left the latest script in /etc/profile.d, AND reverted back to the edited cpbackup script. When I try to manually run it, it errors out because now it's trying to use gzip with a switch of --rsyncable--rsyncable

This tells me that both scripts are adding the requisite switch... so why didn't it work? Maybe a fluke, so I've gone back to the original, unedited version of cpbackup and the /etc/profile.d script and we'll see what happens tonight.
 

alwaysweb

Well-Known Member
Mar 8, 2002
97
0
306
Dallas, TX
cPanel Access Level
Root Administrator
Lyttek said:
Well...

The script ran, and the following shows up when I use the 'env' command:

GZIP=--rsyncable

However, the sync took 1.2 gigs, so something didn't work correctly. Not sure why.

Here's something I don't understand yet, and it may have bearing:

In the HOW TO section, the actual command to set the GZIP variable is


How does this differ in results from


In particular, does the . before the = in the former statement mean anything?
I think you're right, the env only doesn't seem to be cutting it --
From a programming background, The ".=" versus "=" means append to... the previous settings of the variable. So if test = "abcd" and then you did test .= "efgh" then test would be "abcdefgh".

The second one is obvious, a normal equal sign. It just overwrites the variable with the value.... Hope this helps!

Perhaps its worth testing with a ".=" vs. "=" and see which behaves :)
 

alwaysweb

Well-Known Member
Mar 8, 2002
97
0
306
Dallas, TX
cPanel Access Level
Root Administrator
Lyttek said:
Ok, did some more testing thusly:

I left the latest script in /etc/profile.d, AND reverted back to the edited cpbackup script. When I try to manually run it, it errors out because now it's trying to use gzip with a switch of --rsyncable--rsyncable

This tells me that both scripts are adding the requisite switch... so why didn't it work? Maybe a fluke, so I've gone back to the original, unedited version of cpbackup and the /etc/profile.d script and we'll see what happens tonight.
I tried this myself as well and got the --rsyncable--rsyncable messages on the command line...

Try this: If you change the quoted "--rsyncable" to " --rsyncable" with a space before the double dash (in both locations)... and if they BOTH apply... and its:

" --rsyncable --rsyncable"

that still is a valid argument to gzip. :)
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
alwaysweb said:
From a programming background, The ".=" versus "=" means append to... the previous settings of the variable. So if test = "abcd" and then you did test .= "efgh" then test would be "abcdefgh".
Ok, I see what caused the problem with the double switch, which makes sense. Thanks for that important bit of info! Still not sure why the env. variable didn't seem to work. Still waiting for test to finish.
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
Sucess!!

It must have been some type of fluke. Just finished the second round of tests, and it updated 1.8 gigs with only 39 megs of transfer!

Guess I'll update the original post with the new instructions.