HOW TO: make full backups use --rsyncable w/ gzip

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
Drew Nichols said:
So are you having to manually change your cpbackup script each night or have you locked it?
At the moment I'm copying the 'cpbackupedited' script to 'cpbackup' everynight just before 'cpbackup' runs, as I found that in some cases, the 'watchcpbackup' script doesn't copy it back if there are changes. Which means that when 'cpbackup' was actually changed by cPanel, the new, non-rsync-friendly 'cpbackup' was used, wiping out all the benefits.

bleah!
 

chirpy

Well-Known Member
Verifed Vendor
Jun 15, 2002
13,437
33
473
Go on, have a guess
It's definitely working for me. For those testing this for the first time, remember that you will have to wait for 2 backups to pass by before you'll see speedups. I'm seeing speedups using the export in the crontab of between 2 and 36 (a lot will depend on the number of sites and the amount of data changing, obviously) which is a huge improvement.
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
Working here as well. I evidently didn't use that exact method; seems that putting the export command on it's own line (instead of with the cpbackup line) doesn't work, nor did putting it into a script and calling it, even on the same line. Guess the first post needs updating now! :)
 

Drew Nichols

Well-Known Member
May 5, 2003
96
0
156
SC
Love the parsimony of the new solution. Here's my rsync string, any body have a better one? This one seems to work.

rsync -vrplogDtH /home/bu/cpbackup/daily/ -e ssh [email protected]:/home/nas/xeon/

Where [email protected] is the user nas at the SSH host nas on our network.

This is run on the server being backed up hours after the cpbackup script has completed.
 

chirpy

Well-Known Member
Verifed Vendor
Jun 15, 2002
13,437
33
473
Go on, have a guess
Similar to mine, except I pull instead of push and I use a non-standard SSH port and am sure to delete files at the destination that are not part of the source (e.g. removed accounts):

rsync --delete --stats -vae 'ssh -p NNN' [email protected]:/backup/cpbackup/daily /backup/otherserver
 

Drew Nichols

Well-Known Member
May 5, 2003
96
0
156
SC
chirpy said:
Similar to mine, except I pull instead of push and I use a non-standard SSH port and am sure to delete files at the destination that are not part of the source (e.g. removed accounts):

rsync --delete --stats -vae 'ssh -p NNN' [email protected]:/backup/cpbackup/daily /backup/otherserver
We actually keep removed accounts (unless they're so big they get on the disk radar screen) just in case a former client needs something. It happens. :)
 

lloyd_tennison

Well-Known Member
Mar 12, 2004
697
1
168
I add a "--partial" so if the process is interrupted..

Hmm, I wonder if adding a "--compress" would make a big difference. I know that uses gzip format, and since we made the gzip format rsyncable the first time when creating the files - it would not be as efficient as standard gzip format. Have to test overhead versus transfer.
 

Shuriken1

Member
Apr 5, 2005
15
0
151
Hi,

Lyttek, can you confirm that the method that is now at the start of this thread is up-to-date as per all of the comments? Knowing my luck I will make a mess of it anyway, be good to know that it's definitely me rather than the method thats getting it wrong.

Thanks and excellent work with this, seems like a great solution.

Shuriken1.
 

lloyd_tennison

Well-Known Member
Mar 12, 2004
697
1
168
I am not Lyttek, but can confirm that is all you need to do. It works well. I added a "-P" (or just --progress if you do not want --partial) to the rysnc command while testing it as you can see exactly what is does.
 

Shuriken1

Member
Apr 5, 2005
15
0
151
Thanks for the response, would I be right in assuming that if I wanted Chirpy's rsync string:

rsync --delete --stats -vae 'ssh -p NNN' [email protected]:/backup/cpbackup/daily /backup/otherserver

to push rather than pull I could simply swap the two address over at the end of the line, like so?

rsync --delete --stats -vae 'ssh -p NNN' /backup/otherserver [email protected]:/backup/cpbackup/daily

I'm trying to backup to a windows box, preferably over ftp but haven't found much info, has anyone done this already? I understand the above is for a *nix to *nix setup, I've just wondered off on a different tangent without warning. :)

Thanks,

Shuriken1.
 

chirpy

Well-Known Member
Verifed Vendor
Jun 15, 2002
13,437
33
473
Go on, have a guess
lloyd_tennison said:
I add a "--partial" so if the process is interrupted..

Hmm, I wonder if adding a "--compress" would make a big difference. I know that uses gzip format, and since we made the gzip format rsyncable the first time when creating the files - it would not be as efficient as standard gzip format. Have to test overhead versus transfer.
You'll probably find that if you use rsynch compression:

1. It'll make little difference for the reasons you mention

2. It'll fail - it seems to be a very flaky option in rsync and I find that the transfer will nearly always fail to complete

Better off without it.
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
Looks like they beat me to it :)

Yep, first post is up-to-date. I'm also looking at using cwrsync, which is rsync running on cygwin, which should (in theory) allow Windows to use rsync. Haven't had a chance to play with it yet.

Anyone have experience with cwrsync?
 

lloyd_tennison

Well-Known Member
Mar 12, 2004
697
1
168
I tried it and could never get their version of sshd to work when pushing. Installed full cygwin and worked fine. Pulling would have probably run fine, but I was trying it before this thread started so I was trying the uncompressed files and many would not transfer because of differences in *nix and Windows naming conventions, i.e, case sensative, etc.
 

fmalekpour

Well-Known Member
PartnerNOC
Dec 4, 2002
85
1
158
Can we add something like this to prevent gzip from getting lots of cpu:

Code:
export GZIP="--rsyncable -3" ; /scripts/cpbackup
This will create bigger files but faster. Backup storage is much cheaper than processor.

Is syntax correct? Any idea?
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
Interesting thought... in theory, it should work, as the rsync option just tells it to reset the table after X number of bytes.

Whether or not it actually works... well, hopefully you'll fill us in after trying, yes? ;)
 

areha

Well-Known Member
Oct 30, 2002
52
0
156
Can´t get it to work, even though it looked like it the first runs, but it transfered the entire file if there was any changed files inside the .tar.gz file. If there was no change at all, it transfered partial file, which fooled me to think it worked.

I am doing testing because I can´t see that this works correctly. Backups are 800 MB large and bandwidth is expensive here.

What I have done:

Created a file named test.txt with a lot of text. I also added the gzip rsyncable option.

Run #1
Run it first one time. The entire files are transfered.
gzip="--rsyncable" tar zcvf test.tar.gz testfile.txt

Run #2
Now, just run gzip="--rsyncable" tar zcvf test.tar.gz testfile.txt again, without any change to the file. Partly transfered file, which I think is correct, since the gzip process can change som meta-data I guess. You see the log below.

Run #3
Edit the testfile.txt, added some lines with text.
Entire file transfered. This should not happen.


LOG OUTPUT

RUN #2
---------------------------------
test.tar.gz
2762 100% 2.63MB/s 0:00:00 (1, 100.0% of 1)

Number of files: 1
Number of files transferred: 1
Total file size: 2762 bytes
Total transferred file size: 2762 bytes
Literal data: 700 bytes
Matched data: 2062 bytes
File list size: 34
Total bytes sent: 60
Total bytes received: 806

sent 60 bytes received 806 bytes 133.23 bytes/sec
total size is 2762 speedup is 3.19

RUN #3
---------------------------------
test.tar.gz
2767 100% 2.64MB/s 0:00:00 (1, 100.0% of 1)

Number of files: 1
Number of files transferred: 1
Total file size: 2767 bytes
Total transferred file size: 2767 bytes
Literal data: 2767 bytes
Matched data: 0 bytes
File list size: 34
Total bytes sent: 60
Total bytes received: 2861

sent 60 bytes received 2861 bytes 449.38 bytes/sec
total size is 2767 speedup is 0.95
 

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
I think there's a flaw in the way you're trying to use it.

The GZIP variable must be set AND exported before the gzip program will see it. Simply setting in the command line as it appears you're doing is probably where it fails.

If you type 'env' at a command prompt, you should see 'GZIP=--rsyncable' somewhere. If you don't see it, gzip won't either.

Issuing the command 'export GZIP=--rsyncable' will set it. While it may not matter, by convention env variables are uppercase. I think it would work lowercase, but not sure on that point.

if you were issuing a gzip command directly, rather than through the 'tar' command as shown, you could use --rsyncable as a switch.
 

areha

Well-Known Member
Oct 30, 2002
52
0
156
Lyttek said:
The GZIP variable must be set AND exported before the gzip program will see it. Simply setting in the command line as it appears you're doing is probably where it fails.
I have done that too, tried almost everything. But I did it again now, just to be sure. The --rsyncable does appear as environment variable and I get the same results, over and over again. What also makes me think this doesn´t work today is that I have added such cronjob as described above, and my backups are transfered in full even after three runs IF there are any change in the real files. If it is just a new backup with no changes within the files in the backup, it is only transfering small data and you can think that it s transfering changed data and working as it should for that file.


Lyttek said:
The GZIP variable must be set AND exported before the gzip program will see it. Simply setting in the command line as it appears you're doing is probably where it fails.
According to the manual for gzip (the quick help gzip -h), it´s perfectly allowed to do it at command line. It doesn´t describe environment variables at all, so it is more preferable to do it by command line if it wasn´t due to the fact that Cpanel overwrites scripts on updates.

Snippet from the gzip help:
-1 --fast compress faster
-9 --best compress better
--rsyncable Make rsync-friendly archive
 
Last edited:

Lyttek

Well-Known Member
Jan 2, 2004
775
5
168
areha said:
According to the manual for gzip (the quick help gzip -h), it´s perfectly allowed to do it at command line.
True, but the command you were running isn't 'gzip', it's 'tar' which calls gzip using the -c switch... not the same thing. For instance, try running your tar command with rsyncable in the switchlist... it will fail, because 'tar' doesn't know what that is, so it can't pass it to 'gzip'.

From my experience, if you rsync a file multiple times and there are no changes to that file, the rsync command will complete almost immediately and there will be no 'speedup' reported because nothing was transferred. That the report tells you there is a speed up says the file isn't the same.

Also, in looking more closely at your log files, I don't think you're working with large enough files to get a true reading. 2.7Kb is a very small file. Having said all the above, this is probably the source of the problems you're reporting. Trying testing with a 1-5meg file.