Remote incremental backups - timeouts

sp3ctre69

Well-Known Member
Aug 14, 2006
107
5
168
I have implemented remote incremental backups and by and large they seem to work well. However, I have had timeout issues during the pruning phase. Interestingly, when I look on the remote server it does look like it has done it.

My concern is not so much around the timeouts, but rather the effect of them.

In my old backup method if there was a timeout issue that wasn't corrected the backup would be incomplete, but the next day I would have a full backup in place. Now I have the remote system in place do I still get the same effect. i.e. if the folder with yesterdays date is incomplete on the remote server will there be a gap in my backup our will the gaps be filled during the next days cycle?

Hopefully I am being clear, if not let me know and I'll try to elaborate.
 

sp3ctre69

Well-Known Member
Aug 14, 2006
107
5
168
Not quite... my issue is the errors are in the backup transport area. Presumably the backups are correct on the server but incorrect on the remote server. My question is if yesterdays backup is incorrect on the remote will this get corrected on the next transport i.e. Will it rsync all the days or just the current one. Seems that a backup transport error on one day could potentally corrupt your backup integrity unless corrected.
 

sp3ctre69

Well-Known Member
Aug 14, 2006
107
5
168
Have you adjusted the time out settings?
Yes I did, they are quite high... I have 3 servers on the new system, 1 fails every time, but works when I re-run it (like I said though, when logging in to check the initial failure it does appear to have done the work). The other 2 servers work every time, apart from last night when all 3 failed (due to either the version update or a local network issue).

Either way, the timeout is not currently my concern, it is how a potential transport issue could affect the remote copy of the incremental backups.
 

sp3ctre69

Well-Known Member
Aug 14, 2006
107
5
168
Just to clarify my question a little... consider the following few days... (backup retention set to 3)

Mon - Backup works, stored on server, remote transport works (backup files exist remotely)
Tue - Backup works, stored on server, remote transport works (backup files exist remotely)
Wed - Backup works, stored on server, remote transport FAILS (backup files DO NOT exist remotely)

My question is what happens on Thursday....

Tue - Backup works, stored on server, remote transport works (backup files exist remotely)
Wed - Backup works, stored on server, remote transport FAILS (backup files DO NOT exist remotely) ** here is the problem
Thu - Backup works, stored on server, remote transport works (backup files exist remotely)

If these files do not get repopulated and only the Thursday files get remotely copied there is a hole in my remote backup still. Looks to me like backup integrity is still ok on the server, but remotely there is still a hole unless it gets fixed in the next transport session.
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,884
2,250
463
If these files do not get repopulated and only the Thursday files get remotely copied there is a hole in my remote backup still. Looks to me like backup integrity is still ok on the server, but remotely there is still a hole unless it gets fixed in the next transport session.
Hello @sp3ctre69,

It depends on if "Strictly enforce retention, regardless of backup success." is enabled in "WHM >> Backup Configuration". You can read more about the retention behavior at:

Backup Retention

Could you confirm if it's not working how it's described on that document?

Thank you.
 

sp3ctre69

Well-Known Member
Aug 14, 2006
107
5
168
I am obviously struggling to articulate the problem here... the way the system handles failed backups is fine, the problem is the backups are running fine, and each night I get a "backup succeeded" email. Following that is the transport phase where it sends it to the remote destination... that is the bit that sometimes fails. When doing a full backup a failed transport on Wed would be fixed by a successful transport on Thurs.... but with incrementals the failed Wed would still cause problems... Of course the files are all there on the local file system, but if the remotes are wrong it doesn't help if you need to use them.

Does that make more sense?
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,884
2,250
463
Following that is the transport phase where it sends it to the remote destination... that is the bit that sometimes fails. When doing a full backup a failed transport on Wed would be fixed by a successful transport on Thurs.... but with incrementals the failed Wed would still cause problems... Of course the files are all there on the local file system, but if the remotes are wrong it doesn't help if you need to use them.
Hello,

Would you mind opening a support ticket using the link in my signature so we can take a closer look and see if this is a flaw in how remote incremental backups are transported and retained?

Thank you.
 

sp3ctre69

Well-Known Member
Aug 14, 2006
107
5
168
Hello,

Would you mind opening a support ticket using the link in my signature so we can take a closer look and see if this is a flaw in how remote incremental backups are transported and retained?

Thank you.
Thanks... done, ticket number is 8760865
 

sp3ctre69

Well-Known Member
Aug 14, 2006
107
5
168
My issue is the errors are in the backup transport area. Presumably the backups are correct on the server but incorrect on the remote server. My question is if yesterdays backup is incorrect on the remote will this get corrected on the next transport i.e. Will it rsync all the days or just the current one. Seems that a backup transport error on one day could potentally corrupt your backup integrity unless corrected.
I have done some investigation on this and I think the system works as I think it should... there is definitely some confusion though even amongst some of the discussions I have had with cpanel staff.

Last night I blatted my remote server and re-installed it, just in case there were issues. This left me with 7 days backups on the server and none on the remote.

I ran a backup and the result was a full backup on the remote server, which matched the size of the backup for last night on the source. Obviously on the source server the files were linked (hence incremental) but on the remote server there was nothing to link to (as they had been deleted). It has obviously dealt with this by taking them from the source server.

Can someone comment on my understanding of this as I just want some comfort of the process it goes through when the previous days backup exists on the source server but not on the remote server when using rsync incremental backups.

Thanks
 

sp3ctre69

Well-Known Member
Aug 14, 2006
107
5
168
Interestingly I am getting timeouts while pruning on 2 of my servers every night. When I look the oldest folder had been pruned. Interestingly these are 2 of my servers with the biggest sites. All timeouts are on max, I wonder if there is another one we are missing? Never had any failed transports when using full backup
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,884
2,250
463
Interestingly I am getting timeouts while pruning on 2 of my servers every night. When I look the oldest folder had been pruned. Interestingly these are 2 of my servers with the biggest sites. All timeouts are on max, I wonder if there is another one we are missing? Never had any failed transports when using full backup
I see that ticket number 8762433 is open for this issue. I'll monitor the ticket and update this thread with the outcome.

I ran a backup and the result was a full backup on the remote server, which matched the size of the backup for last night on the source. Obviously on the source server the files were linked (hence incremental) but on the remote server there was nothing to link to (as they had been deleted). It has obviously dealt with this by taking them from the source server.

Can someone comment on my understanding of this as I just want some comfort of the process it goes through when the previous days backup exists on the source server but not on the remote server when using rsync incremental backups.
The remote incremental backup process (with rsync) is designed to check if the files associated with the account backup on the cPanel server also exist on the remote backup destination. If the files do not exist on the remote backup destination (or if the files have changed), then it copies the actual files. Otherwise, it makes use of hard links when the files already exist on the remote backup destination. The following document is available if you'd like to read more about how account backup information is stored:

Metadata for Backups - Version 66 Documentation - cPanel Documentation

Thank you.
 

sp3ctre69

Well-Known Member
Aug 14, 2006
107
5
168
I see that ticket number 8762433 is open for this issue. I'll monitor the ticket and update this thread with the outcome.



The remote incremental backup process (with rsync) is designed to check if the files associated with the account backup on the cPanel server also exist on the remote backup destination. If the files do not exist on the remote backup destination (or if the files have changed), then it copies the actual files. Otherwise, it makes use of hard links when the files already exist on the remote backup destination. The following document is available if you'd like to read more about how account backup information is stored:

Metadata for Backups - Version 66 Documentation - cPanel Documentation

Thank you.
Thanks Michael, I am getting a better understanding of the process now.

Where I am at present is 2 of my servers fail every night during the pruning process. It seems to invoke pruning and fail almost exactly 3 minutes later with "ssh slave failed: timed out". The timeout on the cpanel side is 5 minutes but I can't see anything on the SFTP server that would time out after 3 minutes. I can also SSH from the server to the SFTP and keep the connection idle for a long time without it dropping. Any ideas much appreciated.
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,884
2,250
463
Hello,

To update, a couple of internal cases were opened as part of the support ticket.

Internal case CPANEL-15398 was opened to ensure that that the operations for the rsync backend use the default timeout value of 300 so that destination servers with extremely slow disks (e.g. disk caching is disabled) can delete directories before timing out. I'll update this thread once the resolution is published.

Internal case CPANEL-15309 is open to address an issue where the use of a relative path (e.g. the path does not start with a slash character) as the backup directory for an Rsync transport prevents incremental backups from hard-linking on the remote destination. I'll monitor this case and update this thread with the outcome. In the meantime, the workaround is to update the backup directory path for the rsync destination to the absolute path (e.g. /home/user/path/to/).

Thank you.
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,884
2,250
463
Hello,

To update, the resolutions associated with internal cases CPANEL-15398 and CPANEL-15309 are scheduled for inclusion with cPanel version 68.

Thank you.
 

brt

Well-Known Member
Jul 9, 2015
104
10
68
US
cPanel Access Level
Root Administrator
This is a nuisance, with directories being abandoned on a daily basis. Any chance this can get bumped to a 66 update? (Specifically CPANEL-15398)
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,884
2,250
463
This is a nuisance, with directories being abandoned on a daily basis. Any chance this can get bumped to a 66 update? (Specifically CPANEL-15398)
There are currently no plans to backport the patch to cPanel version 66, however the following steps are available as a temporary workaround:

1. Open /usr/local/cpanel/Cpanel/Transport/Files/Rsync.pm via the command line in the text editor of your preference.

2. Locate the following line:

Code:
my $res = $self->{'rsync_obj'}->capture( { timeout => 10, tty => 1 }, "rm -rf $path" );
3. Modify the "10" entry in this line to "300":

Code:
my $res = $self->{'rsync_obj'}->capture( { timeout => 300, tty => 1 }, "rm -rf $path" );
4. Save the file.

Let us know if this helps.

Thank you.
 

brt

Well-Known Member
Jul 9, 2015
104
10
68
US
cPanel Access Level
Root Administrator
This hasn't helped at all. *No* backups on the remote server are deleted (and I'm obviously considering the retention settings -- they're obeyed properly on the primary server).

Backups complete just fine, but I have to manually delete previous backups every single time or they just stick around.