Final state is Backup Partial Failure error

Bretas

Active Member
PartnerNOC
Jun 10, 2018
30
7
8
Brazil
cPanel Access Level
Root Administrator
Hy there!

Over the past couple of weeks one of our servers has been unable to successfully run its weekly backup even though nothing changed on our end. Our latest backup log isn't very helpful:

Code:
[2019-03-18 04:53:51 -0300] info [backup] no_transport = 0 .. and queueid = TQ:TaskQueue:24463
[2019-03-18 04:53:51 -0300] info [backup] leaving queue_backup_transport_item
[2019-03-18 04:53:51 -0300] info [backup] checking backup for user01
[2019-03-18 04:53:51 -0300] info [backup] Skipping suspended account user01.
[2019-03-18 04:53:51 -0300] info [backup] checking backup for user02
[2019-03-18 04:53:51 -0300] info [backup] Skipping suspended account user02.
[2019-03-18 04:53:51 -0300] info [backup] Queuing transport of meta file: /backup/weekly/2019-03-17/accounts/.master.meta
[2019-03-18 04:53:51 -0300] info [backup] no_transport = 0 .. and queueid = TQ:TaskQueue:24464
[2019-03-18 04:53:51 -0300] info [backup] leaving queue_backup_transport_item
[2019-03-18 04:53:51 -0300] info [backup] Queuing transport of file: /backup/weekly/2019-03-17/backup_incomplete
[2019-03-18 04:53:51 -0300] info [backup] no_transport = 0 .. and queueid = TQ:TaskQueue:24465
[2019-03-18 04:53:51 -0300] info [backup] leaving queue_backup_transport_item
[2019-03-18 04:53:51 -0300] warn [backup] Pruning of backup files skipped due to errors. at /usr/local/cpanel/bin/backup line 394.
        bin::backup::run("bin::backup") called at /usr/local/cpanel/bin/backup line 122
[2019-03-18 04:54:12 -0300] info [backup] Scheduling backup metadata vacuum
[2019-03-18 04:54:12 -0300] info [backup] Queuing transport reporter
[2019-03-18 04:54:12 -0300] info [backup] no_transport = 0 .. and queueid = TQ:TaskQueue:24466
[2019-03-18 04:54:12 -0300] info [backup] leaving queue_backup_transport_item
[2019-03-18 04:54:12 -0300] info [backup] Completed at Mon Mar 18 04:54:12 2019
[2019-03-18 04:54:12 -0300] info [backup] Final state is Backup::PartialFailure (0)
[2019-03-18 04:54:12 -0300] info [backup] Sent Backup::PartialFailure notification.
However, last week's log does suggest something killed the process:

Code:
Skipping access-logs
Skipping .cpanel/caches
Skipping .cpanel/datastore
Skipping .cagefs
.........
.........
.........
.........
.........
.........
.........
.........
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(638) [generator=3.1.2]
rsync error: received SIGUSR1 (code 19) at main.c(1429) [receiver=3.1.2]
[2019-03-10 00:40:37 -0300] info [backup] Final state is Backup::Failure (HUP)
[2019-03-10 00:40:37 -0300] info [backup] Sent Backup::Failure notification.
Given how different both log outputs are, I'm not sure if they were caused by the same issue. Our dmesg won't show info from all the way back when this occurred so I can't confirm whether or not OOM was triggered in any of the two circumstances. Our memory usage graphs don't show anything our of the ordinary.

There's 1.6TB of free space available in the backup NFS mount and I can write data to it just fine.

Is there anything about cPanel v78 causing its backups process to take up more memory than it used to? The only recent change I can think of is cPanel's upgrade from v76 to v78.

Any assistance is greatly appreciated.

Thanks, everyone!
 

cPanelLauren

Product Owner
Staff member
Nov 14, 2017
13,295
1,273
313
Houston
I double checked the release notes here: 78 Release Notes - Version 78 Documentation - cPanel Documentation and don't see anything that would cause the behavior you're seeing. In fact these errors don't tell us a whole lot.

- During the time of the errors in both instances do you see anything noted in either of the following logs?

/var/log/messages
/usr/local/cpanel/logs/error_log