Missing accounts in "List Accounts" are not being backed up - nearly cost me dear.

spaceman

Well-Known Member
Mar 25, 2002
518
12
318
I recently completed the transfer of 168 hosting accounts from two old servers into a new whizz-bang server.

I considered the list of accounts shown in WHM > List Accounts to be authoritative for the accounts that needed to be transferred to the new server from the old. THIS WAS MY FIRST MISTAKE.

So having successfully transferred all (or so I thought) hosting accounts, I instructed the data center to decommission the old servers. THIS WAS MY SECOND MISTAKE.

About 8 hours after the old servers were turned off we got a call from one of our hosting clients wondering where their website had gone. Checking WHM > List Accounts on the new server I quickly worked out that it wasn't there. The only possible explanation for this was that the account had not appeared in WHM > List Accounts on the old server where it used to be hosted becuse I'd checked and re-checked that absolutely, definitely, all sites listed in WHM > Accounts had been successfully transferred. I've actually had this issue a couple of times in the past, but (stupidly) have forgotten this as a remote possibility when planning my site transfer strategy.

"No worries" I thought. In my infinite wisdom I'd copied ALL (yes, definitely 'all' this time) the latest cpanel daily backups from both the old servers to the new. So even though I'd not copied the 'live' site across, at least it could be restored from the most recent daily backup file, right? WRONG. Well, about 95% wrong anyway. The backup was there, but when I restored it the most recent data associated with any file in the backup was Sept 26th - nearly 2 months older than expected/hoped. So from this I can only assume that the same bug that caused this particular (otherwise 100% fully functional) website from appearing in the WHM > List Accounts list, *also* meant that it was not being picked up for backing up along with all the other accounts that were being list in WHM > List Accounts.

BUGGER! (can I say that?).

About the only 'positive' to come out of this sorry tale of human and software error was that the client was surprisingly good humoured about this major stuff up, and reckoned it wouldn't take them too many hours to add the missing page content via their CMS.

So reviewing my mistakes...

MISTAKE #1 - assuming the WHM > List Accounts is authoritative for all active accounts hosted on my server. Well, call me old fashioned, but I think I have a right to assume this, so the failure was more WHMs as opposed to mine. I think I know the reason for this bug. I think it's because I favour deleting DNS zone files where the zone file on my server is NOT authoritative for the domain, just to make absolutely sure that there's no misunderstanding between my server and the external authoriative DNS name servers. I don't think WHM knows fully how to deal with this (and it should), and I think that one manifestation of WHMs inability to manage this situation gracefully is that (sometimes - not always) the account with no matching local zone file gets removed from WHM > List Accounts *and* (I've just deduced) the same account also stops getting backed up.

MISTAKE #2 - fully decommissioning the servers. What I should have done is left the servers turned on for a fews days - maybe even a week or two - before the final decommissioning, but with pretty much all services disabled. This would/should have had the effect of bringing any outstanding issues to my attention in such a way that retrieving the data from the old server would have been easy. 'Course if all the accounts had been listed in WHM > List Accounts AND were being backed up, then this would never have happened and I wouldn't be sharing this perhaps overly-cautious server decommissioning advice with others.

Right, I'm done. Just wanted to get that off my chest. Feel free to laugh, comment, empathise - as you like ;)

http://bugzilla.cpanel.net/show_bug.cgi?id=4843
 
Last edited: