I have two WHM/Cpanel servers A&B. The accounts are rsynced.
Primary nameserver is on A. Clustered nameserver is on B
Secondary mailserver is on A. Primary mailserver is on B
websites are served from A.
Hence server A handles websites, B mostly handles email. Nameservice is shared. So if server B fails - no problem. Server A will handle all the nameserver, website & mail traffic automatically.
But if server A fails - no probs on nameserver and email. But no websites!
My thinking was to have a cron job on B that polled A say every 5 mins and on the second consecutive failure initiated a global find & replace of server A's webserver IP occurences with that of server B in the named zone directory and restart bind. Then B would handle web traffic too automatically. Maybe I could get clever and on two subsequent good polls it switched them back. This and email notification might ease the the need for 24/7 monitoring (monitoring is easy, always having someone available to take action is harder).
Has this been done? The only references to failover appears to make this a complex task handled at a premium. This solution appears rather more obvious and simple. There must be a catch?
Primary nameserver is on A. Clustered nameserver is on B
Secondary mailserver is on A. Primary mailserver is on B
websites are served from A.
Hence server A handles websites, B mostly handles email. Nameservice is shared. So if server B fails - no problem. Server A will handle all the nameserver, website & mail traffic automatically.
But if server A fails - no probs on nameserver and email. But no websites!
My thinking was to have a cron job on B that polled A say every 5 mins and on the second consecutive failure initiated a global find & replace of server A's webserver IP occurences with that of server B in the named zone directory and restart bind. Then B would handle web traffic too automatically. Maybe I could get clever and on two subsequent good polls it switched them back. This and email notification might ease the the need for 24/7 monitoring (monitoring is easy, always having someone available to take action is harder).
Has this been done? The only references to failover appears to make this a complex task handled at a premium. This solution appears rather more obvious and simple. There must be a catch?
Last edited: