Wrong site after transfer with cached IP address

LoadFactor

Well-Known Member
Jul 12, 2013
89
18
133
cPanel Access Level
Root Administrator
I've seen this problem more than once now. Here's the sequence:

1. Site X is on server A.
2. Request site X to get address in local DNS cache.
3. Transfer site X to server B in the same DNS cluster.
4. Remove site X from server A, retaining DNS.
5. Request site X, get site Y from server A.

Restarting Apache has no effect.

Site Y is unpredictable but consistent across multiple transfers. This is obviously a short term problem while the DNS updates, but it's still seriously undesirable, especially to the person who owns the site!

Other than leaving a placeholder on "server A" until DNS has updated, anyone know a way to prevent this?
 

LucasRolff

Well-Known Member
Community Guide Contributor
May 27, 2013
142
95
78
cPanel Access Level
Root Administrator
Sure, move the IP from the old server to the new server :)

You can’t fix DNS propagation, even as much as we want the world to follow the defined TTL of our DNS records, there will always be this someone who likes to break the rules. That someone tends to be major ISPs :)

Now, technically, if you migrate things in the evening, it’s usually OK when people wake up in the morning.

So few possibilities:
- move the IP
- migrate during evening/night

Another method is to forward old traffic to the new server using IP tables :)
 

LoadFactor

Well-Known Member
Jul 12, 2013
89
18
133
cPanel Access Level
Root Administrator
Moving the IP to the new server would strand all the sites remaining on the old server. Same with an iptables forward, so that would be a much larger problem. :D

FYI this is an Apache configuration question, not a network question. I really don't think requests for a site should resolve to some semi-random domain on the old server while the DNS updates. :D

Usually we just leave a copy on the old server until DNS updates, but for a transactional site that doesn't work. We could have the old server remote SQL to the new server, but that would require punching a hole in the firewall and even then that's a kludge not a fix.

Sure we could go into the account that's getting all these requests and add a rule in .htaccess that says "if the requested host doesn't match the host that this account covers, toss a 403", but IMO that account shouldn't be getting the request in the first place. It should be falling back to the default webpage.

The question really is "why the heck isn't this coming up with the default webpage and how do we fix it."
 

LoadFactor

Well-Known Member
Jul 12, 2013
89
18
133
cPanel Access Level
Root Administrator
Unfortunately if the servers are in the same DNS cluster, then the A record has to refer to the new server.

I must be missing something... I can't see how it's propagation related.

- A request for some-random-site.com that resolves to a cPanel server gets the default webpage.
- If I edit my local hosts file to force some-site-in-my-dns-cluster.com to a server that has never hosted the site, I get a default webpage.

It's only the case where the site was recently on the old server that the request returns a page from some-unrelated-customer.com (and a certificate error for HTTPS requests). It seems to me that something local to the server is cached, which is why I tried an Apache restart.

For me the expected result is that the request return the default webpage, not that of a random customer. I'm open to explanations as to why that expectation is incorrect...
 

cPanelLauren

Product Owner II
Staff member
Nov 14, 2017
13,266
1,301
363
Houston
Unfortunately if the servers are in the same DNS cluster, then the A record has to refer to the new server.
Whether the servers were clustered or not this would be true. Unfortunately, them being in the same DNS cluster makes this option unusable.

- A request for some-random-site.com that resolves to a cPanel server gets the default webpage.
Just because the servers are clustered does not make them immune to propagation, any time you change a site's IP address you're subject to this. Clustering does not modify the behavior of propagation throughout the internet, the nameservers may still be the same but the IP address that resolves to is not, this has to have time to be updated in more places than just your server. You're getting the default page because when you're requesting the site (due to some potential DNS caching and propagation) you're getting the old IP address still and because the site isn't present on that IP you'll get the default page it's as simple as that.


- If I edit my local hosts file to force some-site-in-my-dns-cluster.com to a server that has never hosted the site, I get a default webpage.
I'm not sure I'm following here, how is this relevant?

It's only the case where the site was recently on the old server that the request returns a page from some-unrelated-customer.com (and a certificate error for HTTPS requests). It seems to me that something local to the server is cached, which is why I tried an Apache restart.

For me the expected result is that the request return the default webpage, not that of a random customer. I'm open to explanations as to why that
https requests are a bit different, apache's default behavior for https requests when a VirtualHost is not present in the configuration is to load the first SSL VirtualHost in the configuration using that IP, which is why you might seemingly get a "random" site.
 

LoadFactor

Well-Known Member
Jul 12, 2013
89
18
133
cPanel Access Level
Root Administrator
I'm not sure I'm following here, how is this relevant?
I'm attempting to illustrate the point that I seem to be unable to make clearly. I will try again...

This isn't about propagation itself, it's about what happens during propagation under a specific set of conditions. It seems to me that a request for a site that
  • is no longer on a server, or that has never been on that server,
  • independent of whether or not it is in the DNS, clustered or otherwise, or
  • basically under any other circumstances whatsoever
Should result in the default webpage, not that of another (indeterminate) site hosted on the server.

Instead, I moved the site for a large supplier of telecom equipment, with an 80GB disk footprint, from one server to another, then immediately removed it from the old server (via terminate account, keep DNS zones). For the period of time that the domain was still resolving to the old server, we were serving up a counterculture blog instead of the default webpage. Unsurprisingly, my client found this to be unacceptable. At least the SSL error alerted visitors that what they asked for isn't what they were getting. Worse yet, I couldn't create an emergency maintenance page on the old server, because it was still in the DNS cluster.

Obviously if something like this comes up again, I'll put the site on the old server into maintenance mode instead of terminating it and wait for propagation before removing it, but that still doesn't mean that serving up someone else's site is what I would consider the correct response, under any circumstances.
 

cPanelLauren

Product Owner II
Staff member
Nov 14, 2017
13,266
1,301
363
Houston
The resolution to this in the future would be that you should absolutely not remove a site from the server until propagation is complete if you don't want to experience downtime.

but that still doesn't mean that serving up someone else's site is what I would consider the correct response, under any circumstances.
Had the domain not been removed, this wouldn't have occurred, instead, the domain was removed, meaning the SSL VirtualHost on that server no longer existed on that IP address, understanding how apache processes these requests should be helpful to keep something of this nature from occurring in the future.


I'd suggest fully understanding the implications of a server move prior to making a migration, especially in sensitive circumstances. The following may be helpful to explain how SSL requests work with Apache:

NameBasedSSLVHostsWithSNI - HTTPD - Apache Software Foundation
Specifically:
Detailed Processing
Before there is even an SSL handshake, Apache finds the best match for the IP address and TCP port the connection is established on (IP-based virtual hosting)

If there is a NameVirtualHost directive that has the same literal arguments as this best-matching VirtualHost, Apache will instead consider ALL VirtualHost entries with identical arguments to the matched VirtualHost. Otherwise, SNI processing has no selection to perform.

If the client sends a hostname along with its TLS handshake request, Apache will compare this TLS hostname to the ServerName ServerAlias of the candidate VirtualHost set determined in the preceding steps.

Whichever VirtualHost is selected on the preceding basis will have its SSL configuration used to continue the handshake. Notably, the contents of the certificates are not used in any comparison.

This process mimics the normal (albet misundersood) consecutive application of IP-based, then name-based, vhost matching algorithms used with HTTP, except that the input is the TLS data and not an HTTP header.
Standard practice (especially in environments like shared hosting) when you're moving between servers within the same DNS cluster is to do something akin to the following:

- Perform initial migration of site data
- Check for issues using a localhost file
- Update DNS to point to the new server
- If you choose not to utilize a maintenance mode on the page on the old server until propagation is complete and if you have sensitive database transactions (like purchases, signups etc.) You may want to perform an additional manual transfer (such as rsync for data like mail) of information which is a fairly common practice among web hosting providers who migrate accounts between shared servers that use the same nameservers frequently.