Googlebot is really loading one of my servers, hitting some strange URLs for one particular customer (poorly written shopping cart). A new cart is being investigated, meanwhile I thought we could simply try to catch these bad URLs and redirect them to the home page or something.
However, the "gotcha" is that these are HTTPS URLs, and you cannot use {REQUEST_URI} on HTTPS.
For example, here's a bad URL it's trying to hit:
https://www.example.com/cart/https://www.example.com/cart/checkout/selectAddressshop/Blow-Out-Deal!-Extra-Loud-Alarm-Clock-with-Green-LED-3-for-19-99-Shipped.207Acer-KG-UXH1P-Dual-Band-VHF-Plus-200-MHZ-Handheld-220-Special!-129-95-Shipped-With-Programming-Cable-and-Software!.137shop/Accessories.23YT34010X3-SMA-FEMALE-to-UHF-female-Fits-Sony-and-more.221acer.info.htmlorder?returnPath=
If this wasn't HTTPS, I'd do something like:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.com$
RewriteCond %{REQUEST_URI} ^/cart/https
RewriteRule ^(.*)$ http://www.example.com/cart/$1 [R=301,L]
The syntax may not be right, but what I'm trying to is say... if anyone tries to go to a URL that starts with /cart/https.... that is bogus and redirect them.
But {REQUEST_URI} doesn't work with HTTPS.
Any ideas, either to solve this, or where to go for a "consultant" to help figure out a workaround?
- Scott
However, the "gotcha" is that these are HTTPS URLs, and you cannot use {REQUEST_URI} on HTTPS.
For example, here's a bad URL it's trying to hit:
https://www.example.com/cart/https://www.example.com/cart/checkout/selectAddressshop/Blow-Out-Deal!-Extra-Loud-Alarm-Clock-with-Green-LED-3-for-19-99-Shipped.207Acer-KG-UXH1P-Dual-Band-VHF-Plus-200-MHZ-Handheld-220-Special!-129-95-Shipped-With-Programming-Cable-and-Software!.137shop/Accessories.23YT34010X3-SMA-FEMALE-to-UHF-female-Fits-Sony-and-more.221acer.info.htmlorder?returnPath=
If this wasn't HTTPS, I'd do something like:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.com$
RewriteCond %{REQUEST_URI} ^/cart/https
RewriteRule ^(.*)$ http://www.example.com/cart/$1 [R=301,L]
The syntax may not be right, but what I'm trying to is say... if anyone tries to go to a URL that starts with /cart/https.... that is bogus and redirect them.
But {REQUEST_URI} doesn't work with HTTPS.
Any ideas, either to solve this, or where to go for a "consultant" to help figure out a workaround?
- Scott