Community Forums
Connect with us on LinkedIn
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 18
  1. #1
    Member
    Join Date
    Nov 2006
    Location
    Montreal, Canada
    Posts
    6

    Exclamation Prevent cPanel login pages from being indexed in Google

    Sorry if this may have already been asked and answered, but I couldn't find anything about it.

    Recently I have noticed that Google has somehow managed to find and index some cPanel login pages such as:
    I find those doing a site: query in Google for some domains of mine.

    Actually you can find loads of those with a simple search:
    "Click Here to load cPanel" - Google Search



    I notice that such a page does not have a robots noindex meta tag and there is no robots.txt file at http://www.example.com:2082/robots.txt .

    I can't seem able to do anything about this myself. My own website's .htaccess file does not influence accesses on port 2082 in any way. I have no means to block such a url from being crawled and indexed by search engines.

    I feels like it's also a security risk, despite there being a password requirement of course in order to log in. Still, I don't appreciate my cPanel login page being indexed for all to see when searching my site.


    Is there any solution to this that I may have missed?

    Thanks for any tips.

  2. #2
    Technical Product Specialist cPanelDavidG's Avatar
    Join Date
    Nov 2006
    Location
    Houston, TX
    Posts
    11,189
    cPanel/Enkompass Access Level

    Root Administrator

    Default

    Quote Originally Posted by webado View Post
    Sorry if this may have already been asked and answered, but I couldn't find anything about it.

    Recently I have noticed that Google has somehow managed to find and index some cPanel login pages such as:


    I find those doing a site: query in Google for some domains of mine.

    Actually you can find loads of those with a simple search:
    "Click Here to load cPanel" - Google Search



    I notice that such a page does not have a robots noindex meta tag and there is no robots.txt file at http://www.example.com:2082/robots.txt .

    I can't seem able to do anything about this myself. My own website's .htaccess file does not influence accesses on port 2082 in any way. I have no means to block such a url from being crawled and indexed by search engines.

    I feels like it's also a security risk, despite there being a password requirement of course in order to log in. Still, I don't appreciate my cPanel login page being indexed for all to see when searching my site.


    Is there any solution to this that I may have missed?

    Thanks for any tips.
    Do you happen to have root access to the server on which your website resides? If so you can SSH in and add a robots.txt file to

    /usr/local/cpanel/base/unprotected/

    Let me know if that resolves the issue.

  3. #3
    Member
    Join Date
    Nov 2006
    Location
    Montreal, Canada
    Posts
    6

    Default

    Thanks David. Personally I cannto do it but I know somebody who could.

    However the robots.txt file should be in the root folder - and accessible at http://www.example.com:2082/robots.txt . It serves no purpose at all if it's in the folder /unprotected/ . Unless /usr/local/cpanel/base/unprotected/ actually corresponds to http://www.example.com:2082/ .

    Also robots noindex meta tags ought to be added to the head section of all those pages.

    <meta name="robots" content="noindex,nofollow,noarchive">
    That should take care of everything.

    I think this is something only you guys can (dare I say should?) do .

    Thanks for your help.
    Last edited by webado; 07-07-2009 at 02:34 PM.

  4. #4
    Technical Product Specialist cPanelDavidG's Avatar
    Join Date
    Nov 2006
    Location
    Houston, TX
    Posts
    11,189
    cPanel/Enkompass Access Level

    Root Administrator

    Default

    Quote Originally Posted by webado View Post
    Thanks David. Personally I cannto do it but I know somebody who could.

    However the robots.txt file should be in the root folder - and accessible at http://www.example.com:2082/robots.txt . It serves no purpose at all if it's in the folder /unprotected/ . Unless /usr/local/cpanel/base/unprotected/ actually corresponds to Example Web Page .

    Also robots noindex meta tags ought to be added to the head section of all those pages.



    That should take care of everything.

    I think this is something only you guys can (dare I say should?) do .

    Thanks for your help.
    Actually, with the way our login themes work, the files in unprotected/ are the root directory of the cPanel and WHM ports.

    My problem is being unable to replicate the issue you are experiencing. None of the sites I have access to have their cPanel/WHM/Webmail interfaced indexed by Google. Even though, practically speaking, inserting a robots.txt in the cPanel, WHM and Webmail interfaces should stop the spidering and remove that content, it's best to confirm this to be the case before I submit a proposal to our developers.

  5. #5
    Member
    Join Date
    Nov 2006
    Location
    Montreal, Canada
    Posts
    6

    Default

    That's why I gave this search: "Click Here to load cPanel" - Google Search


    You will see a ton are indexed.



    You can find an instance from one of my sites near the bottom of these search results: site:melinas-music.com - Recherche Google


    Many other sites too.

  6. #6
    Member
    Join Date
    Nov 2006
    Location
    Montreal, Canada
    Posts
    6

    Default

    Not sure what happened to my earlier post - it seemed to go into moderation.

    URLs like what I described are mot definitely getting indexed in Google.

    I found at least one for one of my websites. The quesiotn how Google found such a link is a little mystery in itself, because I have never posted any such url. But it did find it. And indexed it.

    You can see one such url indexed if you lok at the search results for this site query:

    site:melinas-music.com - Google Search


    It's the last one before the omitted results.

  7. #7
    Staff Member cpanelben's Avatar
    Join Date
    Feb 2004
    Location
    Houston, Texas USA
    Posts
    598
    cPanel/Enkompass Access Level

    Root Administrator

    Default

    I've committed a change to prevent the indexing. After some review, this change will propagate to all versions.

  8. #8
    Member
    Join Date
    Nov 2006
    Location
    Montreal, Canada
    Posts
    6

    Default

    Quote Originally Posted by cpanelben View Post
    I've committed a change to prevent the indexing. After some review, this change will propagate to all versions.
    Thank you so very much

    This will be a fantastic help.

  9. #9
    Registered User
    Join Date
    Jun 2006
    Posts
    3

    Default This problem still exists!


  10. #10
    Technical Product Specialist cPanelDavidG's Avatar
    Join Date
    Nov 2006
    Location
    Houston, TX
    Posts
    11,189
    cPanel/Enkompass Access Level

    Root Administrator

    Default

    Quote Originally Posted by g-force View Post
    It seems search engines are ignoring the robots.txt files that should be preventing this. I recommend clicking the "Bugs" link on the top-right corner of this page so we can take a closer look at this issue. Feel welcome to PM me the ticket number assigned to your report.

  11. #11
    Member
    Join Date
    Nov 2006
    Location
    Montreal, Canada
    Posts
    6

    Default

    The robots.txt file is not generally accessible at the root url for those ports.

    In any case the robots.txt file ONLY specifies what is nto to be crawled, not what must nto appear in the index. So even if disallowed in the robots.txt file (assuming a robots.txt file is available there), then just the urls will appear in the index.

    What is needed is for a robots noindex meta tag to be placed on those pages, instead of getting them disallowed in robots.txt.

  12. #12
    Registered User
    Join Date
    Jun 2006
    Posts
    3

    Default

    Yep, wabado is right, looks like cPanel team didn't do their SEO homeworks

    As it was previously mentioned, you should use the robot meta tag.

    See all these well known and documented publicly available details (by that I mean these are not some kind of "SEO guru secrets"... ) :
    Block or remove pages using a robots.txt file - Webmaster Tools Help
    Using meta tags to block access to your site - Webmaster Tools Help

  13. #13
    Technical Product Specialist cPanelDavidG's Avatar
    Join Date
    Nov 2006
    Location
    Houston, TX
    Posts
    11,189
    cPanel/Enkompass Access Level

    Root Administrator

    Default

    Quote Originally Posted by webado View Post
    The robots.txt file is not generally accessible at the root url for those ports.
    Actually, I double-checked on some of the URLs given in the above Google queries and http://SERVER:20xx/robots.txt was accessible. However, as you go on to state, the meta tags that would entirely prevent indexing are not currently present (Confirmed in 11.27.3).

    Quote Originally Posted by webado View Post
    In any case the robots.txt file ONLY specifies what is nto to be crawled, not what must nto appear in the index. So even if disallowed in the robots.txt file (assuming a robots.txt file is available there), then just the urls will appear in the index.

    What is needed is for a robots noindex meta tag to be placed on those pages, instead of getting them disallowed in robots.txt.
    I'm moving this thread to the Feature Request section so this thread can be more readily updated as this is addressed.

  14. #14
    Technical Product Specialist cPanelDavidG's Avatar
    Join Date
    Nov 2006
    Location
    Houston, TX
    Posts
    11,189
    cPanel/Enkompass Access Level

    Root Administrator

    Default Request for Comments

    Below is the report I plan to file with our developers regarding this. Please confirm this reflects the desired course of action:

    Prevent cPanel, WHM and Webmail pages from being indexed by search engines

    Currently, cPanel/WHM has a robots.txt accessible by search engines telling it to not crawl any pages. However, this is ineffective at preventing the login pages from appearing in search engines. The ineffectiveness of current measures is demonstrated by the following search queries:

    - http://www.google.com/search?q=inurl...095%22+webmail
    - http://www.google.com/search?q=inurl...096%22+webmail
    - http://www.google.com/search?q=inurl...2082%22+cpanel
    - http://www.google.com/search?q=inurl...2083%22+cpanel
    - http://www.google.com/search?q=inurl:%22com:2086%22+whm
    - http://www.google.com/search?q=inurl:%22com:2087%22+whm

    This is because "Disallow" in robots.txt is different from "No Index" which should be (but is not) present in a meta tag on the index page. While "Disallow" prevents crawling of the pages, it does not prevent indexing (the appearance of that page in search engine results). As a result, current measures are insufficient to prevent the login pages from appearing in the search engines.

    To prevent the login pages from appearing in popular search engines, the following code should be added to the index files the search engines see:

    HTML Code:
    <meta name="robots" content="noindex">
    Furthermore, this meta tag will prevent links to the login pages from causing the login page to appear in search engine results.

    Related Sources:
    - The Web Robots Pages
    - Using meta tags to block access to your site - Webmaster Tools Help
    - http://blog.beacontechnologies.com/r...he-difference/

  15. #15
    Technical Product Specialist cPanelDavidG's Avatar
    Join Date
    Nov 2006
    Location
    Houston, TX
    Posts
    11,189
    cPanel/Enkompass Access Level

    Root Administrator

    Default

    I have filed the above report with our developers and will update this thread with any updates regarding this request.

Page 1 of 2 1 2 LastLast
Similar Threads & Tags
Similar threads

  1. mail.domain.com pages showing up in Google results
    By zerokarma in forum cPanel and WHM Discussions
    Replies: 3
    Last Post: 10-22-2011, 10:03 PM
  2. Replies: 1
    Last Post: 02-11-2011, 10:14 AM
  3. Some cPanel login pages appear indexed in Google
    By webado in forum New User Questions
    Replies: 12
    Last Post: 09-22-2010, 08:58 AM
  4. Firefox - Generic Popup Login Box instead of cPanel/WHM login pages
    By Nathan D. in forum cPanel and WHM Discussions
    Replies: 1
    Last Post: 02-01-2009, 08:27 AM
  5. How to prevent brute force attacks on Cpanel Login
    By baabaa in forum Data Protection
    Replies: 16
    Last Post: 02-23-2007, 06:56 AM
Linkedin       Facebook       Twitter       RSS       Flickr       YouTube