The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

I need to add RewriteCond... to httpd.conf to block bots.

Discussion in 'Security' started by jols, Mar 27, 2012.

  1. jols

    jols Well-Known Member

    Joined:
    Mar 13, 2004
    Messages:
    1,111
    Likes Received:
    2
    Trophy Points:
    38
    Hi. I've struggled literally for years to block the MJ12 bot with various mod_security rules but nothing seems to work, and we just got another server's access literally killed by the MJ12 bot.

    Okay, so now I would like to try to apply the following in the httpd.conf file:

    RewriteEngine on
    RewriteCond %{HTTP_USER_AGENT} (crawler|Ezooms|MJ12|Nutch|Sogou|spider|Yandex) [NC]
    RewriteRule .* - [F]

    But where, and how do I apply this in such a way that it will effect every account on the server, also without being overwritten or removed at some point by the system software?

    Anyone? Greatly appreciate any response on this one. Thanks.
     
  2. cPanelTristan

    cPanelTristan Quality Assurance Analyst
    Staff Member

    Joined:
    Oct 2, 2010
    Messages:
    7,623
    Likes Received:
    21
    Trophy Points:
    38
    Location:
    somewhere over the rainbow
    cPanel Access Level:
    Root Administrator
    Actually, if you place it into /home/.htaccess, then the rewrite rule should impact all accounts on the machine as /home is read before /home/username for parsing these rewrites.
     
  3. jols

    jols Well-Known Member

    Joined:
    Mar 13, 2004
    Messages:
    1,111
    Likes Received:
    2
    Trophy Points:
    38
    Ha! (He slaps forehead.) That just makes too much sense.

    Thanks for pointing that out. Much appreciated.
     
  4. jols

    jols Well-Known Member

    Joined:
    Mar 13, 2004
    Messages:
    1,111
    Likes Received:
    2
    Trophy Points:
    38
    Well that would have been nice, had it worked, but regretfully the bots in my user agent list still have free and wild access to every account on the server.

    Does anyone have any idea how exactly I could block agents like the following from hitting all the accounts on the server?

    R6_CommentReader(Radian6 Crawler FAQ)
    BoardReader/1.0 (http://boardreader.com/info/robots.htm)-CommentCrawler5"
    FeedMyInbox/2.0 (http://www.FeedMyInbox.com)"
    "UniversalFeedParser/5.0.1 +http://feedparser.org/
    "RockMeltService"
    "Reeder/1.5.5 CFNetwork/548.1.4 Darwin/11.0.0"
    aggregator:Spinn3r (Spinn3r 3.1)
    MJ12bot/v1.4.2; http://www.majestic12.co.uk/bot.php?+)"
     
  5. tvcnet

    tvcnet Well-Known Member
    PartnerNOC

    Joined:
    Aug 15, 2003
    Messages:
    116
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    San Diego
    cPanel Access Level:
    DataCenter Provider
    Interested as well.

    I generally recommend the below to cilents to help reduce bad bandwidth usage and processes, though I've had dedicated clients who would like to block server wide as well, though I've never really tested this as a /home/* solution.

    # Begin Bad Bot Blocking
    BrowserMatchNoCase OmniExplorer_Bot/6.11.1 bad_bot
    BrowserMatchNoCase omniexplorer_bot bad_bot
    BrowserMatchNoCase Baiduspider bad_bot
    BrowserMatchNoCase Baiduspider/2.0 bad_bot
    BrowserMatchNoCase yandex bad_bot
    BrowserMatchNoCase yandeximages bad_bot
    BrowserMatchNoCase Spinn3r bad_bot
    BrowserMatchNoCase sogou bad_bot
    BrowserMatchNoCase Sogouwebspider/3.0 bad_bot
    BrowserMatchNoCase Sogouwebspider/4.0 bad_bot
    BrowserMatchNoCase sosospider+ bad_bot
    BrowserMatchNoCase jikespider bad_bot
    BrowserMatchNoCase ia_archiver bad_bot
    BrowserMatchNoCase PaperLiBot bad_bot
    BrowserMatchNoCase ahrefsbot bad_bot
    BrowserMatchNoCase ahrefsbot/1.0 bad_bot
    BrowserMatchNoCase SiteBot/0.1 bad_bot
    BrowserMatchNoCase DNS-Digger/1.0 bad_bot
    BrowserMatchNoCase DNS-Digger-Explorer/1.0 bad_bot
    BrowserMatchNoCase boardreader bad_bot
    BrowserMatchNoCase radian6 bad_bot
    BrowserMatchNoCase R6_FeedFetcher bad_bot
    BrowserMatchNoCase R6_CommentReader bad_bot
    BrowserMatchNoCase ScoutJet bad_bot
    BrowserMatchNoCase ezooms bad_bot
    BrowserMatchNoCase CC-rget/5.818 bad_bot
    BrowserMatchNoCase libwww-perl/5.813 bad_bot
    BrowserMatchNoCase magpie-crawler 1.1 bad_bot
    BrowserMatchNoCase jakarta bad_bot
    BrowserMatchNoCase discobot/1.0 bad_bot
    BrowserMatchNoCase MJ12bot bad_bot
    BrowserMatchNoCase MJ12bot/v1.2.0 bad_bot
    BrowserMatchNoCase MJ12bot/v1.2.5 bad_bot
    BrowserMatchNoCase SemrushBot/0.9 bad_bot
    BrowserMatchNoCase MLBot bad_bot
    BrowserMatchNoCase butterfly bad_bot
    BrowserMatchNoCase SeznamBot/3.0 bad_bot
    BrowserMatchNoCase HuaweiSymantecSpider bad_bot
    BrowserMatchNoCase Exabot/2.0 bad_bot
    BrowserMatchNoCase netseer/0.1 bad_bot
    BrowserMatchNoCase NetSeer crawler/2.0 bad_bot
    BrowserMatchNoCase NetSeer/Nutch-0.9 bad_bot
    BrowserMatchNoCase psbot/0.1 bad_bot
    BrowserMatchNoCase moreoverbot/5.0 bad_bot
    BrowserMatchNoCase Jakarta Commons-HttpClient/3.0 bad_bot
    BrowserMatchNoCase SocialSpider-Finder/0.2 bad_bot

    Order Deny,Allow
    Deny from env=bad_bot
    # End Bad Bot Blocking



    soapbox * I've always wondered why cPanel does not come up with a better bot blocking/allow management system. Seems like a perfect and insanely sought after "plugin" project. I mean this discussion is literally a decade old. I know cause I started a similar thread back in 20 aught 1.
     
    #5 tvcnet, Apr 1, 2012
    Last edited: Apr 1, 2012
Loading...

Share This Page