The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Very slow smtp response from one service provider

Discussion in 'E-mail Discussions' started by Lyttek, Oct 13, 2014.

  1. Lyttek

    Lyttek Well-Known Member

    Joined:
    Jan 2, 2004
    Messages:
    770
    Likes Received:
    3
    Trophy Points:
    18
    Weird issue this morning:

    Two different clients in the same geographic area using the same service provider (Time Warner) both reported same problem with email... can receive, but not send. They not using their ISP mail server for smtp.

    Testing via telnet shows that they can make connections to other smtp servers normally (smtp.gmail.com for instance), but connections to their email server would timeout in Outlook. telnet would take nearly 45-55 seconds for the initial banner response to appear whether attempting a connection via DNS name or IP.

    Once the initial banner did appear, subsequent commands were processed in realtime with no delay.

    CSF firewall and related scripts are installed; I'm not seeing anything in maillog, exim_mainlog shows initial smtp connection at the same moment I initiate the test, then nothing else.

    There are no blocks, temp or permanent that I can find.
     
  2. Lyttek

    Lyttek Well-Known Member

    Joined:
    Jan 2, 2004
    Messages:
    770
    Likes Received:
    3
    Trophy Points:
    18
    btw, rfc1413_query_timeout = 0
     
  3. cPanelMichael

    cPanelMichael Forums Analyst
    Staff Member

    Joined:
    Apr 11, 2011
    Messages:
    30,678
    Likes Received:
    648
    Trophy Points:
    113
    cPanel Access Level:
    Root Administrator
    Hello :)

    Have the users reported this issue to their ISP to see if it's an issue with their connection? You may want to enable Exim on an additional port and have the users try that port for sending. You can configure Exim to run on an additional port via:

    "WHM Home » Service Configuration » Service Manager"

    Thank you.
     
  4. SamTheMan

    SamTheMan Registered

    Joined:
    Oct 14, 2014
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    I am having this problem with five different offices full of PCs (so far) in Kansas City, using 5 different cPanel servers. All of them are on Time Warner "residential" connections -- this doesn't seem to be an issue for "business class" customers. The problem started yesterday mid-day, none of the customers changed anything on their end. Of course TW hotly denies they did anything, they're not responsible, etc etc. The Tier 3 tech I talked to yesterday (for more than an hour) told me there was nothing wrong, nothing he could do and said I should tell my customer to call Comcast!

    What I'm seeing is this: any connections to an SMTP port take a little more than a minute for the greeting banner to appear. The port number doesn't matter -- 25, 587, 2525, even 465 (SSL) all behave the same. For customers using Outlook, the default timeout is 60 seconds, so no messages can be delivered. Increasing the timeout to 3-4 minutes works around this issue (messages are delivered but each one takes a minute to go). I've tested this repeatedly from the command prompt; it's not an Outlook issue. I've also disabled firewalls, antivirus, etc on the PCs with no effect. Bypassing the router (CAT5 straight into the TW modem) changes nothing. However, changing a laptop from its TW connection to a Verizon wifi fixes it immediately.

    ALSO very important -- this only seems to be a problem with cPanel servers. When I test connections from TW to Gmail, Yahoo or other 3rd party servers, no problems at all. But 5 different cPanel servers I've tested are all having this problem.

    I think it's some kind of timeout, but I can't figure out what. My servers all have the "rfc1413_query_timeout" set to 0 and I've even configured the iptables firewalls to block IDENT requests with an immediate "connection refused", no change. I've tried disabling all the RBLs in exim, no change. This doesn't feel like a DNS timeout to me.

    I'd welcome any suggestions at all. My customers are SCREAMING for a fix to this, so I'm really under the gun.
     
  5. SamTheMan

    SamTheMan Registered

    Joined:
    Oct 14, 2014
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    New clue -- after some debugging with exim, it looks like the Big Delay is in DNS. When exim looks up reverse DNS for the client's IP address (w.x.y.z), it comes back with something like cpe-W-X-Y-Z.kc.res.rr.com. Then exim tries to do a forward lookup on that name to find its IP address, and the query times out several times (for a total of over one minute delay).

    Looks like TW changed their internal network so the nameserver for kc.res.rr.com is unreachable from the outside world. That server is named device-dns1.rr.com AND device-dns2.rr.com with IP 65.24.6.70.

    I'm attempting to configure named to hijack the kc.res.rr.com zone (for my server only) so it'll return fast results for those queries. I'll report back if I get it working.
     
  6. SamTheMan

    SamTheMan Registered

    Joined:
    Oct 14, 2014
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    FIXED! Hijacking the DNS zone fixes the problem. In my case, the zone is kc.res.rr.com, but I'm sure there are others...

    To fix it, login as root through SSH. If you're not comfortable editing config files through SSH, STOP and DO NOT continue.

    Create a file named /var/named/kc.res.rr.com.db (or whatever zone is giving you problems). Put this in the file (again, change kc.res.rr.com as needed):
    Change ownership on that file:
    Then edit your /etc/named.conf file. Find the section that begins with:
    Within that section, just below the "recursion" command, insert these lines:
    Also insert those same lines in the section that begins:
    Restart named (signaling it doesn't seem to be enough):
    That should do it. In your new-found free time, call TW and tell them how their all a bunch of lying sacks of crap.

    - - - Updated - - -

    Forgot to mention -- this only works if your cPanel server is using itself as the only nameserver. Check your /etc/resolv.conf!
     
  7. jhitesma

    jhitesma Member

    Joined:
    Jun 17, 2007
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    We're experiencing this in AZ as well the past several days. A DNS workaround isn't an option for us since we're not running our own nameservers.
     
  8. mo-jord

    mo-jord Registered

    Joined:
    Oct 15, 2014
    Messages:
    1
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Reseller Owner
    I am having the exact same issue in Kansas City

    - - - Updated - - -

    I should also add that this is through TWC and cPanel websites with any SMTP port - when I Switch to AT&T the issue goes away.
    I was able to change the SMTP time out and this fixed the problem, although it takes forever to send emails. I can't change the name servers, so I am stuck too.
     
  9. jhitesma

    jhitesma Member

    Joined:
    Jun 17, 2007
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    I opened a ticket with cpanel regarding this and sadly the response has been far less helpful than I've come to expect from their support staff :(

    Their only suggestion has been to enable bind on the servers and use the hack above to hijack the DNS requests. Quite frankly I really don't want to enable additional services we don't need and don't want which will take additional server resources, provide additional attack vectors for hackers, and run the risk of causing further problems down the road since we use external nameservers and have absolutely no reason to be running bind on our servers. Not to mention the history of security issues associated with bind even when it's configured correctly.

    In the past I've always got prompt professional and truly helpful support from cpanel and this is so out of character it shocks me. Given that other hosts aren't having the same issue it truly seems to be something related to cpanel's configuration of exim and the response I'm getting basically sounds like "Sorry, we don't want to deal with that."

    I tried white listing affected IP's in both "Sender verification bypass IP addresses" and "Trusted SMTP IP addresses" but that had no effect either.

    Very disappointed in cpanels response to this issue.
     
  10. Lyttek

    Lyttek Well-Known Member

    Joined:
    Jan 2, 2004
    Messages:
    770
    Likes Received:
    3
    Trophy Points:
    18
    Thank you SamTheMan for that info!! Will be trying that out shortly!
     
  11. Lyttek

    Lyttek Well-Known Member

    Joined:
    Jan 2, 2004
    Messages:
    770
    Likes Received:
    3
    Trophy Points:
    18
    Worked for me, so thanks very much!
     
  12. jhitesma

    jhitesma Member

    Joined:
    Jun 17, 2007
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    We're still trying to find a solution to this since we don't run nameservers on our servers and can't do a workaround as posted above. Cpanel has been very good about documenting the problem with Time Warners DNS configuration. But has been extremely unhelpful on finding a way to reconfigure exim to deal with this.

    Usually we get outstanding tech support from cpanel but I've had a ticket open for over a week and cpanel has been dropping the ball repeatedly on dealing with it. From techs who didn't even bother to read the description of the problem, to techs who despite having access to our server ask what settings are set to in our exim config then suggest trying things I already tried earlier in the ticket and finally suggest configuration changes that "may help but may make things worse for people not having problems" (which didn't work when I did try them.)

    I get that the root problem is TW's screwed up DNS. But we've got a ticket open with TW that has supposedly been escalated repeatedly to the national level but still isn't getting any attention so it's unlikely TW is going to fix their DNS anytime soon and in the meantime their tech support is telling our clients it's a misconfiguration on our server that's causing the problem.

    Since people with TW (we have a TW line here in our office as a backup so "people with TW" includes us to some extent) aren't able to reliably connect to our server but are able to reliably connect to just about any other server they try to connect to (google, yahoo, bing, exchange based servers...) they have no problems believe TW that it's our fault and not actually TW's.

    Very very disappointed in cpanel's lack of response to this which is now actively costing us clients. Yes it's TW's fault - but other servers are able to deal with TW's messed up DNS without preventing SMTP connections. That cpanel doesn't even seem to consider this a problem and isn't interested in trying to fix the configuration of exim that's causing it to refuse connections other servers are accepting is deeply troubling.
     
  13. cPanelMichael

    cPanelMichael Forums Analyst
    Staff Member

    Joined:
    Apr 11, 2011
    Messages:
    30,678
    Likes Received:
    648
    Trophy Points:
    113
    cPanel Access Level:
    Root Administrator
  14. jhitesma

    jhitesma Member

    Joined:
    Jun 17, 2007
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    Ticket #5574307 it's been reported to management twice which were the only points it seemed anyone bothered to give it any real attention. Though the response is still "tough luck, install bind on your servers and try to figure out how to get the workaround above to work without using itself as the only nameserver (oh, and you'll be on your own trying that) and the ever helpful "Contact Time Warner and ask for an escalation" even though we've been in contact with TW for over a week, have a regional manager working with us and have escalated to a national level....but still are getting replies from TW like:

    Which believe it or not is not from an out sourced overseas tech but one here in the US who still apparently can't handle simple English or really understands how DNS works.
     
  15. cPanelMichael

    cPanelMichael Forums Analyst
    Staff Member

    Joined:
    Apr 11, 2011
    Messages:
    30,678
    Likes Received:
    648
    Trophy Points:
    113
    cPanel Access Level:
    Root Administrator
    Hello,

    I see your support ticket with us is still open so you can except further correspondence from our staff. However, I did want to address one of your previous comments:

    Is there a specific vulnerability with Bind you are concerned about?

    Thank you.
     
  16. jhitesma

    jhitesma Member

    Joined:
    Jun 17, 2007
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    I honestly have no idea currently as we have no interest in running bind on our servers now or in the future so I haven't been following it in years. Even with no currently known vulnerabilities adding another service is just adding more potential attack vectors and using more system resources, we're not currently prepared to accept that additional risk on our servers. We also have multiple clients who require notification of new services being added on our servers due to their own policies and agreements with 3rd parties not wanting nameservers running on the same physical server as their website. (In some cases this is simply that they don't want a single point of failure - and I understand that adding bind just to hijack TW's domain wouldn't add an extra point of failure...but try explaining that to a client who no longer understands the why of their requirement only the way it's written.)

    There are many reasons we don't have and don't want bind running on our servers.

    I would fully agree that it's entirely TW's problem if it wasn't that other SMTP servers are dealing with the broken TW DNS with no issue. They're facing the same broken DNS system but they aren't experiencing connectivity issues. I've reached out to our nameservers administration but they are completely unwilling to hijack the res.tw.com domain even if it's broken. A response which also reinforces that installing bind just to hijack a domain is hardly a suitable response to this issue.
     
  17. jhitesma

    jhitesma Member

    Joined:
    Jun 17, 2007
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    Good news. TW seems to have fixed their DNS as I'm now able to get resolution on those domains:
    root@host5 [~]# dig +trace cpe-76-178-74-167.natsow.res.rr.com ANY

    ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.23.rc1.el6_5.1 <<>> +trace cpe-76-178-74-167.natsow.res.rr.com ANY
    ;; global options: +cmd
    . 65051 IN NS e.root-servers.net.
    . 65051 IN NS b.root-servers.net.
    . 65051 IN NS f.root-servers.net.
    . 65051 IN NS k.root-servers.net.
    . 65051 IN NS h.root-servers.net.
    . 65051 IN NS m.root-servers.net.
    . 65051 IN NS g.root-servers.net.
    . 65051 IN NS i.root-servers.net.
    . 65051 IN NS j.root-servers.net.
    . 65051 IN NS l.root-servers.net.
    . 65051 IN NS d.root-servers.net.
    . 65051 IN NS c.root-servers.net.
    . 65051 IN NS a.root-servers.net.
    ;; Received 228 bytes from 10.0.80.11#53(10.0.80.11) in 318 ms

    com. 172800 IN NS m.gtld-servers.net.
    com. 172800 IN NS l.gtld-servers.net.
    com. 172800 IN NS k.gtld-servers.net.
    com. 172800 IN NS j.gtld-servers.net.
    com. 172800 IN NS i.gtld-servers.net.
    com. 172800 IN NS h.gtld-servers.net.
    com. 172800 IN NS g.gtld-servers.net.
    com. 172800 IN NS f.gtld-servers.net.
    com. 172800 IN NS e.gtld-servers.net.
    com. 172800 IN NS d.gtld-servers.net.
    com. 172800 IN NS c.gtld-servers.net.
    com. 172800 IN NS b.gtld-servers.net.
    com. 172800 IN NS a.gtld-servers.net.
    ;; Received 497 bytes from 198.41.0.4#53(198.41.0.4) in 210 ms

    rr.com. 172800 IN NS dns1.rr.com.
    rr.com. 172800 IN NS dns2.rr.com.
    rr.com. 172800 IN NS dns3.rr.com.
    rr.com. 172800 IN NS dns6.rr.com.
    rr.com. 172800 IN NS dns5.rr.com.
    ;; Received 228 bytes from 192.26.92.30#53(192.26.92.30) in 100 ms

    natsow.res.rr.com. 7200 IN NS dns-sec-01.peakview.rr.com.
    natsow.res.rr.com. 7200 IN NS dns-pri-01.peakview.rr.com.
    ;; Received 144 bytes from 65.24.0.171#53(65.24.0.171) in 151 ms

    cpe-76-178-74-167.natsow.res.rr.com. 3600 IN A 76.178.74.167
    natsow.res.rr.com. 3600 IN NS dns-pri-01.peakview.rr.com.
    natsow.res.rr.com. 3600 IN NS dns-sec-01.peakview.rr.com.
    ;; Received 160 bytes from 76.85.232.130#53(76.85.232.130) in 29 ms


    Bad news - I'm still getting abnormally slow connections coming from TW. More than twice as long as it takes coming from our centry link connection. Don't know if it's long enough to keep causing problems yet but it still seems abnormally long even though that DNS response doesn't seem slow.
     
  18. jhitesma

    jhitesma Member

    Joined:
    Jun 17, 2007
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    Looks like it was some bad caching along the way somewhere. After a few more hours things have cleared out fully and connections from TW are going through as normal again.
     
Loading...

Share This Page