The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

DNS Culster sending loads to 200+

Discussion in 'Bind / DNS / Nameserver Issues' started by nappa, Jul 24, 2005.

  1. nappa

    nappa Active Member

    Joined:
    Aug 2, 2003
    Messages:
    32
    Likes Received:
    0
    Trophy Points:
    6
    I'm running a DNS cluster setup. Cpanel's Current build of Cpanel/WHM and version 10. The machines have been secured and optimised. And we've checked it's a problem that isn't related to any of those issues.

    This problem started up a few hours ago. we have had to disable the DNS cluster, and that doesn't help the matter altogether. After a while we believe it runs a CRON, and it runs the following files : dnsadmin , killdns. We have ended up scheduling a cron to run every minute to kill those processes before we got control back of the server. This is happening on one server. The other server the load goes high as well, but killing it off on the main server stops the problems on the other. The load on the other server is around about 20 - 40 at this point.

    Has anyone else been experiencing this problem ? or anything of the like ? I talked to my provider, and they told me that it's something that they are aware of and thus don't recommend the clients to do .

    Thanks.
     
    #1 nappa, Jul 24, 2005
    Last edited: Jul 24, 2005
  2. chirpy

    chirpy Well-Known Member

    Joined:
    Jun 15, 2002
    Messages:
    13,475
    Likes Received:
    20
    Trophy Points:
    38
    Location:
    Go on, have a guess
    Yes, I've seen this. Remove the problem server from the DNS cluster and then clear out any files in:

    /var/cpanel/clusterqueue/requests
    /var/cpanel/clusterqueue/retry

    Then add that server back into the DNS cluster and sync all the DNS records to it.
     
  3. thedavid

    thedavid Well-Known Member

    Joined:
    Nov 22, 2002
    Messages:
    124
    Likes Received:
    0
    Trophy Points:
    16
    We had the exact same thing happen on our server cluster just recently. *all* of the servers were disabled, because of hundreds upon hundreds of 'dnsadmin' processes being spawned.

    When I got access to a server to see what it was doing, it was looping in a wait cycle for something.

    To fix it, I had to:

    1) Go in and 'killall dnsadmin' in one window repeatedly whilst shutting down WHM in another
    2) Change the remote access key on the server so that other servers in the cluster don't have access to get in
    3) Restart WHM and disable the dns clustering system altogether

    It was an exciting time, let me tell you. Right now, we're still running on the old system because of this - it works fine, really, just a bit slower...

    Now the above instructions might fix it for this time, but what can be done to prevent it in the future? I'm still not entirely sure what kicked off the outage of all of our servers, which is troubling. There just wasn't enough spare cpu cycles to do anything but the most rudimentary checking...
     
  4. nappa

    nappa Active Member

    Joined:
    Aug 2, 2003
    Messages:
    32
    Likes Received:
    0
    Trophy Points:
    6
    Chirpy : Thanks for the help. It's all back to normal, and the problem hasn't returned again. You were right about the problem.

    I think to prevent it, you would need to edit the program that runs the DNSAdmin and design it to kill it self off if there is too much load and notify the administrator. For my system, since I'm not conversant in PERL enough to do these kinds of changes, I've configured my systems integrity program to reboot the server if the load is over 40, and there is a cron that clears out the problem directories.

    Important to note - it hasn't done it again.
     
  5. blahrus

    blahrus Member
    PartnerNOC

    Joined:
    Jul 18, 2005
    Messages:
    6
    Likes Received:
    0
    Trophy Points:
    1
    I am having the same issues, anyone know why the load goes sooooo high?


    Thanks,
    Clint
     
  6. Curious Too

    Curious Too Well-Known Member

    Joined:
    Aug 31, 2001
    Messages:
    427
    Likes Received:
    0
    Trophy Points:
    16
    cPanel Access Level:
    Root Administrator
    I had the same problem and had to downgrade my dns servers to a Release version of cPanel. The problem is the dnsadmin could not write to the named.conf file so all of the dns changes were hanging.
     
  7. chirpy

    chirpy Well-Known Member

    Joined:
    Jun 15, 2002
    Messages:
    13,475
    Likes Received:
    20
    Trophy Points:
    38
    Location:
    Go on, have a guess
    I would guess there's a loop in there. Probably the best you or anyone that finds the problem can do is to log a ticket with cPanel and have them investigate it if you can bear the high loads for a while. Otherwise, try the workarounds above.
     
  8. JamesSmith

    JamesSmith Well-Known Member

    Joined:
    Sep 17, 2003
    Messages:
    185
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    UK, Luton
    We to have experienced this same issue over the past week.

    When it all goes pete tong, it looks similar to this:

    root 64217 0.0 0.1 3652 1336 p3 IJ 3:13PM 0:00.04 killdns - domain.com (perl)
    root 64216 0.0 0.0 4328 0 p3 IWJ - 0:00.00 dnsadmin - REMOVEZONE - 1rx77mc04QOInITzQuhNKLFzNVFHvepR (perl)
    root 64213 0.0 0.1 4328 1560 p3 SJ 3:13PM 0:00.17 dnsadmin - REMOVEZONE - 1rx77mc04QOInITzQuhNKLFzNVFHvepR (perl)
    root 64212 0.0 0.1 3652 1336 p3 IJ 3:13PM 0:00.04 killdns - domain.com (perl)
    root 64211 0.0 0.0 4328 0 p3 IWJ - 0:00.00 dnsadmin - REMOVEZONE - KwBbIssmPiSxfR1VtH1a2RJZ4PSjtIWt (perl)
    root 64209 0.0 0.1 4328 1560 p3 SJ 3:13PM 0:00.17 dnsadmin - REMOVEZONE - KwBbIssmPiSxfR1VtH1a2RJZ4PSjtIWt (perl)
    root 64208 0.0 0.1 3652 1336 p3 IJ 3:13PM 0:00.04 killdns - domain.com (perl)
    root 64207 0.0 0.0 4328 0 p3 IWJ - 0:00.00 dnsadmin - REMOVEZONE - opLTLuVufnhWh0xAgXoLf8ZYdpls6DlE (perl)
    root 64205 0.0 0.1 4328 1564 p3 SJ 3:13PM 0:00.17 dnsadmin - REMOVEZONE - opLTLuVufnhWh0xAgXoLf8ZYdpls6DlE (perl)
    root 64204 0.0 0.1 3652 1336 p3 IJ 3:13PM 0:00.04 killdns - domain.com (perl)
    root 64203 0.0 0.0 4328 0 p3 IWJ - 0:00.00 dnsadmin - REMOVEZONE - l1YWzC4bfB1wthsR52pjrEJuDA_riUVq (perl)
    root 64201 0.0 0.1 4328 1564 p3 SJ 3:13PM 0:00.17 dnsadmin - REMOVEZONE - l1YWzC4bfB1wthsR52pjrEJuDA_riUVq (perl)

    The backup dns machine is running the FreeBSD DNSONLY installation. The primary is a regular CPanel machine.

    I've taken out the upcp and dnsqueue crons to prevent this happening every god dammed 15 mins. DNS clustering is also disabled on all machines.

    I'll give this suggestion a try to see if it works.

    Edit: When this does occur and when we can eventually login to the machine, su to root etc .. we're finding killall perl is working repeatedly to bring it under control.
     
  9. JamesSmith

    JamesSmith Well-Known Member

    Joined:
    Sep 17, 2003
    Messages:
    185
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    UK, Luton
    ... and its not fixed.

    I can add zones. But I cannot delete zones.

    Getting a tad desperate on this now :) - I've just sent a ticket to CPanel support regarding the issue.

    Ticket ID: 107013
     
    #9 JamesSmith, Jul 30, 2005
    Last edited: Jul 30, 2005
  10. DigitalN

    DigitalN Well-Known Member

    Joined:
    Sep 23, 2004
    Messages:
    420
    Likes Received:
    1
    Trophy Points:
    18
    Just had the same issue (running Current release)

    Loads up to 200 when deleting dns zones.

    Down grade to stable 83 and it appears *fixed*

    There were basically dnsadmin processes spawning everywhere, much the same as your issue.
     
  11. JamesSmith

    JamesSmith Well-Known Member

    Joined:
    Sep 17, 2003
    Messages:
    185
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    UK, Luton
    Nope, not fixed here on the stable version.

    WHM 10.1.0 cPanel 10.2.0-S83
     
  12. JamesSmith

    JamesSmith Well-Known Member

    Joined:
    Sep 17, 2003
    Messages:
    185
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    UK, Luton
    The issue has been fixed by a upcp on the secondary server running the DNSONLY installation of cPanel.

    CPanel support seemed to believe its to do with incompatible versions of the files /usr/local/cpanel/whostmgr/bin/dnsadmin and /scripts/killdns.

    If anyone else has this issue, do a /scripts/upcp --force on all the machines having the problem.
     
  13. leorevenda

    leorevenda Active Member
    PartnerNOC

    Joined:
    Jan 24, 2004
    Messages:
    30
    Likes Received:
    0
    Trophy Points:
    6
    Hello,

    Today, this bug ocourring in 2 our servers, server don't use dns cluster.

    I try to fix upgrade cpanel, but is possible killdns and dnsadmin have problems in all servers during this day
     
  14. bjdea1

    bjdea1 Well-Known Member

    Joined:
    Mar 6, 2003
    Messages:
    83
    Likes Received:
    1
    Trophy Points:
    8
    Holy Crap

    Holy Crap !!

    Exact Same issue here.

    Had 3 ssh shells open and was running "killall dnsadmin" in one "top" in the other and the third one ot actually do something else. Loads up to 200+ - everything grinded to halt a few times - but managed to get in there and get the "killall dnsadmin" command to kick in.

    I had to disable dns clustering also - but even then it kept coming - WHAT A NIGHT !!! WHOO !!

    Seems to have calmed down now - will try the fixes you guys have mentioned above - DAmN another late night for me!!! At least its not a server compromise

    I'm going to setup a cron to run every minute with the following command:
    * * * * * killall -KILL dnsadmin >/dev/null 2>&1

    This is just so I can get some sleep tonight :) I'll then fix it in the morning.
     
    #14 bjdea1, Aug 19, 2005
    Last edited: Aug 19, 2005
  15. leorevenda

    leorevenda Active Member
    PartnerNOC

    Joined:
    Jan 24, 2004
    Messages:
    30
    Likes Received:
    0
    Trophy Points:
    6
    Hello,

    I too killall dnsadmin and fix, but this don't is a very good option, this issue ocourring when customer kill account in WHM.

    I change chmod 0000 to killdns, and problem don't ocorring more, but customers can't delete domains, note below, is updated today:
    ---------- 1 root root 725 Aug 19 00:39 killdns
     
  16. bjdea1

    bjdea1 Well-Known Member

    Joined:
    Mar 6, 2003
    Messages:
    83
    Likes Received:
    1
    Trophy Points:
    8
    Yes

    Good point. Given me an idea.

    I've just done a "mv /scripts/killdns /scripts/KILLDNS" command now - should do the same thing. Swap it back in the morning once this prob is fixed.
     
  17. chirpy

    chirpy Well-Known Member

    Joined:
    Jun 15, 2002
    Messages:
    13,475
    Likes Received:
    20
    Trophy Points:
    38
    Location:
    Go on, have a guess
    Have you logged a ticket with cPanel WRT to the problem so that they are aware and can investigate incase it's peculiar to your server config?
     
  18. leorevenda

    leorevenda Active Member
    PartnerNOC

    Joined:
    Jan 24, 2004
    Messages:
    30
    Likes Received:
    0
    Trophy Points:
    6
    Hi Jonathan,

    Best option is open bugzilla ticket? This problem can affect more servers, now my 3rd server is affected, probably is automatic update of whm software.

    Thanks
     
  19. chirpy

    chirpy Well-Known Member

    Joined:
    Jun 15, 2002
    Messages:
    13,475
    Likes Received:
    20
    Trophy Points:
    38
    Location:
    Go on, have a guess
    Probably best to log a support ticket if you've just upgraded to the new RELEASE tree, but you could also search bugzilla incase it's already logged in there.
     
  20. cretu

    cretu Well-Known Member

    Joined:
    Jul 21, 2002
    Messages:
    208
    Likes Received:
    0
    Trophy Points:
    16
    Hi,

    I am having 5 servers affected by high load created by dnsadmin after recent release update to 10.6.0-RELEASE_4.
    Is there a permanent solution beside kiiling the "dnsadmin" constantly?
    Should I downgrade to Stable?

    Cretu
     
Loading...

Share This Page