AlaskanWolf

Well-Known Member
Aug 11, 2001
537
0
316
Fremont CA
For quite a while now (about since we had this server) bind has continued to fail on a normal basis. We have since had to setup a cron to restart bind every 30 minutes, but i think its causing some
server issues and attributing to the server freezing up, seems when the cron restarts bind, (looking in the messages log) is when the system either locks up or becomes unresponsive


Anyone else have this problem and u know of a fix?
 

AlaskanWolf

Well-Known Member
Aug 11, 2001
537
0
316
Fremont CA
thanks for the help, this is what i got when i ran it

Named again failed earlier this evening (checkservice did not restart it) so while i was doing last minute xmas shopping, bind was down for a good hour or so


xxxxxxxxxxxxx
Type help -or- /h if you need help.
ndc> /t
tracing now on
ndc> /d
debugging now on
ndc> restart
ndc: [isc/ctl_clnt::new_state: initializing -> connecting]
ndc: [isc/ctl_clnt::new_state: connecting -> connected]
ndc: [isc/ctl_clnt::readable: read 15, used 15]
ndc: [220 8.2.3-REL]
ndc: [isc/ctl_clnt::readable: read 34, used 34]
ndc: [250 my arguments are < -u named>]
ndc: [isc/ctl_clnt::new_state: connected -> destroyed]
ndc: [isc/ctl_clnt::new_state: initializing -> connecting]
ndc: [isc/ctl_clnt::new_state: connecting -> connected]
ndc: [isc/ctl_clnt::readable: read 15, used 15]
ndc: [220 8.2.3-REL]
ndc: [isc/ctl_clnt::readable: read 23, used 23]
ndc: [250 my pid is <15518>]
ndc: [isc/ctl_clnt::new_state: connected -> destroyed]
pid 15518 is running
ndc: [stopping named (pid 15518)]
ndc: [isc/ctl_clnt::new_state: initializing -> connecting]
ndc: [isc/ctl_clnt::new_state: connecting -> connected]
ndc: [isc/ctl_clnt::readable: read 15, used 15]
ndc: [220 8.2.3-REL]
ndc: [isc/ctl_clnt::readable: read: Unexpected EOF]
ndc: [EOF]
ndc: [isc/ctl_clnt::new_state: connected -> destroyed]
ndc: [named (pid 15518) is dead]
ndc: [isc/ctl_clnt::new_state: initializing -> connecting]
ndc: [isc/ctl_clnt::new_state: connecting -> connected]
ndc: [isc/ctl_clnt::readable: read 15, used 15]
ndc: [220 8.2.3-REL]
ndc: [isc/ctl_clnt::readable: read 23, used 23]
ndc: [250 my pid is <16105>]
ndc: [isc/ctl_clnt::new_state: connected -> destroyed]
pid 16105 is running
new pid is 16105

[Edited on 12/24/01 by AlaskanWolf]
 

shaun

Well-Known Member
PartnerNOC
Verifed Vendor
Nov 9, 2001
708
1
318
San Clemente, Ca
cPanel Access Level
DataCenter Provider
Twitter
ermm that didnt give us much info. Have you tryed doing a tail -f on the syslog file and then doing a restart? their is going to be alot of crap but looks for any error\'s. Also if you have the exact time it died then go into the logs and look for that time and date and check the log up to 30 minutes back from the time that was reported that it went down. You may see some errors (hopefully)
 

AlaskanWolf

Well-Known Member
Aug 11, 2001
537
0
316
Fremont CA
I just caught named dying and looking in the logs, (after running your debug stuff) heres whats in named that caught my eye


(LATEST FIRST)

Dec 24 15:31:27 wolf named[2819]: Zone \"nomdeplum.com\\032\" (file /var/named/nomdeplum.com .db): No default TTL ($TTL <value>) set, $
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db: Line 10: Unknown type: ..
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db:10: Database error near (.)
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db: Line 11: Unknown type: ..
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db:11: Database error near (.)
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db: Line 12: Unknown type: ..
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db:12: Database error near (.)
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db: Line 13: Unknown type: ..
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db:13: Database error near (.)
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db: Line 15: Unknown type: ..
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db:15: Database error near (.)
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db: Line 17: Unknown type: ..
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db:17: Database error near (.)
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db:19: Database error near ()
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db:20: Database error near ()
Dec 24 15:31:27 wolf named[2819]: /var/named/nomdeplum.com .db:21: Database error near ()
Dec 24 15:31:27 wolf named[2819]: Zone \"nomdeplum.com\\032\" (file /var/named/nomdeplum.com .db): no NS RRs found at zone top
Dec 24 15:31:27 wolf named[2819]: master zone \"nomdeplum.com\\032\" (IN) rejected due to errors (serial 1009162985)

xxxxxxxxxxx

Dec 24 15:31:08 wolf named[2807]: Zone \"nomdeplum.com\\032\" (file /var/named/nomdeplum.com .db): no NS RRs found at zone top
Dec 24 15:31:08 wolf named[2807]: master zone \"nomdeplum.com\\032\" (IN) rejected due to errors (serial 1009162985)

xxxxxxxxxxxxxxx
Dec 24 15:31:07 wolf named[1142]: named shutting down
Dec 24 15:31:07 wolf named[1142]: USAGE 1009236667 1009235842 CPU=0.66u/4.28s CHILDCPU=0u/0s
Dec 24 15:31:07 wolf named[1142]: NSTATS 1009236667 1009235842 A=307 SOA=2 PTR=523 MX=193 AAAA=12 38=2 ANY=2
Dec 24 15:31:07 wolf named[1142]: XSTATS 1009236667 1009235842 RR=703 RNXD=21 RFwdR=323 RDupR=0 RFail=63 RFErr=0 RErr=4 RAXFR=0 RLa$
Dec 24 15:31:07 wolf named[2807]: starting (/etc/named.conf). named 8.2.3-REL Sat Jan 27 05:32:51 EST 2001 ^[email protected]$
Dec 24 15:31:07 wolf named[2807]: /etc/named.conf:8: syntax error near allow
Dec 24 15:31:07 wolf named[2807]: hint zone \"\" (IN) loaded (serial 0)
 

AlaskanWolf

Well-Known Member
Aug 11, 2001
537
0
316
Fremont CA
ok, deleted nomdeplum.com.db (didnt have any records in it) but not sure what to do about

Dec 24 15:31:07 wolf named[2807]: /etc/named.conf:8: syntax error near allow
Dec 24 15:31:07 wolf named[2807]: hint zone \"\" (IN) loaded (serial 0)
 

shaun

Well-Known Member
PartnerNOC
Verifed Vendor
Nov 9, 2001
708
1
318
San Clemente, Ca
cPanel Access Level
DataCenter Provider
Twitter
well you shouldnt have deleted it, Rule of thumb, move files dont delete them unless absolutly positive that you dont need it. But since it was empty i guess it\'s ok.

ok here\'s what you do next.

edit /etc/named.conf with what ever editor you prefer. (i prefer vi, and i recommend you learn it :) ).

Look for a line that looks similar to this, i cant give you the exact like because it may look alittle diffrent but they usually look somthing like this.

zone \"Domain.com\" {
type master;
zone \"domain.com.db\";
}

look for one that has the name nomdeplum.com in it

for example by the looks of it, it should look similar to this

zone \"nomdeplum.com\" {
type master;
zone \"nomdeplum.com.db\";
}

basically just delete everything in that zone. so everything starting at the zone \"nomdeplum.com\" { to the }. Once you do that save it and do ndc restart

watch the logs see what comes out now.

You are removing that domain from your nameserver so if that domain need\'s to be on it you need to readd it.


Before you do any of this make a copy of the named.conf file. just in case.
 

AlaskanWolf

Well-Known Member
Aug 11, 2001
537
0
316
Fremont CA
errors went away but still get the

Dec 24 15:31:07 wolf named[2807]: /etc/named.conf:8: syntax error near allow
Dec 24 15:31:07 wolf named[2807]: hint zone \"\" (IN) loaded (serial 0)

i found the same error in my other servers, so i am thinking this is not the central problem in bind failing


I found another thread that sorta expresses whats happening to this server, as you can see from that, my old partner expressed the problem back in August

http://support.cpanel.net/new/viewthread.php?tid=222
 

shaun

Well-Known Member
PartnerNOC
Verifed Vendor
Nov 9, 2001
708
1
318
San Clemente, Ca
cPanel Access Level
DataCenter Provider
Twitter
is it still dieing? looks like were both online here. If you want to IM me my s/n is octekshaun.

That will be easyer.

I dont run bind on my Cpanel box\'s and i just noticed it\'s running bind 9 This version of bind is very unforgiving when it comes to makeing mistakes. We run bind 8 here on our external nameservers that i manage and i was thinking about upgrading until i read the do\'s and donts of upgrading to 9. also... look threw your named.conf file. Look for any corrupt zones. Their are some dns until in /scripts also. Not sure what they do but ones called dnsclean i beleive.