Elizabeta

Well-Known Member
Mar 21, 2018
245
33
28
Mostar
cPanel Access Level
Root Administrator
Hello,

I have restarted my dns -only server and now I ´ve got some messages:


Service Name

dnsadmin

Service Status

failed

Notification

The service “dnsadmin” appears to be down.

Service Check Method

The system’s command to check or to restart this service failed.

Number of Restart Attempts

1

Service Check Raw Output

(XID pteh4m) The “dnsadmin” service is down.

The subprocess “/usr/local/cpanel/scripts/restartsrv_dnsadmin” reported error number 255 when it ended.

Startup Log

Aug 16 13:53:39 dns2.example.com systemd[1]: Starting cPanel DNS admin service...
Aug 16 13:53:39 dns2.example.com restartsrv_dnsadmin[26916]: Starting PID 26920: dnsadmin-dormant
Aug 16 13:53:39 dns2.example.com systemd[1]: Started cPanel DNS admin service.
Aug 16 14:40:43 dns2.example.com systemd[1]: Stopping cPanel DNS admin service...
Aug 16 14:40:44 dns2.example.com systemd[1]: Stopped cPanel DNS admin service.

Next message:

Service Name

dnsadmin

Service Status

failed

Notification

The service “dnsadmin” appears to be down.

Service Check Method

The system’s command to check or to restart this service failed.

Number of Restart Attempts

1

Service Check Raw Output

(XID pteh4m) The “dnsadmin” service is down.

The subprocess “/usr/local/cpanel/scripts/restartsrv_dnsadmin” reported error number 255 when it ended.

Startup Log

Aug 16 13:53:39 dns2.example.com systemd[1]: Starting cPanel DNS admin service...
Aug 16 13:53:39 dns2.example.com restartsrv_dnsadmin[26916]: Starting PID 26920: dnsadmin-dormant
Aug 16 13:53:39 dns2.example.com systemd[1]: Started cPanel DNS admin service.
Aug 16 14:40:43 dns2.example.com systemd[1]: Stopping cPanel DNS admin service...
Aug 16 14:40:44 dns2.example.com systemd[1]: Stopped cPanel DNS admin service.

And so other message for services crond, rsyslogd, sshd, nameserver, lmtp, mysql...

But in ps aux I see that processes are up.
What is dnsadmin dormant mode?

Best regards,
Elizabeta
 
Last edited by a moderator:

Elizabeta

Well-Known Member
Mar 21, 2018
245
33
28
Mostar
cPanel Access Level
Root Administrator
Hello,

One more information. On the same machine I have manually stopped exim and I´ve got message from cPanel Monitoring that service is down. After two minutes service exim was recovered and I´ve got message that service is operational. It seems ok now. Or? What happened before?

Thank you!

Best regards,
Elizabeta
 

Elizabeta

Well-Known Member
Mar 21, 2018
245
33
28
Mostar
cPanel Access Level
Root Administrator
Hello,

Now I have restarted cpanel (after upgrade kernel) and cca one hour after restart machine I have got a many messages from cPanel that services lmtp, mysql, exim, crond, ftpd was down. But, I see in command ps aux that all services are up.
Also, I have not received messages from cPanel Monitoring that services are now operational. Why??

Best regards,
Elizabeta
 

Elizabeta

Well-Known Member
Mar 21, 2018
245
33
28
Mostar
cPanel Access Level
Root Administrator
Hello,

After many my messages, sorry, I have looked in detail logs after restart machine (18:57)

Code:
2018-08-16 18:57:06 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa6-0000Lq-Er
2018-08-16 18:57:06 1fqLa6-0000Lq-Er == [EMAIL][email protected][/EMAIL] R=lookuphost defer (-36): host lookup for mail.net.ba did not complete (DNS timeout?)
2018-08-16 18:58:32 SMTP connection from [127.0.0.1]:35638 (TCP/IP connection count = 1)
2018-08-16 18:58:32 SMTP connection from (localhost) [127.0.0.1]:35638 closed by QUIT
2018-08-16 19:03:26 SMTP connection from [127.0.0.1]:35672 (TCP/IP connection count = 1)
2018-08-16 19:03:26 SMTP connection from (localhost) [127.0.0.1]:35672 closed by QUIT
2018-08-16 19:08:28 SMTP connection from [127.0.0.1]:35700 (TCP/IP connection count = 1)
2018-08-16 19:08:28 SMTP connection from (localhost) [127.0.0.1]:35700 closed by QUIT
2018-08-16 19:13:29 SMTP connection from [127.0.0.1]:35744 (TCP/IP connection count = 1)
2018-08-16 19:13:29 SMTP connection from (localhost) [127.0.0.1]:35744 closed by QUIT
2018-08-16 19:18:30 SMTP connection from [127.0.0.1]:35774 (TCP/IP connection count = 1)
2018-08-16 19:18:30 SMTP connection from (localhost) [127.0.0.1]:35774 closed by QUIT
2018-08-16 19:23:32 SMTP connection from [127.0.0.1]:35802 (TCP/IP connection count = 1)
2018-08-16 19:23:32 SMTP connection from (localhost) [127.0.0.1]:35802 closed by QUIT
2018-08-16 19:28:32 SMTP connection from [127.0.0.1]:35834 (TCP/IP connection count = 1)
2018-08-16 19:28:32 SMTP connection from (localhost) [127.0.0.1]:35834 closed by QUIT
2018-08-16 19:33:33 SMTP connection from [127.0.0.1]:35876 (TCP/IP connection count = 1)
2018-08-16 19:33:33 SMTP connection from (localhost) [127.0.0.1]:35876 closed by QUIT
2018-08-16 19:38:34 SMTP connection from [127.0.0.1]:35914 (TCP/IP connection count = 1)
2018-08-16 19:38:34 SMTP connection from (localhost) [127.0.0.1]:35914 closed by QUIT
2018-08-16 19:42:18 SMTP connection from [139.162.109.245]:41050 (TCP/IP connection count = 1)
2018-08-16 19:42:22 SMTP connection from scan-7.security.ipip.net [139.162.109.245]:41050 lost D=3s
2018-08-16 19:43:35 SMTP connection from [127.0.0.1]:35942 (TCP/IP connection count = 1)
2018-08-16 19:43:35 SMTP connection from (localhost) [127.0.0.1]:35942 closed by QUIT
2018-08-16 19:48:36 SMTP connection from [127.0.0.1]:35974 (TCP/IP connection count = 1)
2018-08-16 19:48:36 SMTP connection from (localhost) [127.0.0.1]:35974 closed by QUIT
2018-08-16 19:53:38 SMTP connection from [127.0.0.1]:36012 (TCP/IP connection count = 1)
2018-08-16 19:53:38 SMTP connection from (localhost) [127.0.0.1]:36012 closed by QUIT
2018-08-16 19:56:59 cwd=/var/spool/exim 2 args: /usr/sbin/exim -qG
2018-08-16 19:56:59 Start queue run: pid=17224
2018-08-16 19:56:59 1fqLa6-0000Lg-4W => [EMAIL][email protected][/EMAIL] R=lookuphost T=remote_smtp H=mail.net.ba [212.20.31.50] C="250 ok:  Message 2642517 accepted"
Ok, I understand, cPanel Monitoring have attempted send mail 18:57, it was not ok, and cPanel Monitoring succesufully sent mail after one hour that services was failed before.
I would like to know where is put this option in exim? (time for retry failed messages from queue)

But, the main question is why cpanel Monitoring did not sent messages that services are operational??

Thank you!

Best regards,
Elizabeta
 
Last edited by a moderator:

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,909
2,229
463
Hello @Elizabeta,

You can read more information about why this happens on the following thread:

[CPANEL-21627] Chkservd reports service failures during graceful reboots

But, the main question is why cpanel Monitoring did not sent messages that services are operational??
Can you verify if the following option is enabled under the System tab in WHM >> Tweak Settings?

The option to enable or disable ChkServd recovery notifications

Per it's description:

Disabling this option will suppress notification of service recovery from ChkServd.

Thank you.
 

Elizabeta

Well-Known Member
Mar 21, 2018
245
33
28
Mostar
cPanel Access Level
Root Administrator
Hello Michael,

I have read everything on this link [CPANEL-21627] Chkservd reports service failures during graceful reboots
and I have increased the default value of "3" to a value such as "5" for the following option under the System tab in WHM >> Tweak Settings:
ChkServd TCP check failure threshold

After Graceful restart of cPanel, cPanel Monitoring did not send messages that services are down.
Last time, when was restart of cPanel, cPanel Monitoring sent many messages. In first moment, he could not sent, but just after one hour retry send messages.
Where I can put in exim that messages from queue which can not be delivered immediately, try retry not after hour time, but before?

Thank you!

Best regards,
Elizabeta
 

Elizabeta

Well-Known Member
Mar 21, 2018
245
33
28
Mostar
cPanel Access Level
Root Administrator
Hello,

In my exim.conf is this for retry configuration * * F,2h,15m; G,16h,1h,1.5; F,4d,8h .

How can I via WHM change this settings?

I would like to put option, when message can not be accepted that retry several times (i.e. 15 min) in first 2 hours, then rarely.
Now is situation that message can not be accepted, and retry after hour.

Thanks in advance,
Elizabeta
 

Elizabeta

Well-Known Member
Mar 21, 2018
245
33
28
Mostar
cPanel Access Level
Root Administrator
Hello,

in logs :

Code:
2018-08-16 18:57:04 1fqLa4-0000Km-Ik == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 18:57:04 1fqLa4-0000Kr-Tk <= [email protected] U=root P=local S=48472 [email protected] T="[cpanel.example.com] FAILED \342\233\224: lmtp (212.20.31.57)" for [email protected]
2018-08-16 18:57:05 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa4-0000Kr-Tk
2018-08-16 18:57:05 1fqLa4-0000Kr-Tk == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 18:57:05 cwd=/ 3 args: /usr/sbin/sendmail -odb -ti
2018-08-16 18:57:05 1fqLa5-0000Kw-2p <= [email protected] U=root P=local S=48396 [email protected] T="[cpanel.example.com] FAILED \342\233\224: imap (212.20.31.57)" for [email protected]
2018-08-16 18:57:05 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa5-0000Kw-2p
2018-08-16 18:57:05 1fqLa5-0000Kw-2p == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 18:57:05 cwd=/ 3 args: /usr/sbin/sendmail -odb -ti
2018-08-16 18:57:05 1fqLa5-0000LB-9X <= [email protected] U=root P=local S=49044 [email protected] T="[cpanel.example.com] FAILED \342\233\224: ftpd (212.20.31.57)" for [email protected]
2018-08-16 18:57:05 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa5-0000LB-9X
2018-08-16 18:57:05 1fqLa5-0000LB-9X == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 18:57:05 cwd=/ 3 args: /usr/sbin/sendmail -odb -ti
2018-08-16 18:57:05 1fqLa5-0000LQ-GH <= [email protected] U=root P=local S=50792 [email protected] T="[cpanel.example.com] FAILED \342\233\224: exim (212.20.31.57)" for [email protected]
2018-08-16 18:57:05 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa5-0000LQ-GH
2018-08-16 18:57:05 1fqLa5-0000LQ-GH == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 18:57:05 cwd=/ 3 args: /usr/sbin/sendmail -odb -ti
2018-08-16 18:57:05 1fqLa5-0000LW-Kw <= [email protected] U=root P=local S=49408 [email protected] T="[cpanel.example.com] FAILED \342\233\224: dnsadmin (212.20.31.57)" for [email protected]
2018-08-16 18:57:05 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa5-0000LW-Kw
2018-08-16 18:57:05 1fqLa5-0000LW-Kw == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 18:57:05 cwd=/ 3 args: /usr/sbin/sendmail -odb -ti
2018-08-16 18:57:05 1fqLa5-0000Lb-Sm <= [email protected] U=root P=local S=51082 [email protected] T="[cpanel.example.com] FAILED \342\233\224: crond (212.20.31.57)" for [email protected]
2018-08-16 18:57:05 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa5-0000Lb-Sm
2018-08-16 18:57:06 1fqLa5-0000Lb-Sm == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 18:57:06 cwd=/ 3 args: /usr/sbin/sendmail -odb -ti
2018-08-16 18:57:06 1fqLa6-0000Lg-4W <= [email protected] U=root P=local S=48053 [email protected] T="[cpanel.example.com] FAILED \342\233\224: cpanellogd (212.20.31.57)" for [email protected]
2018-08-16 18:57:06 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa6-0000Lg-4W
2018-08-16 18:57:06 cwd=/ 3 args: /usr/sbin/sendmail -odb -ti
2018-08-16 18:57:06 1fqLa6-0000Lg-4W == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 18:57:06 1fqLa6-0000Ll-AA <= [email protected] U=root P=local S=50225 [email protected] T="[cpanel.example.com] FAILED \342\233\224: cpanel-dovecot-solr (212.20.31.57)" for [email protected]
2018-08-16 18:57:06 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa6-0000Ll-AA
2018-08-16 18:57:06 cwd=/ 3 args: /usr/sbin/sendmail -odb -ti
2018-08-16 18:57:06 1fqLa6-0000Ll-AA == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 18:57:06 1fqLa6-0000Lq-Er <= [email protected] U=root P=local S=47826 [email protected] T="[cpanel.example.com] FAILED \342\233\224: clamd (212.20.31.57)" for [email protected]
2018-08-16 18:57:06 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa6-0000Lq-Er
2018-08-16 18:57:06 1fqLa6-0000Lq-Er == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 19:56:59 1fqLa6-0000Lg-4W => [email protected] R=lookuphost T=remote_smtp H=mail.example.com [212.20.31.50] C="250 ok: Message 2642517 accepted"
2018-08-16 19:56:59 1fqLa6-0000Lg-4W Completed
18:56 is time restart of machine

Exim just try to send mail at 18:57 9 time
and succesufull sent mail after hour.

But in my exim.conf is this for retry configuration * * F,2h,15m; G,16h,1h,1.5; F,4d,8h
It specifies
# retries every 15 minutes for 2 hours


Why it is not so?

Best regards,
Elizabeta
 
Last edited by a moderator:

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,909
2,229
463
Hello @Elizabeta,

Where I can put in exim that messages from queue which can not be delivered immediately, try retry not after hour time, but before?
Exim just try to send mail at 18:57 9 time
and succesufull sent mail after hour.

But in my exim.conf is this for retry configuration * * F,2h,15m; G,16h,1h,1.5; F,4d,8h
It specifies
# retries every 15 minutes for 2 hours
Can you run the following command and let us know the output so we can confirm when the first delivery attempt for message ID 1fqLa6-0000Lg-4W was initiated?

Code:
exigrep 1fqLa6-0000Lg-4W /var/log/exim_mainlog
Ensure to post the output in CODE tags and remove any identifying information.

Thank you.
 

Elizabeta

Well-Known Member
Mar 21, 2018
245
33
28
Mostar
cPanel Access Level
Root Administrator
Hello Michael,

Thank you for your answer. The output of command is:

Code:
[[email protected] log]# exigrep 1fqLa6-0000Lg-4W /var/log/exim_mainlog
2018-08-16 18:57:06 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1fqLa6-0000Lg-4W

2018-08-16 18:57:06 1fqLa6-0000Lg-4W <= [email protected] U=root P=local S=48053 [email protected] T="[cpanel.example.com] FAILED \342\233\224: cpanellogd (x.x.x.x)" for [email protected]
2018-08-16 18:57:06 1fqLa6-0000Lg-4W == [email protected] R=lookuphost defer (-36): host lookup for mail.example.com did not complete (DNS timeout?)
2018-08-16 19:56:59 1fqLa6-0000Lg-4W => [email protected] R=lookuphost T=remote_smtp H=mail.net.ba [x.x.x.x] C="250 ok: Message 2642517 accepted"
2018-08-16 19:56:59 1fqLa6-0000Lg-4W Completed
Best regards,
Elizabeta
 
Last edited by a moderator:

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,909
2,229
463
Hello @Elizabeta,

Can you open a support ticket so we can take a closer look at the affected system and verify why the retry attempt occurred one hour later instead of 15 minutes later? You can post the ticket number here and we will link it to this thread.

Thank you.
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,909
2,229
463
Hello,

To update, here's part of the response in the support ticket regarding this issue:

After researching even further, the cause of this behavior is Exim's handling of retry attempts and 'queue runs'. When a deferred message is added to Exim's retry database, it is provided a timestamp for when it is safe to re-attempt delivery.

However, the next delivery attempt is handled when Exim performs a routine queue run, which is when Exim attempts to process any undelivered messages still sitting in its queue. By default, this queue run is set to process once per hour via the '-q1h' flag:

=====
# ps aux | grep [e]xim
mailnull 5035 0.0 0.0 77508 3976 ? Ss Aug24 0:00 /usr/sbin/exim -bd -q1h -oP /var/spool/exim/exim-daemon.pid
=====

As such, while a message may be marked as safe to re-attempt delivery after 15 or 30 minutes, the message will not be processed until Exim's next queue run is processed.

For additional insight, you can also read Exim's official documentation on its handling of retry rules here:
32. Retry configuration

Here is a quote from this page that summarizes the functionality:

"If such a delivery suffers a temporary failure, the retry data is updated as normal, and subsequent delivery attempts from queue runs occur only when the retry time for the local address is reached."

This behavior resulted in some inaccuracies during my testing, as Exim seems to perform a new queue run immediately upon starting. When observing this behavior, and observing the hour-long period of no activity in the server's Exim log during the time of this issue, I theorized that Exim was 'online' but 'unresponsive' due to a server issue that was occurring outside of the Exim service.

However, with this new understanding in mind, we can now re-observe this hour delay being part of the default Exim behavior.

In short, the earlier messages were added to the retry database with a retry time of 15 minutes at 18:57, but the messages were not processed until 19:56:59 due to the scheduling of Exim's queue runs at this time.
Thank you.