spamd falls over every few days

ItsMattSon

Well-Known Member
Sep 5, 2016
182
38
103
Perth
cPanel Access Level
Root Administrator
Hi cPanel,

spamd falls over once every day or two and I'm having difficulty determining why. Is it normal for services to have to be restarted that often?

Is there anything I'm supposed to review in particular when this service falls down so often or do I ignore them and worry less?
Code:
Service Name spamd
Service Status failed
Notification The service “spamd” appears to be down.
Service Check Method The system’s command to check or to restart this service failed.
Number of Restart Attempts 1
Service Check Raw Output
The subprocess “/usr/local/cpanel/scripts/restartsrv_spamd” reported error number 9 when it ended.
Startup Log No startup log
Memory Information
Used 2.98 GB
Available 5.02 GB
Installed 8 GB
Load Information 0.35 0.20 0.17
Uptime 19 days, 10 hours, 12 minutes, and 43 seconds
IOStat Information avg-cpu: %user %nice %system %iowait %steal %idle 0.17 0.04 0.06 0.01 0.00 99.72 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
Top Processes
PID Owner CPU% Memory % Command
21424 root 21.84 0.02 spamd-dormant: waiting for connections --max-spare=1 --max-children=3 --allowed-ips=127.0.0.1,::1 --pidfile=/var/run/spamd.pid --listen=5
21393 root 1.00 0.32 tailwatchd - chkservd - ftpd check
1757 mysql 0.10 1.06 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=srv.domain.com.err --open-files-limit=10000 --pid-file=/var/lib/mysql/srv.domain.com.pid
1427 cpanelsolr 0.05 3.30 /usr/lib/jvm/jre-1.8.0/bin/java -server -Xms512m -Xmx512m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -XX:-OmitStackTraceInFastThrow -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/home/cpanelsolr/server/logs/solr_gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=9 -XX:GCLogFileSize=20M -Dsolr.log.dir=/home/cpanelsolr/server/logs -Djetty.port=8984 -DSTOP.PORT=7984 -DSTOP.KEY=solrrocks -Dhost=127.0.0.1 -Duser.timezone=UTC -Djetty.home=/home/cpanelsolr/server -Dsolr.solr.home=/home/cpanelsolr/server/solr -Dsolr.install.dir=/home/cpanelsolr -Xss256k -Dsolr.autoSoftCommit.maxTime=3000 -Dsolr.log.muteconsole -XX:OnOutOfMemoryError=/home/cpanelsolr/bin/oom_solr.sh 8984 /home/cpanelsolr/server/logs -jar start.jar --module=http
27948 root 0.05 0.35 lfd - sleeping
 
Last edited by a moderator:

ItsMattSon

Well-Known Member
Sep 5, 2016
182
38
103
Perth
cPanel Access Level
Root Administrator
I've pulled this out of /var/log/exim_mainlog. Not sure if it points to the potential cause? It does match the time of issue however, and time of recovery.

I've stripped out my email, server IP (XX.XX.XXX.XX) and domain but hopefully it still helps. If I missed anything, please remove for me admins :$
Code:
2017-09-19 03:25:02 cwd=/home/munin 9 args: /usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t -f root
2017-09-19 03:25:02 1du1fC-0005ZC-8c <= [email protected] U=munin P=local S=815 T="Cron <[email protected]> /usr/local/cpanel/3rdparty/perl/524/bin/munin-cron" for munin
2017-09-19 03:25:02 cwd=/var/spool/exim 4 args: /usr/sbin/exim -odi -Mc 1du1fC-0005ZC-8c
2017-09-19 03:25:02 1du1fC-0005ZC-8c => munin <[email protected]> R=localuser T=dovecot_delivery C="250 2.0.0 <[email protected]> EMGTFY4drFmjUwYYb9rQoQ Saved"
2017-09-19 03:25:02 1du1fC-0005ZC-8c Completed
2017-09-19 03:25:10 cwd=/etc/csf 2 args: /usr/sbin/exim -bpc
2017-09-19 03:25:13 SMTP connection from [127.0.0.1]:43371 (TCP/IP connection count = 1)
2017-09-19 03:25:13 SMTP connection from (localhost) [127.0.0.1]:43371 closed by QUIT
2017-09-19 03:25:13 SMTP connection from [127.0.0.1]:43378 (TCP/IP connection count = 1)
2017-09-19 03:25:13 send: Connection refused at /etc/exim.pl.local line 2271.
2017-09-19 03:25:13 SMTP connection identification H=localhost A=127.0.0.1 P=43378 U=root ID=0 S=root B=identify_local_connection
2017-09-19 03:25:13 1du1fN-0005aP-Q5 <= [email protected] H=(localhost.localdomain) [127.0.0.1]:43378 P=esmtp S=49380 T="[srv.domain.com] FAILED \342\233\224: spamd (XX.XX.XXX.XX)" for [email protected]
2017-09-19 03:25:13 SMTP connection from (localhost.localdomain) [127.0.0.1]:43378 lost
2017-09-19 03:25:13 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1du1fN-0005aP-Q5
2017-09-19 03:25:13 1du1fN-0005aP-Q5 unable to open private key file for reading: /var/cpanel/domain_keys/private/srv.domain.com
2017-09-19 03:25:13 1du1fN-0005aP-Q5 => [email protected] R=send_to_smart_host T=dkim_remote_smtp H=dedrelay.secureserver.net [208.109.80.210] C="250 2.0.0 u1fNdko4rIdip mail accepted for delivery"
2017-09-19 03:25:13 1du1fN-0005aP-Q5 Completed
2017-09-19 03:30:05 cwd=/ 2 args: /usr/sbin/exim -bpu
2017-09-19 03:30:10 cwd=/etc/csf 2 args: /usr/sbin/exim -bpc
2017-09-19 03:30:11 SMTP connection from [127.0.0.1]:43953 (TCP/IP connection count = 1)
2017-09-19 03:30:11 send: Connection refused at /etc/exim.pl.local line 2271.
2017-09-19 03:30:11 SMTP connection identification H=localhost A=127.0.0.1 P=43953 U=root ID=0 S=root B=identify_local_connection
2017-09-19 03:30:11 1du1kB-0005jr-9w <= [email protected] H=(localhost.localdomain) [127.0.0.1]:43953 P=esmtp S=47556 T="[srv.domain.com] RECOVERED \342\235\207: spamd (XX.XX.XXX.XX)" for [email protected]
2017-09-19 03:30:11 SMTP connection from (localhost.localdomain) [127.0.0.1]:43953 lost
2017-09-19 03:30:11 cwd=/var/spool/exim 3 args: /usr/sbin/exim -Mc 1du1kB-0005jr-9w
2017-09-19 03:30:11 1du1kB-0005jr-9w unable to open private key file for reading: /var/cpanel/domain_keys/private/srv.domain.com
2017-09-19 03:30:11 1du1kB-0005jr-9w => [email protected] R=send_to_smart_host T=dkim_remote_smtp H=dedrelay.secureserver.net [208.109.80.54] C="250 2.0.0 u1kBdnPvHtpIc mail accepted for delivery"
2017-09-19 03:30:11 1du1kB-0005jr-9w Completed
2017-09-19 03:30:11 SMTP connection from [127.0.0.1]:43961 (TCP/IP connection count = 1)
2017-09-19 03:30:12 SMTP connection from (localhost) [127.0.0.1]:43961 closed by QUIT
2017-09-19 03:35:04 cwd=/ 2 args: /usr/sbin/exim -bpu
2017-09-19 03:35:11 cwd=/etc/csf 2 args: /usr/sbin/exim -bpc
2017-09-19 03:35:18 SMTP connection from [127.0.0.1]:44620 (TCP/IP connection count = 1)
2017-09-19 03:35:19 SMTP connection from (localhost) [127.0.0.1]:44620 closed by QUIT
 
Last edited by a moderator:

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,909
2,228
463
Hello,

Can you also check /var/log/exim_paniclog and let us know if you see any activity around the time spamd fails?

Thank you.
 

ItsMattSon

Well-Known Member
Sep 5, 2016
182
38
103
Perth
cPanel Access Level
Root Administrator
Hi @cPanelMichael

I've checked the paniclog (it goes back 3 days total) and married up most entries against the last 3 days of failed services emails and they appear to line up so there may be something here...

An example - cpsrvd Failed at 10:25 on the 17th and Recovered at 10:35 on the 17th; 10 mins later. (Usually it's only 5 with other log entries, but 10 in this example for some reason)

2017-09-17 10:25:36 1dtPH6-0002GO-H5 unable to open private key file for reading: /var/cpanel/domain_keys/private/srv.domain.com
2017-09-17 10:35:13 1dtPQP-0002ZV-5f unable to open private key file for reading: /var/cpanel/domain_keys/private/srv.domain.com

Does this shed any light on the matter perhaps?

NOTE - It's not just spamd that has been falling over since Aug 27. It's all of these:
- spamd
- cpsrvd
- tailwatchd
- lfd (ConfigServer Firewall daemon)

All of my failed service emails only go back to August 27th, so perhaps something at that date is involved/the cause. Before that it was smooth sailing since the server came online on Feb 20th.

I don't want to take your attention away from the above private key-related log entry so just as FYI i'll mention that the only Aug 27 (and prior) issue (the first since Feb) was, according to emails from cPanel Monitoring, a failed backup for my main website. I've had none of those failed backup emails since however.
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,909
2,228
463
Hello,

Can you open a support ticket using the link in my signature so we can take a closer look?

Thank you.
 

ItsMattSon

Well-Known Member
Sep 5, 2016
182
38
103
Perth
cPanel Access Level
Root Administrator
Hi @cPanelMichael

I'm not sure that's an option for me because I didn't purchase my license directly through cPanel. I'm a GoDaddy customer and my license was provided as part of one of their packages.

I understand that without access to my server you are limited in your ability to support this issue.

What is the pricing like to have this looked at? I couldn't find the service fees mentioned anywhere, only that I'd be up for them if I continued with the submission of the support ticket?

Thanks in advance.

By the way, do these permissions looks correct?

File: `/var/cpanel/domain_keys/private'
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: 2a30b601h/707835393d Inode: 1706256 Links: 2
Access: (0710/drwx--x---) Uid: ( 0/ root) Gid: ( 12/ mail)
Access: 2017-09-21 02:00:21.713944326 +0800
Modify: 2017-03-13 15:52:04.490104244 +0800
Change: 2017-04-18 05:27:54.765246488 +0800
File: `/var/cpanel/domain_keys/public'
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: 2a30b601h/707835393d Inode: 1706255 Links: 2
Access: (0711/drwx--x--x) Uid: ( 0/ root) Gid: ( 10/ wheel)
Access: 2017-09-21 02:00:21.739944890 +0800
Modify: 2017-03-14 06:28:28.474089907 +0800
Change: 2017-04-18 05:27:54.765246488 +0800

Just incase you're curious also - The files inside private are 0640 root:mail
 
Last edited:

ItsMattSon

Well-Known Member
Sep 5, 2016
182
38
103
Perth
cPanel Access Level
Root Administrator
Hi guys,

Thanks for trying to help on this one. I appreciate the efforts as always.

I uninstalled/reinstalled Munin from WHM Plugins on the 26th and haven't had any emails of service failures since. It was at 2-3 per day, every day. I don't actually think the services fell over, despite the emails, so maybe they were just false alarms.

I think this is resolved? :)