named.service killed after clamd fail

marjwyatt

Well-Known Member
Jun 23, 2014
48
5
58
cPanel Access Level
Reseller Owner
More frequently, lately, I've awoken to discover that named.service has failed and not restarted. This always follows a failure of clamd.

It may have begun when I initiated this thread:
Increase in clamd failures after manually updating

One of my clients notified me early this morning that all of their sites hosted with me were unavailable. I rebooted the server and all sites are back online but I would like to get to the bottom of the issue and fix whatever has caused this to happen.

Here are the entries from /var/log/messages leading up to the named.service being killed.
Code:
Oct 26 20:50:43 server pure-ftpd: ([email protected]) [INFO] New connection from 127.0.0.1
Oct 26 20:50:43 server pure-ftpd: ([email protected]) [INFO] __cpanel__service__auth__ftpd__4HEcC7DM32YN10E_Zjcyv_mfGG6uCxsxrr3HQvLDtU_eqXjqG5DIv4uxJuDqFKfQ is now logged in
Oct 26 20:50:43 server pure-ftpd: (__cpanel__service__auth__ftpd__4[email protected]127.0.0.1) [INFO] Logout.
Oct 26 20:55:01 server systemd: Created slice User Slice of root.
Oct 26 20:55:01 server systemd: Starting User Slice of root.
Oct 26 20:55:01 server systemd: Started Session 15936679 of user root.
Oct 26 20:55:01 server systemd: Starting Session 15936679 of user root.
Oct 26 20:55:04 server systemd: Removed slice User Slice of root.
Oct 26 20:55:04 server systemd: Stopping User Slice of root.
Oct 26 20:55:43 server pure-ftpd: ([email protected]) [INFO] New connection from 127.0.0.1
Oct 26 20:55:43 server pure-ftpd: ([email protected]) [INFO] __cpanel__service__auth__ftpd__4HEcC7DM32YN10E_Zjcyv_mfGG6uCxsxrr3HQvLDtU_eqXjqG5DIv4uxJuDqFKfQ is now logged in
Oct 26 20:55:43 server pure-ftpd: (__cpanel__service__auth__ftpd__4[email protected]127.0.0.1) [INFO] Logout.
Oct 26 20:58:15 server systemd: clamd.service: main process exited, code=killed, status=9/KILL
Oct 26 20:58:15 server systemd: Unit clamd.service entered failed state.
Oct 26 20:58:15 server systemd: clamd.service failed.
Oct 26 20:58:18 server mysqld_safe: /usr/bin/mysqld_safe: line 183: 21725 Killed                  nohup /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=server.websitehosting101.com.err --open-files-limit=10000 --pid-file=server.websitehosting101.com.pid < /dev/null >> /var/lib/mysql/server.websitehosting101.com.err 2>&1 >> /var/lib/mysql/server.websitehosting101.com.err 2>&1
Oct 26 20:58:21 server mysqld_safe: 181026 20:58:21 mysqld_safe Number of processes running now: 0
Oct 26 20:58:21 server systemd: named.service: main process exited, code=killed, status=9/KILL
I scanned journalctl specifically for named and the only entries within were from after this morning's reboot of the VPS. I scanned this forum for similar issues and the first thread that popped up was the one that I started in September and posted above. I have searched the internet for exact match errors and I can find no help out in the wild, either.

As I mentioned above, I would like to find and resolve the root cause of this problem. I'm not sure where to begin so I'm seeking some guidance from folks who might know.
 

GOT

Get Proactive!
PartnerNOC
Apr 8, 2003
1,740
300
363
Chesapeake, VA
cPanel Access Level
DataCenter Provider
My gut suspicion is that you are running out of ram and processes are getting killed. Have you been looking at/monitoring your ram utilization? Because according to the logs, MySQL was also killed off.
 

marjwyatt

Well-Known Member
Jun 23, 2014
48
5
58
cPanel Access Level
Reseller Owner
My gut suspicion is that you are running out of ram and processes are getting killed. Have you been looking at/monitoring your ram utilization? Because according to the logs, MySQL was also killed off.
When I opened the support thread in September, @cPanelLauren and I exchanged about a possible RAM shortage and she indicated that she didn't think that was the problem. I notified A2 to add 2GB RAM just to eliminate this as the potential cause.

About every 2 hours since I rebooted the server, clamd has failed and recovered. I still think it is related to that, somehow. The email notification that I receive shows that the top process is spamd and that is only using around 4% of available memory.

You ask if I am looking at or monitoring ram utilization. Can you tell me what tools or scripts you use to do this?
 

GOT

Get Proactive!
PartnerNOC
Apr 8, 2003
1,740
300
363
Chesapeake, VA
cPanel Access Level
DataCenter Provider
Lacking an external monitoring system you could install Munin and check the memory graphs at the times you see the clamd failures.

You might consider removing clamd for a few days and see if the problem goes away.

You might consider switching the dns server over to NSD or pwerdns which takes up less memory.

You should take a look at service manager and make sure that the name server is set to restart on detected failure.
 
  • Like
Reactions: rpvw

marjwyatt

Well-Known Member
Jun 23, 2014
48
5
58
cPanel Access Level
Reseller Owner
Lacking an external monitoring system you could install Munin and check the memory graphs at the times you see the clamd failures.

You might consider removing clamd for a few days and see if the problem goes away.

You might consider switching the dns server over to NSD or pwerdns which takes up less memory.

You should take a look at service manager and make sure that the name server is set to restart on detected failure.
Thanks for the info and suggestions
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,913
2,201
363
Hello @marjwyatt,

You may also want to consider opening a support ticket so we can take a closer look at your system to see what's happening. You can post the ticket number here and we will link this thread to it.

Thank you.
 

marjwyatt

Well-Known Member
Jun 23, 2014
48
5
58
cPanel Access Level
Reseller Owner
Hello @marjwyatt,

You may also want to consider opening a support ticket so we can take a closer look at your system to see what's happening. You can post the ticket number here and we will link this thread to it.

Thank you.
I have initiated a request for a memory upgrade through A2. The billing has been authorized but I don't know when they will be performing that upgrade.

Thanks for suggesting that I open a support ticket, @cPanelMichael. The support ticket number is: 10597881. I hope I did an adequate job of explaining everything.
 
  • Like
Reactions: cPanelMichael

rpvw

Well-Known Member
Jul 18, 2013
1,101
457
113
UK
cPanel Access Level
Root Administrator
I have been trying to make some sense of the issues that clamd seems to be provoking for a few users.

I use CentOS on the desktop with clamd running as a daemon, and often see it using in excess of 1 GB of memory whilst sitting idle.

I have not profiled this to see what the peak usage might be under load (either on my desktop box or my servers) since they all have sufficient RAM as to make such testing unnecessary (so far......).

Searching through a number of reports of high clamd memory usage, I eventually found an explanation that, to me at least, begins to make some sense, but what I am hypothesising here may be way off track, and not be a fair representation of the issue at all !

If we accept that clamd used to work without falling over, and rationalise that the server is hosting the same number of websites with the same traffic; something must have changed to provoke the out-of-memory events that lead to the damon crashes (and any other related or collateral damage) and we must examine what those changes might have been.

There is no single simple answer, but a significant contributory cause would seem to be the ever growing data set that is the malware signature lists.

Freshclam updates the main, daily and bytecode signature list (normally every 4 hours) - I am not sure how often on a cPanel installation, but assuredly at least once a day.

Combine that with a ClamAV version that is almost always behind the latest release, and therefore possibly unable to make the most efficient use of the newest signatures, AND the ever growing demands from the operating system and from cPanel and its controlled software and daemons (FTP, mail, Apache, DNS etc etc) and you may experience significant out-of-memory events when the right perfect storm of incoming mail needs to be scanned, combined with some other event like backups or heavy SQL queries.

I stumbled upon a number of articles expressing the desire for the signatures to be swapped out of memory when the daemon is not using them, or for a facility to only load a partial data set so as to save on the total memory footprint - but the philosophy of the software developers would seem to be that any loading/unloading of the malware signatures would itself create unnecessary disk I/O, and probably use more system resources than just having the signatures permanently loaded into memory. I kinda/sorta see their point, and I have to confess that I don't know enough detail about how the daemon works to make any sensible comments or suggestions.

To reiterate a comment I made elsewhere "Nothing stays the same". Even normal updates and upgrades may contribute to a changed memory footprint that will need to be periodically reassessed, and if you have bothered to read this far down the post ....thank you !

Hope this helps.
 
  • Like
Reactions: marjwyatt

marjwyatt

Well-Known Member
Jun 23, 2014
48
5
58
cPanel Access Level
Reseller Owner
@rpvw, thanks for sharing your analysis with this thread. What you've offered makes sense. On my local computers, I suppose I could compare it to the amount of resources Malwarebytes uses during its database updates and scans.

My woes on this issue began in mid-May of this year. I may have added a couple of development sites, since then, but traffic is not impacted by these sites. Since most of my development work is done offline using Xampp, the new sites on my VPS are there merely for review and tweaking prior to moving them over to their final destination.

Yesterday (October 28, 2018), I uninstalled the ClamAV plugin and reinstalled the latest version through WHM. This morning, there were issues with a duplicate clam database so I resolved that by following the advice on this thread in the forum.
CLAMD duplicate database?

I won't know if that solved the problem until tomorrow morning.

Another correspondent on this thread suggested installing Munin. I would but I'm afraid it will exacerbate the overuse of memory.

I have contacted A2 Hosting and requested a memory upgrade for my Dynamic VPS. I'm not sure if that is the root cause or when the memory upgrade will be scheduled.

If there are any revelations to share with the public after cPanel Support has looked things over, I will do so in order to assist others who might be experiencing similar issues.
 
  • Like
Reactions: cPanelMichael