CloudLinux 8 + CageFS + Munin Plugin issues

mtindor

Well-Known Member
Sep 14, 2004
1,394
72
178
inside a catfish
cPanel Access Level
Root Administrator
I just set up a CloudLinux 8 box with the latest WHM and CageFS, LVE Manager, alt-PHP,etc. I installed the Munin plugin, and munin itself is functioning, but apparently it does not have access to certain items in /proc that it it needs in order to show Firewall Throughput, Processes, VMstat, etc.

groups munin
munin : munin clsupergid

As you can see, the munin user is in the clsupergid group and should have access. The user munin is not added in CageFS and in fact is excluded via it being listed in one of the files in /etc/cagefs/excluded/

/var/log/munin/munin-node.log

2021/09/12-08:45:01 CONNECT TCP Peer: "[::ffff:127.0.0.1]:33554" Local: "[::ffff:127.0.0.1]:4949"
2021/09/12-08:45:02 [200549] Error output from vmstat:
2021/09/12-08:45:02 [200549] Error: /proc must be mounted
2021/09/12-08:45:02 [200549] To mount /proc at boot you need an /etc/fstab line like:
2021/09/12-08:45:02 [200549] proc /proc proc defaults
2021/09/12-08:45:02 [200549] In the meantime, run "mount proc /proc -t proc"
2021/09/12-08:45:03 [200549] Error output from swap:
2021/09/12-08:45:03 [200549] /etc/munin/plugins/swap: line 65: /proc/vmstat: Operation not permitted
2021/09/12-08:45:03 [200549] Service 'swap' exited with status 1/0.
2021/09/12-08:45:03 [200549] Error output from irqstats:
2021/09/12-08:45:03 [200549] Can't open /proc/interrupts: Operation not permitted
2021/09/12-08:45:03 [200549] Service 'irqstats' exited with status 1/0.
2021/09/12-08:45:04 [200549] Error output from irqstats:
2021/09/12-08:45:04 [200549] Can't open /proc/interrupts: Operation not permitted
2021/09/12-08:45:04 [200549] Service 'irqstats' exited with status 1/0.
2021/09/12-08:45:04 [200549] Error output from fw_packets:
2021/09/12-08:45:04 [200549] Cannot read /proc/net/snmp: Operation not permitted
2021/09/12-08:45:04 [200549] Service 'fw_packets' exited with status 1/0.
2021/09/12-08:45:04 [200549] Error output from netstat:
2021/09/12-08:45:04 [200549] cannot open /proc/net/snmp: Operation not permitted

The installation was AlmaLinux, WHM, and then conversion to CloudLinux, add CageFS, LVE Manager, alt-php, etc.

When running just under Almalinux is working fine. It didn't stop working until the OS was converted to CloudLinux. And yes, I opened a ticket with CL, but they didn't seem to find the issue. I'm just wondering if there is anyone out there running CL 8, CageFS, LVE Manager, Alt-PHP (equiv setup as me) and has installed the Munin plugin only to see the same issue. I'm basically trying to determine if it's just my machine, or if it's any machine running CL 8 + WHM + CageFS.

Mike
 

cPJustinD

Administrator
Staff member
Jan 12, 2021
286
51
103
Houston
cPanel Access Level
Root Administrator
Hello! Unfortunately, I do not see any similar reports looking through our ticket history. I may suggest uninstalling Munin, and then reinstall it using this process here:

How to install Munin on cPanel

Once reinstalled, I would also suggest rebuilding CageFS:

cagefsctl -M

I hope that this helps. If not, then it may be best to open a support ticket so that our analysts can review the issue more thoroughly and determine what exactly is occurring. You can submit a support request using the "Submit a ticket" link in my signature below.

Please be sure to link this thread when opening the ticket and provide the ticket number here so that we can track the issue appropriately. If possible, please post the resolution on this thread as it may help other community members with similar issues.
 

mtindor

Well-Known Member
Sep 14, 2004
1,394
72
178
inside a catfish
cPanel Access Level
Root Administrator
Hello! Unfortunately, I do not see any similar reports looking through our ticket history. I may suggest uninstalling Munin, and then reinstall it using this process here:

How to install Munin on cPanel

Once reinstalled, I would also suggest rebuilding CageFS:

cagefsctl -M

I hope that this helps. If not, then it may be best to open a support ticket so that our analysts can review the issue more thoroughly and determine what exactly is occurring. You can submit a support request using the "Submit a ticket" link in my signature below.

Please be sure to link this thread when opening the ticket and provide the ticket number here so that we can track the issue appropriately. If possible, please post the resolution on this thread as it may help other community members with similar issues.
If I didn't already say it, Munin was working fine until I converted from Almalinux to Cloudlinux 8, which was my last step. Install Almalinux / Install cPanel / configure whole server to my liking / convert to CloudLinux 8.

I already had a CL tech look at it (and in fact still have an open ticket). His only suggestion was for me to boot into an earlier kernel (I don't have an earlier kernel to boot into except a non-CL kernel, and I have no desire to boot into a non-CL kernel on a [now] production server) or to disable CageFS completely / test / re-enable CageFS. And I did not do the latter because I don't know what happens if I try to just blanket-disable CageFS and then re-enable it after a test. I don't know if I have to go in and reconfigure everything related to CageFS, etc.).

And, although I'd like to have Munin running and reporting on everything it should be reporting on, I don't want to put a ton of effort into it if it ends up simply being a case where there is a flaw in the most current CL 8 kernel that causes it.

That's why I was hoping for people who run CloudLinux 8 + cPanel + CageFS + Munin (and only that specific subset of admins) to chime in and say whether or not their Munin is having the same issues. I have no clue how many CL 8 +cPanel + CageFS boxes exist in production, and especially do not know how many are also running Munin. And then out of that subset, I'm not sure how many admins would have looked at their munin graphs and noticed that various statistics were not available and the processes shown were only the number of processes owned by munin (like 3-4 processes on a busy server). Figured I'd try to get an idea from others running the same setup.

Since I already have asked the CL folk, and I'm not willing to do what they suggested I do to test, do you still think I should open a ticket with cPanel? I certainly can / would. But I don't want to waste anybody's time unnecessarily.

Mike
 

cPJustinD

Administrator
Staff member
Jan 12, 2021
286
51
103
Houston
cPanel Access Level
Root Administrator
Hello again! I was able to replicate the error reported in a test environment, but I could not determine exactly why this was occurring.

I think it would be best to open a support ticket so that our analysts can review the issue more thoroughly and determine what exactly is occurring. You can submit a support request using the "Submit a ticket" link in my signature below.

Please be sure to link this thread when opening the ticket and provide the ticket number here to track the issue appropriately. If possible, please post the resolution on this thread as it may help other community members with similar issues.
 

mtindor

Well-Known Member
Sep 14, 2004
1,394
72
178
inside a catfish
cPanel Access Level
Root Administrator
Hello again! I was able to replicate the error reported in a test environment, but I could not determine exactly why this was occurring.

I think it would be best to open a support ticket so that our analysts can review the issue more thoroughly and determine what exactly is occurring. You can submit a support request using the "Submit a ticket" link in my signature below.

Please be sure to link this thread when opening the ticket and provide the ticket number here to track the issue appropriately. If possible, please post the resolution on this thread as it may help other community members with similar issues.
Thanks, Justin. I'm glad to know I'm not the only one seeing it. I suspect it's due to the current kernel version or current version of something CL related.

I will indeed open up a ticket with you guys, but I won't do it til later this evening as I have to run some errands.

Thanks!

Mike
 

mtindor

Well-Known Member
Sep 14, 2004
1,394
72
178
inside a catfish
cPanel Access Level
Root Administrator
Thanks, Justin. I'm glad to know I'm not the only one seeing it. I suspect it's due to the current kernel version or current version of something CL related.

I will indeed open up a ticket with you guys, but I won't do it til later this evening as I have to run some errands.

Thanks!

Mike

Of note, I think I tested other things are part of this process too -- such as fs.proc_can_see_other_uid = 1 in /etc/sysctl.d/90-cloudlinux.conf to see if my caged users were able to see all processes then. And they weren't able to, even though they should have been able to. Of course, the "munin" user isn't caged at all. It was put into /etc/cagefs/exclude/cpaneluserlist by some process (not me).

At any rate, I'll open a ticket this evening.

Mike
 

mtindor

Well-Known Member
Sep 14, 2004
1,394
72
178
inside a catfish
cPanel Access Level
Root Administrator
@cPJustinD The issue with Cloudlinux and the Munin plugin has been figured out by the Cloudlinux Support. See my ticket for details.

Basically, munin-node runs as the user root, and the cron fires off the 5-minute data collection / graph updates is run as the user munin . I think we all knew this much. But the kicker is that when munin actually processes the plugins that collect the various data, they execute those plugins as user nobody , which is astounding to me.

One could do something foolish like add user nobody to the clsupergid group and it all would work fine. But that should never be done because you really don't want to be granting the extended privileges of the clsupergid group to user nobody.

So those of us running CL and munin are either just going to have to live without much of the useful information we like, or wait for Munin to be updated so that it processes everything under the munin user rather than processing the gathering plugins as user nobody.

Mike
 

cPJustinD

Administrator
Staff member
Jan 12, 2021
286
51
103
Houston
cPanel Access Level
Root Administrator
Thank you for that explanation mtindor! I can certainly understand the inconvenience here, however, I do agree that it would be best to wait for Munin to be updated so that it can process its plugins as a user that can be acceptably added to the clsupergid group.

If you have any other questions or concerns, please let us know!
 
  • Like
Reactions: mtindor

mtindor

Well-Known Member
Sep 14, 2004
1,394
72
178
inside a catfish
cPanel Access Level
Root Administrator
I was doing some investigating, figuring out where in the world the various munin data-gathering processes might have picked up the idea to run as 'nobody'. I found out where. But that's irrelevant.

/etc/munin/plugin-conf.d/cpanel.conf is an interesting file

For all of the stats that would not work (which were stats gathered from the swap, vmstat, netstat, processes, irqstats and fw_packets plugins), you can enter a section into /etc/munin/plugin-conf.d/cpanel.conf to tell it what user / group to process those plugins as. By default, in that file there are no stanzas for the plugins that were not working for me, and so they were being processed by user nobody.

As an example, I added this to /etc/munin/plugin-conf.d/cpanel.conf and then restarted munin-node

[irqstats]
user munin
group clsupergid

[fw_packets]
user munin
group clsupergid

And now, for those plugins everything works fine. Working fine for some others as well, but some of the other ones would likely require root/wheel permissions, and I didn't feel I needed them enough to set them to be run as root . Of course, it's clear by that cpanel.conf file that there are plenty of other plugins that were already set to run as user: root / group: wheel. But for all of the plugins that I had an interest in (except vmstat), adding the stanzas to have them processed as user: munin / group: clsupergid worked.

Maybe that conf file will get obliterated with an update. If so, I have no problem with that. I know where I can go to put the stanzas back in.

Mike
 
  • Like
Reactions: cPJustinD