checkd annoyance during upcp: 2 bugs

Trane Francks

Well-Known Member
Jun 19, 2012
102
10
18
Machida, Tokyo, Japan
cPanel Access Level
Root Administrator
Last night, checkd fired off a warning that named had disappeared. So, at midnight my time, I'm forced to climb out of bed and troubleshoot the problem. Much to my satisfaction, things recovered within a minute. That's the good part.

The bad part: checkd fired off its warning as a response to upcp updating bind packages and restarting the service. This is not the first time I've experienced such false warnings. When server packages are updated as a normal process, I do not expect monitors to wake me up. checkd needs to be updated such that it can be told to ignore a service for a moment so that cell phones all over the planet (literally, in our case) don't sound the alarm for no reason.

As a sidebar, and possibly as a consequence of the bind update, upcp e-mail notification did not get delivered for this update, meaning that the only way I knew of the bind update was because of combing through /var/log/messages on a busy server. So, two things:

1. checkd should not trigger notifications for updates or cPanel-triggered service restarts;
2. upcp notifications should be delayed in case e-mail cannot be processed due to package updates.

Thanks for letting me whine. I'm grumpy after my beauty sleep has been interrupted.
 

Trane Francks

Well-Known Member
Jun 19, 2012
102
10
18
Machida, Tokyo, Japan
cPanel Access Level
Root Administrator
Hi, Kenneth.

We're on Stable and were upgraded to 11.34.1.6. At the time upcp for that day fired up, we had been running 11.34.0.11. And, sorry, yes, I mean chkservd, the cPanel Service Monitor.

edited to add: It would appear that the reason the upcp notification wasn't mailed was because exim packages were also upgraded (4.80-3 to 4.80-5). This seems to have been a really big update: cPanel, bind, exim, mysql ... Of all times for upcp notification to not arrive, this was a bad one. ;)
 
Last edited:

Trane Francks

Well-Known Member
Jun 19, 2012
102
10
18
Machida, Tokyo, Japan
cPanel Access Level
Root Administrator
Last night's Stable upgrade from 11.34.x to 11.36.x produced another ChkServd alarm for an imap upgrade. Add the update log not being mailed, mysql updates, etc. and it's clear you guys are failing to dot the Is and cross the Ts on your upgrade processing.

I really don't want to be awoken for no good reason. An expected upgrade and restart of a service is NOT a good reason.
 

Infopro

Well-Known Member
May 20, 2003
17,113
507
613
Pennsylvania
cPanel Access Level
Root Administrator
Twitter
Are you on any of the cPanel email lists? Do you follow the news about updates?

Quite a few changes have been going on in recent months. Upgrading from 11.34.x to 11.36.x is a big change.


Controlling when you do that sort of important update is not a workaround. It's good business sense.
 

Trane Francks

Well-Known Member
Jun 19, 2012
102
10
18
Machida, Tokyo, Japan
cPanel Access Level
Root Administrator
I appreciate your advice here, but discussions of business sense distract from discussions of bugs. We agree that important updates should be handled with all due caution and consideration. Your caution that 11.34 to 11.36 is a big deal makes me aware that cPanel does not use true semantic versioning (major.minor{.maintenance}). We'll revisit how to handle our upgrade strategies in future. That said ....

It's a reproducible problem that when an upgrade triggers a restart of IMAP, ChkServd spams my admins' phones with failure alarms. It's also a reproducible problem that when a certain combination of cPanel and OS upgrade packages take place during the same upgrade processing, logs are not mailed. Business sense aside, those bugs should be addressed. I noticed it last year and reported it in January. It's now April.

Agile that ain't.