I need help with troubleshooting repeated downtime

skrl

Active Member
Mar 18, 2021
37
6
8
Kingston
cPanel Access Level
Website Owner
I have a couple of servers. The first runs cPanel, and is a DNS server and a Mail Server. The second runs Plesk, and is a Web server, and a MySQL server, hosting approximately 10 sites. Also I have UptimeMonitor to test GET of every site every one minute. For the past couple of weeks, I have been getting a notification every other night that one of the sites is experiencing downtime. I am getting an incident start date/time, and an end date/time. On average the reported downtime lasts for 30 to 40 minutes, and it takes place during the early morning hours, so there is no one active at that time to notice it. If it wasn't for UptimeMonitor I would probably not have known myself either.

Anyway, I have SSH'ed in the cPanel server and have run uptime which yields an uptime of days since the last restart.

I have checked the journalctl output, and there are some records in red lettering like systemd-logind[652]: Failed to remove runtime directory /run/user/1005: Device or resource busy otherwise it is mostly white-font records on imap-login or failed password for root from [some IPv4] and then received disconnect from [said IPv4] Bye Bye, or Firewall TCP_IN Blocked and Firewall: UDP_IN Blocked ones.

I have verified that the time-zones are correctly set across both the cPanel server and UptimeRobot, so that I am not checking logs that are X hours off. But just to stay on the safe side of things I have also checked the timestamps before and after the reported timestamps according to my timezone.

I have been unable to trace any indication that the cPanel server is the culprit here.

Would you kindly advise what else I could/should check prior to ruling out this server, and steering my attention to the web server next?
 
Last edited by a moderator:

cPRex

Jurassic Moderator
Staff member
Oct 19, 2014
16,598
2,620
363
cPanel Access Level
Root Administrator
Hey there! Does running "last reboot" also confirm the time you see from the "uptime" command? If so, it seems something is rebooting the machine - or possibly a failed monitoring tool is telling someone the system needs to be rebooted - but that wouldn't be something that is initiated by cPanel tools.
 

skrl

Active Member
Mar 18, 2021
37
6
8
Kingston
cPanel Access Level
Website Owner
Indeed it does confirm that.

"last reboot" was at "Fri Dec 30 16:07 - 10:27 (17+18:20)", and "uptime" yields "10:33:44 up 17 days, 18:26, 1 user, load average: 0.14, 0.12, 0.07"

That is good to know that it can not be cPanel's fault. That would be one less thing to worry about.

Thanks!
 

cPRex

Jurassic Moderator
Staff member
Oct 19, 2014
16,598
2,620
363
cPanel Access Level
Root Administrator
I'm glad that helped. I just wanted to clarify that there isn't a situation where cPanel would ever automatically reboot a server.

This at least confirms that there are entire server reboots happening, so hopefully your host will be able to help track down more details if you aren't able to see anything on your end.
 

skrl

Active Member
Mar 18, 2021
37
6
8
Kingston
cPanel Access Level
Website Owner
That's very helpful to know, that a cPanel process can't automatically trigger a server reboot.

I've also checked the web server logs from Plesk, and there seems to be no record of a restart there either.

At this time there is some fog blocking my view of why I received that UptimeRobot monitor alert, but that is out of the scope of this forum, and I'll have to look elsewhere next.

Thanks again. :D
 
Last edited:
  • Like
Reactions: cPRex