Hello,
Got a stumper for y'all:
But at random, the entire WHM instance will grind to a halt and stop serving requests entirely (internally and externally). Won't even throw an error. Logging hasn't indicated what exactly is the issue. Packet capture suggests spurious UDP traffic pointed at customer cPanel accounts from outside world may be the culprit but it's unknown why. What reinforces something UDP based being the culprit is that I can drop a global UDP traffic block on the upstream firewall for just 5 seconds... and the server instantly springs back to life like nothing happened. It'll run for a while and then have a problem again.
Suspicion was something DNS related (WHM using PowerDNS), but I and my team are coming up dry on the "why". Don't think it's a DNS resolution issue... it's a lack of any response whatsoever to even locally served traffic.
Any ideas we should be looking at?
Got a stumper for y'all:
- CENTOS 7.9 xen hvm [server]
- V94.0.4
- Load Averages: 5.12 5.34 3.67
- 16 core VDS (Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz)
- 16GB RAM (less than 25% utilized)
- 900GB disk (60% utilized)
- ~60 reseller accounts with hosting
- ~15 reseller accounts with combined hosting and email
But at random, the entire WHM instance will grind to a halt and stop serving requests entirely (internally and externally). Won't even throw an error. Logging hasn't indicated what exactly is the issue. Packet capture suggests spurious UDP traffic pointed at customer cPanel accounts from outside world may be the culprit but it's unknown why. What reinforces something UDP based being the culprit is that I can drop a global UDP traffic block on the upstream firewall for just 5 seconds... and the server instantly springs back to life like nothing happened. It'll run for a while and then have a problem again.
Suspicion was something DNS related (WHM using PowerDNS), but I and my team are coming up dry on the "why". Don't think it's a DNS resolution issue... it's a lack of any response whatsoever to even locally served traffic.
Any ideas we should be looking at?
Last edited by a moderator: