Please whitelist cPanel in your adblocker so that you’re able to see our version release promotions, thanks!

The Community Forums

Interact with an entire community of cPanel & WHM users!

Strange "fake" CPU usage during Backup

Discussion in 'Data Protection' started by Rogerio, Oct 5, 2018.

  1. Rogerio

    Rogerio Well-Known Member

    Joined:
    Sep 26, 2016
    Messages:
    74
    Likes Received:
    13
    Trophy Points:
    8
    Location:
    Sao Paulo, Brazil
    cPanel Access Level:
    Root Administrator
    Hello,

    I'm having a problem with a "fake" CPU usage during cPanel backup. And the problem does not happen always. Only about 3-4 months, for no reason, but I need to reboot the server and lost that day backups.

    Hard to explain, but... let's go.

    Server is a VPS on RamNode, package VDS with only 1 dedicated 3.4 Ghz CPU and 4 Gb RAM. VPS is KVM (host).

    Backup is set to run at 01:30. It runs everyday.

    At 03:00 AM local time, that is 00:00 UTC, I start to receive alerts from my Nagios saying that the CPU is high. When I "top", CPU usage is 100% idle but the average counters are crazy, like: 1.02 (1 min), 1.03 (5 min), 1.02 (15 min). But, in fact, nothing is using CPU except the basic 2-3% for services and OS.

    But... at 23:55 UTC, CPU is like 0.02 0.03 0.02 - In real world, the 1.00 (15 min) is not true. So, I believe that the counters just move to 1.0 1.0 1.0 immediately at 00:00 UTC.

    Logs...

    and stuck... then, a reboot, and the next line:

    The "System load is currently 0.88" is normal, occurs everday, but proceeds after some seconds.

    Ideas: something related to daily Cron? The server is 95% cPanel basic install in a minimum CentOS install and everything as recommended by docs... I just install other packages like mrtg, systat, iotop, nrpe (nagios) after done.

    Note: in the first moment appears to be related to CentOS 7, KVM but... the problem only happens on cPanel servers AND during the backup AND 00:00 UTC (always). And this is not related to RamNode because the problem happened in another cPanel server once (but I don't remember if it was KVM, Xen or OpenVZ).

    So... any ideas? :(
     
  2. cPanelLauren

    cPanelLauren Forums Analyst II Staff Member

    Joined:
    Nov 14, 2017
    Messages:
    6,161
    Likes Received:
    474
    Trophy Points:
    233
    Location:
    Houston
    cPanel Access Level:
    DataCenter Provider
    Hi @Rogerio

    Is this happening on multiple servers you have or only one? This isn't behavior I've seen, to be honest but if you're experiencing it on several servers I might suggest you open a ticket using the link in my signature so that we can observe the behavior while it's occurring. Once open please reply with the Ticket ID here so that we can update this thread with the resolution once the ticket is resolved.


    Thanks!
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. Rogerio

    Rogerio Well-Known Member

    Joined:
    Sep 26, 2016
    Messages:
    74
    Likes Received:
    13
    Trophy Points:
    8
    Location:
    Sao Paulo, Brazil
    cPanel Access Level:
    Root Administrator
    Hello Lauren,

    the problem is something rare, so, hard to monitor. I don't know how to fix, only rebooting. Do you know any way to force Linux "reload" the CPU average counters? Or a way to kill a stuck cPanel backup and force to re-run?
     
  4. cPanelLauren

    cPanelLauren Forums Analyst II Staff Member

    Joined:
    Nov 14, 2017
    Messages:
    6,161
    Likes Received:
    474
    Trophy Points:
    233
    Location:
    Houston
    cPanel Access Level:
    DataCenter Provider
    It really sounds like IO wait on the hostnode to me especially if you're running VPS servers this is a common occurrence.

    There is not that I am aware of no.

    If it truly is stuck you can always kill the process and then manually restart it by running:
    Code:
    /usr/local/cpanel/bin/backup --force
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. LucasRolff

    LucasRolff Well-Known Member

    Joined:
    May 27, 2013
    Messages:
    59
    Likes Received:
    45
    Trophy Points:
    18
    cPanel Access Level:
    Root Administrator
    I can also recommend trying to install netdata github.com/netdata/netdata or possibly the Munin plugin in cPanel - it might be you get a high load alert, and by the time you log in, the actual process that causes these load issues is "done" - thus giving you 100% idle in CPU.

    It's important to remember that the unix load is relative and average over the intervals 1, 5 and 15 minutes, so if you have a big spike in load 50 seconds ago and nothing now, the 1 minute load will still be "high".

    The benefit of netdata or munin is the fact you have historical data to check - I personally prefer netdata because it goes down to a 1 second resolution for a whole lot of system metrics (it keeps them for 1 hour by default, and doesn't consume too much memory).

    It will probably help you narrowing down the problem a lot easier.
     
    #5 LucasRolff, Oct 9, 2018
    Last edited by a moderator: Oct 12, 2018
    cPanelLauren likes this.
  6. Rogerio

    Rogerio Well-Known Member

    Joined:
    Sep 26, 2016
    Messages:
    74
    Likes Received:
    13
    Trophy Points:
    8
    Location:
    Sao Paulo, Brazil
    cPanel Access Level:
    Root Administrator
    Hello @cPanelLauren, thanks for the infos.

    @LucasRolff I already use Munin, but, as I said, the problem is only the counters. The server was idle for sure until cPanel backup. I run MRTG too, every 5 minutes, and it shows no CPU usage until the problem. And I run NRPE (Nagios) too, that monitors CPU usage every 2 minutes and notify by SMS and PushOver.
     
  7. cPanelLauren

    cPanelLauren Forums Analyst II Staff Member

    Joined:
    Nov 14, 2017
    Messages:
    6,161
    Likes Received:
    474
    Trophy Points:
    233
    Location:
    Houston
    cPanel Access Level:
    DataCenter Provider
    If when it happens again I'd be curious to see some sysstats info specifically what the output of the following is:

    Code:
    sar -p 
    Or for historic usage (pinpoint a specific time/date)

    Code:
    sar -p -f /var/log/sa/saXX
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
    Rogerio likes this.
  8. Rogerio

    Rogerio Well-Known Member

    Joined:
    Sep 26, 2016
    Messages:
    74
    Likes Received:
    13
    Trophy Points:
    8
    Location:
    Sao Paulo, Brazil
    cPanel Access Level:
    Root Administrator
    Hello @cPanelLauren, ok, I'll update this thread when happen again. Hopefully, the problem is rare, every 3-4 months only.
     
    cPanelLauren likes this.
Loading...

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice