The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

High cpu - i dont find why

Discussion in 'Workarounds and Optimization' started by abdelhost77, Jul 21, 2013.

  1. abdelhost77

    abdelhost77 Well-Known Member

    Joined:
    Apr 25, 2012
    Messages:
    81
    Likes Received:
    1
    Trophy Points:
    8
    cPanel Access Level:
    Root Administrator
    Hello ,

    FROM Yesterday my CPU is increasing and i dont find really why , here after output of some commands :


    top - 11:18:17 up 13:03, 1 user, load average: 16.92, 19.96, 19.63
    Tasks: 253 total, 2 running, 248 sleeping, 0 stopped, 3 zombie
    Cpu(s): 4.7%us, 1.3%sy, 0.0%ni, 42.5%id, 51.2%wa, 0.0%hi, 0.3%si, 0.0%st
    Mem: 8023892k total, 7535632k used, 488260k free, 1523304k buffers
    Swap: 10239992k total, 0k used, 10239992k free, 4117108k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    22453 mysql 20 0 2386m 348m 5364 S 1.0 4.4 25:11.13 mysqld
    6864 nobody 20 0 209m 34m 2368 S 0.3 0.4 0:00.05 httpd
    6921 nobody 20 0 209m 34m 2208 S 0.3 0.4 0:00.04 httpd
    6942 nobody 20 0 209m 34m 2292 S 0.3 0.4 0:00.02 httpd
    7027 nobody 20 0 209m 34m 2348 S 0.3 0.4 0:00.02 httpd
    7116 nobody 20 0 209m 34m 2332 S 0.3 0.4 0:00.03 httpd
    7513 xxxx 20 0 111m 10m 6220 S 0.3 0.1 0:00.01 php
    22578 root 20 0 14768 696 480 S 0.3 0.0 0:02.92 dovecot
    23248 root 20 0 135m 35m 3464 S 0.3 0.5 0:29.52 httpd
    1 root 20 0 19352 1528 1212 S 0.0 0.0 0:00.86 init
    2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
    3 root RT 0 0 0 0 S 0.0 0.0 0:01.60 migration/0
    4 root 20 0 0 0 0 S 0.0 0.0 0:06.92 ksoftirqd/0
    5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
    6 root RT 0 0 0 0 S 0.0 0.0 0:00.19 watchdog/0
    7 root RT 0 0 0 0 S 0.0 0.0 0:00.59 migration/1
    8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1
    9 root 20 0 0 0 0 S 0.0 0.0 0:03.55 ksoftirqd/1
    10 root RT 0 0 0 0 S 0.0 0.0 0:00.06 watchdog/1
    11 root RT 0 0 0 0 S 0.0 0.0 0:00.70 migration/2
    12 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/2
    13 root 20 0 0 0 0 S 0.0 0.0 0:04.55 ksoftirqd/2
    14 root RT 0 0 0 0 S 0.0 0.0 0:00.10 watchdog/2




    IOTOP


    Total DISK READ: 381.54 K/s | Total DISK WRITE: 0.00 B/s
    TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
    6195 be/4 nobody 253.06 K/s 0.00 B/s 0.00 % 2.15 % httpd -k start -DSSL
    6403 be/4 nobody 3.89 K/s 0.00 B/s 0.00 % 1.03 % httpd -k start -DSSL
    6256 be/4 nobody 124.59 K/s 0.00 B/s 0.00 % 0.00 % httpd -k start -DSSL
    6270 be/4 nobody 0.00 B/s 7.79 K/s 0.00 % 0.00 % httpd -k start -DSSL
    6169 be/4 nobody 0.00 B/s 7.79 K/s 0.00 % 0.00 % httpd -k start -DSSL
    6145 be/4 nobody 0.00 B/s 7.79 K/s 0.00 % 0.00 % httpd -k start -DSSL
    382 be/4 biocuir 0.00 B/s 15.57 K/s 0.00 % 0.00 % pure-ftpd (UPLOAD)
    1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init
    2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]
    3 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0]
    4 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0]
    5 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0]
    6 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [watchdog/0]
    7 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/1]
    8 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/1]
    9 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/1]
    10 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [watchdog/1]



    df -h

    Filesystem Size Used Avail Use% Mounted on
    /dev/sda2 39G 8.7G 28G 24% /
    tmpfs 3.9G 0 3.9G 0% /dev/shm
    /dev/sda1 985M 51M 884M 6% /boot
    /dev/sda7 386G 69G 298G 19% /home
    /dev/sda6 4.9G 271M 4.3G 6% /tmp
    /dev/sda3 20G 7.5G 11G 41% /usr





    free -m


    total used free shared buffers cached
    Mem: 7835 7350 485 0 1486 4010
    -/+ buffers/cache: 1852 5983
    Swap: 9999 0 9999




    I find also this in /var/log/messages


    Jul 21 11:20:29 hiver kernel: ata1: EH complete
    Jul 21 11:20:30 hiver kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Jul 21 11:20:30 hiver kernel: ata1.00: BMDMA stat 0x24
    Jul 21 11:20:30 hiver kernel: ata1.00: failed command: READ DMA
    Jul 21 11:20:30 hiver kernel: ata1.00: cmd c8/00:08:10:7d:35/00:00:00:00:00/e5 tag 0 dma 4096 in
    Jul 21 11:20:30 hiver kernel: res 51/40:07:11:7d:35/00:00:00:00:00/e5 Emask 0x9 (media error)
    Jul 21 11:20:30 hiver kernel: ata1.00: status: { DRDY ERR }
    Jul 21 11:20:30 hiver kernel: ata1.00: error: { UNC }
    Jul 21 11:20:30 hiver kernel: ata1.00: configured for UDMA/133
    Jul 21 11:20:30 hiver kernel: sd 0:0:0:0: [sda] Unhandled sense code
    Jul 21 11:20:30 hiver kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    Jul 21 11:20:30 hiver kernel: sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
    Jul 21 11:20:30 hiver kernel: Descriptor sense data with sense descriptors (in hex):
    Jul 21 11:20:30 hiver kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
    Jul 21 11:20:30 hiver kernel: 05 35 7d 11
    Jul 21 11:20:30 hiver kernel: sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
    Jul 21 11:20:30 hiver kernel: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 05 35 7d 10 00 00 08 00





    Please help me !!!
     
    #1 abdelhost77, Jul 21, 2013
    Last edited: Jul 21, 2013
  2. thinkbot

    thinkbot Well-Known Member

    Joined:
    Oct 30, 2012
    Messages:
    326
    Likes Received:
    0
    Trophy Points:
    16
    cPanel Access Level:
    Root Administrator
    51.2%wa


    very high I/O usage, is this VPS or dedicated ?
    do you had backup running at a time or some slow mysql queries ?
     
  3. abdelhost77

    abdelhost77 Well-Known Member

    Joined:
    Apr 25, 2012
    Messages:
    81
    Likes Received:
    1
    Trophy Points:
    8
    cPanel Access Level:
    Root Administrator


    It is dedicated ,
    no there is no slow MYSQL Queries ,



    the %io is not always high see top result below :

    top - 15:02:44 up 16:47, 1 user, load average: 27.13, 28.10, 25.36
    Tasks: 296 total, 3 running, 291 sleeping, 0 stopped, 2 zombie
    Cpu(s): 17.1%us, 4.1%sy, 0.0%ni, 75.7%id, 2.1%wa, 0.0%hi, 1.1%si, 0.0%st
    Mem: 8023892k total, 7565348k used, 458544k free, 794264k buffers
    Swap: 10239992k total, 172k used, 10239820k free, 5130108k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    22453 mysql 20 0 2450m 379m 5384 S 17.2 4.8 30:38.50 mysqld
    31301 xxxx 20 0 136m 14m 6532 R 1.0 0.2 0:00.03 php-cgi
    22589 dovecot 20 0 37612 2628 1804 S 0.7 0.0 0:02.43 imap-login
    30339 nobody 20 0 209m 34m 2356 S 0.7 0.4 0:00.07 httpd
    31265 xxxx 20 0 98060 15m 2416 S 0.7 0.2 0:00.02 cpsrvd-ssl
    31273 xxxx 20 0 97988 15m 2416 S 0.7 0.2 0:00.02 cpsrvd-ssl
    31275 xxxx 20 0 97980 15m 2416 S 0.7 0.2 0:00.02 cpsrvd-ssl
    31279 xxxx 20 0 98136 15m 2416 S 0.7 0.2 0:00.02 cpsrvd-ssl
    31281 xxxx 20 0 98136 15m 2416 S 0.7 0.2 0:00.02 cpsrvd-ssl
    31285 xxxx 20 0 98348 15m 2408 S 0.7 0.2 0:00.02 cpsrvd-ssl
    31299 xxxx 20 0 0 0 0 Z 0.7 0.0 0:00.02 php <defunct>
    1336 root 20 0 0 0 0 R 0.3 0.0 0:59.18 kondemand/0
    13543 xxxx 20 0 135m 1600 708 S 0.3 0.0 0:01.59 pure-ftpd
    23539 root 20 0 97500 13m 1796 S 0.3 0.2 0:08.15 cpsrvd-ssl
    25875 nobody 20 0 209m 35m 3012 S 0.3 0.5 0:00.26 httpd
    28264 nobody 20 0 209m 34m 2356 S 0.3 0.4 0:00.08 httpd



    i think maybe it is harddisk issue
    is it safe to reboot with fsck option ? => shutdown -Fr now
     
    #3 abdelhost77, Jul 21, 2013
    Last edited: Jul 21, 2013
  4. thinkbot

    thinkbot Well-Known Member

    Joined:
    Oct 30, 2012
    Messages:
    326
    Likes Received:
    0
    Trophy Points:
    16
    cPanel Access Level:
    Root Administrator
    yes

    you can also do some hdd test with smartctl
     
  5. abdelhost77

    abdelhost77 Well-Known Member

    Joined:
    Apr 25, 2012
    Messages:
    81
    Likes Received:
    1
    Trophy Points:
    8
    cPanel Access Level:
    Root Administrator



    /usr/sbin/smartctl -q errorsonly -H -l selftest -l error /dev/sda


    SMART overall-health self-assessment test result: FAILED!
    Drive failure expected in less than 24 hours. SAVE ALL DATA.
    Failed Attributes:
    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
    1 Raw_Read_Error_Rate 0x002f 001 001 051 Pre-fail Always FAILING_NOW 36753

    ATA Error Count: 12694 (device log contains only the most recent five errors)
    Error 12694 occurred at disk power-on lifetime: 16743 hours (697 days + 15 hours)
    Error 12693 occurred at disk power-on lifetime: 16743 hours (697 days + 15 hours)
    Error 12692 occurred at disk power-on lifetime: 16743 hours (697 days + 15 hours)
    Error 12691 occurred at disk power-on lifetime: 16743 hours (697 days + 15 hours)
    Error 12690 occurred at disk power-on lifetime: 16743 hours (697 days + 15 hours)

    Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
    # 1 Short offline Completed: read failure 90% 16731 87391503



    What do you think please ?
     
  6. ilaurens

    ilaurens Active Member

    Joined:
    Jul 13, 2013
    Messages:
    28
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    Make a backup asap.

    That harddisk will fail soon and lose all data on it.
     
  7. abdelhost77

    abdelhost77 Well-Known Member

    Joined:
    Apr 25, 2012
    Messages:
    81
    Likes Received:
    1
    Trophy Points:
    8
    cPanel Access Level:
    Root Administrator
    Hopefully i have j-2 backup ,

    The DC will not change HD before tomorrow , and the server seems to work fine from websites speed opening point of view ( not very slow ) . even if cpu is more than 20 !!! ( yes i confirm it is strange :) )

    im thinking to reboot with fsck ( shutdown -Fr now ) , may be there is just some corrupted filesystems that should be corrected but i read somewhere that fsck may stucks and we need KVM or console to unblock .

    i do not have console , and DC are not responsive today ( sunday) and dont know what is the best option to choose , my goal is the minimum downtime as we have +500 cpanel live accounts on that server ?
     
    #7 abdelhost77, Jul 21, 2013
    Last edited: Jul 21, 2013
  8. Infopro

    Infopro cPanel Sr. Product Evangelist
    Staff Member

    Joined:
    May 20, 2003
    Messages:
    14,478
    Likes Received:
    203
    Trophy Points:
    63
    Location:
    Pennsylvania
    cPanel Access Level:
    Root Administrator
    Twitter:
    Make another set of backups just to be safe while you wait.
     
  9. ilaurens

    ilaurens Active Member

    Joined:
    Jul 13, 2013
    Messages:
    28
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    Might be, but it has trouble with reading sectors that is the reason for high cpu usage. It has to make multiple cycles to receive the data.
     
  10. abdelhost77

    abdelhost77 Well-Known Member

    Joined:
    Apr 25, 2012
    Messages:
    81
    Likes Received:
    1
    Trophy Points:
    8
    cPanel Access Level:
    Root Administrator

    The CPU go back to normal , i think may be it was only a corrumpted filesystem that was be corrected !!?? i do not know , and did not perform any actions except making backup of hard disk to external server .

    please see top below , all seems to work very fine now , im wondering if still a good idea to change harddisk ?!


    top - 23:55:17 up 1 day, 1:40, 2 users, load average: 0.24, 0.41, 0.43
    Tasks: 240 total, 3 running, 235 sleeping, 0 stopped, 2 zombie
    Cpu(s): 11.1%us, 2.8%sy, 0.0%ni, 82.2%id, 2.7%wa, 0.0%hi, 1.2%si, 0.0%st
    Mem: 8023892k total, 7469112k used, 554780k free, 1318128k buffers
    Swap: 10239992k total, 4004k used, 10235988k free, 3808156k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    30002 user 20 0 0 0 0 Z 13.6 0.0 0:00.41 php <defunct>
    22453 mysql 20 0 2578m 501m 4988 S 6.0 6.4 49:24.66 mysqld
    30017 user 20 0 111m 10m 6152 R 1.0 0.1 0:00.03 php
    28946 nobody 20 0 0 0 0 Z 0.7 0.0 0:00.16 httpd <defunct>
    29562 nobody 20 0 210m 35m 2336 S 0.7 0.5 0:00.07 httpd
    17 root 20 0 0 0 0 S 0.3 0.0 0:13.46 ksoftirqd/3
    19 root 20 0 0 0 0 S 0.3 0.0 0:33.40 events/0
    34 root 20 0 0 0 0 S 0.3 0.0 0:59.12 kblockd/0
    14318 root 20 0 99920 4336 3328 S 0.3 0.1 0:02.08 sshd
    14583 root 20 0 15164 1456 952 R 0.3 0.0 0:29.02 top
    23248 root 20 0 135m 36m 3464 S 0.3 0.5 1:08.38 httpd
    25225 user 20 0 135m 1528 684 S 0.3 0.0 0:00.72 pure-ftpd
    27066 nobody 20 0 210m 35m 2356 S 0.3 0.5 0:00.07 httpd
    28281 nobody 20 0 210m 35m 2392 S 0.3 0.5 0:00.23 httpd
    28969 nobody 20 0 210m 35m 2364 S 0.3 0.5 0:00.10 httpd
    28972 nobody 20 0 210m 35m 2376 S 0.3 0.5 0:00.12 httpd
    29552 nobody 20 0 210m 35m 2300 S 0.3 0.5 0:00.07 httpd
    29688 nobody 20 0 210m 35m 2340 S 0.3 0.5 0:00.03 httpd
    29712 nobody 20 0 210m 35m 2296 S 0.3 0.4 0:00.03 httpd
    29758 nobody 20 0 210m 35m 2352 S 0.3 0.5 0:00.03 httpd
    29772 nobody 20 0 210m 35m 2348 S 0.3 0.5 0:00.05 httpd
    29785 nobody 20 0 210m 35m 2280 S 0.3 0.4 0:00.02 httpd
    29791 nobody 20 0 210m 35m 2292 S 0.3 0.4 0:00.02 httpd
    1 root 20 0 19352 1136 908 S 0.0 0.0 0:01.08 init
     
    #10 abdelhost77, Jul 21, 2013
    Last edited: Jul 21, 2013
  11. abdelhost77

    abdelhost77 Well-Known Member

    Joined:
    Apr 25, 2012
    Messages:
    81
    Likes Received:
    1
    Trophy Points:
    8
    cPanel Access Level:
    Root Administrator
    Just to be sure ,

    we also rebooted with fsck option ; and here after the boot logs relateed to filesystems :

    ...........
    Checking filesystems
    /dev/sda2: clean, 175011/2564096 files, 2430862/10240000 blocks
    /dev/sda1: clean, 46/64000 files, 16938/256000 blocks
    /dev/sda7: clean, 2871827/25665536 files, 19465422/102639616 blocks
    /dev/sda6: clean, 30943/320000 files, 88745/1280000 blocks
    /dev/sda3: clean, 156109/1281120 files, 1998422/5120000 blocks
    ^[[60G[^[[0;32m OK ^[[0;39m]^M
    Remounting root filesystem in read-write mode: ^[[60G[^[[0;32m OK ^[[0;39m]^M
    Mounting local filesystems: ^[[60G[^[[0;32m OK ^[[0;39m]^M
    Enabling local filesystem quotas: ^[[60G[^[[0;32m OK ^[[0;39m]^M
    Enabling /etc/fstab swaps: ^[[60G[^[[0;32m OK ^[[0;39m]^M
    Entering non-interactive startup
    Calling the system activity data collector (sadc):
    Starting securetmp: *** Notice *** No loop module detected
    If the loopback block device is built as a module, try running `modprobe loop` as root via ssh and running this script again.
    If the loopback block device is built into the kernel itself, you can ignore this message.
    Securing /tmp & /var/tmp
    Securing /tmp... Done
    Setting up /var/tmp... Done
    Checking fstab for entries ...Done
    Logrotate TMPDIR already configured
    Process Complete
     
  12. abdelhost77

    abdelhost77 Well-Known Member

    Joined:
    Apr 25, 2012
    Messages:
    81
    Likes Received:
    1
    Trophy Points:
    8
    cPanel Access Level:
    Root Administrator
    Just if anyone else encounter the same pb ,
    i think it is maybe just related to some RPM update from cpanel , a cpanel expert can confirm this , because when checking crontab , i find that the time where the errors related to disk disapear and cpu go down is almost the same time of execution of :

    /usr/local/cpanel/scripts/upcp

    So it seems somehow that cpanel update did fix the issu .

    so if someone else encounter the same pb it will be a good idea to run
    /usr/local/cpanel/scripts/upcp

    What is your opinions please ?
     
    #12 abdelhost77, Jul 21, 2013
    Last edited: Jul 21, 2013
  13. thinkbot

    thinkbot Well-Known Member

    Joined:
    Oct 30, 2012
    Messages:
    326
    Likes Received:
    0
    Trophy Points:
    16
    cPanel Access Level:
    Root Administrator
    whats the smartctl output on hdd health ?
     
  14. ilaurens

    ilaurens Active Member

    Joined:
    Jul 13, 2013
    Messages:
    28
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    I second this question, because the previous smart error ment that the harddisk will break soon. The indication of 1 day does not really mean it'll break after one day but it's just a indicator that it will high likely break soon. That is why we said to make a backup of the data to avoid any problems regarding data loss.

    If you follow it, we do not mind, it's just a suggestion. If the data is not important than do not make a backup.
     
Loading...

Share This Page