Problem with high load, high iowait CPU

Berbox

Well-Known Member
Apr 4, 2005
55
0
156
Belgium
Hello,

I have a hugh problem with a server.
For about 3 days now the server is showing randomly high loads. The period where the loads occur are only increasing. Now the load is there for several ours.

I check all sar parameters but they are all normal except for the CPU iowait. Below you can see it is 78% but mostly it is about 90-100%.

top - 03:18:34 up 1:39, 4 users, load average: 6.94, 6.95, 8.25
Tasks: 126 total, 3 running, 120 sleeping, 1 stopped, 2 zombie
Cpu(s): 19.3% us, 2.7% sy, 0.0% ni, 0.0% id, 78.1% wa, 0.0% hi, 0.0% si
Mem: 2073956k total, 1701132k used, 372824k free, 120984k buffers
Swap: 2096440k total, 0k used, 2096440k free, 1180004k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22559 nobody 15 0 31396 19m 20m S 8.6 0.9 0:00.66 httpd
22360 nobody 15 0 31368 19m 20m S 4.3 0.9 0:00.74 httpd
22560 nobody 15 0 31532 19m 21m S 2.7 0.9 0:00.46 httpd
22362 nobody 16 0 31864 19m 21m S 2.3 1.0 0:12.36 httpd
22359 nobody 15 0 31500 19m 21m S 0.7 0.9 0:00.38 httpd
22378 nobody 15 0 31688 19m 21m S 0.7 1.0 0:00.65 httpd
22636 root 16 0 2524 968 1684 R 0.3 0.0 0:00.13 top
1 root 16 0 2224 548 1424 S 0.0 0.0 0:00.97 init
2 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
3 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 events/0
4 root 15 -10 0 0 0 S 0.0 0.0 0:00.00 khelper
5 root 15 -10 0 0 0 S 0.0 0.0 0:00.00 kacpid
26 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 kblockd/0
36 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pdflush
37 root 15 0 0 0 0 S 0.0 0.0 0:00.29 pdflush
39 root 14 -10 0 0 0 S 0.0 0.0 0:00.00 aio/0
27 root 15 0 0 0 0 S 0.0 0.0 0:00.00 khubd


Can somebody give me some advice what is causing this??? I disabled all services in Cpanel but this is not lowering the CPU iowait. It's urgent. Thank you.

I try to move the accounts on the server but there are not much cpu resource left so the transfers are really slow.
 

AndyReed

Well-Known Member
PartnerNOC
May 29, 2004
2,217
4
193
Minneapolis, MN
I check all sar parameters but they are all normal except for the CPU iowait. Below you can see it is 78% but mostly it is about 90-100%.
Try to optimize your drive(s) as much as possible. Are they PATA, SATA, or SCSI? If they're PATA, use hdparm to set the best possible settings for the drives and your chipset.

You'll also need to look for the processes that require the resources and why they're not getting them. You might also want to renice MySQL server to -10 if you're running it. This gives it more priority so it can do it's thing faster, but still less priority that vital system processes.

Also view "dmesg" "ps auxwww" "free" "netstat -vap"

Also you may check "dstat -cd -f" or "hdparm -tT" if it's CentOS.
 

Berbox

Well-Known Member
Apr 4, 2005
55
0
156
Belgium
Hello Andy

Thanks for the reply.

I analyzed the sar statistics, did a filesytem and memory check but everything was normal.
Also the system stays stable. I disabled also HyperThreading in the bios and did a kernel upgrade.

Probably the cooling system of the CPU is broke because I heared that a P4 CPU with HT will downclock himself to a lower speed. This explains the high iowait.
I'm investigating this know and everything points to this problem.
I managed to decrease the load below 1 by moving some heavy accounts and disable spamassassin.
I only have to check in the datacenter if my conclusion is wright.
 

AndyReed

Well-Known Member
PartnerNOC
May 29, 2004
2,217
4
193
Minneapolis, MN
I disabled also HyperThreading in the bios and did a kernel upgrade.
Did you compile the Kernel by hand? If yes, did you enable inotify module into the kernel?
 

Berbox

Well-Known Member
Apr 4, 2005
55
0
156
Belgium
After a month of problems the problem with the high io wait showed up again. This time the problem did not disappear. The server was so slow I couldn't move the accounts to another server.

I did every possible check (harddisk, motherboard, memory, file sysytem) but didn't find anything until I did my final effort, switching the network controller. All of a sudden the problem was gone. Everything works ok now.

Sometimes you have to look really hard; Here a list off possible causes of high server loads:

  • High server loads could be caused by just one or several resource-intensive application(s). Examples include very high-traffic Web sites, database-driven Web sites, forums, gaming sites, file download sites and so on.
  • A high server load can also be caused by a malicious script or a "runaway script" which can continuously loop, dragging down the server's resources.
  • Too many websites on the one server - with the cumulative resources resulting in high server load.
  • Running out of memory and swapping to the swap file.
  • Server backups or server updates are taking place.
  • Mis-configured software causing errors.
  • Users sending large mailing lists.
  • Users trying to bounce spam.
  • Users/spammers sending spam email.
  • Hardware issues including memory leak, bad hard drive, and network card.