deseweb

Member
Aug 4, 2010
11
0
51
cPanel Access Level
Root Administrator
Hi

Before you start posting I want to inform that already tried to solve this problem with our Admin, with companies and professionals in the field, but none of them was able to identify and solve our problem.

To avoid spending more money without having a solution to our problem, I wonder if any friend of the Forum could help us identify the process that is causing problem of disk i/o.

The server has high load during various periods of the day and night, and I could see that when the load is high, the disk i/o is in 100%, as follows:

I got this result monitoring i/o every 15 seconds

Time: 02:40:27 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 2.86 0.47 0.27 3.73 2.66 8.73 95.27 34027.00 1363.82 99.95

Time: 02:40:42 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.73 0.47 0.80 4.27 12.80 13.47 98.33 48013.47 789.58 100.01

Time: 02:40:57 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.47 0.47 3.73 4.80 9.14 99.22 27893.64 1071.57 100.01

Time: 02:41:12 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.47 0.33 3.73 3.73 9.33 96.26 47267.17 1250.17 100.01

Time: 02:41:27 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.47 0.27 3.73 2.13 8.00 94.25 44034.09 1363.64 100.00

Time: 02:41:42 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.53 0.53 6.40 45.87 49.00 89.86 70475.62 937.69 100.02

Time: 02:41:57 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.47 0.27 4.80 2.13 9.45 84.25 54866.55 1363.82 100.01

Time: 02:42:12 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 13.33 187.40 65.93 30.00 1609.60 1849.07 36.05 33.66 7428.83 10.43 100.01

Time: 02:42:27 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 15.40 199.53 91.40 70.67 1420.80 2176.53 22.20 5.98 38.75 6.16 99.86

Time: 02:42:42 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 1.40 120.45 35.38 113.19 353.36 1869.15 14.96 13.91 93.68 3.20 47.48

Time: 02:42:57 AM
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 12.60 0.67 6.80 5.33 155.20 21.50 0.13 16.88 7.71 5.76
Now I can not identify the process that is causing this problem, I passed that could be excessive connection problem in Dovecot, Apache or MySQL, but can't seem to find which one is the real problem or even if they are the problem.

We ask our datancenter that check the server disks for errors or problems, but the disks are well second datacenter experts.

Which way do I go now? Remembering that I am not an expert in the subject, I am trying to solve because professionals who have tried have failed.

Thanks
 

deseweb

Member
Aug 4, 2010
11
0
51
cPanel Access Level
Root Administrator
Hey deseweb,

That output looks very much like the output from iostat rather than iotop. As alphawolf50 suggested, if you install and run iotop, it will give you an output listing your processes and how much read/write IO they are using.
Sorry, I confused between the commands.

The problem of 'iotop' is the display, very fast and sometimes I can not fix the processes.

I am using the command 'iotop --only'

Thanks
 

LDHosting

Well-Known Member
Jan 19, 2008
93
2
58
cPanel Access Level
Root Administrator
It's no problem, they are rather similar, easy enough to mistake them at a glance.

If the output is going a little quick for you, there are a few things you could try to make it more readable:

Slow the refresh rate down (example would refresh every 5 seconds):
Code:
iotop -od 5
Output to a text file instead (example would output 20 times to /root/iotop.txt)
Code:
iotop -obn 20 > /root/iotop.txt
You may also want to check the output from vmstat
Code:
vmstat 5 5
 
Last edited:

deseweb

Member
Aug 4, 2010
11
0
51
cPanel Access Level
Root Administrator
Hi,

Based on the information below, how can I fix the problem?

# iotop -od 5

Code:
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN IO      COMMAND
  671 be/3 root        0.00 B/s    0.00 B/s  0.00 % 99.99 % [kjournald]
    8 rt/3 root        0.00 B/s    0.00 B/s  0.00 %  0.08 % [migration/2]
   14 rt/3 root        0.00 B/s    0.00 B/s  0.00 %  0.08 % [migration/4]
25210 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.06 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
   11 rt/3 root        0.00 B/s    0.00 B/s  0.00 %  0.06 % [migration/3]
   20 rt/3 root        0.00 B/s    0.00 B/s  0.00 %  0.06 % [migration/6]
   23 rt/3 root        0.00 B/s    0.00 B/s  0.00 %  0.02 % [migration/7]
 6305 be/4 named       0.00 B/s    0.00 B/s  0.00 %  0.02 % named -u named
 6390 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.02 % xinetd -stayalive -pidfile /var/run/xinetd.pid
21302 be/4 nobody      0.00 B/s    0.00 B/s  0.00 %  0.02 % httpd -k start -DSSL
 6019 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kondemand/7]
 6016 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kondemand/4]
 6053 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6054 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6015 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kondemand/3]
28041 be/5 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6051 be/5 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6017 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kondemand/5]
 6018 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kondemand/6]
 7099 be/6 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % pure-uploadscript -B -r /usr/share/ilabs_antimalware/pure-ftpd-inspector.php
 6216 be/4 dbus        0.00 B/s    0.00 B/s  0.00 %  0.00 % dbus-daemon --system
10647 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % -bash
 5926 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % syslogd -m 0
 5929 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % klogd -x
 6066 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
19110 be/4 nobody      0.00 B/s 1619.05 B/s  0.00 %  0.00 % httpd -k start -DSSL
 6056 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6058 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6059 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6060 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6061 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6062 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6064 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6065 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
 6057 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
29179 be/4 inodontc    0.00 B/s    0.00 B/s  0.00 %  0.00 % imap [[email protected] 189.48.249.146]
21303 be/4 nobody      0.00 B/s    0.00 B/s  0.00 %  0.00 % httpd -k start -DSSL
19097 be/5 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % perl /usr/local/cpanel/bin/leechprotect
19107 be/4 nobody      0.00 B/s    2.37 K/s  0.00 %  0.00 % httpd -k start -DSSL
 6076 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
   10 rt/3 root        0.00 B/s    0.00 B/s  0.08 %  0.00 % [watchdog/2]
   22 rt/3 root        0.00 B/s    0.00 B/s  0.06 %  0.00 % [watchdog/6]
   29 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [events/3]
 5707 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
19113 be/4 nobody      0.00 B/s    0.00 B/s  0.00 %  0.00 % httpd -k start -DSSL
28976 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % python /usr/bin/iotop -od 5
 6067 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
25528 be/4 dovecot     0.00 B/s    0.00 B/s  0.00 %  0.00 % pop3-login
    7 rt/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [watchdog/1]
   16 rt/3 root        0.00 B/s    0.00 B/s  0.08 %  0.00 % [watchdog/4]
   19 rt/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [watchdog/5]
   25 rt/3 root        0.00 B/s    0.00 B/s  0.02 %  0.00 % [watchdog/7]
   28 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [events/2]
   34 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [khelper]
   12 be/7 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/3]
   13 rt/3 root        0.00 B/s    0.00 B/s  0.06 %  0.00 % [watchdog/3]
   30 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [events/4]
   31 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [events/5]
22117 be/5 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % spamd child
Note: I changed the information 'hostname.domain.com.br.pid' for security reasons.

Thanks!
 
Last edited:

alphawolf50

Well-Known Member
Apr 28, 2011
186
2
68
cPanel Access Level
Root Administrator
The process with the highest I/O seems to be "kjournald", which is responsible for updating the journals on filesystems such as ext3 or ext4. You *might* be able to solve the issue by mounting your partitions with "noatime". If you're not sure how to do that, please post the output of the following command:
Code:
cat /etc/fstab
Please paste the output inside of "code" tags rather than "quote" tags, as this makes it much easier to read.
 

deseweb

Member
Aug 4, 2010
11
0
51
cPanel Access Level
Root Administrator
The process with the highest I/O seems to be "kjournald", which is responsible for updating the journals on filesystems such as ext3 or ext4. You *might* be able to solve the issue by mounting your partitions with "noatime". If you're not sure how to do that, please post the output of the following command:
Code:
cat /etc/fstab
Please paste the output inside of "code" tags rather than "quote" tags, as this makes it much easier to read.
Hi Alphawolf50,

Result:

/dev/VolGroup89/ROOT / ext3 usrjquota=quota.user,jqf mt=vfsv0 1 1
LABEL=/boot /boot ext3 defaults 1 2
tmpfs /dev/shm tmpfs defaults,noexec,nosuid 0 0
/tmp /var/tmp ext3 defaults,bind,noauto,usr quota,noexec,nosuid,nodiratime,noatime 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
/dev/VolGroup89/SWAP swap swap defaults 0 0
/usr/tmpDSK /tmp ext3 defaults,noauto 0 0
/dev/sdb1 /hd2 ext3 defaults 1 2
Thanks!
 

alphawolf50

Well-Known Member
Apr 28, 2011
186
2
68
cPanel Access Level
Root Administrator
Two things:

  1. Either there is an error in your /etc/fstab, or you accidentally added a space where there shouldn't be one. In the first line, there should be no space in "jqf mt=vfsv0". Please run the command again and confirm that there is no space in the output.
  2. **Please paste the results in [ CODE ] tags, instead of [ QUOTE ] tags.** :) It will make it easier for both of us.
 

deseweb

Member
Aug 4, 2010
11
0
51
cPanel Access Level
Root Administrator
Hi Alphawolf50,

Result:

Code:
/dev/VolGroup89/ROOT	/                       ext3	usrjquota=quota.user,jqfmt=vfsv0 1 1
LABEL=/boot             /boot                   ext3    defaults 1 2
tmpfs                   /dev/shm                tmpfs   defaults,noexec,nosuid 0 0
/tmp                    /var/tmp                ext3    defaults,bind,noauto,usrquota,noexec,nosuid,nodiratime,noatime 0 0
devpts                  /dev/pts                devpts  gid=5,mode=620 0 0
sysfs                   /sys                    sysfs   defaults 0 0
proc                    /proc                   proc    defaults 0 0
/dev/VolGroup89/SWAP    swap                    swap    defaults 0 0
/usr/tmpDSK             /tmp                    ext3    defaults,noauto 0 0
/dev/sdb1	        /hd2	                ext3	defaults 1 2
Thanks!
 
Last edited:

InterServed

Well-Known Member
Jul 10, 2007
267
8
68
cPanel Access Level
DataCenter Provider
Hi,

Is this a virtual machine or no virtualization layer involved ?

Please show the output of the following:
cat /proc/sys/vm/dirty_background_ratio
cat /proc/sys/vm/dirty_ratio
cat /proc/sys/vm/dirty_expire_centisecs
cat /proc/sys/vm/dirty_writeback_centisecs
 
Last edited:

deseweb

Member
Aug 4, 2010
11
0
51
cPanel Access Level
Root Administrator
Hi,

Is this a virtual machine or no virtualization layer involved ?

Please show the output of the following:
cat /proc/sys/vm/dirty_background_ratio
cat /proc/sys/vm/dirty_ratio
cat /proc/sys/vm/dirty_expire_centisecs
cat /proc/sys/vm/dirty_writeback_centisecs
Hi InterServed,

Results:

Code:
cat /proc/sys/vm/dirty_background_ratio -> 2
cat /proc/sys/vm/dirty_ratio -> 8
cat /proc/sys/vm/dirty_expire_centisecs -> 3000
cat /proc/sys/vm/dirty_writeback_centisecs -> 500
Is a dedicated server without virtualization, used for shared hosting.

I've been taking a look with the command 'iotop' and it seems that there are other processes consuming 100% of IO, as for example 'pdflush'

Thanks!
 

InterServed

Well-Known Member
Jul 10, 2007
267
8
68
cPanel Access Level
DataCenter Provider
Hi,

If you wish , you can test the fallowing changes:
edit: /etc/sysctl.conf
add the following:
Code:
vm.dirty_background_ratio = 5
vm.dirty_ratio = 15
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
execute:
Code:
sync; echo 3 > /proc/sys/vm/drop_caches; sysctl -p
Then monitor and see how's the io after this changes.
 

deseweb

Member
Aug 4, 2010
11
0
51
cPanel Access Level
Root Administrator
I added the lines as suggested.

I will now monitor the I/O.

Thanks

- - - Updated - - -

Hi,

If you wish , you can test the fallowing changes:
edit: /etc/sysctl.conf
add the following:
Code:
vm.dirty_background_ratio = 5
vm.dirty_ratio = 15
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
execute:
Code:
sync; echo 3 > /proc/sys/vm/drop_caches; sysctl -p
Then monitor and see how's the io after this changes.
After you add the lines below to run any command or restart the server?

Thanks!
 

deseweb

Member
Aug 4, 2010
11
0
51
cPanel Access Level
Root Administrator
Hi,

Unfortunately not resolved!

I/O:

Code:
Total DISK READ: 2.39 K/s | Total DISK WRITE: 15.91 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
 7729 be/4 nobody    407.22 B/s    0.00 B/s  0.00 % 99.99 % httpd -k start -DSSL
26579 be/5 mysql       0.00 B/s 2036.12 B/s  0.00 % 99.99 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
26065 be/5 mysql     407.22 B/s    0.00 B/s  0.00 % 99.99 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysq~en-files-limit=50000 --pid-file=/var/lib/mysql/hostname.domain.com.br.pid
32592 be/4 mailnull  407.22 B/s    0.00 B/s -0.01 % 99.99 % exim -bd -q60m
32605 be/4 mailnull  407.22 B/s    0.00 B/s  0.00 % 92.73 % exim -bd -q60m
Load:

Code:
top - 17:02:08 up 3 days, 13:42,  2 users,  load average: 18.43, 7.30, 5.40
Tasks: 285 total,   1 running, 282 sleeping,   1 stopped,   1 zombie
Cpu(s):  0.1%us,  0.0%sy,  0.0%ni, 40.5%id, 59.3%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16623420k total,  2505780k used, 14117640k free,    67608k buffers
Swap:  5144568k total,       92k used,  5144476k free,   426152k cached
Thanks!