The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Most services are failing repeatedly (notably chkservd)

Discussion in 'General Discussion' started by UmbraHosting, Mar 22, 2008.

  1. UmbraHosting

    UmbraHosting Member

    Joined:
    Mar 12, 2006
    Messages:
    13
    Likes Received:
    0
    Trophy Points:
    1
    Hey everyone!

    I have been looking through the log files most of the day today and I can't find anything that would cause this. I recently acquired a VPS so that I can test out RoR hosting. The VPS has CentOS and cPanel/WHM is installed:

    WHM 11.15.0 cPanel 11.18.3-R21703
    CENTOS Enterprise 4.6 i686 on virtuozzo - WHM X v3.1.0

    I have ~40 *very* low traffic domains hosted on the server and none are currently running RoR. I have logged in numerous times today and found this:

    root@jenkins [~]# ps aux |grep exim
    root 19950 0.0 0.0 1664 476 pts/0 S+ 16:48 0:00 grep exim
    root@jenkins [~]# ps aux |grep cppop
    root 20053 0.0 0.0 1668 480 pts/0 S+ 16:48 0:00 grep cppop
    root@jenkins [~]# w
    16:48:29 up 10 days, 17:44, 1 user, load average: 0.06, 0.02, 0.00
    USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
    root pts/0 c-98-223-89-85.h 16:47 0.00s 0.06s 0.00s w
    root@jenkins [~]# /etc/init.d/chkservd restart
    Stopping chkservd: [FAILED]
    Starting chkservd: [ OK ]

    The services that are repeatedly failing are:

    named
    imap
    ftpd
    eximstats
    exim
    cpsrvd
    antirelayd
    chkservd

    Apache has been online for 4 days however and the server as been online 10 days. I searched through the forums but didn't see anything that would lead me down the right path, does anyone know where I should start?

    Thanks!
    Tom
     
  2. nyjimbo

    nyjimbo Well-Known Member

    Joined:
    Jan 25, 2003
    Messages:
    1,125
    Likes Received:
    0
    Trophy Points:
    36
    Location:
    New York
    First thing I would check is how much actual ram you have, also what does "top" show, how much swapping/paging going on. Dont assume you have enough ram, make sure you can see how much is free and if anything in the logs mention memory issues.

    Might also check to see if you are running out of space on any drives, we had a drive run out of space on one partition and different tasks would die.
     
  3. UmbraHosting

    UmbraHosting Member

    Joined:
    Mar 12, 2006
    Messages:
    13
    Likes Received:
    0
    Trophy Points:
    1
    First, thank you for taking the time to reply!

    Second, I asked my host if they recommend adding more RAM to the VPS, but they did not really answer the question. Here's what top and my disk space look like:

    root@jenkins [~]# top
    top - 21:34:30 up 10 days, 22:30, 2 users, load average: 0.38, 0.54, 0.31
    Tasks: 56 total, 1 running, 55 sleeping, 0 stopped, 0 zombie
    Cpu(s): 0.4% us, 0.3% sy, 0.0% ni, 99.3% id, 0.0% wa, 0.0% hi, 0.0% si
    Mem: 1048576k total, 161936k used, 886640k free, 0k buffers
    Swap: 0k total, 0k used, 0k free, 0k cached

    root@jenkins [~]# df -h
    Filesystem Size Used Avail Use% Mounted on
    /dev/vzfs 20G 7.3G 13G 37% /
    simfs 20G 7.3G 13G 37% /tmp
    simfs 20G 7.3G 13G 37% /var/tmp

    It doesn't seem to be using too much memory and disk space is not an issue. I thought it might have been CSF+LFD but I disabled them yesterday and it is still happening :eek:(
     
  4. nyjimbo

    nyjimbo Well-Known Member

    Joined:
    Jan 25, 2003
    Messages:
    1,125
    Likes Received:
    0
    Trophy Points:
    36
    Location:
    New York
    Ok, is that the top with everything running and working ok or with things dead?.

    I ask because it shows "Tasks: 56 total" which is very little, compared to what a normal *nix would have when you consider all the normal system tasks, plus all the python, courier, apache, stunnel, etc. I just have this feeling that you see alot of free ram because nothing is running. Can you do a "ps ax" and see if you have normal jobs like apache, courier, etc actually up?
     
  5. UmbraHosting

    UmbraHosting Member

    Joined:
    Mar 12, 2006
    Messages:
    13
    Likes Received:
    0
    Trophy Points:
    1
    Surprisingly, the top is with all of the services up and running. I was sure CSF+LFD were disabled but LFD was still running on the server. I stopped it right after your post last night and I haven't experienced any further issues, so I think LFD was killing off the processes. Here is the list that it wasn't supposed to kill:

    exe:/usr/local/cpanel/3rdparty/bin/english/webalizer
    exe:/usr/lib/courier-imap/bin/pop3d
    exe:/usr/lib/courier-imap/bin/imapd
    exe:/usr/sbin/pure-ftpd
    exe:/usr/local/cpanel/cpsrvd
    exe:/usr/local/cpanel/3rdparty/bin/imapd
    exe:/usr/local/apache/bin/httpd
    exe:/usr/local/cpanel/bin/cppop
    exe:/usr/sbin/sshd
    exe:/usr/sbin/proftpd
    exe:/usr/local/cpanel/3rdparty/bin/php
    exe:/usr/local/cpanel/3rdparty/bin/analog
    exe:/usr/local/urchin/bin/urchinwebd
    exe:/usr/local/cpanel/cpsrvd-ssl
    exe:/usr/bin/spamc
    exe:/usr/local/cpanel/bin/cppop-ssl
    exe:/usr/local/apache1/bin/httpd
    exe:/usr/local/apache2/bin/httpd
    exe:/usr/local/cpanel/bin/logrunner
    exe:/usr/local/cpanel/cpdavd
    exe:/usr/local/cpanel/bin/cpwrap
    user:umbra
    exe:/usr/libexec/gam_server

    Named, chkservd, antirelayd, eximstats, exim, etc. are not in that list :eek: I did a locate on those functions on the server but do not know what to include in the lfd list so that it doesn't kill the *good* processes.
     
  6. seby

    seby Well-Known Member

    Joined:
    Oct 18, 2005
    Messages:
    46
    Likes Received:
    0
    Trophy Points:
    6
    Thanks for sharing. I have also same issue and solved by restarted lfd and Cpanel upgarde.
     
Loading...

Share This Page