The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

monitoring service with chkservd

Discussion in 'cPanel Developers' started by hgbas, Dec 23, 2011.

  1. hgbas

    hgbas Registered

    Joined:
    Dec 23, 2011
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    Hello,

    I am trying to get chkservd to monitor a service called openfire, but it continually shows it as down and restarts it.

    Here's my running process:
    Code:
    USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
    daemon    5787  1.2  8.5 370152 67132 ?        Sl   08:31   0:12 /opt/openfire/jre/bin/java -server -DopenfireHome=/opt/openfire -Dopenfire.lib.dir=/opt/openfire/lib -classpath /opt/openfire/lib/startup.jar -jar /opt/openfire/lib/startup.jar
    Cpanel version:
    Code:
    11.30.5.3
    And chkservd setting (service based):
    Code:
    /etc/chkserv.d/openfire
    service[openfire]=x,x,x,/etc/init.d/openfire restart,java,daemon
    Running PID:
    Code:
    cat /var/run/openfire.pid 
    5787
    checkservd status?
    Code:
    cat /var/run/chkservd/openfire 
    +
    What am I doing wrong?

    Thanks
     
  2. cPanelDavidG

    cPanelDavidG Technical Product Specialist

    Joined:
    Nov 29, 2006
    Messages:
    11,279
    Likes Received:
    8
    Trophy Points:
    38
    Location:
    Houston, TX
    cPanel Access Level:
    Root Administrator
    If it's restarting it, then the restart command is correct.

    Is the process name in ps: java (case sensitive)?

    Is it running as user: daemon (case sensitive)?
     
  3. cPanelDavidG

    cPanelDavidG Technical Product Specialist

    Joined:
    Nov 29, 2006
    Messages:
    11,279
    Likes Received:
    8
    Trophy Points:
    38
    Location:
    Houston, TX
    cPanel Access Level:
    Root Administrator
    Okay, the user is correct, but that's a long process name. Not sure on what to put in here myself. Typically chkservd monitors executables rather than interpreted scripts/bytecode.
     
  4. hgbas

    hgbas Registered

    Joined:
    Dec 23, 2011
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    Hello David,

    Thanks for reply.

    Yes, it does restart, but that's the problem :) It keeps restarting over & over each 8m interval (chkservd monitoring), and I get the - fail:

    Code:
    ..openfire [[check command:-][tcp connect:N/A][fail count:1]Restarting openfire....
    So, I have to disable it until I can figure it out.

    I checked the "sub _servicecmdcheck" call in /usr/local/cpanel/Cpanel/TailWatch/ChkServd.pm, and I'm not entirely certain what exactly is returned into my @RUN = from the `ps` + grep.

    I tried enabling --debug in tailwatchd, but it did not shed any more light why it is failing.

    Here's what I get from abbreviated
    Code:
    ps -U daemon
      PID TTY          TIME CMD
     5787 ?        00:00:24 java
    ps xuww -U daemon returns the full CMD as given previously + a bunch of normal processes owned by root, so the list is large.

    Any help or suggestions would be appreciated.
     
  5. cPanelDavidG

    cPanelDavidG Technical Product Specialist

    Joined:
    Nov 29, 2006
    Messages:
    11,279
    Likes Received:
    8
    Trophy Points:
    38
    Location:
    Houston, TX
    cPanel Access Level:
    Root Administrator
    Thinking on this with a fresh mind (yay holidays), I would say just try "java" as the command to look for. Listing all the parameters sent to the command isn't something we do, you can just look at our own stuff in /etc/chkserv.d/ - the best (and perhaps most complex) example is ftpd.

    Unfortunately, by the nature of running Java apps this way, this also means essentially any Java app running will trigger a false positive.
     
  6. cPanelTristan

    cPanelTristan Quality Assurance Analyst
    Staff Member

    Joined:
    Oct 2, 2010
    Messages:
    7,623
    Likes Received:
    21
    Trophy Points:
    38
    Location:
    somewhere over the rainbow
    cPanel Access Level:
    Root Administrator
    I just installed openfire onto my machine and set it up for monitoring. It isn't failing on service checks. Now, I have my process running as root user with /usr/local/jdk/bin/java showing for the process:

    Here are the steps I used to add monitoring for it:

    1. Created /etc/chkserv.d/openfire file:

    Code:
    echo "service[openfire]=x,x,x,/etc/init.d/openfire restart,/usr/local/jdk/bin/java,root" > /etc/chkserv.d/openfire
    2. Added monitoring for openfire in chkservd:

    Code:
    vi /etc/chkserv.d/chkservd.conf
    Put this line alphabetically into the file:

    Code:
    openfire:1
    Saved the file :)wq)

    3. Added a file to have monitoring up for openfire:

    Code:
    echo "+" > /var/run/chkservd/openfire
    4. Restarted chkservd

    Code:
    /scripts/restartsrv_chkservd
    At that point, I tailed /var/log/chkservd.log to see the results for 10 minutes. I have had no failures:

    Code:
    tail -fn0 /var/log/chkservd.log
    Here are my success results:

     
  7. hgbas

    hgbas Registered

    Joined:
    Dec 23, 2011
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    Hello,

    Thanks so much for the great help.
    I had already tried many iterations of CMD & USER to no avail, but not like your last reply with the full path to the binary.
    I will give it a shot and let you know.
     
  8. PbG

    PbG Well-Known Member

    Joined:
    Mar 11, 2003
    Messages:
    241
    Likes Received:
    0
    Trophy Points:
    16
    I successfully added nagios to chkservd absent this instruction which I do not understand. Please clarify where you added said file and what it is titled?

     
  9. cPanelTristan

    cPanelTristan Quality Assurance Analyst
    Staff Member

    Joined:
    Oct 2, 2010
    Messages:
    7,623
    Likes Received:
    21
    Trophy Points:
    38
    Location:
    somewhere over the rainbow
    cPanel Access Level:
    Root Administrator
    Hello,

    The /var/run/chkservd files are named after the services being monitored and in this instance openfire was the name of the service being created. The file contents will have either - or + based on whether they are seen to be up (+) or down (-) on service checks. If they are up, they will have + and will appear as up in WHM > Server Status > Service Status area. You would want to call your service nagios if that's what you've named it in /etc/chkserv.d/chkservd.conf file.

    You are very welcome!
     
  10. jimpic

    jimpic Member

    Joined:
    Aug 23, 2011
    Messages:
    7
    Likes Received:
    0
    Trophy Points:
    1
    cPanel Access Level:
    Root Administrator
    This topic is now one year old but it's precisely what I am trying to do: monitor openfire with chkservd.
    Openfire is running well on the server but when I do what cpaneltristan wrote in a post above, openfire stops and restarts every 5 minutes (at every check in fact). I must have done something wrong.
    So I disabled openfire monitoring.
    Any clue on this matter?
     
Loading...

Share This Page