The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

tailwatchd down across multiple servers

Discussion in 'General Discussion' started by katmai, Nov 4, 2016.

Tags:
  1. katmai

    katmai Well-Known Member

    Joined:
    Mar 13, 2006
    Messages:
    530
    Likes Received:
    0
    Trophy Points:
    166
    Location:
    Brno, Czech Republic
    so i upgraded to cpanel 11.60 on a bunch of servers and i keep getting messages that tailwatchd is down
    Code:
    [root@msrv04 ~]# tail -f /usr/local/cpanel/logs/tailwatchd_log
    [5905] [2016-11-04 02:39:42 +0100] [Cpanel::TailWatch::Eximstats] Loading email sending limits from 1478221200 - 1478224800
    [5905] [2016-11-04 03:00:00 +0100] [Cpanel::TailWatch::Eximstats] Resetting email limits to new starttime of 1478224800
    [5905] [2016-11-04 04:00:00 +0100] [Cpanel::TailWatch::Eximstats] Resetting email limits to new starttime of 1478228400
    [5905] [2016-11-04 05:00:00 +0100] [Cpanel::TailWatch::Eximstats] Resetting email limits to new starttime of 1478232000
    [5905] [2016-11-04 06:00:00 +0100] [Cpanel::TailWatch::Eximstats] Resetting email limits to new starttime of 1478235600
    [5905] [2016-11-04 07:00:00 +0100] [Cpanel::TailWatch::Eximstats] Resetting email limits to new starttime of 1478239200
    [5905] [2016-11-04 08:00:00 +0100] [Cpanel::TailWatch::Eximstats] Resetting email limits to new starttime of 1478242800
    [5905] [2016-11-04 09:00:00 +0100] [Cpanel::TailWatch::Eximstats] Resetting email limits to new starttime of 1478246400
    [5905] [2016-11-04 10:00:00 +0100] [Cpanel::TailWatch::Eximstats] Resetting email limits to new starttime of 1478250000
    [5905] [2016-11-04 11:00:00 +0100] [Cpanel::TailWatch::Eximstats] Resetting email limits to new starttime of 1478253600
    
    [root@msrv04 ~]# /scripts/restartsrv_tailwatchd --status
    (XID pvhjzz) The “tailwatchd” service is down.
    
    the process seems to be running though

    Code:
    [root@msrv04 ~]# ps aux |grep tail
    root      5905  0.0  0.4 106172 15868 ?        S    Nov01   0:06 tailwatchd
    
    [root@msrv04 ~]# systemctl status tailwatchd.service
    ● tailwatchd.service - tailwatchd
       Loaded: loaded (/etc/systemd/system/tailwatchd.service; disabled; vendor preset: disabled)
       Active: failed (Result: exit-code) since Fri 2016-11-04 11:05:31 CET; 10s ago
      Process: 32236 ExecStart=/scripts/restartsrv_tailwatchd --no-verbose (code=exited, status=1/FAILURE)
    Main PID: 16200 (code=exited, status=0/SUCCESS)
    
    Nov 04 11:05:31 msrv04.example.nl systemd[1]: Starting tailwatchd...
    Nov 04 11:05:31 msrv04.example.nl restartsrv_tailwatchd[32236]: tailwatchd is already running (tailwatchd) with PID 5905 by root
    Nov 04 11:05:31 msrv04.example.nl systemd[1]: tailwatchd.service: control process exited, code=exited status=1
    Nov 04 11:05:31 msrv04.example.nl systemd[1]: Failed to start tailwatchd.
    Nov 04 11:05:31 msrv04.example.nl systemd[1]: Unit tailwatchd.service entered failed state.
    Nov 04 11:05:31 msrv04.example.nl systemd[1]: tailwatchd.service failed.
    
    [root@msrv04 ~]# /scripts/restartsrv_chkservd
    Waiting for “tailwatchd” to start ……Job for tailwatchd.service failed because the control process exited with error code. See "systemctl status tailwatchd.service" and "journalctl -xe" for details.
    …failed.
    
    Service Error
            (XID ttdda4) The “tailwatchd” service failed to start.
    
    Startup Log
            Nov 04 11:05:31 msrv04.storetech.nl systemd[1]: Starting tailwatchd...
            Nov 04 11:05:31 msrv04.storetech.nl restartsrv_tailwatchd[32236]: tailwatchd is already running (tailwatchd) with PID 5905 by root
            Nov 04 11:05:31 msrv04.storetech.nl systemd[1]: tailwatchd.service: control process exited, code=exited status=1
            Nov 04 11:05:31 msrv04.storetech.nl systemd[1]: Failed to start tailwatchd.
            Nov 04 11:05:31 msrv04.storetech.nl systemd[1]: Unit tailwatchd.service entered failed state.
            Nov 04 11:05:31 msrv04.storetech.nl systemd[1]: tailwatchd.service failed.
    
    tailwatchd has failed. Contact your system administrator if the service does not automagically recover.
    
    
    any ideas what's up?

    [root@msrv04 ~]# cat /usr/local/cpanel/version
    11.60.0.15
     
    #1 katmai, Nov 4, 2016
    Last edited by a moderator: Nov 4, 2016
  2. Infopro

    Infopro cPanel Sr. Product Evangelist
    Staff Member

    Joined:
    May 20, 2003
    Messages:
    15,620
    Likes Received:
    296
    Trophy Points:
    433
    Location:
    Pennsylvania
    cPanel Access Level:
    Root Administrator
    Twitter:
  3. katmai

    katmai Well-Known Member

    Joined:
    Mar 13, 2006
    Messages:
    530
    Likes Received:
    0
    Trophy Points:
    166
    Location:
    Brno, Czech Republic
    err. i fail to see how this helps. my servers are sending e-mail alerts that tailwatchd is down. the service clearly isn't. the service tries to get restarted - it errors out - i get e-mail about it. i am not getting e-mails about the exim thingie since from your thread that you linked, it shows it's just information.

    i am getting e-mails about a service that isn't down however.

    am i understanding this wrong? i kinda don't think so.

    i think it's related to this more:
    tailwatchd_log Problem with recent_authed_mail_ips

    Implemented case CPANEL-7723: Prevent multiple copies of tailwatchd from being started.
    Fixed case CPANEL-6436: Prevent tailwatchd from starting multiple processes.


    than to what you linked.
     
  4. katmai

    katmai Well-Known Member

    Joined:
    Mar 13, 2006
    Messages:
    530
    Likes Received:
    0
    Trophy Points:
    166
    Location:
    Brno, Czech Republic
    i just manually killed the process and did a /scripts/restartsrv_chkservd

    seems everything is ok after doing this. pretty weird to have this thing happening tho.

    i just looked closely and on another server i had 2 instances of tailwatchd running:

    root@jaws [~]# ps aux |grep tail
    root 411 0.0 0.8 402604 263800 ? S Oct12 11:20 tailwatchd
    root 5242 0.0 0.0 112652 972 pts/0 S+ 06:38 0:00 grep --color=auto tail
    root 24796 0.0 0.1 120564 39436 ? S Nov03 0:53 tailwatchd
     
  5. cPanelMichael

    cPanelMichael Forums Analyst
    Staff Member

    Joined:
    Apr 11, 2011
    Messages:
    37,086
    Likes Received:
    1,288
    Trophy Points:
    363
    cPanel Access Level:
    Root Administrator
    Hello,

    The following cases were recently included with cPanel version 60 to help prevent these types of issues with tailwatchd:

    Fixed case CPANEL-9515: Tail-check: ensure tailwatchd is restarted using systemd.
    Fixed case CPANEL-9392: Harden tailwatchd's dupe process check.

    It looks like the process list you reported shows the duplicate tailwatchd process was started on October 12th, before the server was updated to include the resolutions in cPanel version 60. Could you let us know if you notice this issue on a server and the duplicate process is dated at a time after the system is updated to cPanel 60?

    Thank you.
     
  6. katmai

    katmai Well-Known Member

    Joined:
    Mar 13, 2006
    Messages:
    530
    Likes Received:
    0
    Trophy Points:
    166
    Location:
    Brno, Czech Republic
    i did tell you in my post that it's after 11.60 that i experienced this issue :/ ...

    - Removed -
     
    #6 katmai, Nov 4, 2016
    Last edited by a moderator: Nov 4, 2016
  7. cPanelMichael

    cPanelMichael Forums Analyst
    Staff Member

    Joined:
    Apr 11, 2011
    Messages:
    37,086
    Likes Received:
    1,288
    Trophy Points:
    363
    cPanel Access Level:
    Root Administrator
    Hello,

    The resolutions are aimed to prevent the startup of duplicate tailwatchd processes, but they won't automatically kill any existing processes that were started before the system was updated to include those resolutions.

    Could you let us know if any systems where this is happening shows a duplicate/hanging tailwatchd process that was started after the update to version 60?

    Thank you.
     
  8. katmai

    katmai Well-Known Member

    Joined:
    Mar 13, 2006
    Messages:
    530
    Likes Received:
    0
    Trophy Points:
    166
    Location:
    Brno, Czech Republic
    well 2 of the boxes didn't have duplicate tailwatchd processes, only 1 of them. they stopped failing after i killed it.

    i don't think it's a massive issue though, because i have about 25 boxes and the others aren't alerting. now i know what to do anyway.

    thanks.
     
  9. cPanelMichael

    cPanelMichael Forums Analyst
    Staff Member

    Joined:
    Apr 11, 2011
    Messages:
    37,086
    Likes Received:
    1,288
    Trophy Points:
    363
    cPanel Access Level:
    Root Administrator
    Could you open a support ticket using the link in my signature if you notice this happening again? We're happy to take a closer look to determine what went wrong.

    Thank you.
     
Loading...

Share This Page