The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

New server, Apache and mySQL failed overnight, chkservd hangs

Discussion in 'EasyApache' started by ttremain, Jan 14, 2013.

  1. ttremain

    ttremain Well-Known Member

    Joined:
    Feb 16, 2003
    Messages:
    212
    Likes Received:
    0
    Trophy Points:
    16
    I have a new server in service (well sort of "in service")

    apache, and mySQL both seem to have failed overnight...

    Chkservd HTML alerts say it is hanging...


    /usr/local/cpanel/logs/tailwatchd_log
    [17003] [2013-01-14 07:38:21 -0800] [Cpanel::TailWatch::Eximstats] [SQLERR] Could not prepare query, logging SQL to /var/cpanel/sql
    DBI connect('eximstats:localhost','eximstats',...) failed: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2) at /usr/local/cpanel/Cpanel/TailWatch/Eximstats.pm line 862.


    /var/log/chkservd.log

    [2013-01-14 07:06:03 -0800] Disk check .... /tmp (/var/tmp) [3%] ... /dev/sda3 (/) [6%] ... /usr/tmpDSK (/tmp) [3%] ... /dev/sda1 (/boot) [23%] ... /dev/sdb1 (/backup) [1%] ... {status:eek:k} ... Done
    [2013-01-14 07:06:03 -0800] Service check ....tomcat [[check command:+][socket connect:N/A]]...sshd [[check command:+][socket connect:N/A]]...spamd [[check command:+][socket connect:N/A]]...queueprocd [[check command:+][socket connect:N/A]]...named [[check command:+][socket connect:N/A]]...mysql [Service Check Started
    The previous service check is still running (302 second). It will be terminated if still hanging after 3 check intervals. (1/3)
    Service Check Started
    The previous service check is still running (603 second). It will be terminated if still hanging after 3 check intervals. (2/3)
    Service Check Started
    The previous service check was still running (904 second). It was terminated.
    Loading services .....cpanellogd....cpdavd....cpsrvd....exim....exim-26....ftpd....httpd....imap....ipaliases....lfd....mailman....mysql....named....queueprocd....spamd....sshd....tomcat..Done


    I am not sure what to check next. I did find that mySQL was not configured for enough open files, which I have adjusted.
     
  2. SB-Nick

    SB-Nick Well-Known Member

    Joined:
    Aug 26, 2008
    Messages:
    134
    Likes Received:
    0
    Trophy Points:
    16
    cPanel Access Level:
    Root Administrator
    Hi,

    Theres no error for apache on the output you provided, have you tried searching in /etc/httpd/logs/error_log for the time the daemon got restarted by chkservd?
     
  3. ttremain

    ttremain Well-Known Member

    Joined:
    Feb 16, 2003
    Messages:
    212
    Likes Received:
    0
    Trophy Points:
    16
    Well, it just sat there hung for 6 hours...

    [Mon Jan 14 01:26:59 2013] [error] server reached MaxClients setting, consider raising the MaxClients setting
    [Mon Jan 14 07:37:48 2013] [notice] caught SIGTERM, shutting down

    Obviously I should look into raising the MaxClients setting, but that does not explain why it wasn't restarted by chkservd...

    I have a theory, maybe someone can confirm it, or dispell it. Chkservd could not get mySQL running (which is what it appears to have been trying to do over and over and over, and never got around to trying to restart mySQL.

    mySQL was hung due to not enough open files allowed in the config.
     
  4. SB-Nick

    SB-Nick Well-Known Member

    Joined:
    Aug 26, 2008
    Messages:
    134
    Likes Received:
    0
    Trophy Points:
    16
    cPanel Access Level:
    Root Administrator
    Hi Thomas,

    The MaxClients error is probably the cause why chkservd is showing httpd as failing.
    Increasing the open file limit for mysql should also solve chksrvd and mysqld restart issues.
     
  5. ttremain

    ttremain Well-Known Member

    Joined:
    Feb 16, 2003
    Messages:
    212
    Likes Received:
    0
    Trophy Points:
    16
    I have doubled the mysql open files, from 2048 to 4096... Yet still:

    /var/lib/mysql/<hostname>.err:
    130115 8:33:58 [ERROR] /usr/sbin/mysqld: Can't open file: './xxxxxxxx_xxxxxxx/xxxxxxx.frm' (errno: 24)
    130115 8:38:29 [ERROR] Error in accept: Too many open files
    130115 8:42:45 [ERROR] Error in accept: Too many open files
    130115 8:47:01 [ERROR] Error in accept: Too many open files
    130115 8:51:17 [ERROR] Error in accept: Too many open files
    130115 8:55:33 [ERROR] Error in accept: Too many open files
    130115 8:59:49 [ERROR] Error in accept: Too many open files
    130115 9:04:05 [ERROR] Error in accept: Too many open files
    130115 9:08:21 [ERROR] Error in accept: Too many open files
    130115 9:12:37 [ERROR] Error in accept: Too many open files
    130115 9:16:53 [ERROR] Error in accept: Too many open files
    130115 9:21:09 [ERROR] Error in accept: Too many open files
    130115 9:25:25 [ERROR] Error in accept: Too many open files
    130115 9:29:41 [ERROR] Error in accept: Too many open files
    130115 9:33:57 [ERROR] Error in accept: Too many open files
    130115 9:36:20 [Note] /usr/sbin/mysqld: Normal shutdown

    130115 9:36:20 [Note] Event Scheduler: Purging the queue. 0 events
    130115 9:36:23 InnoDB: Starting shutdown...
    130115 9:36:26 InnoDB: Shutdown completed; log sequence number 0 15991140
    130115 9:36:26 [Note] /usr/sbin/mysqld: Shutdown complete

    130115 09:36:26 mysqld_safe mysqld from pid file /var/lib/mysql/raptor.likeit.net.pid ended
    130115 09:36:27 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
    130115 9:36:27 [Warning] '--log_slow_queries' is deprecated and will be removed in a future release. Please use ''--slow_query_log'/'--slow_query_log_file'' instead.
    130115 9:36:27 [Note] Plugin 'FEDERATED' is disabled.
    130115 9:36:27 InnoDB: Initializing buffer pool, size = 128.0M
    130115 9:36:27 InnoDB: Completed initialization of buffer pool
    130115 9:36:27 InnoDB: Started; log sequence number 0 15991140
    130115 9:36:27 [Note] Event Scheduler: Loaded 0 events
    130115 9:36:27 [Note] /usr/sbin/mysqld: ready for connections.
    Version: '5.1.66-cll' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL)

    It ran the "Can't open file" error several times, then chkservd started trying to reset it...


    at 9:36 I did a /scripts/restartsrv_mysql

    So how come /scripts/restartsrv_mysql can reset it, but not chkservd ?

    I have now doubled open files again.
    I've seen conflicting info, so I'm trying both syntax:
    open_files_limit=8192
    open-files-limit=8192
     
  6. SB-Nick

    SB-Nick Well-Known Member

    Joined:
    Aug 26, 2008
    Messages:
    134
    Likes Received:
    0
    Trophy Points:
    16
    cPanel Access Level:
    Root Administrator
    Hi,

    I cant see any obvious mesage from chksrvd showing mysql was unable to get restarted, you may need to provide further log output.
    The cant open file message is certainly because of the ulimit values, try setting to some large value such as (ensure this setting is on the [mysqld] section),

    open-files-limit=120000
     
  7. ttremain

    ttremain Well-Known Member

    Joined:
    Feb 16, 2003
    Messages:
    212
    Likes Received:
    0
    Trophy Points:
    16
    I just have more of this in the chkservd.log:


    [2013-01-15 09:13:30 -0800] Service check ....tomcat [[check command:+][socket connect:N/A]]...sshd [[check command:+][socket connect:N/A]]...spamd [[check command:+][socket connect:N/A]]...queueprocd [[check command:+][socket connect:N/A]]...named [[check command:+][socket connect:N/A]]...mysql [Service Check Started
    The previous service check is still running (301 second). It will be terminated if still hanging after 3 check intervals. (1/3)
    Service Check Started
    The previous service check is still running (602 second). It will be terminated if still hanging after 3 check intervals. (2/3)
    Service Check Started
    The previous service check was still running (903 second). It was terminated.
    Loading services .....cpanellogd....cpdavd....cpsrvd....exim....exim-26....ftpd....httpd....imap....ipaliases....lfd....mailman....mysql....named....queueprocd....spamd....sshd....tomcat..Done
    [2013-01-15 09:28:33 -0800] Disk check .... /tmp (/var/tmp) [3%] ... /dev/sda3 (/) [6%] ... /usr/tmpDSK (/tmp) [3%] ... /dev/sda1 (/boot) [23%] ... /dev/sdb1 (/backup) [4%] ... {status:eek:k} ... Done

    Any other logs I should be looking at?

    I have added the open-files-limit you suggested trying, and I am reinstituting some off-server monitoring (my bad)
     
  8. SB-Nick

    SB-Nick Well-Known Member

    Joined:
    Aug 26, 2008
    Messages:
    134
    Likes Received:
    0
    Trophy Points:
    16
    cPanel Access Level:
    Root Administrator
    Hi Thomas,

    Nothing odd on that output neither, see how the new open file limit setting goes from now on.
     
  9. ttremain

    ttremain Well-Known Member

    Joined:
    Feb 16, 2003
    Messages:
    212
    Likes Received:
    0
    Trophy Points:
    16
    mysql> SHOW GLOBAL STATUS LIKE '%open%';
    +--------------------------+-------+
    | Variable_name | Value |
    +--------------------------+-------+
    | Com_ha_open | 0 |
    | Com_show_open_tables | 0 |
    | Open_files | 3178 |
    | Open_streams | 0 |
    | Open_table_definitions | 1795 |
    | Open_tables | 1800 |
    | Opened_files | 28295 |
    | Opened_table_definitions | 9496 |
    | Opened_tables | 9540 |
    | Slave_open_temp_tables | 0 |
    +--------------------------+-------+
    10 rows in set (0.00 sec)
     
Loading...

Share This Page