The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Exim restarting several times a day

Discussion in 'E-mail Discussions' started by nyjimbo, Aug 30, 2007.

  1. nyjimbo

    nyjimbo Well-Known Member

    Joined:
    Jan 25, 2003
    Messages:
    1,125
    Likes Received:
    0
    Trophy Points:
    36
    Location:
    New York
    We have been running Cpanel on one specific server for several years, recently upgraded the OS to Freebsd 6.2 and then Cpanel to 11 Stable. For the past few weeks since the upgrade its been rock solid stable, but suddenly we are seeing Exim restarting several times a day with the email:

    exim failed . A restart was attempted automagically.

    It used to happen maybe once a month but now happens 6-8 times a day. Its not causing any real problems with the machine, but we can't seem to hunt down what is causing it.

    No errors in logs, nothing about core dumps and by the time we get the warning email and walk to the machine to see whats going on the loads are very low and mail traffic is normal with no obvious indication of a problem.

    Anyone see this sort of thing in the past week or so?. What would cause EXIM to keep restarting when we dont see any load issues, failures, cores, error log entries, etc.. We are using Exim 4.67 and have been for weeks before these failures.
     
    #1 nyjimbo, Aug 30, 2007
    Last edited: Aug 30, 2007
  2. ggooden

    ggooden Member

    Joined:
    Dec 9, 2002
    Messages:
    10
    Likes Received:
    0
    Trophy Points:
    1
    Location:
    Pasadena, California, United States
    cPanel Access Level:
    Root Administrator
    I am seeing exactly the same thing on my box. It's been doing it for 3 days. I'm getting over 20 failure/restarts per day now.
     
  3. michutsg

    michutsg Member

    Joined:
    May 15, 2007
    Messages:
    6
    Likes Received:
    0
    Trophy Points:
    1
    I have the same. I'm running from cron every minute script for checking load on the server:

    #!/bin/bash

    LOAD=`/usr/bin/uptime | awk -F "," '{print $4}' | awk '{print $3}'`
    TIME=`/bin/date "+%d/%m/%Y %H:%M:%S"`

    echo "${TIME} ${LOAD}" >> /checkload/load.log



    And always when exim going down load is higher than 1 and keeps longer than 2 minutes.

    Yestarday guy from cpanel was working on my box but without success.
     
  4. hostspring

    hostspring Member

    Joined:
    Jul 26, 2004
    Messages:
    5
    Likes Received:
    0
    Trophy Points:
    1
    I'm also having the exact same problem. Been happening since Tuesday (8/28).

    Log is filled with "refused: too many connections"
     
  5. bazzi

    bazzi Well-Known Member

    Joined:
    May 23, 2004
    Messages:
    119
    Likes Received:
    0
    Trophy Points:
    16
    I have the same problem but sometimes it is prm that kills exim...
     
  6. nyjimbo

    nyjimbo Well-Known Member

    Joined:
    Jan 25, 2003
    Messages:
    1,125
    Likes Received:
    0
    Trophy Points:
    36
    Location:
    New York
    This is the only thing I notice before each "restart". Could it be that the program that is monitoring EXIM thinks it is down rather than simply not getting a "connection" and so Cpanel/WHM restarts EXIM ?.
     
  7. fcitrolo

    fcitrolo Active Member

    Joined:
    Dec 31, 2003
    Messages:
    26
    Likes Received:
    0
    Trophy Points:
    1
    We are experiencing the same issue with two boxes.

    We were fine for weeks and now exim is failing the same about 6 to 8 times daily.

    Obviously there is something that is not working properly.

    ------

    Wanted to add we are running exim-4.66-3
     
    #7 fcitrolo, Aug 31, 2007
    Last edited: Aug 31, 2007
  8. rmackay

    rmackay Well-Known Member

    Joined:
    Nov 26, 2002
    Messages:
    75
    Likes Received:
    0
    Trophy Points:
    6
    I am seeing the same thing. The chkservd log shows:

    [Fri Aug 31 12:07:37 2007] Service check ....cpsrvd [+]...exim [exim: [421 != 220]

    Then it restarts exim. This happens quite often but not at exact regular intervals.

    Any help is much appreciated.
     
  9. graham_w

    graham_w Well-Known Member

    Joined:
    May 25, 2004
    Messages:
    50
    Likes Received:
    0
    Trophy Points:
    6
    I am seeing this too. Right now Exim is running 92 processes - a restart of Exim brings it right down to normal but it just climbs. Normally when it gets too high I see the "Too many connections" in the logs and then Exim restarts.

    I'm going to log a ticket with cPanel.
     
  10. nyjimbo

    nyjimbo Well-Known Member

    Joined:
    Jan 25, 2003
    Messages:
    1,125
    Likes Received:
    0
    Trophy Points:
    36
    Location:
    New York
    From observing a few things it seems that this may be more of a combination of configuration and EXIM efficiency rather than a genuine bug or error.

    Lately it seems that more EXIM tasks can be found running when I do a "ps ax" or some other method like "exiwhat", this gives me the feeling that perhaps exim is taking more time to process each individual message stream. I am not noticing a heavier load on my server, rather it appears lighter as if the exim tasks might be doing nothing or not eating the processor as I would expect it to. With Cpanel 11 we have several nice new features in the exim configuration menu that would appear to require more time per message to handle (seconds really).

    Since by default the exim.conf is set for "smtp_accept_max = 100" then anything above 100 is going to return a 421 or some kind of similiar error. If exim is in fact taking a bit more time to handle an email/connection then you are going to see more active connections and many of us will see the "smtp_accept_max = 100" trigger kick in.

    If cpanel interrogates exim and looks for a good port 25 response rather than just seeing if it is live in memory it will likely encounter the 421 or similiar error and might be a reason it restarts exim. A cpanel tech might be able to address how it handles this.

    For many of us this appears to be only happening a few (6-8 or a little more) times a day. However if the "smtp_accept_max = 100" were to be bumped up to say "smtp_accept_max = 125" and then the machine monitored for 24 hours of normal operation (weekday) and the restarts occur less frequently or disappear completly then this might be all that is needed.

    This would be done in the EXIM configuration ADVANCED editor and you would put the line:

    smtp_accept_max = 125

    in the first white box and save it. Please do not do this unless you are comfortable in the exim editor.

    It is VERY important to note that you will now be allowing EXIM to launch more "children" which eats more memory and cpu so you should monitor your machine quite often for the first couple of days to see if this is a big impact or possibly kills your machine if it runs out of ram.

    I would not suggest going too high on the smtp_accept_max at first, do 20 or 30 more to see what that does to your memory and cpu. Perhaps some of you can do this as I have done one my machines today and report back findings. Right now we are going into a weekend so it might not be an accurate test as mail tends to be lighter now.

    Until we find why EXIM seems to be sitting on email longer this might be the fix we need to
    eliminate (or at least reduce) the exim restarts and 421/try later/connection refused errors.
     
  11. Sash

    Sash Well-Known Member

    Joined:
    Feb 18, 2003
    Messages:
    252
    Likes Received:
    0
    Trophy Points:
    16
    Does anyone else have a resolution or work around? It seems raising the "smtp_accept_max" setting from 100 to 125 is not going to resolve the problem, as an increase of 25 is small if you're already having a problem.

    It seems exim is not closing connections. I'm unsure what was changed in exim during the cpanel 11 upgrade process.

    Thanks,
    Mike
     
  12. zigzam

    zigzam Well-Known Member

    Joined:
    May 9, 2005
    Messages:
    206
    Likes Received:
    0
    Trophy Points:
    16
    Have this issue as well.
     
  13. nyjimbo

    nyjimbo Well-Known Member

    Joined:
    Jan 25, 2003
    Messages:
    1,125
    Likes Received:
    0
    Trophy Points:
    36
    Location:
    New York
    I think its holding them open longer, but not sure why. Our increase to 125 cut down restarts to only 1 in the past 18 hours or so, but we are seeing that Exim seems to not drop connections quickly enough. I did notice that in using the standard exim config with the built-in RBL checking seems to make exim more "chatty" and sends back a very detailed message to the smtp server along with what looks like a reverse dns lookup of the offending ip. This could be one reason its sitting on connections so long. I think the Cpanel team needs to review what exactly they are doing that is different in the config and see if it can be reduced for each transaction/connection.
     
    #13 nyjimbo, Sep 1, 2007
    Last edited: Sep 1, 2007
  14. Sash

    Sash Well-Known Member

    Joined:
    Feb 18, 2003
    Messages:
    252
    Likes Received:
    0
    Trophy Points:
    16
    Has anyone opened a ticket for this problem?

    What verion of cpanel is everyone running? We're running 11.10.0-S16448.
    Mike
     
  15. CptDecker

    CptDecker Registered

    Joined:
    Nov 11, 2006
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    Same problem here. 20 plus messages a day. We have a ticket open.
     
  16. artera

    artera Registered
    PartnerNOC

    Joined:
    Sep 27, 2005
    Messages:
    2
    Likes Received:
    1
    Trophy Points:
    3
    Same problem

    I have the same problem, only for one box ...
     
  17. nyjimbo

    nyjimbo Well-Known Member

    Joined:
    Jan 25, 2003
    Messages:
    1,125
    Likes Received:
    0
    Trophy Points:
    36
    Location:
    New York
    It's likely that the affected box is getting more spam than others and exim is not processing the messages quick enough or dumping the connection properly.

    We are not seeing it on one of our machines only because it doesnt ever get to a point where it hits the limit, but I can still see the exim tasks taking a long time to die off.
     
  18. rpmws

    rpmws Well-Known Member

    Joined:
    Aug 14, 2001
    Messages:
    1,824
    Likes Received:
    5
    Trophy Points:
    38
    Location:
    back woods of NC, USA
    do any of you like me run MRTG? take a look at open connections ..specifically yearly or monthly and can you see a spike?

    I don't get the spike on anything else, like emails and such and loads are normal.

    look at this yearly graph

    it's like this on all boxes.

    2007-09-01 15:47:41 Connection from [83.5.154.45] refused: too many connections
    2007-09-01 15:47:41 Connection from [84.10.125.125] refused: too many connections
    2007-09-01 15:47:42 Connection from [218.157.141.190] refused: too many connections
    2007-09-01 15:47:42 Connection from [85.101.11.186] refused: too many connections
    2007-09-01 15:47:43 Connection from [88.238.47.124] refused: too many connections

    3 boxes (**yearly** graphs for "connections" ..otherwise boxes are fine.
     

    Attached Files:

    #18 rpmws, Sep 1, 2007
    Last edited: Sep 2, 2007
  19. nyjimbo

    nyjimbo Well-Known Member

    Joined:
    Jan 25, 2003
    Messages:
    1,125
    Likes Received:
    0
    Trophy Points:
    36
    Location:
    New York
    That's the problem, its not a true load issue or network bandwidth or something, its just that too many exim tasks seem to remain open/connected and the next ones come in and must be refused due to one or more rules in the exim.conf

    I think we are getting near to a true fix and it will likely be some kind of configuration tweak or how new features of eximconfig are being implemented.
     
  20. EcoHosting

    EcoHosting Member

    Joined:
    Mar 6, 2004
    Messages:
    23
    Likes Received:
    0
    Trophy Points:
    1
    Location:
    Montreal
    We've also been experiencing the same problem for about 4 days now. I don't think it is a coincidence and I do not think it is an increase in SPAM. As mentioned before it is likely an Exim Process management issue. They are not terminating in a timely manner and results in excessive processes which in turn gets killed off when the configured threshold is hit.

    At the moment it is not a mission critical problem as cPanel just restarts the service but this is worrisome and should be dealt with asap by cPanel.

    As a ticket has already been started all we can do is wait. Let's hope it's not too long a wait!

    G
     
Loading...

Share This Page