The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

ipaliases failed being restarted over and over again

Discussion in 'General Discussion' started by adtastichosting, Apr 1, 2012.

  1. adtastichosting

    adtastichosting Active Member

    Joined:
    Sep 13, 2008
    Messages:
    31
    Likes Received:
    0
    Trophy Points:
    6
    I'm getting the notice: ipaliases failed A restart was attempted automagically. Service Check Method: [check command] Number of Restart Attempts: 1 Cmd Service Check Raw Output: ipaliases has missing ips

    over and over again at least a dozen times a day this is happening on my dedicated server running centos 5.8 with whm 11.30.6.

    This just started happening over the last few days on the dedicated server. This server has been running for along time without any issues until now. Nothing has been modified or changed on it to my knowledge other than standard cpanel updates and so forth.

    I've looked thru the logs and see nothing that tells me anything. But of course, I'm not an expert, just fair to middling skills so I might not know what to look for specifically.

    I have 11 ips on the server, 10 dedicated to certain accounts with ssl. All of a sudden some of them will fail. Running /etc/rc.d/init.d/ipaliases restart you know will make everything normal again. For a WHILE. Then, zap, ipaliases fail again.

    Ideas here?
     
  2. adtastichosting

    adtastichosting Active Member

    Joined:
    Sep 13, 2008
    Messages:
    31
    Likes Received:
    0
    Trophy Points:
    6
    As an addendum to this post I opended a support ticket with Cpanel. This was their analysis:
    "The only errors I see are those reported by chkservd. This usually indicates an IP conflict. It looks like it's dying every 10 to 15 minutes which would make sense if there was an IP conflict on your network. My first suggestion is that you contact your datacenter or webhost and have them check their switch ARP logs and see if they are finding multiple MAC addresses arping with any of those IP addresses. If not there, I recommend that you disable csf to see if it happens then. It's possible that CSF is blocking the service checks for ipaliases. I'm fairly certain that you'll find it in one of those two places as the service check isn't able to connect to the IP addresses when the service is reported as down."

    I had the NOC look into it and they've offered nothing, everything seems to be in order. We conducted a capture of eth0 using wireshark but nothing seems to be out of the ordinary. Of course, we have no way to see what is happening right at the moment of the failure since nothing seems to show up in the logs in that regard. I disabled CSF to see if that was causing the issue but it made no difference. I've checked all the associated files such as /etc/nameserverips, /etc/ips, /etc/init.d/ipaliases, chkconfig --list ipaliases and so forth... and nothing out of the ordinary. There have been over 20 failures today alone ad each time the server is unresponsive for a good while until it resets. Often it will reboot then ipaliases will fail again within minutes, sometimes no problem for awhile.

    I'm totally at a loss here, some good ideas about what to check, or do would be most welcome.
     
  3. cPanelTristan

    cPanelTristan Quality Assurance Analyst
    Staff Member

    Joined:
    Oct 2, 2010
    Messages:
    7,623
    Likes Received:
    21
    Trophy Points:
    38
    Location:
    somewhere over the rainbow
    cPanel Access Level:
    Root Administrator
    Hello,

    Can you check /var/log/messages for one of the IPs that has recently failed to work to see what it shows?

    Code:
    grep IP# /var/log/messages
    You might well see avahi-daemon withdrawing the IP. If it isn't configured correctly with the IPs registered, it can withdraw the IP on the network interface. If you see entries for it, please ensure you have physical access to the machine (or someone with physical access standing by) and then check the config run levels for it and disable avahi-daemon entirely:

    Code:
    chkconfig --list avahi-daemon
    chkconfig avahi-daemon off
    chkconfig --list avahi-daemon
    Next, shut it off by running a ps aux on the processes and then running a kill -9 on them along the following examples:

    Code:
    #ps aux | grep avahi
    avahi    24652  0.1  0.0  23300  1408 pts/0    S    23:28   0:00 avahi-daemon: running [sloth.local]
    avahi    24653  0.0  0.0  23168   336 ?        Ss   23:28   0:00 avahi-daemon: chroot helper
    
    #kill -9 24652
    #kill -9 24653
    -bash: kill: (24653) - No such process
    [1]+  Killed                  /usr/sbin/avahi-daemon
    
    #ps aux | grep avahi
    root     24666  0.0  0.0  61196   764 pts/0    S+   23:29   0:00 grep --color avahi
    You may have the server go down at this point. If it is okay, I might still recommend a reboot to ensure everything is working properly. If it did go down, simply ensure it comes back online by physically monitoring it in console.

    Also, if it does end up being avahi and the above corrects the situation, could you then post the ticket number here?

    Thanks!
     
  4. adtastichosting

    adtastichosting Active Member

    Joined:
    Sep 13, 2008
    Messages:
    31
    Likes Received:
    0
    Trophy Points:
    6
    Thank's for stepping in and providing some info! Actually, we've already ventured through all the logs in trying to find the culprit and nothing I tell you. And I put in a ticket with cpanel for assistance and they spent the last couple of days going thru everything to no avail. The NOC couldn't find anything either then.. out of the blue,, get this, they "discovered" a fan was dead. The fan was replaced and there hasn't been an issue since. Go figure.. I guess my lesson is the one thing I didn't think of in this instance. Check the temp's. :) But who would've thunk it. Maybe the hard drive was getting too hot and causing the problem? Seems reasonable I suppose a new fan and problem goes away. Of course, my wife attributes it simply to "Mercury retrograde." Yeah, that's it. Naw....
     
Loading...

Share This Page