Comodo OCSP Outage

ruiz

Well-Known Member
Feb 13, 2008
50
4
58
Hi there,

Last thursday we had a couple hours downtime during work hours on a huge dedicated server with over 800 accounts. Needless to say, it was bad, but the worst thing is that all services came up again and we still dont know why! Here's what happened:

Right before noon Apache and Exim stop responding correctly, with browsers and e-mail clients receiving a "time-out" response. WHM and SSH where still working (responding) perfectly, and the server load was low.

At that moment i tried restarting apache and exim, and when it didn't work i tried stopping the firewall, because it seems like a network issue... But no change.

Finally i gave up and restarted the whole server... Still no change.

After that i logged into another server in the same hosting company (this one was working with no hiccups) and tried to reach a website on the problematic server from the command line using "wget"... It worked instantly on any page.

From that moment i assumed it was some kind of filter or bug on the hosting company network so i contacted then. Unfortunatly they said there was no problem with their network so it should be something with my server.

After a couple hours the server started responding normally again without any change from me, or from my hosting company (allegedly). I checked all my logs and it all points out that those services were working with no problems, but network traffic to those ports stopped during the downtime. There's no problem with the server.

The question is... Is it possible that something malfunctioned on my hosting company and that caused the downtime? Any idea of what it might be? Or should i keep looking for something on my server?

Thanks!
 

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,880
2,268
463
Hello,

Did you notice anything unusual in /var/log/messages or /var/log/dmesg during the downtime? It seems like a network issue based on the information you provided. You may want to follow-up with your provider and let them know you reviewed the logs and don't see anything that suggests a server-level issue.

Thank you.
 

ruiz

Well-Known Member
Feb 13, 2008
50
4
58
Thanks cPanelMichael!

The same problem happened yesterday for a few minutes, and i think i found the source. It wasn't the network, but our SSL certificate issued by Comodo (probably).

Some websites without ssl were working correctly, so I used this service to analise out SSL certificate:
SSL Server Test (Powered by Qualys SSL Labs)

Here is the result:
ibb.co/j8T8Pv

My main concern was the line that says:
OCSP ERROR: Exception: connect timed out [http://ocsp.comodoca.com]

Since the OCSP responder was offline, is it normal that all ssl websites on my server stop responding? Is there a workaround? Since autoSSL uses comodo, no one else noticed this problem? Thanks!
 
Last edited by a moderator:

rpvw

Well-Known Member
Jul 18, 2013
1,100
477
113
UK
cPanel Access Level
Root Administrator
Couple of things to take into account:

The OCSP requirement is more likely to be a setting in the configuration of the browser you are using (eg in Firefox you can see it in Preferences > advanced > certificates, or use the string ocsp in about:config

There is a possibility that the OCSP server was down, overloaded or unreachable at the time you experienced the issues.

It has also been suggested in various forums that an UN-synchronized time/date on the calling device (the computer you are calling the site FROM) may sometimes provoke this response.

Hope this helps
 
Last edited:

cPanelMichael

Administrator
Staff member
Apr 11, 2011
47,880
2,268
463
My main concern was the line that says:
OCSP ERROR: Exception: connect timed out [http://ocsp.comodoca.com]

Since the OCSP responder was offline, is it normal that all ssl websites on my server stop responding? Is there a workaround? Since autoSSL uses comodo, no one else noticed this problem? Thanks!
Hello,

This was actually due to a Comodo outage yesterday:

Comodo Certificate Authority Status

These types of outages can result in websites failing to open when the browser (e.g. Firefox) is unable to directly connect to the OCSP server. Note that we did implement the following case back in June:

EA-6302: Add SSLStaplingResponderTimeout to help when OCSP is down

This helps to ensure the connection fails sooner when the OCSP server is down, whereas before the connection would hang. I recommend using the "Subscribe" button in the Comodo status URL referenced above so you are alerted when there's a Comodo outage in order to better identify when this issue might appear.

Thank you.
 
  • Like
Reactions: rpvw