VoIP Devices Unregistered
Incident Report for MeloTel Network Operations (NOC)
Postmortem

Yesterday (Wednesday) we experienced a problem with voip devices losing registration. The root cause was not identified immediately but we were able to get the services back to operational.

  • Wednesday Evening: 14 Minute Interruption

Last night we performed some maintenance tasks on the environment in an effort to mitigate the problem.

This morning again, devices were flapping registration causing customers to lose calls and prevent them from being able to make calls.

Once we involved our switch vendor, they were able to clearly identify the problem being related to a security protocol we have in place, related to our intrusion and toll fraud IP blocking processes.

Technically, a cache was building too large and needed to be pruned. Since it was exceeding capacity, the firewall would stop all connections as a fail safe.

The log was pruned and services were restarted which restored service.

  • Thursday Morning: 32 Minute Interruption

Engineers are confident the adjustment to log pruning rotation has remedied the problem once and for all.

We learned an important lesson from this event and stand committed to our customers to always strive for a very high standard of reliability.

We sincerely apologize for the inconvenience caused during this event.

From the technical support perspective, this ticket has been resolved. If anything related to this ticket changes, please reply to this email to re-open. Otherwise, please open a new ticket by emailing us at support@melotel.com.

Posted Sep 06, 2018 - 10:22 EDT

Resolved
This problem has been identified and resolved. Devices are back on line and this issue will not be a problem any further. RFO to follow.
Posted Sep 06, 2018 - 09:59 EDT
Update
Services should be restored at this time. We are monitoring very closely at this time.
Posted Sep 06, 2018 - 09:52 EDT
Update
The problem with devices failing failing registration was caused by our firewall. It should be resolved shortly.
Posted Sep 06, 2018 - 09:50 EDT
Identified
We have identified the root cause of this problem and it is being worked on right now. Details to follow.
Posted Sep 06, 2018 - 09:46 EDT
Update
We are experiencing an ongoing problem with our environment that is causing some critical process to hang. The result is devices unregister. Once the service is restarted, the registration is permitted again. We have already engaged our switch vendor to investigate the situation along with all hands on deck to remedy this issue. We apologize for the inconvenience. Details will follow.
Posted Sep 06, 2018 - 09:43 EDT
Update
Please bare with us as some customers experience their lines un-registering and re-registering. It seems devices are stabilized. If your device is not registered, please restart it. This ticket will remain open while we investigate.
Posted Sep 06, 2018 - 09:35 EDT
Update
We see lines are registering again. Were continuing diagnosis.
Posted Sep 06, 2018 - 09:31 EDT
Investigating
We are seeing an issue with lines unregistered again this morning. They are back online. We are investigating.
Posted Sep 06, 2018 - 09:27 EDT
This incident affected: VoIP Services (SIP Registration).