Domain Lookups Issue

Incident Report for OpenSRS

Postmortem

Incident Date: October 27, 2021
Incident Number: PR-2513

On October 27, 2021 at 1:07 PM ET, Tucows experienced a service interruption impacting Hover, Ting Mobile and MSE platforms.

The service interruption was caused due to hardware failure of one of the core routers. 

At 1:10 PM ET, The services automatically failed over to the secondary router and majority of the services started to recover.

At 2:18 PM ET, The engineering team restarted the impacted systems to restore all the affected services. 

Tucows is in contact with external vendors to investigate the cause of the hardware failure and implement a solution based on their recommendations to prevent future interruptions.

Tucows is to review and enhance monitoring for better visibility and to address the issue in a timely manner. 

Thank you,

Tucows Engineering Team

Posted Oct 30, 2021 - 05:45 UTC

Resolved

This issue has now been resolved.

Incident Start Time: 10-27-2021 17:07:00 UTC
Incident Start Time: 10-27-2021 18:18:00 UTC
Total Duration: 1 hour 11 minutes.
Posted Oct 27, 2021 - 18:33 UTC

Identified

The engineering team has identified the issue and are working to resolve it.
Posted Oct 27, 2021 - 17:46 UTC

Update

We are currently investigating an issue that is impacting domain lookups within the reseller control panel. The engineering team has been engaged and are currently investigating.
Posted Oct 27, 2021 - 17:38 UTC

Investigating

We are currently investigating this issue.
Posted Oct 27, 2021 - 17:23 UTC
This incident affected: Control Panels (Reseller Control Panel, Classic RWI).