Domain Availability Lookups - .UK
Incident Report for OpenSRS
Postmortem

Incident Date: October 1, 2020
Incident Number: PR-1372

On October 1, 2020 at 11:13 AM ET we experienced intermittent issues with Tucows’ Domains platform where Domain Availability Lookups for *.UK domains were showing domains as taken when available.

Tucows Engineers were engaged and started investigating the issue together with the registry. During these consultations, it was identified that the issue was on Tucows’ end.

On October 2, 2020 at 7:00 PM ET, The engineering team performed an emergency maintenance to enable debug logging and rebooted the servers connecting to the registry. The services recovered successfully after the implementation of the change and have been stable since.

The root cause of the service interruption is under investigation.

Tucows is to revise and enhance monitoring for better visibility.

Tucows has enabled additional logging to facilitate in-depth investigation and improve time to recover.

Tucows is to revise the escalation process with the vendor for timely resolution.

Thank you,

Tucows Engineering Team

Posted Oct 07, 2020 - 14:28 UTC

Resolved
This incident is resolved. We will update a Postmortem once the root cause is determined.
Posted Oct 02, 2020 - 23:52 UTC
Monitoring
Our engineering team has implemented a fix by upgrading our logging configuration and restarting registry connection nodes. Domain Availability Lookups are functioning normally at this time. Our engineering team will continue to monitor the logs over the weekend to determine the root cause.

We appreciate your patience during this investigation.

Incident Start Time: 10-01-2020 15:13:00 UTC
Incident Start Time:10-02-2020 23:00:00 UTC
Total Duration: 1 day 8 hours 13 minutes
Posted Oct 02, 2020 - 23:27 UTC
Update
Our engineering team is gathering further data on the issue. Updates will be provided as we work towards a resolution.
Posted Oct 02, 2020 - 21:26 UTC
Update
Our engineering team executed a change to implement a workaround to restore services. However, the issue persisted so the change was reverted. The engineering team continues to troubleshoot the issue.
Posted Oct 02, 2020 - 19:22 UTC
Update
We are continuing to work with Nominet technical team about the issue. We will update here as soon as we make any progress.
Posted Oct 02, 2020 - 16:10 UTC
Update
The .UK registry (Nominet) has updated us and suspect a number of queries could be the culprit due to a coding issue and are investigating. Our engineering team has been engaged to respond to their inquiries.
Posted Oct 02, 2020 - 14:04 UTC
Update
The .co.uk registry (Nominet) is aware of the situation, and it has been raised with their technical team. The issue is still being investigated, and we will update as soon as we have more information. We thank you for your patience.
Posted Oct 02, 2020 - 10:45 UTC
Update
We have been trying to reach out to the .co.uk registry (Nominet) but have not been able to get a response. We will continue to reach out for an update.
Posted Oct 01, 2020 - 20:29 UTC
Update
We are continuing our efforts to make contact with the Nominet registry regarding Domain Availability Lookups.

Next Update: Within 60 mins
Posted Oct 01, 2020 - 19:25 UTC
Update
We have reached out to the Nominet registry regarding this issue but have not heard back yet.

Next Update: Within 60 minutes
Posted Oct 01, 2020 - 18:14 UTC
Identified
The engineering team has identified the issue to be on the registry side. We have reached out to the Nominet registry to investigate further.
Posted Oct 01, 2020 - 17:05 UTC
Update
Our investigation efforts continue while we look into this problem. Domain Availability Lookups are intermittently showing taken when available.

Next Update: Within 30 minutes
Posted Oct 01, 2020 - 16:30 UTC
Investigating
We are currently investigating an issue with false positives being returned when searching for .UK domains. This has affected both domain registrations and transfers for the .UK TLD. Our development teams have been engaged.
Posted Oct 01, 2020 - 15:44 UTC
This incident affected: Domain Services (Core ccTLDs).