Cluster A - email delays
Incident Report for OpenSRS
Postmortem

Incident Date: July 26, 2020
Incident Number: PR-1215

On July 26, 2020, at 10:50 AM ET, the Tucows Hosted Email platform experienced service degradation and email delivery delays impacting cluster A.

The service interruption was caused by a high load on a network storage device.

At 1:35 PM ET, The Engineering team successfully stopped all the processes that were causing high load and brought the services back online.

Tucows is in the process of increasing resources to further spread the load and eliminate future interruptions.

Thank you,

Tucows Engineering Team

Posted Jul 28, 2020 - 15:31 UTC

Resolved
Our engineering team has resolved the issue on cluster A that was impacting load times and email delay.

Incident Start Time: 07-26-2020 14:50:00 UTC
Incident End Time:07-26-2020 17:35:00 UTC
Total Duration:2 hours 45 mins
Posted Jul 26, 2020 - 18:21 UTC
Update
Engineering team is still working on resolving the identified issues to alleviate the load. We will provide further updates shortly.

Customers may experience longer load time and email delays.
Posted Jul 26, 2020 - 17:21 UTC
Identified
Engineering team has identified issues that were causing high load on one of our storage devices. They are currently working on resolving them to alleviate the load.

Customers may experience longer load time and email delays.
Posted Jul 26, 2020 - 16:21 UTC
Investigating
We are experiencing high load issue in one of our storage devices which is causing email delays in Cluster A. We have engaged the engineering team and they are currently investigating the issue.
Posted Jul 26, 2020 - 15:24 UTC
This incident affected: Hosted Email (Cluster A, Webmail).