Cluster A - Outbound Mail and webmail login
Incident Report for OpenSRS
Postmortem

Incident Date: November 2, 2021
Incident Number: PR-2528 

On November 2, 2021, at 2:22 PM ET, Tucows’ hosted email platform experienced service interruption impacting POP/IMAP/Webmail for Prod A. 

The service interruption was due to the high number of established connections  on the Webmail database in the legacy hosted email platform. 

At 2:35 PM ET, The engineering team restarted the Webmail database to reset the connection. All the services were restored successfully.

At 3:40 PM ET, Tucows encountered another service interruption impacting email services in Prod A. 

At 3:57 PM ET, All the affected services recovered successfully without any intervention.

Tucows is in the process of investigating the root cause and develop a plan to roll out a permanent solution to address the issue.

Tucows is committed to continue with the hosted email migration efforts into the new cloud to maintain a scalable and stable hosted email environment.

Thank you,

Tucows Engineering Team

Posted Nov 08, 2021 - 16:30 UTC

Resolved
Engineers were able to restart services, cluster A outbound and webmail are fully recovered.

Incident Start Time: 11-02-2021 18:22:00
Incident Start Time:11-02-2021 18:35:00
Total Duration: 13 minutes
Posted Nov 02, 2021 - 18:48 UTC
Investigating
Some users may experience issues with outbound mail on Cluster A. Our engineering team is investigating the issue.

We will provide an update once we have additional information.
Posted Nov 02, 2021 - 18:35 UTC
This incident affected: Hosted Email (Cluster A).