Incident Date: July 15, 2021
Incident Number: PR-2178
On July 15, 2021, at 6:06 AM ET, Tucows’ hosted email platform experienced service interruption, causing email delays and login issues in Prod A.
The service interruption was due to a file system error that caused a high load on the network storage pair.
Tucows’ Engineering team increased the severity of the incident when we observed the external impact.
At 9:20 AM ET, The engineering team disabled the IMAP nodes to alleviate the high load causing login failures for a subset of users.
At 11:30 AM ET, The engineering team successfully stabilized the storage devices and restored services in a controlled manner.
At 2:50 PM ET, the engineering team restored all the services after stabilizing the hosted email environment.
The Tucows Engineering team has successfully upgraded the core software on the affected systems to prevent this incident from happening again.
Tucows Engineering Team