Email Cluster A Status Archive

Email Cluster A is Online

Updated Wednesday, March 28th, 2012 at 5:03 PM ET
2012-03-28 at 21:03 UTC - Other time zones

We have just a bit more information on the email issue we reported earlier. It turns out the only issue was with our internal email here at Tucows (Tucows personnel sending and receiving).

We sounded the alarm in case our own experience represented a broader problem with the service. In fact, the Email service was not actually degraded.

In a related story, "Wolf!!!".

Email Cluster A is Online

Updated Wednesday, March 28th, 2012 at 3:27 PM ET
2012-03-28 at 19:27 UTC - Other time zones

The earlier issues identified with mailflow have now been resolved by our Operations and Development teams.   Upon review it was determined there was no impact to customer email service.

 

 

Email Cluster A is Degraded

Updated Wednesday, March 28th, 2012 at 2:21 PM ET
2012-03-28 at 18:21 UTC - Other time zones

We are currently experiencing an intermittent issue affecting our Cluster A Email Service. Sending and receiving of email may be disrupted in some cases.

Our Operations team is working to resolve this issue.

Email Cluster A is Online

Updated Saturday, March 3rd, 2012 at 3:33 AM ET
2012-03-03 at 7:33 UTC - Other time zones

The maintenance on OpenSRS Email Cluster 'A' finished ahead of schedule and all services have been restored.

Email Cluster A is In Maintenance

Updated Saturday, March 3rd, 2012 at 1:55 AM ET
2012-03-03 at 5:55 UTC - Other time zones

Scheduled maintenance on OpenSRS Email Service Cluster 'A' will begin very shortly.

During this four-hour maintenance period, OpenSRS Email Service Cluster 'A' will be unavailable, including POP, IMAP, Webmail, SMTP services and provisioning.

PLEASE NOTE: End users of Email Service Cluster 'A' will not have access to their mailboxes during the maintenance period. All inbound messages will be queued for delivery after the maintenance is complete.

Email Cluster A is Online

Updated Friday, March 2nd, 2012 at 9:16 PM ET
2012-03-03 at 1:16 UTC - Other time zones

The previous notice was sent in error - the planned Cluster 'A' maintenance is scheduled to begin at 01:00 EST. Cluster 'A' is fully available at this time.

Email Cluster A is Online

Updated Monday, September 26th, 2011 at 2:04 PM ET
2011-09-26 at 18:04 UTC - Other time zones

Our earlier mitigation efforts appear to have helped and we have not experienced any recurrence related to this weekend's intermittent outage.

It's been nearly 24 hours since our last event and we're continuing to work with our storage vendor to bring this issue to full resolution.

In the meantime, we feel confident that our emergency changes have helped to temporarily bypass the problem until a permanent fix can be implemented. So we're going to change the status to Online and continue to closely monitor the platform.

Once again, we appreciate your patience and apologize for the trouble this event may have caused you and your end users.

This update is related to

Email Cluster A is Degraded

Updated Monday, September 26th, 2011 at 10:55 AM ET
2011-09-26 at 14:55 UTC - Other time zones

We've been closely monitoring the intermittent performance issues affecting Mail Cluster A. After exhaustive testing, we believe we can rule out the load balancer as the cause of the behaviour and have focused our efforts on the storage service for the Mail Cluster.

Although we haven't seen the symptoms since 14:22 EST yesterday afternoon, we know that load plays a factor in these events. To help mitigate the effects that load could bring, and reduce the chance of the event recurring, we have worked throughout the night to make preparations to ensure write latency is kept to a minimum and disk writes are able to run as efficiently and quickly as possible during peak load.

Our current focus will be to continue to work with our vendor's Kernel/filesystem experts in identifying and resolving the root cause affecting the storage service.

We sincerely apologize for the inconvenience this issue has caused you and your customers.

This update is related to

Email Cluster A is Degraded

Updated Sunday, September 25th, 2011 at 9:25 PM ET
2011-09-26 at 1:25 UTC - Other time zones

We continue to monitor Cluster A closely.

Users may experience short periods (less than 10 minutes) where access via POP, IMAP and Webmail is unavailable. We've taken steps to minimize the impact of these periods and to reduce their frequency.

At the same time we are also working on determining the root cause of the issue.

Once again, we're very sorry about the inconvenience to you and your customers.

This update is related to

Email Cluster A is Degraded

Updated Sunday, September 25th, 2011 at 4:01 PM ET
2011-09-25 at 20:01 UTC - Other time zones

We've made some progress toward resolving the intermittent POP3/IMAP/Webmail issues on Cluster A.

While ruling out the load balancer update as root cause, we identified some unusual behaviour related to the storage cluster. Our logging has indicated high latency network filesystem writes on the storage cluster that appear to coincide with the intermittent outage events.

As we continue to work toward ruling out the load balancer update as root cause, we're also working in parallel to further investigate and rule out the storage cluster as a contributing factor to the intermittent connectivity issue.

Impact to end user mailboxes is improving, with the 5-10 minute intermittent connection / slowness issues occurring less often. Earlier today, the interval was every 45 minutes, but as of this update, the impact interval is closer to 1H 45M.