[Engage] Image delivery issues
Incident Report for Voyado
Postmortem

Summary

On August 18, 2024, at 13, we encountered a critical issue involving the expiration of the SSL/TLS certificate for the custom domain images.eclub.se, which was hosted and managed by our cloud provider's CDN services. The certificate expired without prior notification from the cloud provider, leading to security warnings and the failure to load images correctly in customer emails. This resulted in images not being displayed properly, which negatively impacted user experience.

Customer Impact

Images did not load properly in emails. This effected all tenants.

Root Cause and Mitigation

The root cause of the issue was twofold:

  1. Failure to Renew Certificate: The cloud provider did not automatically renew the SSL/TLS certificate for the custom domain images.eclub.se due to a domain validation issues.
  2. Lack of Notification: The cloud provider failed to notify us about the impending expiration, leaving the issue unnoticed until the certificate had already expired.

Mitigation:

Re-enabling Encryption: We re-enabled encryption on the affected endpoint to force the initiation of a new certificate validation process.

Cloud Provider Communication: A ticket was submitted to the cloud provider to escalate the issue and expedite the resolution process. Additionally, the third-party certificate provider was contacted to request that the Voyado team responsible for cloud resources is to be added to the notification list for any future.

Next Steps

Enhanced Monitoring: Implementing a more focused monitoring to track SSL/TLS certificate expirations independently of the cloud provider, ensuring that we receive timely alerts.

Proactive Communication with Providers: Establish direct lines of communication with both the cloud provider and the certificate authority to ensure that we are promptly informed of any potential issues with certificate renewals.

Posted Aug 19, 2024 - 15:09 CEST

Resolved
This is incident has been resolved.
A post mortem will be provided as soon as it's ready.
Posted Aug 19, 2024 - 09:49 CEST
Monitoring
The affected resources have been functional throughout the night and we are currently monitoring.
Posted Aug 19, 2024 - 08:06 CEST
Update
Actions to mitigate the issue have been taken on multiple fronts.
Given the nature of the problem we expect some delay before the actions taken come into effect.
A ticket has also been raised with our cloud provider to help mitigate the issue.
Posted Aug 18, 2024 - 17:10 CEST
Identified
Investigations point toward an expired certificate managed by our cloud provider causing security warnings and/or failure to load images from our CDN.
We are working to mitigate the issue.
Posted Aug 18, 2024 - 13:49 CEST
Investigating
We are currently investigating reported problems where images hosted on Engage CDN's are failing to load in emails.
Reports indicate the issue only affects Engage-hosted images, not images from external hosts.
Posted Aug 18, 2024 - 13:30 CEST
This incident affected: Engage (Messaging).