[Engage] FTP is unreachable for some customers

Incident Report for Voyado

Postmortem

Summary
On January 15th we experienced an issue when our Engage FTP service became unavailable. The issue was quickly identified and engineers restarted the service, monitored and successfully verified its availability.

Customer Impact
For customers utilizing the FTP service the it was unavailable for about 30-40 minutes, leaving them with not being able to utilize the service.

Root Cause and Mitigation
Root cause
A sudden high load on the FTP server contributed to the consumption of all available resources for the service. This condition led to a bottleneck where multiple processes were blocked or delayed due to the lack of available resources. The accumulation of blocked processes overwhelmed the server, eventually causing it to crash. While we cannot definitively confirm that the high load was the sole trigger, it is highly likely to have been a significant contributing factor.
Mitigation
Restarting the affected server remediated the issue.

Next Steps
We have identified suggested actions to mitigate the issue in the future and will take those into consideration going forward. The actions include, but are not limited to, adjusting available resources as well as trying to mitigate the risk for overuse of the service.

Posted Jan 28, 2025 - 13:19 CET

Resolved

This incident has been resolved.
Posted Jan 15, 2025 - 17:17 CET

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Jan 15, 2025 - 16:18 CET

Investigating

We are currently investigating this issue.
Posted Jan 15, 2025 - 16:11 CET
This incident affected: Engage (FTP).