On November 18, 2024, at 15:37 UTC, we received alerts that workspace resource allocation wasn't functioning correctly. This issue affected users' ability to access models in Anaplan Data Centers: US East, US West, Germany, and the Netherlands.
We quickly validated that the suspected component was healthy. Further investigation identified that there was a loss in communication and a drop in network traffic to one of the backend subcomponents. The backend subcomponent was restarted, and full service was restored at 16:02 UTC.
We have reviewed the logs to understand why the backend subcomponent wasn't functioning correctly. The subcomponent was attempting to connect to a specific infrastructure resource. The request timed out as the infrastructure resource was unavailable. Rather than attempting a new connection to a different resource, the same resource was targeted. This connection request again failed. This occurred several times, ultimately causing the subcomponent to fail as it was unable to complete the connection request. As the subcomponent was in a failed state, it was unable to communicate with the frontend component, causing the platform incident.
To stop it from happening again, we are improving how the subcomponent responds to errors. This will make it less likely that the part will fail when timeouts happen. We are also increasing our observability of the subcomponent to enable early detection and preemptive mitigation of potential issues before they impact customer experience.
We apologize for any impact this issue may have had on your business operations. We are continuously strengthening our systems and procedures to ensure we avoid future disruptions to your business and users.
If you have any further questions or concerns, please contact Anaplan Customer Care. Thank you for your patience during this situation and thank you for being an Anaplan customer.