On 05/13, customers were having issues on some screens where certain sections, such as office locations, would go missing or stop working.
All times are in PST.
|2022-05-12 09:13||A separate issue prompts the DevOps team to release a mid-day hotfix to all servers, rolled out at a slower pace to prevent significant customer impact. This causes some issues for customers accessing the site during this time.|
|2022-05-13 07:00||The errors are found to consist of pages not rendering correctly, which are cached pages so they do not fix automatically upon navigating to another page. Users are instructed to perform a hard refresh, which fixes the issue.|
|2022-05-15 07:51||The engineering team is alerted by reports of users having problems when they open the calendar screen. An incident is posted to the status page.|
|2022-05-18 06:25||The root cause is found to be in the previous deployment. Process changes are put in place so this does not cause a similar issue in the future.|
|2022-05-20 08:44||Status page is updated to monitoring.|
|2022-05-25 08:27||A member of the DevOps team puts mitigation measures in place that should prevent this issue going forward.|
Our deployment steps do not currently account for frontend files that might be changed during a deployment but still need to be served for a while. Those failed requests get cached and stick around for a long time.
No system actions were performed immediately. Processes have been changed since then to make sure our deployments minimize any potential impact.
Customers were having issues loading offices in the calendar page and other UI problems throughout the application.
The following internal process improvements have been implemented: