Customers were experiencing appointments and other data not loading consistently.
All times are EST
Date/Time | Activity |
---|---|
2022-02-18 08:14 | Support team started receiving tickets related to appointment creation/updates, exam room moves, and an inability to save faxes to charts. |
2022-02-18 08:17 | DevOps team verified that the database metrics were normal |
2022-02-18 08:18 | Investigation of software errors as a potential cause was started. |
2022-02-18 08:19 | Additional examples of the behavior provided by the support team |
2022-02-18 08:19 | Investigation into software and infrastructure continues |
2022-02-18 08:30 | Zoom meeting created to discuss the issue and lack of obvious cause |
2022-02-18 09:25 | Status page created |
2022-02-18 09:30 | Issue identified and mitigated |
2022-02-18 09:33 | After manual verification of the resolution QA team started automated testing to verify the fix |
2022-02-18 09:56 | Status page updated to Monitoring |
2022-02-18 10:16 | QA reported that the issue was fixed after completion of automated testing |
2022-02-18 11:05 | Status page resolved |
One of the replica databases was missing data. However, none of the other metrics associated with replication issues were reporting an error state. The issue was identified through manual querying of all of the replicas.
Once the server was identified it was removed from the load-balancing pool and was no longer serving traffic.
All of our customers will experience inconsistent results with appointments and notes. The data was being saved correctly on the primary server but when it was loaded from the failing database the application would throw a not found error.
Corrective actions were the stabilization steps. In addition, the problematic server will be rebuilt and we are evaluating our database configurations to minimize the likelihood of the behavior reoccurring. Improvements in our monitoring and alert notifications to proactive identify this behavior are also being researched.