On December 4, 2016, there were unexpected errors encountered in production during a planned PingID database migration from MySQL to Cassandra. Approximately 0.5% of PingID users were temporarily affected. As a result of these database migration issues, less than 0.51% of PingID users experienced configuration data corruption.
Single-Sign-On was unavailable for approximately 0.5% of PingID MFA users as a result of this migration issue. The affected users could not SSO to applications that use PingID as MFA was blocked for the affected users. These users could not be unpaired or removed from PingID by an administrator due to the configuration data corruption.
Incident Timeline (MST)
- 11:04 - Development teams discover errors during database migration
- 11:49 - Discovery is escalated to the Site Reliability Team for assistance with investigation
- 13:27 - Technical Support escalates to Site Reliability Team reporting PingID users are unexpectedly unpaired/disabled
- 15:27 - Site Reliability Team validated database restore process in our non-production environment
- 15:56 - Production database restore complete by the Site Reliability Team
- 17:15 - Status posted with a status of “Identified"
- 19:23 - Development teams started deploying a fix to production
- 19:37 - Code fix completely deployed
- 20:13 - Status post updating incident status to “Monitoring"
- 08:21 (12/5) - Incident updated to “Resolved”
- PingID Services
- PingID Server (North America)
- PingID Server (Europe)
- PingID Server (Australia)
A code fix was deployed that allowed impacted users to re-pair their devices to restore PingID functionality.