The provisioning service was impacted by an extremely high rate of Directory API usage which caused the PingOne Directory cloud service to queue and hold up requests.
Customer Impact
Customers using PingOne Directory and performing administrative functions such as creating or modifying users and groups during this period would have seen a long wait for changes to take effect.
Incident Timeline - March 20, 2017 (MDT)
1610 - First internal reports of increased load for provisioning.
1630 - Operations and Directory team start investigation.
1713 - First reports of Customers seeing updates not showing up for extended periods.
1731 - Internal teams determined that there were several large concurrent updates from multiple customers.
1805 - Status page updated to note the degraded service.
1830 - Added 4 servers to attempt faster processing.
1858 - Issue is determined to be large queue sizes. Operations team sees it start to recover.
1900 - Removed extra nodes; processing did not increase.
1903 - Operations team monitoring to ensure system behavior is returning to normal and not degrading.
2005 - Monitoring indicates everything has returned to normal, no further delays detected.
2032 - Status posting updated to indicate degraded status has cleared.
Affected Services
PingOne Services
Provisioning of PingOne Directory users and groups.
Resolution
Issue resolved itself when the queues naturally caught up given enough time.
Ping Action Items
Prioritize non-batch traffic to remove delays caused by batch jobs that are less sensitive to the turn-around time.
Adjust alerting levels to allow for an earlier detection of a queue increase.
Posted Mar 30, 2017 - 17:13 UTC
Resolved
Delays with provisioning of cloud directory users and groups has resolved. The Directory service is back to normal.
Posted Mar 21, 2017 - 02:33 UTC
Investigating
SRE has detected an issue causing delays with provisioning of cloud directory users and groups. SRE is currently investigating and will post an update when we find more.
Posted Mar 21, 2017 - 00:06 UTC
This incident affected: PingOne for Enterprise - United States (.com services) (Directory API).