PingOne Dock Service Impairment (.com)
Incident Report for Ping Identity
Postmortem

Incident Summary

Beginning on May 29th, 2018, recently added users may have experienced a delay between creation and the ability to authenticate to the PingOne dock. This was due to a batch migration process impacting lag between our master and replica database nodes. New accounts were provisioned in the master database, but due to the time it took to replicate the user to the read-only replica lookup of the user failed.

Incident Timeline (all times in UTC)

May 29, 2018

  • 16:00 - Replica lag starts increasing between master and read-only replica databases.

May 31, 2018

  • 13:42 - Issue escalated to Incident Commander. SRE identified replica lag as likely cause of the issue.

  • 14:29 - SRE begins repointing applications to master database.

  • 15:55 - Services repointed. Manual verification confirms issue resolved.

  • 19:54 - Investigation determines batch migration process is responsible for lag. Process stopped.

June 1, 2018

  • 02:21 - Read-only database in sync with master. No lag reported.

  • 16:25 - SRE repoints applications back to read-only database.

Affected Services

  • PingOne Dock (.com)

Resolution

Service restoration occured after application was pointed to master database node.

Ping Action Items

Increase severity of replica lag alert to page on call SRE.

Posted 4 months ago. Jun 06, 2018 - 13:48 UTC

Resolved
This incident has been resolved.
Posted 5 months ago. May 31, 2018 - 15:32 UTC
Investigating
Our Site Reliability Engineering team is investigating reports of recently added users being unable to login to the PingOne Dock.

For additional questions please contact support@pingidentity.com, or follow this incident on https://status.pingidentity.com for real-time service updates.
Posted 5 months ago. May 31, 2018 - 14:23 UTC
This incident affected: PingOne Services (PingOne dock - North America (.com)).