System Uptime

System uptime in the past 90 days.

Past Incidents Past Incidents

Welcome to Ping Identity's system status site.

PingID Service Interruption

Incident Report for Ping Identity

Postmortem

Incident Summary

PingID stopped functioning correctly which prevented users from being able perform second factor authentication. The root cause of the outage was a data replication failure in the session management system. An unusual circumstance occurred where a failed node was replaced then rebalancing of session data stalled the entire system. Mitigation actions were taken and the system functionality was restored (see below).

Customer Impact

Customers were not able to utilize PingID during the duration of the outage.

Incident Timeline - Apr 17, 2017 (MDT)

1745 - PingID errors reported
1750 - Operations Team begins investigation
1751 - System monitoring indicates spike in HttpServerError 500
1753 - Web server stack trace shows problem connecting to the session management system
1756 - Internal escalation process initiated
1806 - Synthetic testing validates problem
1815 - Restarting web services reduces error rate
1817 - Load balancing mechanism marks all nodes as down
1822 - Heartbeat for all nodes return to normal
1832 - Status monitoring page updated
1836 - Service is restored
1851 - Status monitoring page updated

Affected Services

PingID Services
PingID App
PingID Authenticator
PingID Server

Resolution

Restarting the web services allowed the stateless session management system to fully recover.

Ping Action Items

Improve error monitoring of synthetic tests to detect this type of failure sooner.
Improve status update process and method.
Change the PingID session management system implementation to be more resilient.

Posted Apr 20, 2017 - 22:23 UTC

Resolved

This incident has been resolved. PingID service in all regions are back to normal.

Posted Apr 18, 2017 - 00:51 UTC

Investigating

Monitoring systems have detected an issue with the PingID service. The Site Reliability Engineering team has been notified and is currently working the issue. We will update this message when the incident has been identified. Automated monitoring systems will update affected components and will resolve operational status as systems recover.

For additional questions please contact support@pingidentity.com, or follow this incident on https://status.pingidentity.com for real-time service updates.

Posted Apr 18, 2017 - 00:32 UTC

This incident affected: PingOne for Enterprise - Global (Administration API, AD Connect & Routing Service, Administration Portal, OAuth Configuration Service, Single Sign-on), PingOne for Enterprise - United States (.com services) (Directory API, Directory Login, Office365 Service, PingOne Dock, SCIM Provisioning), PingOne for Enterprise - Europe (.eu services) (Directory API, Directory Login, Office365 Service, PingOne Dock, SCIM Provisioning), PingOne for Enterprise - Australia (.com.au services) (Directory API, Directory Login, Office365 Service, PingOne Dock, SCIM Provisioning), PingID - Europe (.eu services) (PingID Authenticator, PingID Server), PingID - Australia (.com.au services) (PingID Authenticator, PingID Server), PingID - United States (.com services) (PingID Authenticator, PingID Server), PingID Global (PingID App), and Twilio (SMS).

System Status

System Status

System Uptime

System Uptime

Past Incidents Past Incidents

PingID Service Interruption

Postmortem

Incident Summary

Incident Timeline - Apr 17, 2017 (MDT)

Affected Services

Resolution

Ping Action Items

Resolved

Investigating