Service disruption

Incident Report for Greenlight Guru Services

Postmortem

Summary

Between 05:40 PM and 06:28 PM UTC on Thursday,  Aug 7th, 2025, customers experienced a service disruption with Greenlight Guru - Quality Management system due to an outage with a 3rd party vendor. All other products remained fully functional.

Root Cause

Due to an outage with 3rd party vendor the Greenlight Guru - Quality Management system failed to resolve certain functionality which caused various errors and subsequent failures within the product. This outage was the first of its type and scale with the vendor, it exposed a gap in the fault tolerance and mitigation plans already in place for said vendor.

Impact

During the incident, some customers were unable to perform regular actions with the Greenlight Guru - Quality Management system.

Timeline

  • 05:40 PM UTC - Internal monitoring system alerted team of elevated errors
  • 05:42 PM UTC - Internal response team began investigation of errors
  • 05:50 PM UTC - Internal response team confirmed issue with 3rd party vendor
  • 05:53 PM UTC - Communication with 3rd party vendor established, collaborated on resolution path
  • 06:00 PM UTC - Met with 3rd party vendor for status update
  • 06:10 PM UTC - Internal response team flagged gap in fault tolerance with 3rd party vendor
  • 06:21 PM UTC - Received communication with 3rd party vendor that service was being restored incrementally
  • 06:28 PM UTC - Confirmed service fully restored
  • 06:30 PM UTC - Internal team monitored functionality and 3rd party vendor status for next 3hrs.

Remediation

Immediate: The Greenlight Guru team worked with the vendor to ensure a timely restoration.

Post: The Greenlight Guru team closely monitored functionality for several hours after the incident.

Action Items

  1. Improve Fault Tolerance with 3rd Party Vendors

    1. Strengthen system resilience to third-party failures
    2. Improve client error handling to account for fault tolerance scenarios
Posted Aug 14, 2025 - 18:17 EDT

Resolved

This incident has been resolved.
Posted Aug 07, 2025 - 14:52 EDT

Monitoring

The third-party vendor has resolved the issue, and access to our application is being restored. We will continue to monitor the situation.
Posted Aug 07, 2025 - 14:26 EDT

Update

We are continuing to work on a fix for this issue.
Posted Aug 07, 2025 - 14:18 EDT

Identified

We’re currently experiencing a service disruption due to an outage with a third-party vendor. We’re monitoring their progress and will provide updates as we learn more.
Posted Aug 07, 2025 - 14:03 EDT

Investigating

We are currently investigating this issue.
Posted Aug 07, 2025 - 13:49 EDT
This incident affected: Greenlight - QMS (Greenlight - US, Greenlight - EU, Greenlight - AUS, Greenlight - Singapore).