ReadMe logo

ReadMe Status Page

Developer Platforms & Tools · monitored by Alert24

readme.com
All Systems Operational

Current Status

All Systems Operational

View ReadMe status page ↗

Components

ReadMe Hubs
Operational
ReadMe Knowledge Base
Operational
Admin Dashboard
Operational
Cloudflare CDN/Cache
Operational
Developer Metrics
Operational
Cloudflare SSL for SaaS Provisioning
Operational
Auth0 User Authentication
Operational
Owlbot AI
Operational
Auth0 Machine to Machine Authentication
Operational
ReadMe Micro
Operational
Auth0 Management API
Operational

Recent Incidents

API Reference Degradation

none

Jun 17, 2026 · resolved Jun 17

This incident was resolved and normal site function has been resolved

Refactored Hubs Down

major

May 28, 2026 · resolved Jun 1

Our systems have been stable since Friday, May 29th. We're monitoring closely and improving spike detection to prevent recurrence and adding proactive capacity monitoring to detect admin dashboard degradation before it impacts customers. Full root cause analysis coming soon. Thanks for your patience and understanding.

ReadMe Refactored Outage

critical

May 27, 2026 · resolved May 27

### What Happened Beginning Tuesday, May 26, 2026, customers were experiencing slow loading and 503 errors across ReadMe-hosted docs and the admin dashboard. The outage was intermittent but recurring, with the worst periods hitting during business hours when traffic spiked. ### Root Cause An internal data backup process was generating excessive I/O on our storage layer. Under normal traffic conditions, this additional load was manageable. But when it coincided with peak customer traffic and elevated bot activity, total I/O demand exceeded system capacity, causing cascading request timeouts. The maintenance process ran on a recurring schedule, which is why the degradation followed a predictable pattern of spikes throughout each day. Separately, a surge in bot traffic to non-existent pages \(404s\) amplified the problem because those requests were not being served from cache. ### Resolution **Immediate fix:** We identified and disabled the maintenance process causing the excess I/O load. Within three hours, storage utilization returned to normal levels and remained stable. **Additional improvements shipped during the incident:** * Expanded caching across multiple layers, significantly reducing load on backend storage * Hardened 404 handling to serve error pages from cache instead of hitting the backend * Implemented rate limiting and IP-based protections against abusive bot traffic * Optimized several high-traffic API endpoints to reduce redundant backend calls * Added new monitoring and alerting for storage I/O thresholds ### Timeline May 26 - June 1, 2026 | Time | Status | Details | | --- | --- | --- | | Mon 5/26, 6:39 AM PDT | Investigating | Issue reported | | Mon 5/26, 7:26 AM PDT | Monitoring | Fix implemented, monitoring results | | Mon 5/26, 8:37 AM PDT | Resolved | Admin hub incident resolved | | Mon 5/26, 10:41 AM PDT | Investigating | Slow performance across customer hubs | | Mon 5/26, 10:49 AM PDT | Monitoring | Quick fix applied, investigating thorough fix | | Mon 5/26, 8:54 PM PDT | Resolved | Customer hub incident resolved | | Tue 5/27, 6:51 AM PDT | Investigating | Issue reported | | Tue 5/27, 7:52 AM PDT | Identified | Fix in progress | | Tue 5/27, 8:57 AM PDT | Monitoring | Slowly recovering | | Tue 5/27, 10:17 AM PDT | Update | Updated 6/3: A routine configuration update coincided with the downtime window, which led us to initially identify it as the cause. Further investigation confirmed the two were unrelated. See root cause and resolution above. | | Tue 5/27, 1:54 PM PDT | Update | Performance and loading still affected, rolling out fixes | | Tue 5/27, 3:37 PM PDT | Resolved | Incident resolved | | Wed 5/28, 6:40 AM PDT | Investigating | Issue reported | | Wed 5/28, 6:43 AM PDT | Identified | Fix being implemented | | Wed 5/28, 7:35 AM PDT | Update | Systems coming back up, working on permanent fix | | Wed 5/28, 8:27 AM PDT | Monitoring | Fix implemented, monitoring results | | Wed 5/28, 10:53 AM PDT | Update | Reports of degraded performance, actively investigating | | Wed 5/28, 1:04 PM PDT | Update | Systems appear stable, rolling out fixes | | Wed 5/28, 6:49 PM PDT | Monitoring | Improvements and fixes deployed, continuing to monitor | | Thu 5/29, 10:03 AM PDT | Monitoring | Systems stable. Bi-directional sync maintenance 9:50–11:59 AM ET. | | Thu 5/29, 11:36 AM PDT | Monitoring | Deploying targeted fixes. Serving all 404s from cache. Degraded read performance. | | Fri 5/30 | Monitoring | Monitoring continues | | Sat 5/31 | Monitoring | Weekend monitoring | | Mon 6/1, 11:16 AM PDT | Resolved | Root cause identified and resolved. Systems stable since Thursday, May 29. | ### Path Forward We are using this incident to make lasting improvements to reliability and incident response: * **Storage capacity and isolation:** Restructuring how background processes interact with production storage to eliminate contention under load. * **Caching and performance:** The caching improvements shipped during the incident are permanent. We are continuing to expand cache coverage across additional endpoints and page types. * **Bot and traffic protection:** Strengthening rate limiting and abuse detection to prevent bot traffic from contributing to backend load. * **Monitoring and alerting:** Adding proactive capacity monitoring with earlier thresholds so the team can intervene before customers are affected. * **Incident response:** Improving our internal processes for faster escalation and more frequent status page updates during multi-day incidents. ### Final Note During the incident, we posted an update referencing a platform update. That was our initial hypothesis based on timing. Further investigation confirmed it was unrelated. The change in question was a routine, isolated configuration update and had no impact on the outage or any other customers. We should have waited for confirmation before publishing it, and we're tightening our internal process for status page updates as a result. We know how critical your documentation is to your customers, and this level of disruption is not acceptable. We have already shipped meaningful improvements to prevent recurrence, and the work outlined above will continue through the coming weeks. If you have questions, reach out to your account team or contact [support@readme.io](mailto:support@readme.io).

Slow performance across customer hubs

major

May 26, 2026 · resolved May 27

This incident has been resolved.

Degraded performance across Admin Hubs

major

May 26, 2026 · resolved May 26

This incident has been resolved.

Get alerted when ReadMe goes down

Alert24 monitors ReadMe and 3,700+ other cloud and SaaS providers. When an outage is detected, it updates your status page automatically and pages your on-call team. No manual updates at 2 AM.

Start free — no credit card

More Developer Platforms & Tools status pages