Uptime Monitor User Guide

1. Executive Summary

The Monitor module is a lightweight, crucial system health-check worker. Its primary responsibility is to routinely ping live routes and infrastructure dependencies to ensure the Aegis platform and its active landing pages are highly available and performing optimally.

2. Routine Health Checks

The Monitor operates on automated scheduled intervals (Cron triggers):

  • Endpoint Pinging: It systematically fires HTTP requests at live landing pages and critical API endpoints.
  • Latency Tracking: It measures response times to ensure that backend Lovable SPA instances and Shopify integrations are not suffering from severe latency degradation.
  • Status Code Verification: It validates that routes are returning expected 200 OK statuses, rather than 500 errors or unintended 404s.

3. Incident Alerting

When the Monitor detects an anomaly, rapid response is initiated:

  • If a critical route fails consecutively (to rule out momentary network blips), the Monitor constructs an incident report.
  • It dispatches real-time alerts to the engineering team via dedicated Slack channels, providing the exact URL, the error code received, and the time of failure.

4. Fallback Validation

The Monitor is also responsible for testing the platform's disaster recovery mechanisms.

  • It routinely validates the integrity of the fallbackConfig.json system used by the Router.
  • By testing the fallback paths, the Monitor ensures that if the D1 Database experiences a catastrophic outage, the Edge memory routing will seamlessly take over without dropping active ad traffic.