K0nsult relies on Fly.io's built-in backup infrastructure for all persistent data.
fly secrets. A secure offline copy is maintained by 0n40i4.K0nsult runs with 2 machines in the same Fly.io region. If one machine fails, Fly.io automatically routes traffic to the surviving instance.
/health endpoint is used for liveness checksFollow these steps in order when a service disruption is detected:
Determine whether the issue is platform-wide or application-specific.
fly status -a k0nsult
If the database is unresponsive, restart the Postgres cluster.
fly postgres restart -a k0nsult-db
If the app container is corrupt or misconfigured, redeploy from the latest git commit.
fly deploy -a k0nsult
Confirm the application is responding correctly after recovery.
curl https://k0nsult.fly.dev/health
Expected response: {"ok":true,"status":"operational"}
fly postgres restore and expect up to 24 hours of data loss (RPO). Coordinate with 0n40i4 before restoring.
If automated recovery fails or manual intervention is needed, contact the following in order:
Fly.io automatically reroutes to the second machine. No manual action needed. Monitor logs for root cause.
Step 1: Check fly postgres status. Step 2: Restart Postgres. Step 3: If persistent, check connection string in fly secrets.
Run fly deploy to redeploy from git. If the latest commit is broken, deploy a known-good commit: fly deploy --image registry.fly.io/k0nsult:sha-xxxxxxx
Restore from the latest Fly.io Postgres snapshot. Contact 0n40i4 immediately. Maximum data loss: 24 hours.
Check Fly.io certificate status: fly certs show -a k0nsult. If expired, Fly.io auto-renews via Let's Encrypt. Force renewal: fly certs add k0nsult.fly.dev
/health endpoint responds from both machines