Ops Event Logs

Structured operational logs for deployments, cron executions, incidents, config changes, and maintenance tasks across the Eko platform. Each log captures context, timing, results, and follow-ups for auditability and debugging.

Directory Structure

logs/
  README.md              # This file
  YYYY-MM/
    index.md             # Monthly summary table
    YYYY-MM-DDThh-mm--<event-type>--<title-slug>.md

Naming Convention

YYYY-MM-DDThh-mm--<event-type>--<title-slug>.md
SegmentDescriptionExamples
YYYY-MM-DDThh-mmUTC timestamp when the event started2026-02-18T20-18
<event-type>Category of operational eventdeploy, cron-run, incident, config-change, maintenance, worker-issue, queue-drain
<title-slug>Short description of the eventprod-release, payment-reminders-fail, evergreen-enabled, dlq-cleanup

Event Types

Event TypeDescriptionTypical Trigger
deployProduction or preview deploymentVercel deploy, manual promote
cron-runManual cron trigger or notable scheduled executionManual curl, Vercel scheduler
incidentService degradation, errors, or outageAlert, user report, monitoring
config-changeEnvironment variable or feature flag change.env.local edit, Vercel env update
maintenancePlanned cleanup, migration, or deprecation removalScheduled work
worker-issueWorker crash, stuck queue, or processing failureHealth check failure, DLQ growth
queue-drainManual DLQ drain or queue reprocessingOps decision
billing-eventStripe webhook issue, subscription anomalyWebhook logs, Stripe dashboard
cost-alertAI spend cap reached or budget anomalydaily-cost-report cron, manual check

Log Template

Use this template when creating a new ops log. Copy the block below into a new file.

---
title: "<Event Type><Description>"
status: completed | in-progress | failed | investigating
event_type: deploy | cron-run | incident | config-change | maintenance | worker-issue | queue-drain | billing-event | cost-alert
severity: info | warning | error | critical
started: YYYY-MM-DDThh:mm UTC
finished: YYYY-MM-DDThh:mm UTC
duration_minutes: N
environment: production | preview | local
components: [cron/payment-reminders, worker-facts, stripe-webhooks]
trigger: manual | scheduled | alert | user-report
resolution: resolved | monitoring | escalated | deferred
---

# <Event Type><Description>

## Context

What happened and why this event was logged:

- **Trigger:** What initiated this event
- **Environment:** Which environment was affected
- **Impact:** User-facing impact (if any)

## Timeline

| Time (UTC) | Action | Result |
|------------|--------|--------|
| hh:mm | First observation / trigger | Description |
| hh:mm | Investigation step | Finding |
| hh:mm | Resolution action | Outcome |

## Details

### What Changed / What Happened

Describe the specific changes, errors, or observations.

### Root Cause

<!-- For incidents only. Omit for routine events. -->

Root cause analysis (if applicable).

### Impact Assessment

| Metric | Value |
|--------|-------|
| Users affected | N |
| Duration | Nm |
| Data loss | none / description |
| Revenue impact | none / $N |

## Resolution

What was done to resolve the issue or complete the task.

## Follow-ups

- [ ] Items needing attention after this event

Severity Levels

LevelMeaningExample
infoRoutine operational eventSuccessful deploy, config change, scheduled maintenance
warningNotable but not impacting usersDLQ growth, budget approaching cap, cron skip
errorService degradation or partial failureCron failures, worker crashes, webhook rejections
criticalUser-facing outage or data integrity riskProduction down, payment processing failure, data corruption

Creating a Log

  1. Copy the template above into a new file at logs/YYYY-MM/<filename>.md
  2. Fill in frontmatter — at minimum: title, status, event_type, severity, started
  3. Update sections as the event progresses
  4. Update the monthly index.md with a summary row
  5. Mark status as completed, failed, or investigating when done

Monthly Index Format

Each month's index.md contains a summary table:

| Date | Event Type | Severity | Components | Status | Duration | Log |
|------|-----------|----------|------------|--------|----------|-----|
| Feb 18 | config-change | info | env/EVERGREEN_ENABLED | completed | 5m | [link](file.md) |