Pipeline Prompts

Prompts for operational pipeline tasks: queue management, worker operations, cron jobs, cost tracking, and monitoring.

Prompts

#	Prompt	Cost	Duration
1	Start Workers	$0	~2 min
2	Trigger News Ingestion	Negligible	~2 min
3	Check Queue Health	$0	~2 min
4	Drain Dead Letter Queues	$0	5-15 min
5	Recover Stuck Queues	$0	10-30 min
6	Trigger Validation Retry	~$1	~2 min
7	Trigger Evergreen Generation	~$1	~2 min
8	Run Workers High Concurrency	Varies	~5 min
9	Trigger Challenge Images	$0	~2 min

FAQ

Queues

What are all 12 queue types and which worker consumes each one?

worker-ingest (4): INGEST_NEWS, CLUSTER_STORIES, RESOLVE_IMAGE, RESOLVE_CHALLENGE_IMAGE.
worker-facts (6): EXTRACT_FACTS, IMPORT_FACTS (stub), GENERATE_EVERGREEN, EXPLODE_CATEGORY_ENTRY, FIND_SUPER_FACTS, GENERATE_CHALLENGE_CONTENT.
worker-validate (1): VALIDATE_FACT.
Deprecated (1): SEND_SMS -- no active consumer.
Queue client: packages/queue/src/index.ts. Message schemas: packages/shared/src/schemas.ts.

How does the retry and dead-letter queue (DLQ) system work?

Backend: Upstash Redis (REST API).
Messages are retried up to 3 times before moving to the DLQ.
Nack retry backoff: 5s→30s→60s with 15% jitter. Polling backoff: 250ms→5min.
DLQ naming convention: queue:<type>:dlq (e.g., queue:validate_fact:dlq).
Messages in the DLQ require manual inspection and reprocessing.
Config: packages/queue/src/index.ts.

How do I inspect queue depth and drain a DLQ?

Admin dashboard: admin.eko.day/queue shows live depths and DLQ counts.
Upstash dashboard provides direct Redis inspection for deeper debugging.
Queue names follow the pattern queue:ingest_news, queue:cluster_stories, etc.
DLQ names follow the pattern queue:ingest_news:dlq, queue:validate_fact:dlq, etc.

What is SOAK_QUEUE_SUFFIX and how do I test queues in isolation?

SOAK_QUEUE_SUFFIX is an env var that appends a suffix to all queue names for test isolation.
Example: SOAK_QUEUE_SUFFIX=_test causes queue names to become queue:ingest_news_test.
This prevents test messages from polluting production queues.
Set it in .env.local during development or soak testing.

Workers

How do I start workers locally for development?

bun run dev:worker-ingest -- news fetch, clustering, images.
bun run dev:worker-facts -- extraction, evergreen, challenges, seeding.
bun run dev:worker-validate -- multi-phase fact validation.
Run each in a separate terminal for full pipeline coverage.
Requires: UPSTASH_REDIS_REST_URL, UPSTASH_REDIS_REST_TOKEN, SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY.

What does the worker health endpoint return?

GET http://localhost:8080/health returns { status: "ok", timestamp: "..." }.
Port: 8080 (configurable via PORT env var).
All workers expose the same /health endpoint.
Heartbeat interval: 30 seconds. Lease duration: 2 minutes.

How does worker concurrency work and when should I increase it?

WORKER_CONCURRENCY env var controls parallel message handlers per queue (default: 1).
Increase to 3-5 for seeding pipeline runs where high throughput is needed.
Keep at 1 for normal operations to maintain cost control.
Set per-worker in the terminal: WORKER_CONCURRENCY=5 bun run dev:worker-facts.

How does graceful shutdown work?

Workers implement abort signal handling via SIGINT/SIGTERM.
On signal: stop accepting new messages, finish processing current messages, then exit.
Lease duration (2 minutes) is the maximum time to finish current work before forced exit.
This prevents message loss during restarts and deploys.

Crons

Which crons are scheduled in vercel.json and which are unscheduled?

Scheduled (5): payment-reminders, payment-escalation, account-anniversaries, daily-cost-report, monthly-usage-report (deprecated stub).
Unscheduled (8): ingest-news, cluster-sweep, generate-evergreen, validation-retry, archive-content, topic-quotas, import-facts (stub), daily-digest (deprecated).
Config: apps/web/vercel.json.
Unscheduled crons must be triggered manually or added to vercel.json (OPS-004).

Why are crons not firing and how do I trigger them manually?

Vercel crons only fire on production deployments -- all current deploys are preview (OPS-005).
Manual trigger: curl -X POST http://localhost:3000/api/cron/<name> -H "Authorization: Bearer $CRON_SECRET".
Example: curl -X POST http://localhost:3000/api/cron/ingest-news -H "Authorization: Bearer $CRON_SECRET".
Cron routes live in apps/web/app/api/cron/.

How do I add a new cron job?

Create a route handler at apps/web/app/api/cron/<name>/route.ts.
Add auth check: verify Authorization: Bearer $CRON_SECRET in the request header.
Add a schedule entry to apps/web/vercel.json for production firing.
Document the new cron in docs/APP-CONTROL.md.

Cost Tracking and Monitoring

How does AI cost tracking work and where is spend recorded?

Every AI call records cost to the ai_cost_log table.
Fields: model, feature, input_tokens, output_tokens, reasoning_tokens_total, cost_usd, timestamp.
Cost tracker: packages/ai/src/cost-tracker.ts.
Model selection: ai_model_tier_config DB table maps tiers (default/mid/high) to models (60s cache). The TASK_TIER_MAP in packages/ai/src/fact-engine.ts maps each pipeline task to a tier.
Budget enforcement: ANTHROPIC_DAILY_SPEND_CAP_USD (default $5/day).
When the cap is exceeded, the model router falls back to a cheaper model tier.

What does the daily cost report contain and who receives it?

Cron: apps/web/app/api/cron/daily-cost-report/route.ts -- runs daily at 6 AM UTC.
Emails admins an AI cost breakdown by model and feature with a 7-day trend.
Recipients: configured via ADMIN_EMAIL_ALLOWLIST env var.
Uses the @eko/email package for delivery.

What happens when the daily spend cap is exceeded?

ANTHROPIC_DAILY_SPEND_CAP_USD (default $5) and GOOGLE_DAILY_SPEND_CAP_USD (default $3) are checked before each AI call.
When exceeded: the model router falls back to the cheapest available model (typically GPT-4o-mini or Gemini Flash).
No hard stop -- the pipeline continues with cheaper models to avoid data loss.
Cap resets at midnight UTC.
Config: packages/config/src/index.ts.

#Pipeline Prompts

#Prompts

#FAQ

#Queues

#Workers

#Crons

#Cost Tracking and Monitoring

#See Also