Seed Job Logs

Structured operational logs for every seed job run against the Eko fact engine. Each log captures configuration, timing, results, costs, and errors for auditability and debugging.

Directory Structure

logs/
  README.md              # This file
  YYYY-MM/
    index.md             # Monthly summary table
    YYYY-MM-DDThh-mm--<job-type>--<title-slug>.md

Naming Convention

YYYY-MM-DDThh-mm--<job-type>--<title-slug>.md

Segment	Description	Examples
`YYYY-MM-DDThh-mm`	UTC timestamp when the job started	`2026-02-18T20-18`
`<job-type>`	Pipeline stage or script that ran	`curated-seed`, `challenge-gen`, `cleanup`, `backfill`, `news-ingestion`, `explosion`, `validation`
`<title-slug>`	Short description of what was seeded	`science`, `full-regen`, `geography-expansion`, `daily`

Job Types

Job Type	Script / Component	Description
`curated-seed`	`generate-curated-entries.ts`	AI-generated entity names for a topic
`explosion`	`bulk-enqueue.ts` + `worker-facts`	Entity → fact expansion via AI
`challenge-gen`	`generate-challenge-content.ts`	Fact → 6 quiz styles via AI
`cleanup`	`cleanup-content.ts`	Rewrite titles/context for quality
`backfill`	`backfill-fact-nulls.ts`	Fill NULL metadata on existing facts
`validation`	`worker-validate`	Multi-tier fact verification
`news-ingestion`	`worker-ingest` + crons	Automated news article processing

Log Template

Use this template when creating a new seed log. Copy the block below into a new file.

---
title: "<Job Type> — <Topic/Description>"
status: completed | failed | partial
mode: curated-seed | news-only | evergreen-boost | full-pipeline
started: YYYY-MM-DDThh:mm
finished: YYYY-MM-DDThh:mm
duration_minutes: N
topics: [topic-slug-1, topic-slug-2]
entities: N
facts_generated: N
facts_failed: N
challenges_generated: N
cost_usd: N.NN
errors: N
model: gpt-5-mini | gemini-2.5-flash | claude-haiku-4-5
partitions: N
concurrency: N
---

# <Job Type> — <Topic/Description>

## Config Snapshot

Relevant SEED.md settings or CLI flags used for this run:

```yaml
# Paste the active SEED.md sections or CLI flags here
mode: curated-seed
topics:
  science:
    subcategories:
      - name: "Physics & Space"
        count: 100
volume:
  richness_tier: medium
  challenge_difficulty: 1
execution:
  concurrency: 5
  partitions: 4
```

## Execution Timeline

| Phase | Started | Duration | Result |
|-------|---------|----------|--------|
| Export | hh:mm | Nm | N facts exported |
| Generate | hh:mm | Nh Nm | N challenges created |
| Upload | hh:mm | Nm | N rows upserted |
| Validate | hh:mm | Nm | N% pass rate |

## Results

| Metric | Value |
|--------|-------|
| Entities seeded | N |
| Facts generated | N |
| Facts failed | N |
| Challenges generated | N |
| CQ-002 pass rate | N% |
| Total cost | $N.NN |

## Errors

<!-- Omit this section if errors = 0 -->

| Time | Phase | Error | Resolution |
|------|-------|-------|------------|
| hh:mm | Generate | Rate limit (TPM) | Auto-retry succeeded |

## Follow-ups

- [ ] Items needing attention after this run

Creating a Log

Copy the template above into a new file at logs/YYYY-MM/<filename>.md
Fill in frontmatter and sections as the job runs
Update the monthly index.md with a summary row
Mark status as completed, failed, or partial when done

Monthly Index Format

Each month's index.md contains a summary table:

| Date | Job Type | Topics | Status | Facts | Challenges | Cost | Log |
|------|----------|--------|--------|-------|------------|------|-----|
| Feb 18 | challenge-gen | all | completed | — | 373,937 | ~$93 | [link](file.md) |

SEED.md — Seeding control prompt
README.md — Pipeline documentation
runbook.md — Operational procedures
TODO.md — Progress tracker
seeding-best-practices.md — Strategies and cost management
APP-CONTROL.md — App control manifest (crons, workers, queues, APIs)
Ops Logs — Operational event logging (parallel system)

#Seed Job Logs

#Directory Structure

#Naming Convention

#Job Types

#Log Template

#Creating a Log

#Monthly Index Format

#Related Documents