4. Update Seed Controls
Purpose: Adjust seeding pipeline directives like mode, volume, quality thresholds, and cost caps before a seed run. Also covers adding or expanding CategorySpec definitions (44 specs across 33 root categories, ~49,000 seed entries total).
Prerequisites:
- Bun installed with
bun installcompleted - For new categories: target category must exist in
topic_categoriesDB table - For seed generation: AI model must be configured (check
ai_model_tier_config— theseed_explosiontask uses thedefaulttier)
Cost / Duration: $0 for directives | $1-10 for seed generation (depends on entity count and model)
Prompt
Update seed controls for the Eko seeding pipeline.
I need to change: [describe what you want to update, e.g.,
"increase volume target to 500 facts",
"raise quality threshold to 0.8",
"switch to curated mode",
"increase cost cap to $50"]
Option A — Update seed control directives:
Open `docs/projects/seeding/SEED.md` and update the relevant
control values in the directives section:
- mode (curated vs. automated)
- volume targets
- quality_threshold
- cost_cap
- Any other pipeline control parameters
This file serves as the single source of truth for seeding
directives. A Claude session reads this file and interprets
the directives when running seeding operations.
Option B — Add or expand CategorySpec definitions:
Open `packages/ai/src/config/categories.ts` to add or modify
CategorySpec entries. Each spec defines:
- `slug`: matches the root category slug in `topic_categories`
- `subcategories`: array of { name, count, prompt } objects
- `name`: must match a depth-1 subcategory name
- `count`: number of entity names to generate (25-200)
- `prompt`: AI prompt describing what entities to generate
There are currently 44 CategorySpec definitions covering all
33 active root categories. Some roots have multiple specs
(e.g., sports has per-sport specs like baseball, basketball).
After editing, generate seed entries:
```bash
bun scripts/seed/generate-curated-entries.ts --category <slug> --insert
```
The script resolves subcategories via DB lookup, generates
entity names using AI (routed through `seed_explosion` task →
`default` tier), and inserts into `seed_entry_queue`.
Option C — Add PATH_CATEGORY_MAP entries:
For new root categories, add keyword patterns to
`scripts/seed/lib/category-mapper.ts` so that URLs can be
auto-mapped to the correct category during ingestion.
Verify the updated values make sense:
- Cost cap is sufficient for the target volume
- Quality threshold is appropriate for the seeding mode
- Volume targets are achievable within budget
- AI model is available for the `default` tier (check
`ai_model_tier_config` — the `seed_explosion` task maps to
`default` tier via `TASK_TIER_MAP` in `packages/ai/src/fact-engine.ts`)
Key references:
- Seed control directives: `docs/projects/seeding/SEED.md`
- CategorySpec definitions: `packages/ai/src/config/categories.ts`
- Seed generation script: `scripts/seed/generate-curated-entries.ts`
- Category mapper: `scripts/seed/lib/category-mapper.ts`
- Seeding scripts: `scripts/seed/`
- Environment cost controls: `packages/config/src/index.ts`
- Model routing for seeds: `packages/ai/src/fact-engine.ts` (TASK_TIER_MAP)
Verification
- Seed control values updated in
docs/projects/seeding/SEED.md(if changing directives) - CategorySpec added/expanded in
categories.ts(if adding categories) - PATH_CATEGORY_MAP updated in
category-mapper.ts(if adding new roots) - Values are internally consistent (cost cap supports volume target)
-
generate-curated-entries.ts --category <slug> --insertsucceeds (if generating seeds) - Seed counts verified:
SELECT count(*) FROM seed_entry_queue WHERE topic_category_id = ... - Next seed run picks up the new directives
Related Prompts
- Seed the Database — Run a seeding operation with updated controls
- Update Model Registry — Change model or tier before a seed run for cost optimization