1. Seed the Database
Purpose: Run the full seeding pipeline — generate entities, enqueue, explode into facts, validate, and generate challenge content.
Prerequisites:
- Full Supabase credentials (
SUPABASE_URL,SUPABASE_SERVICE_ROLE_KEY,SUPABASE_ANON_KEY) - AI model API key for the provider backing the
defaulttier (checkai_model_tier_configDB table — theseed_explosiontask routes throughdefaulttier) - Review docs/projects/seeding/SEED.md and docs/projects/seeding/runbook.md
Cost / Duration: ~$68 for a typical run (500 entities, medium richness) | 1-3 hours
Prompt
I need to run a seeding operation. Before executing, help me configure these parameters:
1. **Mode:** One of:
- `curated-seed` — Generate curated entity entries for specified topics
- `news-only` — Only automated news ingestion (low cost)
- `evergreen-boost` — Generate evergreen facts for existing entities
- `backfill` — Fill gaps in existing coverage
2. **Topics:** Which topic categories to seed? (33 active roots, 44 CategorySpecs in `categories.ts`; e.g., sports, science, entertainment, health-medicine, space-astronomy)
3. **Volume:** How many entities? (50 = light, 200 = medium, 500 = full)
4. **Quality:** Richness level — `minimal`, `medium`, or `rich`?
5. **Model:** Which model for extraction? (routed via `ai_model_tier_config`; e.g., gemini-3-flash-preview, gemini-2.5-flash)
6. **Execution:** Concurrency level (default: 3), dry-run first?
Read `docs/projects/seeding/SEED.md` for the full seeding control interface
and `docs/projects/seeding/runbook.md` for step-by-step execution guidance.
Configure the SEED.md directives based on my answers above, then execute
the pipeline step by step, reporting costs after each phase.
Verification
- SEED.md directives configured before execution
- Entity generation completed for target topics
- Facts exploded and stored in
factstable - Validation pass completed (structural + cross-model)
- Challenge content generated for validated facts
- Cost summary reported (total and per-phase)
- Topic distribution matches requested categories
Related Prompts
- Audit Fact Quality — Check corpus health after seeding
- Backfill Null Metadata — Patch any missing metadata post-seed