Enrichment Pipeline (WI-9)
Context
Retroactive challenge document for the enrichment pipeline work that landed post-Wave 2. This covers the enrichment orchestrator (13 API clients), challenge content generation pipeline, challenge image anti-spoiler pipeline, drift coordinators, and subcategory activations. All work is complete — this doc exists for rollout tracking consistency.
Current State
- All 5 challenges implemented and verified
- Enrichment orchestrator operational with 13 API clients
- Challenge content generation producing 8 format types x 6+ styles
- ~95 subcategories activated with domain-specific schemas
Challenges
Challenge 9.1: Enrichment Orchestrator
Requirement: Parallel API resolution with fail-open design for fact enrichment. Acceptance Criteria:
- 13 API clients integrated (Wikipedia, Wikidata, OpenLibrary, MusicBrainz, TMDB, GeoNames, etc.)
- Parallel resolution with independent failure handling
- Fail-open design — enrichment failure does not block fact publication
- Cache layer for repeated lookups Evaluation: PASS
Challenge 9.2: Challenge Content Generation
Requirement: AI-powered challenge generation across multiple format and style combinations. Acceptance Criteria:
- 8 format types supported (multiple-choice, fill-blank, true-false, etc.)
- 6+ style variations per format
- Micro-batching for cost amortization
- Voice constitution enforcement (challenge tone prefix)
-
GENERATE_CHALLENGE_CONTENTqueue integration Evaluation: PASS
Challenge 9.3: Challenge Image Pipeline
Requirement: Image resolution for challenges with anti-spoiler safeguards. Acceptance Criteria:
-
RESOLVE_CHALLENGE_IMAGEqueue type operational - Anti-spoiler logic prevents answer-revealing images
- Fallback to generic category images when specific image is unavailable or spoiler-risk Evaluation: PASS
Challenge 9.4: Drift Coordinators
Requirement: Semantic validators ensuring AI output quality. Acceptance Criteria:
- 7 drift coordinators implemented: voice, structure, schema, taxonomy, difficulty, reveal, textbook
- Post-generation patching pipeline (passive voice, generic reveal, textbook register)
- Validators run on every generated challenge before publication Evaluation: PASS
Challenge 9.5: Subcategory Activation
Requirement: Activate subcategories with domain-specific schemas for pipeline routing. Acceptance Criteria:
- ~95 subcategories activated
- Domain-specific schemas per subcategory
- Auto-inheritance trigger for new subcategories
-
EXPLODE_CATEGORY_ENTRYqueue type for category expansion - Fact routing respects subcategory schema constraints Evaluation: PASS
Implementation Notes
- Enrichment orchestrator lives in
packages/ai/ - Challenge content generation uses Vercel AI SDK v6
generateObjectwith Zod schemas - Drift coordinators are composable — each returns pass/fail with patch suggestions
- Subcategory schemas are stored in
fact_record_schemas.fact_keysand validated at extraction time - Queue types added:
RESOLVE_CHALLENGE_IMAGE,GENERATE_CHALLENGE_CONTENT,EXPLODE_CATEGORY_ENTRY,FIND_SUPER_FACTS