4. Generate Challenge Content
Purpose: Generate pre-built challenge content (6 quiz styles x 5 difficulty levels) for validated facts using a multi-phase JSONL pipeline.
Prerequisites:
.env.localhasDATABASE_URLand API key for the routed model (see available models)- Validated facts exist in
fact_records(runworker-validatefirst if facts are stillpending_validation) - Sufficient API budget for the run (model selected by tier routing via
selectModelForTask)
Cost / Duration: $6-$60 depending on fact count (see cost table below) | 4-12 hours
Prompt
I need to generate challenge content for validated facts. Walk me through the
6-phase pipeline, reporting progress and costs after each phase.
### Phase 1: Audit current coverage
```bash
bun scripts/seed/generate-challenge-content.ts --audit
```
Report how many validated facts have 0, 1, 2, ... 6 challenge styles.
This tells us the scope of the run.
### Phase 2: Export facts needing content
```bash
bun scripts/seed/generate-challenge-content.ts --export
```
This dumps validated facts missing challenge content to
`.challenge-data/facts-export.jsonl` (gitignored).
### Phase 3: Generate challenge content
First, do a dry run on a small sample:
```bash
bun scripts/seed/generate-challenge-content.ts --generate --dry-run --limit 10
```
Review the sample output. If it looks good, run in partitions for parallel execution:
```bash
bun scripts/seed/generate-challenge-content.ts --generate --concurrency 5 --partition 1/4
bun scripts/seed/generate-challenge-content.ts --generate --concurrency 5 --partition 2/4
bun scripts/seed/generate-challenge-content.ts --generate --concurrency 5 --partition 3/4
bun scripts/seed/generate-challenge-content.ts --generate --concurrency 5 --partition 4/4
```
Each partition generates 6 styles per fact: multiple_choice, direct_question,
fill_the_gap, statement_blank, reverse_lookup, free_text. Each challenge gets
its own per-style `challenge_title` (theatrical, cinematic — generated alongside
the question/answer, not inherited from the fact record). Results go to
`.challenge-data/challenges-generated.jsonl` (partition-aware filenames).
### Phase 4: Upload to database
Preview first, then upload:
```bash
bun scripts/seed/generate-challenge-content.ts --upload --dry-run
bun scripts/seed/generate-challenge-content.ts --upload
```
Bulk upserts to `fact_challenge_content` using
`onConflictDoUpdate` on `(fact_record_id, challenge_style, target_fact_key, difficulty)`.
### Phase 5: Validate quality
```bash
bun scripts/seed/generate-challenge-content.ts --validate
```
Post-upload quality check against CC/CQ rules. Samples 20 random facts for
manual review and reports CQ-002 pass rate.
### Phase 6: Recover weak outputs (optional)
If validation reports issues:
```bash
bun scripts/seed/generate-challenge-content.ts --recover
```
Re-processes facts with validation issues. Supports `--partition N/M` for
parallel execution.
After all phases, run `--audit` again to confirm full coverage.
### Fact Challenge Groups (FCG)
For group-based generation (5 challenges per fact in a single API call), use:
```bash
bun scripts/seed/test-fcg-diversity.ts --limit 10 --models gpt-5.4-nano
```
FCG generation uses `generateGroupChallenges()` which:
- Produces 5 challenges (C1-C5) with increasing difficulty in one call
- C1-C2: multiple_choice; C3-C5: AI-selected from fill_the_gap, direct_question, reverse_lookup, free_text
- Enforces key diversity via `validateGroupDiversity()` — min 3-4 unique keys, max 2 per key
- Retries with explicit round-robin key assignments if diversity validation fails
- Reports per-entity key distribution in `docs/test-results/challenge-groups/test-group-3/`
Cost Table
| Facts | Styles | Estimated Cost |
|---|---|---|
| 1,000 | 6 (all) | ~$6 |
| 5,000 | 6 (all) | ~$30 |
| 10,000 | 6 (all) | ~$60 |
| 10,000 | 3 (mc,dq,ftg) | ~$30 |
Verification
-
--auditshows all validated facts have 6 challenge styles - Zero facts with 0 styles remaining
- CQ-002 pass rate above 95% on
--validate - Upload completed without conflict errors
- Cost summary reported for the run
- FCG diversity test passes (
bun scripts/seed/test-fcg-diversity.ts) — all groups use 3+ unique keys
Related Prompts
- Seed the Database -- Full seeding pipeline
- Tune Challenge Allocation -- Configure style/difficulty matrix
- Rewrite Challenge Defects -- Fix validation failures
- Run Voice Enforcement -- Style-specific voice pass