9. Tune Challenge Allocation

Purpose: Adjust the challenge style and difficulty level matrix for seeding runs — trade depth (more variants per fact) for breadth (more facts at lower cost) or vice versa.

Prerequisites:

  • Familiarity with the challenge content pipeline (scripts/seed/generate-challenge-content.ts)
  • Access to Supabase for audit queries (or use --audit phase)

Cost / Duration: $0 (planning only) | 5-10 minutes

Context

Challenge content generation is the most expensive step in the seeding pipeline (~88% of total cost). Each fact can have up to 30 challenge variants (6 styles x 5 difficulty levels). Reducing this matrix is the single highest-leverage cost optimization.

Available Styles

AbbreviationFull NameDescription
mcmultiple_choice4-option pick-one
dqdirect_questionOpen-ended factual question
ftgfill_the_gapComplete the sentence with blanked term
sbstatement_blankComplete a factual statement
rlreverse_lookupIdentify the subject from clues
ftfree_textOpen synthesis / explanation

Available Difficulty Levels

LevelLabelDescription
1EasyDirect recall of the most prominent fact
2MediumRecall with some contextual reasoning
3HardRequires connecting multiple pieces of information
4ExpertDeep knowledge or cross-domain reasoning
5MasterExpert synthesis across multiple facts or domains

Cost Formula

cost_per_fact ≈ $0.006 × (styles_count / 6) × (difficulty_count / 5)

For 10,000 facts:

ConfigStylesDifficultiesChallenges/FactEst. Cost
Full (default)6530~$60
Core styles, all diff3515~$30
All styles, key diff6318~$36
Core styles, key diff339~$18
Minimal224~$8

Prompt

I need to tune the challenge content allocation for a seeding run.

Read `docs/prompts/seeding/09-tune-challenge-allocation.md` for the full reference,
then help me configure based on these inputs:

1. **Goal** (pick one):
   - `broad` — Maximize fact coverage (more unique facts, fewer challenge variants each)
   - `deep` — Maximize challenge variety per fact (fewer facts, more variants each)
   - `balanced` — Middle ground

2. **Budget:** $_____ total for challenge generation

3. **Estimated fact count:** _____ facts to generate challenges for

4. **Style priority** (rank or select):
   - Which styles are essential for launch? (mc, dq, ftg are the 3 core styles)
   - Which styles can be deferred for backfill later?

5. **Difficulty priority:**
   - Which levels matter most for the current user base?
   - Do we need all 5 levels, or can we start with a subset and backfill?

Based on my answers, do the following:

### A. Calculate the allocation

Show me a cost comparison table for 2-3 allocation options that fit my budget,
including:
- Styles selected and their abbreviations
- Difficulty levels selected
- Challenges per fact
- Total challenges
- Estimated cost
- Cost per fact

### B. Generate the CLI commands

Produce the exact commands for each phase:

```bash
# Export
bun scripts/seed/generate-challenge-content.ts --export --difficulty <N>

# Generate (single difficulty)
bun scripts/seed/generate-challenge-content.ts --generate \
  --difficulty <N> --styles <list> --concurrency 5

# Generate (balanced spread across selected difficulties)
# Run each difficulty level as a separate pass:
bun scripts/seed/generate-challenge-content.ts --generate \
  --difficulty 1 --styles <list> --concurrency 5
bun scripts/seed/generate-challenge-content.ts --generate \
  --difficulty 3 --styles <list> --concurrency 5
# ... etc.

# Upload all results
bun scripts/seed/generate-challenge-content.ts --upload

# Validate
bun scripts/seed/generate-challenge-content.ts --validate --drift-check
```

### C. Backfill plan

For any styles or difficulty levels I'm skipping now, outline the backfill
commands I'd run later when budget allows. Note: `--export-all` re-exports
facts that already have some challenge content so additional styles/levels
can be generated.

### D. Audit current state (if I have existing data)

If I already have challenge content in the database, first run:
```bash
bun scripts/seed/generate-challenge-content.ts --audit
```
Then show me what's already covered and what gaps the new allocation fills.

Verification

  • Cost estimate fits within stated budget
  • Selected styles cover at least 3 core styles (CC-001 compliance: mc, dq, ftg)
  • Selected difficulties include level 1 (Easy) for onboarding
  • CLI commands are syntactically correct with valid abbreviations
  • Backfill plan covers deferred styles/levels

Common Presets

Broad Coverage (maximize facts)

--styles mc,dq,ftg --difficulty 1
# 3 challenges/fact | ~$0.003/fact | Best for initial corpus

Engagement-Ready (launch minimum)

--styles mc,dq,ftg --difficulty 0
# 15 challenges/fact (3 styles x 5 levels) | ~$0.03/fact
# --difficulty 0 = balanced spread across all 5 levels

Key Difficulties Only

--styles mc,dq,ftg,rl --difficulty 1
# then separate passes for:
--styles mc,dq,ftg,rl --difficulty 3
--styles mc,dq,ftg,rl --difficulty 5
# 12 challenges/fact | ~$0.024/fact | Skip middle tiers

Full Matrix (no compromises)

--styles mc,dq,ftg,sb,rl,ft --difficulty 0
# 30 challenges/fact | ~$0.06/fact | Expensive but complete

Parallel Execution

For large runs, split across partitions:

# 4 parallel workers, each handling 1/4 of the facts
for i in 1 2 3 4; do
  bun scripts/seed/generate-challenge-content.ts --generate \
    --partition $i/4 --output-suffix p$i \
    --styles mc,dq,ftg --difficulty 1 --concurrency 5 &
done
wait

Back to index