Model Code Isolation
Per-model prompt optimization via the ModelAdapter strategy pattern. Each AI model gets a dedicated adapter that tailors prompts, generation parameters, and quality review guidance to exploit that model's strengths and mitigate its weaknesses.
Why Per-Model Optimization Matters
Different LLMs have distinct failure modes. GPT-5 Mini tends toward quiz-show register in challenge titles. Gemini 2.5 Flash gravitates to textbook/encyclopedia tone. Claude Haiku 4.5 can be verbose. Rather than tuning prompts to the lowest common denominator, the adapter pattern lets each model receive targeted corrections injected into the system prompt at generation time.
Directory Structure
packages/ai/src/models/
├── types.ts # ModelAdapter, PromptCustomization, AdaptableTask
├── registry.ts # getModelAdapter(), eligibility tracking (JSONL)
└── adapters/
├── _default.ts # Null-object pass-through (no customization)
├── gpt-5-mini.ts # OpenAI GPT-5 Mini
├── gemini-2.5-flash.ts # Google Gemini 2.5 Flash
├── gemini-3-flash-preview.ts # Google Gemini 3 Flash Preview
└── claude-haiku-4-5.ts # Anthropic Claude Haiku 4.5
Interface Contract
ModelAdapter
interface ModelAdapter {
modelId: string
provider: string
displayName: string
getPromptCustomization(task: AdaptableTask): PromptCustomization
getKnownWeaknesses(): string[]
getSignoffGuidance(): string
}
PromptCustomization
interface PromptCustomization {
systemPromptSuffix?: string // Appended after base prompt (default mode)
systemPromptPrefix?: string // Prepended before base prompt
systemPromptOverride?: string // Replaces entire system prompt (use sparingly)
temperature?: number // Override default temperature
maxRetries?: number // Override default retry count
}
AdaptableTask (7 tasks)
| Task | Consumer |
|---|---|
fact_extraction | fact-engine.ts — news story extraction |
evergreen_generation | fact-engine.ts — timeless knowledge facts |
seed_explosion | seed-explosion.ts — entity-to-facts expansion |
challenge_content_generation | challenge-content.ts — quiz/recall content |
notability_scoring | fact-engine.ts — fact importance scoring |
fact_validation | fact-engine.ts — AI plausibility check |
signoff_review | llm-fact-quality-testing.ts, news-ingestion-test.ts — quality signoff (dedicated: gemini-3-flash-preview) |
Three Customization Modes
Priority order (highest wins):
- Override —
systemPromptOverridereplaces the entire system prompt. Use sparingly for models that need fundamentally different prompting. - Prefix —
systemPromptPrefixis prepended before the base prompt. Useful for setting context before instructions. - Suffix (default) —
systemPromptSuffixis appended after the base prompt. Most adapters use this mode to add model-specific corrections at the end.
Applied by applyPromptCustomization() in packages/ai/src/fact-engine.ts:
function applyPromptCustomization(basePrompt: string, customization: PromptCustomization): string {
if (customization.systemPromptOverride) return customization.systemPromptOverride
const prefix = customization.systemPromptPrefix ?? ''
const suffix = customization.systemPromptSuffix ?? ''
return `${prefix}${prefix ? '\n\n' : ''}${basePrompt}${suffix ? '\n\n' : ''}${suffix}`
}
Adapter Registry
getModelAdapter(modelId, provider?) in registry.ts:
- Looks up the model ID in
ADAPTER_FACTORIES(aRecord<string, () => ModelAdapter>) - If found, calls the factory to instantiate the adapter
- If not found, returns a default pass-through adapter (null-object pattern)
- All adapters are cached in a
Map<string, ModelAdapter>after first instantiation
Helper functions:
hasModelAdapter(modelId)— checks if a dedicated adapter existslistAdaptedModels()— returns all model IDs with dedicated adapters
Adding a New Model
-
Create adapter file at
packages/ai/src/models/adapters/{model-id}.ts- Export a
create{ModelName}Adapter()factory function - Implement
getPromptCustomization()for each relevant task - Document known weaknesses and signoff guidance
- Export a
-
Register the factory in
registry.ts:import { createNewModelAdapter } from './adapters/new-model' const ADAPTER_FACTORIES: Record<string, () => ModelAdapter> = { // ... existing adapters 'new-model-id': createNewModelAdapter, } -
Add to model registry in
packages/config/src/model-registry.ts(if not already registered) -
Run quality tests via
bun scripts/seed/llm-fact-quality-testing.ts --models new-model-id- The test pipeline automatically picks up the adapter's signoff guidance
- Review the generated report for dimension scores
-
Verify eligibility — the model must meet per-dimension thresholds (97%+ structural, 90%+ subjective) before production use
Eligibility Tracking
Models must demonstrate quality before production seeding. Eligibility is tracked in a JSONL file at scripts/seed/.llm-test-data/eligibility.jsonl.
7 Quality Dimensions
| Dimension | What it measures |
|---|---|
validation | Facts pass multi-phase validation pipeline |
evidence | Facts corroborated by external evidence |
challenges | Challenge content passes CQ rules (CQ-001 through CQ-013) |
schema_adherence | Output conforms to topic-specific schemas |
voice_adherence | Matches Eko voice constitution and taxonomy voice |
style_adherence | Follows per-style rules (setup arc, reveal structure) |
token_efficiency | Output stays within token budget (no verbose bloat) |
Thresholds
Dimensions use tiered thresholds (ELIGIBILITY_THRESHOLDS in registry.ts):
| Tier | Dimensions | Threshold | Rationale |
|---|---|---|---|
| Structural | validation, evidence, challenges | 97% | Deterministic checks — failures are unambiguous |
| Subjective | schema, voice, style, token_efficiency | 90% | LLM-scored with 5-8pt variance per run — 90% captures genuine quality issues without penalizing reviewer noise |
A model that fails either tier is not production-eligible.
JSONL Format
Each entry is a single JSON line:
{"modelId":"gpt-5-mini","timestamp":"2026-02-21T10:30:00Z","dimensions":{"validation":98,"evidence":97,"challenges":99,"schema_adherence":100,"voice_adherence":98,"style_adherence":97,"token_efficiency":99},"eligible":true}
API
loadEligibility()— parse all entries from JSONLsaveEligibility(entry)— append a new entryisModelEligible(modelId)— check latest entry for a modelgetEligibleModels()— list all currently eligible modelscomputeEligibility(dimensions)— check if all dimensions meet threshold
Integration Points
Fact Extraction (fact-engine.ts)
const { resolved, model, adapter } = await selectModelForTask('fact_extraction')
// ... build system prompt ...
const customization = adapter.getPromptCustomization('fact_extraction')
systemPrompt = applyPromptCustomization(systemPrompt, customization)
The same pattern applies to scoreNotability(), generateEvergreenFacts(), and validateFact().
Seed Explosion (seed-explosion.ts)
explodeCategoryEntry() retrieves the adapter for the active model and applies seed_explosion customizations.
Challenge Content (challenge-content.ts)
generateChallengeContent() retrieves the adapter and applies challenge_content_generation customizations.
Test Pipeline (scripts/seed/llm-fact-quality-testing.ts)
The test harness calls adapter.getSignoffGuidance() and feeds it to the quality reviewer model, so the reviewer knows what model-specific weaknesses to look for.
Current Adapters
| Model | Provider | Focus Areas | Known Weaknesses |
|---|---|---|---|
| gpt-5-mini | OpenAI | Active voice, cinematic titles, second-person address | Quiz-show register in titles, passive voice, drops "you/your" |
| gemini-2.5-flash | Conversational tone, title length, narrative arc | Textbook/encyclopedia register, long titles (>80 chars), flat reveals | |
| gemini-3-flash-preview | Default model — fact generation, challenges, signoff review. Full adapter with voice, schema, factual accuracy guardrails | May inherit Gemini-family textbook register, titles occasionally short | |
| claude-haiku-4-5 | Anthropic | Conciseness, specificity, theatrical energy | Verbose context (>8 sentences), over-explains reveals |
Validation Pipeline Models
The multi-phase validation pipeline (Phases 3 and 4c) uses Gemini 2.5 Flash directly (hardcoded ResolvedModel config, not adapter-driven):
| Phase | Purpose | Model | Cost |
|---|---|---|---|
| Phase 1: Structural | Schema, types, injection detection | None (code-only) | $0 |
| Phase 2: Consistency | Contradictions, taxonomy rules | None (code-only) | $0 |
| Phase 3: Cross-Model | AI adversarial verification | gemini-2.5-flash | ~$0.001 |
| Phase 4: Evidence | APIs + AI reasoner corroboration | gemini-2.5-flash (4c) | ~$0.002-0.005 |
Key Files
| File | Purpose |
|---|---|
packages/ai/src/models/types.ts | Interface definitions |
packages/ai/src/models/registry.ts | Adapter lookup + eligibility tracking |
packages/ai/src/models/adapters/*.ts | Individual model adapters |
packages/ai/src/fact-engine.ts | applyPromptCustomization(), selectModelForTask() |
packages/ai/src/seed-explosion.ts | Seed explosion adapter integration |
packages/ai/src/challenge-content.ts | Challenge content adapter integration |
scripts/seed/llm-fact-quality-testing.ts | Quality test pipeline with signoff guidance |
scripts/seed/.llm-test-data/eligibility.jsonl | Eligibility tracking data |