Model Code Isolation

Per-model prompt optimization via the ModelAdapter strategy pattern. Each AI model gets a dedicated adapter that tailors prompts, generation parameters, and quality review guidance to exploit that model's strengths and mitigate its weaknesses.

Why Per-Model Optimization Matters

Different LLMs have distinct failure modes. GPT-5 Mini tends toward quiz-show register in challenge titles. Gemini 2.5 Flash gravitates to textbook/encyclopedia tone. Claude Haiku 4.5 can be verbose. Rather than tuning prompts to the lowest common denominator, the adapter pattern lets each model receive targeted corrections injected into the system prompt at generation time.

Directory Structure

packages/ai/src/models/
├── types.ts                     # ModelAdapter, PromptCustomization, AdaptableTask
├── registry.ts                  # getModelAdapter(), eligibility tracking (JSONL)
└── adapters/
    ├── _default.ts              # Null-object pass-through (no customization)
    ├── gpt-5-mini.ts            # OpenAI GPT-5 Mini
    ├── gemini-2.5-flash.ts      # Google Gemini 2.5 Flash
    ├── gemini-3-flash-preview.ts # Google Gemini 3 Flash Preview
    └── claude-haiku-4-5.ts      # Anthropic Claude Haiku 4.5

Interface Contract

`ModelAdapter`

interface ModelAdapter {
  modelId: string
  provider: string
  displayName: string
  getPromptCustomization(task: AdaptableTask): PromptCustomization
  getKnownWeaknesses(): string[]
  getSignoffGuidance(): string
}

`PromptCustomization`

interface PromptCustomization {
  systemPromptSuffix?: string    // Appended after base prompt (default mode)
  systemPromptPrefix?: string    // Prepended before base prompt
  systemPromptOverride?: string  // Replaces entire system prompt (use sparingly)
  temperature?: number           // Override default temperature
  maxRetries?: number            // Override default retry count
}

`AdaptableTask` (7 tasks)

Task	Consumer
`fact_extraction`	`fact-engine.ts` — news story extraction
`evergreen_generation`	`fact-engine.ts` — timeless knowledge facts
`seed_explosion`	`seed-explosion.ts` — entity-to-facts expansion
`challenge_content_generation`	`challenge-content.ts` — quiz/recall content
`notability_scoring`	`fact-engine.ts` — fact importance scoring
`fact_validation`	`fact-engine.ts` — AI plausibility check
`signoff_review`	`llm-fact-quality-testing.ts`, `news-ingestion-test.ts` — quality signoff (dedicated: gemini-3-flash-preview)

Three Customization Modes

Priority order (highest wins):

Override — systemPromptOverride replaces the entire system prompt. Use sparingly for models that need fundamentally different prompting.
Prefix — systemPromptPrefix is prepended before the base prompt. Useful for setting context before instructions.
Suffix (default) — systemPromptSuffix is appended after the base prompt. Most adapters use this mode to add model-specific corrections at the end.

Applied by applyPromptCustomization() in packages/ai/src/fact-engine.ts:

function applyPromptCustomization(basePrompt: string, customization: PromptCustomization): string {
  if (customization.systemPromptOverride) return customization.systemPromptOverride
  const prefix = customization.systemPromptPrefix ?? ''
  const suffix = customization.systemPromptSuffix ?? ''
  return `${prefix}${prefix ? '\n\n' : ''}${basePrompt}${suffix ? '\n\n' : ''}${suffix}`
}

Adapter Registry

getModelAdapter(modelId, provider?) in registry.ts:

Looks up the model ID in ADAPTER_FACTORIES (a Record<string, () => ModelAdapter>)
If found, calls the factory to instantiate the adapter
If not found, returns a default pass-through adapter (null-object pattern)
All adapters are cached in a Map<string, ModelAdapter> after first instantiation

Helper functions:

hasModelAdapter(modelId) — checks if a dedicated adapter exists
listAdaptedModels() — returns all model IDs with dedicated adapters

Adding a New Model

Create adapter file at packages/ai/src/models/adapters/{model-id}.ts
- Export a create{ModelName}Adapter() factory function
- Implement getPromptCustomization() for each relevant task
- Document known weaknesses and signoff guidance

Register the factory in registry.ts:

import { createNewModelAdapter } from './adapters/new-model'
const ADAPTER_FACTORIES: Record<string, () => ModelAdapter> = {
  // ... existing adapters
  'new-model-id': createNewModelAdapter,
}

Add to model registry in packages/config/src/model-registry.ts (if not already registered)
Run quality tests via bun scripts/seed/llm-fact-quality-testing.ts --models new-model-id
- The test pipeline automatically picks up the adapter's signoff guidance
- Review the generated report for dimension scores
Verify eligibility — the model must meet per-dimension thresholds (97%+ structural, 90%+ subjective) before production use

Eligibility Tracking

Models must demonstrate quality before production seeding. Eligibility is tracked in a JSONL file at scripts/seed/.llm-test-data/eligibility.jsonl.

7 Quality Dimensions

Dimension	What it measures
`validation`	Facts pass multi-phase validation pipeline
`evidence`	Facts corroborated by external evidence
`challenges`	Challenge content passes CQ rules (CQ-001 through CQ-013)
`schema_adherence`	Output conforms to topic-specific schemas
`voice_adherence`	Matches Eko voice constitution and taxonomy voice
`style_adherence`	Follows per-style rules (setup arc, reveal structure)
`token_efficiency`	Output stays within token budget (no verbose bloat)

Thresholds

Dimensions use tiered thresholds (ELIGIBILITY_THRESHOLDS in registry.ts):

Tier	Dimensions	Threshold	Rationale
Structural	validation, evidence, challenges	97%	Deterministic checks — failures are unambiguous
Subjective	schema, voice, style, token_efficiency	90%	LLM-scored with 5-8pt variance per run — 90% captures genuine quality issues without penalizing reviewer noise

A model that fails either tier is not production-eligible.

JSONL Format

Each entry is a single JSON line:

{"modelId":"gpt-5-mini","timestamp":"2026-02-21T10:30:00Z","dimensions":{"validation":98,"evidence":97,"challenges":99,"schema_adherence":100,"voice_adherence":98,"style_adherence":97,"token_efficiency":99},"eligible":true}

API

loadEligibility() — parse all entries from JSONL
saveEligibility(entry) — append a new entry
isModelEligible(modelId) — check latest entry for a model
getEligibleModels() — list all currently eligible models
computeEligibility(dimensions) — check if all dimensions meet threshold

Integration Points

Fact Extraction (`fact-engine.ts`)

const { resolved, model, adapter } = await selectModelForTask('fact_extraction')
// ... build system prompt ...
const customization = adapter.getPromptCustomization('fact_extraction')
systemPrompt = applyPromptCustomization(systemPrompt, customization)

The same pattern applies to scoreNotability(), generateEvergreenFacts(), and validateFact().

Seed Explosion (`seed-explosion.ts`)

explodeCategoryEntry() retrieves the adapter for the active model and applies seed_explosion customizations.

Challenge Content (`challenge-content.ts`)

generateChallengeContent() retrieves the adapter and applies challenge_content_generation customizations.

Test Pipeline (`scripts/seed/llm-fact-quality-testing.ts`)

The test harness calls adapter.getSignoffGuidance() and feeds it to the quality reviewer model, so the reviewer knows what model-specific weaknesses to look for.

Current Adapters

Model	Provider	Focus Areas	Known Weaknesses
gpt-5-mini	OpenAI	Active voice, cinematic titles, second-person address	Quiz-show register in titles, passive voice, drops "you/your"
gemini-2.5-flash	Google	Conversational tone, title length, narrative arc	Textbook/encyclopedia register, long titles (>80 chars), flat reveals
gemini-3-flash-preview	Google	Default model — fact generation, challenges, signoff review. Full adapter with voice, schema, factual accuracy guardrails	May inherit Gemini-family textbook register, titles occasionally short
claude-haiku-4-5	Anthropic	Conciseness, specificity, theatrical energy	Verbose context (>8 sentences), over-explains reveals

Validation Pipeline Models

The multi-phase validation pipeline (Phases 3 and 4c) uses Gemini 2.5 Flash directly (hardcoded ResolvedModel config, not adapter-driven):

Phase	Purpose	Model	Cost
Phase 1: Structural	Schema, types, injection detection	None (code-only)	$0
Phase 2: Consistency	Contradictions, taxonomy rules	None (code-only)	$0
Phase 3: Cross-Model	AI adversarial verification	`gemini-2.5-flash`	~$0.001
Phase 4: Evidence	APIs + AI reasoner corroboration	`gemini-2.5-flash` (4c)	~$0.002-0.005

Key Files

File	Purpose
`packages/ai/src/models/types.ts`	Interface definitions
`packages/ai/src/models/registry.ts`	Adapter lookup + eligibility tracking
`packages/ai/src/models/adapters/*.ts`	Individual model adapters
`packages/ai/src/fact-engine.ts`	`applyPromptCustomization()`, `selectModelForTask()`
`packages/ai/src/seed-explosion.ts`	Seed explosion adapter integration
`packages/ai/src/challenge-content.ts`	Challenge content adapter integration
`scripts/seed/llm-fact-quality-testing.ts`	Quality test pipeline with signoff guidance
`scripts/seed/.llm-test-data/eligibility.jsonl`	Eligibility tracking data

#Model Code Isolation

#Why Per-Model Optimization Matters

#Directory Structure

#Interface Contract

#ModelAdapter

#PromptCustomization

#AdaptableTask (7 tasks)

#Three Customization Modes

#Adapter Registry

#Adding a New Model

#Eligibility Tracking

#7 Quality Dimensions

#Thresholds

#JSONL Format

#API

#Integration Points

#Fact Extraction (fact-engine.ts)

#Seed Explosion (seed-explosion.ts)

#Challenge Content (challenge-content.ts)

#Test Pipeline (scripts/seed/llm-fact-quality-testing.ts)

#Current Adapters

#Validation Pipeline Models

#Key Files