Fact & Taxonomy Schema Map

Reference map of all fact and taxonomy schema, rules, and voice-related files and functions with their intended targets.

1. Database Schema (Drizzle ORM)

File: packages/db/src/drizzle/schema.ts

Table / Enum	Purpose	Targets
`topicCategories`	Hierarchical taxonomy tree (slug, name, depth 0-3, path, dailyQuota, percentTarget, `deprecated_at`, `replaced_by`, `deprecation_reason`). Depth levels: 0=root, 1=primary, 2=secondary, 3=leaf. `MAX_TAXONOMY_DEPTH = 5`.	Feed filtering, quota cron, AI extraction routing
`topicCategoryAliases`	Maps external provider slugs to canonical topics	`resolveTopicCategory()` in ingestion pipeline
`unmappedCategoryLog`	Logs categories that couldn't be resolved	Admin dashboard analysis
`factRecordSchemas`	Per-topic fact shape definitions (`factKeys` JSONB, `cardFormats` array)	`buildFactObjectSchema()` in AI extraction, validation
`factRecords`	Core fact storage (`facts` JSONB, title, context, notabilityScore, status, validation)	Feed API, card detail, challenge generation
`factChallengeContent`	Pre-generated challenge text per style/difficulty. Includes `challenge_title` for display.	Card UI rendering, challenge sessions
`challengeFormats`	8 named formats (big_fan_of, know_a_lot_about, etc.) with knowledgeType, tone	Challenge session selection, format eligibility
`challengeFormatStyles`	Format to eligible style mapping	Challenge content generation
`challengeFormatTopics`	Format to eligible topic mapping	Topic-scoped format selection
`challengeSessions`	Active challenge game state with conversation history	Conversational challenge flow
`cardInteractions`	User answer/view/bookmark tracking	Spaced repetition, engagement metrics
`aiCostTracking`	Per-model, per-feature daily cost aggregation	Budget monitoring, model routing decisions

Enums: factRecordStatusEnum, cardFormatEnum, challengeStyleEnum, challengeFormatSlugEnum, triviaDifficultyEnum, validationStrategyEnum, interactionTypeEnum, aiModelEnum, newsProviderEnum, storyStatusEnum, useCaseAudienceEnum, domainSourceEnum, domainVerificationStatusEnum, pageTypeEnum, systemNotificationTypeEnum, brandPageTypeEnum, brandConfidenceLevelEnum

2. Zod Validation Schemas

File: packages/shared/src/schemas.ts

Schema	Purpose	Targets
`FactKeySchema`	Validates fact key definitions (key, label, type, required)	`factRecordSchemas.factKeys` column
`CardFormatSchema` / `DifficultySchema`	Enum validation for card formats and difficulty	Queue messages, API inputs
`ChallengeStyleSchema` / `ChallengeFormatSlugSchema`	Enum validation for styles and named formats	Challenge generation, session creation
`KnowledgeTypeSchema`	fandom, skill, memory, temporal, association, career, visual, creator	`challengeFormats.knowledgeType`
`ValidationResultSchema`	Single-phase validation output (strategy, sources, confidence, flags)	`factRecords.validation` JSONB column
`MultiPhaseValidationResultSchema`	Full pipeline result with phase array	`factRecords.validation` JSONB column
`FactDetailSchema`	Public-facing fact detail with topic, image, validation	Web app card detail API
`CardVariationSchema`	Computed card with question, visible/hidden facts, options	Card UI rendering
`ChallengeSessionSchema` / `ConversationMessageSchema`	Challenge game state and dialogue turns	Conversational challenge flow

Queue Message Schemas:

Schema	Targets
`ExtractFactsMessageSchema`	`worker-facts/handlers/extract-facts.ts`
`ImportFactsMessageSchema`	`worker-facts/handlers/import-facts.ts`
`ValidateFactMessageSchema`	`worker-validate/handlers/validate-fact.ts`
`GenerateEvergreenMessageSchema`	`worker-facts/handlers/generate-evergreen.ts`
`ExplodeCategoryEntryMessageSchema`	`worker-facts/handlers/explode-entry.ts`
`FindSuperFactsMessageSchema`	`worker-facts/handlers/find-super-facts.ts`
`GenerateChallengeContentMessageSchema`	`worker-facts/handlers/generate-challenge-content.ts`
`ResolveImageMessageSchema`	`worker-ingest/handlers/resolve-image.ts`
`IngestNewsMessageSchema`	`worker-ingest/handlers/ingest-news.ts`
`ClusterStoriesMessageSchema`	`worker-ingest/handlers/cluster-stories.ts`
`ResolveChallengeImageMessageSchema`	`worker-ingest/handlers/resolve-challenge-image.ts`

3. AI Type Definitions

Fact Engine

File: packages/ai/src/fact-engine.ts

Type / Function	Purpose	Targets
`FactEngineTask`	Union of all AI task names (14 tasks)	Model routing via `selectModelForTask()`
`ExtractFactsInput` / `ExtractFactsResult`	News story to structured facts	`worker-facts/handlers/extract-facts.ts`
`ScoreNotabilityInput` / `ScoreNotabilityResult`	Notability scoring (0-1)	Fact insertion threshold (>= 0.6)
`ValidateFactInput` / `ValidateFactResult`	AI cross-check validation	`worker-validate/handlers/validate-fact.ts`
`GenerateEvergreenInput` / `GenerateEvergreenResult`	Evergreen fact generation	`worker-facts/handlers/generate-evergreen.ts`
`extractFactsFromStory()`	Orchestrates extraction with tone injection	Called by extract-facts handler
`selectModelForTask()`	Tier-based model routing	All AI calls

Seed Explosion

File: packages/ai/src/seed-explosion.ts

Type / Function	Purpose	Targets
`ExplodeCategoryEntryInput` / `ExplodeCategoryEntryResult`	Seed entry to many facts + spinoffs	`worker-facts/handlers/explode-entry.ts`
`SuperFactCandidate` / `FindSuperFactsResult`	Cross-entry correlation discovery	`worker-facts/handlers/find-super-facts.ts`

Schema Utilities

File: packages/ai/src/schema-utils.ts

Function	Purpose	Targets
`buildFactObjectSchema()`	Converts `fact_record_schemas.fact_keys` to concrete `z.object()`	AI extraction prompts, validation

Model Adapters

File: packages/ai/src/models/types.ts

Type	Purpose	Targets
`ModelAdapter`	Model-specific prompt customization interface	Per-model tone adjustments in generation
`PromptCustomization`	systemPromptPrefix/Suffix/Override, temperature	`applyPromptCustomization()` in fact-engine.ts

4. Validation Pipeline

File	Phase	Cost	Targets
`packages/ai/src/validation/structural.ts`	Phase 1: Required keys, types, injection detection	$0	All new fact records
`packages/ai/src/validation/consistency.ts`	Phase 2: Numeric ordering, temporal logic, coherence	$0	All facts passing Phase 1
`packages/ai/src/validation/cross-model.ts`	Phase 3: Adversarial AI verification (Gemini 2.5 Flash)	~$0.001	Facts needing higher confidence
`packages/ai/src/validation/evidence.ts`	Phase 4: API + reasoner evidence corroboration	~$0.002-0.005	Facts needing external verification

5. Voice & Tone Constants

Challenge Content Rules

File: packages/ai/src/challenge-content-rules.ts

Constant	Purpose	Targets
`CHALLENGE_TONE_PREFIX`	Universal 3-line Eko voice identity	Injected into all user-facing AI prompts (fact-engine.ts, challenge-content.ts)
`CHALLENGE_VOICE_CONSTITUTION`	70+ line detailed voice principles, forbidden patterns, anti-pattern words	`buildSystemPrompt()` in challenge-content.ts
`STYLE_RULES`	Per-style structural requirements (6 styles x 5 field rules each)	Challenge content generation prompts
`STYLE_VOICE`	Per-style personality definitions (personality, register, phrasing examples, reveal tone)	Challenge content generation prompts
`FORMAT_VOICE`	Per-format personality definitions (8 formats: personality, register, energy, excitement driver, reveal tone, emphasis)	Challenge content generation prompts (layer 3.7)
`FORMAT_RULES`	Per-format structural refinements (8 formats: setup, challenge, reveal, correct_answer, style_data)	Challenge content generation prompts (layer 3.7)
`DIFFICULTY_LEVELS`	Per-difficulty level definitions (5 levels with cognitive demand descriptors)	Difficulty guidance in generation prompts
`SUPER_FACT_RULES`	Guidance for cross-entry super-fact challenge content	Super-fact challenge generation
`PRE_GENERATED_STYLES`	List of 6 styles pre-generated at content creation time	Challenge content batch generation
`BANNED_PATTERNS`	Regex list for anti-pattern word detection	`validateChallengeContent()`
`CQ002_REGEX`	Second-person enforcement regex	`patchCq002()` post-processing
`validateChallengeContent()`	Enforces all CQ-001 through CQ-013 rules	Generation-time + upload-time validation
`patchCq002()`	Post-generation second-person pronoun enforcement	Challenge content post-processing

Taxonomy Content Rules

File: packages/ai/src/taxonomy-content-rules.ts

Constant	Purpose	Targets
`TAXONOMY_CONTENT_RULES`	Per-taxonomy extraction guidance, challenge guidance, anti-patterns, domain vocabulary (32 taxonomies)	AI extraction prompts in `extractFactsFromStory()`, seed explosion, evergreen generation
`TAXONOMY_VOICE`	Per-taxonomy emotional register, energy, excitement driver, pitfalls	Injected into challenge-content.ts `buildSystemPrompt()`, seed explosion, evergreen generation
`formatVocabularyForPrompt()`	Serializes `TaxonomyVocabulary` into concise prompt block (~120-200 tokens)	All 4 prompt injection sites (DRY helper)

6. Rules Documentation

File	Purpose	Targets
`docs/marketing/CHALLENGE_TONE.md`	Canonical voice specification (source of truth)	All voice constants in code
`docs/rules/challenge-content.md`	System rules CC-001 through CC-011, quality rules CQ-001 through CQ-013	`validateChallengeContent()`, AI prompt instructions
`docs/rules/README.md`	Full rules index with enforcement and ownership	Developer reference

7. Database Query Functions

File: packages/db/src/drizzle/fact-engine-queries.ts

Function	Purpose	Targets
`getTopicCategoryBySlug()` / `getTopicCategoryById()`	Topic lookup	Feed, extraction routing
`resolveTopicCategory()`	Alias-aware category resolution with inactive-category fallback via `resolveActiveCategory()`	Ingestion pipeline (story to topic)
`resolveActiveCategory()`	Follows deprecation replacement chains (max 5 hops) to find the current active successor	Inactive-category fallback in `resolveTopicCategory()`
`getDescendantCategoriesForParent()`	Recursive 3-level descendant loading for a parent category	Taxonomy tree views, admin dashboard
`getActiveTopicCategoriesWithSchemas()`	Taxonomy tree with schemas; supports `maxDepth` filtering	Admin dashboard, evergreen generation
`getFactRecordSchemaByTopicId()`	Schema lookup for a topic	AI extraction (build dynamic Zod schema)
`insertFactRecord()`	Create fact record	Extract + import handlers
`updateFactRecordStatus()`	Set validated/rejected with validation JSONB	Validation handler
`getPublishedFacts()` / `getFactsByTopic()`	Paginated feed queries	Web app feed API
`getFactDetail()`	Full fact with relations	Card detail page
`getTopicFactCountToday()`	Quota monitoring	Topic-quotas cron route
`archiveExpiredFactRecords()`	30-day news fact expiry	Archival cron
`promoteHighEngagementFacts()`	Promote popular facts by engagement	Promotion cron
`recordInteraction()` / `getUserInteractions()`	Card interaction tracking	Spaced repetition, engagement metrics
`getLatestAnswerInteraction()` / `getReviewDueFacts()`	Spaced repetition scheduling	Card review flow
`addBookmark()` / `removeBookmark()` / `getUserBookmarks()`	User bookmark management	Card detail, bookmarks page
`createIngestionRun()` / `updateIngestionRun()`	Pipeline observability	Ingestion cron routes
`createStory()` / `updateStory()` / `getStoryWithSources()`	Story lifecycle	Ingestion pipeline
`insertNewsSources()` / `getNewsSourcesByIds()`	News source management	Ingestion pipeline
`findStoriesByTitleSimilarity()`	Story deduplication	Clustering handler
`getExistingTitlesForTopic()`	Fact deduplication	Extract + import handlers
`getUnclusteredNewsSources()`	Unclustered article discovery	Clustering cron
`getStuckPendingValidations()`	Stuck validation recovery	Validation monitor cron
`createChallengeSession()` / `getChallengeSession()` / `updateChallengeSession()`	Challenge session lifecycle	Conversational challenge flow
`getActiveChallengeFormats()` / `getChallengeFormatBySlug()`	Format lookup	Challenge session creation
`getFormatsForTopic()` / `getPairedFormat()`	Topic-scoped format selection	Challenge format eligibility
`getActiveSessionCount()` / `abandonStaleSessions()`	Session rate limiting and cleanup	Challenge session management
`createScoreDispute()` / `getScoreDispute()` / `getUserMonthlyDisputeCount()`	Score dispute lifecycle	Dispute flow, rate limiting
`updateInteractionScore()`	Score adjustment post-dispute	Dispute resolution
`getAvailableMilestones()` / `claimMilestone()`	Reward milestone system	Gamification
`getUserFreeTextCountToday()`	Free-text answer rate limiting	Subscription gating
`createTrialSubscription()` / `getPlanTrialDays()`	Trial subscription management	Onboarding
`getUserCumulativeScore()`	Cumulative score aggregation	Profile, leaderboard

8. Worker Handlers

File	Queue Message	Consumes From
`apps/worker-facts/src/handlers/extract-facts.ts`	`EXTRACT_FACTS`	fact-engine.ts `extractFactsFromStory()`
`apps/worker-facts/src/handlers/generate-evergreen.ts`	`GENERATE_EVERGREEN`	fact-engine.ts `generateEvergreenFacts()`
`apps/worker-facts/src/handlers/import-facts.ts`	`IMPORT_FACTS`	Bulk structured imports (ESPN, WikiQuote, etc.)
`apps/worker-facts/src/handlers/explode-entry.ts`	`EXPLODE_CATEGORY_ENTRY`	seed-explosion.ts `explodeCategoryEntry()`
`apps/worker-facts/src/handlers/find-super-facts.ts`	`FIND_SUPER_FACTS`	seed-explosion.ts `findSuperFacts()`
`apps/worker-facts/src/handlers/generate-challenge-content.ts`	`GENERATE_CHALLENGE_CONTENT`	challenge-content.ts
`apps/worker-validate/src/handlers/validate-fact.ts`	`VALIDATE_FACT`	4-phase validation pipeline
`apps/worker-ingest/src/handlers/ingest-news.ts`	`INGEST_NEWS`	News API provider fetch
`apps/worker-ingest/src/handlers/cluster-stories.ts`	`CLUSTER_STORIES`	Story deduplication and clustering
`apps/worker-ingest/src/handlers/resolve-image.ts`	`RESOLVE_IMAGE`	Image resolution cascade (Wikipedia, SportsDB, Unsplash, Pexels)
`apps/worker-ingest/src/handlers/resolve-challenge-image.ts`	`RESOLVE_CHALLENGE_IMAGE`	Anti-spoiler image resolution for challenges where the fact image reveals the answer

Prompt Layering Hierarchy

The order voice controls are injected into AI generation prompts:

1.   CHALLENGE_TONE_PREFIX          <- Universal Eko identity (3 lines)
2.   CHALLENGE_VOICE_CONSTITUTION   <- Detailed principles & forbidden patterns
3.   TAXONOMY_VOICE[slug]           <- Domain register & energy (e.g. "Nature documentary narrator")
3.5  DOMAIN_VOCABULARY[slug]        <- Domain terminology & expert phrasing (via formatVocabularyForPrompt)
3.7  FORMAT_VOICE[formatSlug]       <- Format posture & energy (e.g. "enthusiastic superfan")
     FORMAT_RULES[formatSlug]       <- Format structural refinements (setup/challenge/reveal/answer)
4.   STYLE_VOICE[style]             <- Interaction mechanics (e.g. "curious gallery guide")
5.   STYLE_RULES[style]             <- Structural field requirements
6.   TAXONOMY_CONTENT_RULES[slug]   <- Domain extraction/challenge guidance
7.   Difficulty guidance             <- Per-level prompt adjustments
8.   Super-fact rules (if applicable)
9.   Generation instructions
10.  ModelAdapter customization     <- Per-model prefix/suffix/temperature

Each layer adds specificity without overriding the previous one. The universal tone sets the floor, taxonomy voice sets the domain energy, domain vocabulary teaches expert-level language, format voice sets the challenge format posture, and style voice sets the interaction mechanics.

Vocabulary is auto-generated when adding new categories via bun run taxonomy:add (see generateVocabulary() in scripts/taxonomy/add-taxonomy.ts).

#Fact & Taxonomy Schema Map

#1. Database Schema (Drizzle ORM)

#2. Zod Validation Schemas

#3. AI Type Definitions

#Fact Engine

#Seed Explosion

#Schema Utilities

#Model Adapters

#4. Validation Pipeline

#5. Voice & Tone Constants

#Challenge Content Rules

#Taxonomy Content Rules

#6. Rules Documentation

#7. Database Query Functions

#8. Worker Handlers

#Prompt Layering Hierarchy