Fact & Taxonomy Schema Map

Reference map of all fact and taxonomy schema, rules, and voice-related files and functions with their intended targets.

1. Database Schema (Drizzle ORM)

File: packages/db/src/drizzle/schema.ts

Table / EnumPurposeTargets
topicCategoriesHierarchical taxonomy tree (slug, name, depth 0-3, path, dailyQuota, percentTarget, deprecated_at, replaced_by, deprecation_reason). Depth levels: 0=root, 1=primary, 2=secondary, 3=leaf. MAX_TAXONOMY_DEPTH = 5.Feed filtering, quota cron, AI extraction routing
topicCategoryAliasesMaps external provider slugs to canonical topicsresolveTopicCategory() in ingestion pipeline
unmappedCategoryLogLogs categories that couldn't be resolvedAdmin dashboard analysis
factRecordSchemasPer-topic fact shape definitions (factKeys JSONB, cardFormats array)buildFactObjectSchema() in AI extraction, validation
factRecordsCore fact storage (facts JSONB, title, context, notabilityScore, status, validation)Feed API, card detail, challenge generation
factChallengeContentPre-generated challenge text per style/difficulty. Includes challenge_title for display.Card UI rendering, challenge sessions
challengeFormats8 named formats (big_fan_of, know_a_lot_about, etc.) with knowledgeType, toneChallenge session selection, format eligibility
challengeFormatStylesFormat to eligible style mappingChallenge content generation
challengeFormatTopicsFormat to eligible topic mappingTopic-scoped format selection
challengeSessionsActive challenge game state with conversation historyConversational challenge flow
cardInteractionsUser answer/view/bookmark trackingSpaced repetition, engagement metrics
aiCostTrackingPer-model, per-feature daily cost aggregationBudget monitoring, model routing decisions

Enums: factRecordStatusEnum, cardFormatEnum, challengeStyleEnum, challengeFormatSlugEnum, triviaDifficultyEnum, validationStrategyEnum, interactionTypeEnum, aiModelEnum, newsProviderEnum, storyStatusEnum, useCaseAudienceEnum, domainSourceEnum, domainVerificationStatusEnum, pageTypeEnum, systemNotificationTypeEnum, brandPageTypeEnum, brandConfidenceLevelEnum


2. Zod Validation Schemas

File: packages/shared/src/schemas.ts

SchemaPurposeTargets
FactKeySchemaValidates fact key definitions (key, label, type, required)factRecordSchemas.factKeys column
CardFormatSchema / DifficultySchemaEnum validation for card formats and difficultyQueue messages, API inputs
ChallengeStyleSchema / ChallengeFormatSlugSchemaEnum validation for styles and named formatsChallenge generation, session creation
KnowledgeTypeSchemafandom, skill, memory, temporal, association, career, visual, creatorchallengeFormats.knowledgeType
ValidationResultSchemaSingle-phase validation output (strategy, sources, confidence, flags)factRecords.validation JSONB column
MultiPhaseValidationResultSchemaFull pipeline result with phase arrayfactRecords.validation JSONB column
FactDetailSchemaPublic-facing fact detail with topic, image, validationWeb app card detail API
CardVariationSchemaComputed card with question, visible/hidden facts, optionsCard UI rendering
ChallengeSessionSchema / ConversationMessageSchemaChallenge game state and dialogue turnsConversational challenge flow

Queue Message Schemas:

SchemaTargets
ExtractFactsMessageSchemaworker-facts/handlers/extract-facts.ts
ImportFactsMessageSchemaworker-facts/handlers/import-facts.ts
ValidateFactMessageSchemaworker-validate/handlers/validate-fact.ts
GenerateEvergreenMessageSchemaworker-facts/handlers/generate-evergreen.ts
ExplodeCategoryEntryMessageSchemaworker-facts/handlers/explode-entry.ts
FindSuperFactsMessageSchemaworker-facts/handlers/find-super-facts.ts
GenerateChallengeContentMessageSchemaworker-facts/handlers/generate-challenge-content.ts
ResolveImageMessageSchemaworker-ingest/handlers/resolve-image.ts
IngestNewsMessageSchemaworker-ingest/handlers/ingest-news.ts
ClusterStoriesMessageSchemaworker-ingest/handlers/cluster-stories.ts
ResolveChallengeImageMessageSchemaworker-ingest/handlers/resolve-challenge-image.ts

3. AI Type Definitions

Fact Engine

File: packages/ai/src/fact-engine.ts

Type / FunctionPurposeTargets
FactEngineTaskUnion of all AI task names (14 tasks)Model routing via selectModelForTask()
ExtractFactsInput / ExtractFactsResultNews story to structured factsworker-facts/handlers/extract-facts.ts
ScoreNotabilityInput / ScoreNotabilityResultNotability scoring (0-1)Fact insertion threshold (>= 0.6)
ValidateFactInput / ValidateFactResultAI cross-check validationworker-validate/handlers/validate-fact.ts
GenerateEvergreenInput / GenerateEvergreenResultEvergreen fact generationworker-facts/handlers/generate-evergreen.ts
extractFactsFromStory()Orchestrates extraction with tone injectionCalled by extract-facts handler
selectModelForTask()Tier-based model routingAll AI calls

Seed Explosion

File: packages/ai/src/seed-explosion.ts

Type / FunctionPurposeTargets
ExplodeCategoryEntryInput / ExplodeCategoryEntryResultSeed entry to many facts + spinoffsworker-facts/handlers/explode-entry.ts
SuperFactCandidate / FindSuperFactsResultCross-entry correlation discoveryworker-facts/handlers/find-super-facts.ts

Schema Utilities

File: packages/ai/src/schema-utils.ts

FunctionPurposeTargets
buildFactObjectSchema()Converts fact_record_schemas.fact_keys to concrete z.object()AI extraction prompts, validation

Model Adapters

File: packages/ai/src/models/types.ts

TypePurposeTargets
ModelAdapterModel-specific prompt customization interfacePer-model tone adjustments in generation
PromptCustomizationsystemPromptPrefix/Suffix/Override, temperatureapplyPromptCustomization() in fact-engine.ts

4. Validation Pipeline

FilePhaseCostTargets
packages/ai/src/validation/structural.tsPhase 1: Required keys, types, injection detection$0All new fact records
packages/ai/src/validation/consistency.tsPhase 2: Numeric ordering, temporal logic, coherence$0All facts passing Phase 1
packages/ai/src/validation/cross-model.tsPhase 3: Adversarial AI verification (Gemini 2.5 Flash)~$0.001Facts needing higher confidence
packages/ai/src/validation/evidence.tsPhase 4: API + reasoner evidence corroboration~$0.002-0.005Facts needing external verification

5. Voice & Tone Constants

Challenge Content Rules

File: packages/ai/src/challenge-content-rules.ts

ConstantPurposeTargets
CHALLENGE_TONE_PREFIXUniversal 3-line Eko voice identityInjected into all user-facing AI prompts (fact-engine.ts, challenge-content.ts)
CHALLENGE_VOICE_CONSTITUTION70+ line detailed voice principles, forbidden patterns, anti-pattern wordsbuildSystemPrompt() in challenge-content.ts
STYLE_RULESPer-style structural requirements (6 styles x 5 field rules each)Challenge content generation prompts
STYLE_VOICEPer-style personality definitions (personality, register, phrasing examples, reveal tone)Challenge content generation prompts
FORMAT_VOICEPer-format personality definitions (8 formats: personality, register, energy, excitement driver, reveal tone, emphasis)Challenge content generation prompts (layer 3.7)
FORMAT_RULESPer-format structural refinements (8 formats: setup, challenge, reveal, correct_answer, style_data)Challenge content generation prompts (layer 3.7)
DIFFICULTY_LEVELSPer-difficulty level definitions (5 levels with cognitive demand descriptors)Difficulty guidance in generation prompts
SUPER_FACT_RULESGuidance for cross-entry super-fact challenge contentSuper-fact challenge generation
PRE_GENERATED_STYLESList of 6 styles pre-generated at content creation timeChallenge content batch generation
BANNED_PATTERNSRegex list for anti-pattern word detectionvalidateChallengeContent()
CQ002_REGEXSecond-person enforcement regexpatchCq002() post-processing
validateChallengeContent()Enforces all CQ-001 through CQ-013 rulesGeneration-time + upload-time validation
patchCq002()Post-generation second-person pronoun enforcementChallenge content post-processing

Taxonomy Content Rules

File: packages/ai/src/taxonomy-content-rules.ts

ConstantPurposeTargets
TAXONOMY_CONTENT_RULESPer-taxonomy extraction guidance, challenge guidance, anti-patterns, domain vocabulary (32 taxonomies)AI extraction prompts in extractFactsFromStory(), seed explosion, evergreen generation
TAXONOMY_VOICEPer-taxonomy emotional register, energy, excitement driver, pitfallsInjected into challenge-content.ts buildSystemPrompt(), seed explosion, evergreen generation
formatVocabularyForPrompt()Serializes TaxonomyVocabulary into concise prompt block (~120-200 tokens)All 4 prompt injection sites (DRY helper)

6. Rules Documentation

FilePurposeTargets
docs/marketing/CHALLENGE_TONE.mdCanonical voice specification (source of truth)All voice constants in code
docs/rules/challenge-content.mdSystem rules CC-001 through CC-011, quality rules CQ-001 through CQ-013validateChallengeContent(), AI prompt instructions
docs/rules/README.mdFull rules index with enforcement and ownershipDeveloper reference

7. Database Query Functions

File: packages/db/src/drizzle/fact-engine-queries.ts

FunctionPurposeTargets
getTopicCategoryBySlug() / getTopicCategoryById()Topic lookupFeed, extraction routing
resolveTopicCategory()Alias-aware category resolution with inactive-category fallback via resolveActiveCategory()Ingestion pipeline (story to topic)
resolveActiveCategory()Follows deprecation replacement chains (max 5 hops) to find the current active successorInactive-category fallback in resolveTopicCategory()
getDescendantCategoriesForParent()Recursive 3-level descendant loading for a parent categoryTaxonomy tree views, admin dashboard
getActiveTopicCategoriesWithSchemas()Taxonomy tree with schemas; supports maxDepth filteringAdmin dashboard, evergreen generation
getFactRecordSchemaByTopicId()Schema lookup for a topicAI extraction (build dynamic Zod schema)
insertFactRecord()Create fact recordExtract + import handlers
updateFactRecordStatus()Set validated/rejected with validation JSONBValidation handler
getPublishedFacts() / getFactsByTopic()Paginated feed queriesWeb app feed API
getFactDetail()Full fact with relationsCard detail page
getTopicFactCountToday()Quota monitoringTopic-quotas cron route
archiveExpiredFactRecords()30-day news fact expiryArchival cron
promoteHighEngagementFacts()Promote popular facts by engagementPromotion cron
recordInteraction() / getUserInteractions()Card interaction trackingSpaced repetition, engagement metrics
getLatestAnswerInteraction() / getReviewDueFacts()Spaced repetition schedulingCard review flow
addBookmark() / removeBookmark() / getUserBookmarks()User bookmark managementCard detail, bookmarks page
createIngestionRun() / updateIngestionRun()Pipeline observabilityIngestion cron routes
createStory() / updateStory() / getStoryWithSources()Story lifecycleIngestion pipeline
insertNewsSources() / getNewsSourcesByIds()News source managementIngestion pipeline
findStoriesByTitleSimilarity()Story deduplicationClustering handler
getExistingTitlesForTopic()Fact deduplicationExtract + import handlers
getUnclusteredNewsSources()Unclustered article discoveryClustering cron
getStuckPendingValidations()Stuck validation recoveryValidation monitor cron
createChallengeSession() / getChallengeSession() / updateChallengeSession()Challenge session lifecycleConversational challenge flow
getActiveChallengeFormats() / getChallengeFormatBySlug()Format lookupChallenge session creation
getFormatsForTopic() / getPairedFormat()Topic-scoped format selectionChallenge format eligibility
getActiveSessionCount() / abandonStaleSessions()Session rate limiting and cleanupChallenge session management
createScoreDispute() / getScoreDispute() / getUserMonthlyDisputeCount()Score dispute lifecycleDispute flow, rate limiting
updateInteractionScore()Score adjustment post-disputeDispute resolution
getAvailableMilestones() / claimMilestone()Reward milestone systemGamification
getUserFreeTextCountToday()Free-text answer rate limitingSubscription gating
createTrialSubscription() / getPlanTrialDays()Trial subscription managementOnboarding
getUserCumulativeScore()Cumulative score aggregationProfile, leaderboard

8. Worker Handlers

FileQueue MessageConsumes From
apps/worker-facts/src/handlers/extract-facts.tsEXTRACT_FACTSfact-engine.ts extractFactsFromStory()
apps/worker-facts/src/handlers/generate-evergreen.tsGENERATE_EVERGREENfact-engine.ts generateEvergreenFacts()
apps/worker-facts/src/handlers/import-facts.tsIMPORT_FACTSBulk structured imports (ESPN, WikiQuote, etc.)
apps/worker-facts/src/handlers/explode-entry.tsEXPLODE_CATEGORY_ENTRYseed-explosion.ts explodeCategoryEntry()
apps/worker-facts/src/handlers/find-super-facts.tsFIND_SUPER_FACTSseed-explosion.ts findSuperFacts()
apps/worker-facts/src/handlers/generate-challenge-content.tsGENERATE_CHALLENGE_CONTENTchallenge-content.ts
apps/worker-validate/src/handlers/validate-fact.tsVALIDATE_FACT4-phase validation pipeline
apps/worker-ingest/src/handlers/ingest-news.tsINGEST_NEWSNews API provider fetch
apps/worker-ingest/src/handlers/cluster-stories.tsCLUSTER_STORIESStory deduplication and clustering
apps/worker-ingest/src/handlers/resolve-image.tsRESOLVE_IMAGEImage resolution cascade (Wikipedia, SportsDB, Unsplash, Pexels)
apps/worker-ingest/src/handlers/resolve-challenge-image.tsRESOLVE_CHALLENGE_IMAGEAnti-spoiler image resolution for challenges where the fact image reveals the answer

Prompt Layering Hierarchy

The order voice controls are injected into AI generation prompts:

1.   CHALLENGE_TONE_PREFIX          <- Universal Eko identity (3 lines)
2.   CHALLENGE_VOICE_CONSTITUTION   <- Detailed principles & forbidden patterns
3.   TAXONOMY_VOICE[slug]           <- Domain register & energy (e.g. "Nature documentary narrator")
3.5  DOMAIN_VOCABULARY[slug]        <- Domain terminology & expert phrasing (via formatVocabularyForPrompt)
3.7  FORMAT_VOICE[formatSlug]       <- Format posture & energy (e.g. "enthusiastic superfan")
     FORMAT_RULES[formatSlug]       <- Format structural refinements (setup/challenge/reveal/answer)
4.   STYLE_VOICE[style]             <- Interaction mechanics (e.g. "curious gallery guide")
5.   STYLE_RULES[style]             <- Structural field requirements
6.   TAXONOMY_CONTENT_RULES[slug]   <- Domain extraction/challenge guidance
7.   Difficulty guidance             <- Per-level prompt adjustments
8.   Super-fact rules (if applicable)
9.   Generation instructions
10.  ModelAdapter customization     <- Per-model prefix/suffix/temperature

Each layer adds specificity without overriding the previous one. The universal tone sets the floor, taxonomy voice sets the domain energy, domain vocabulary teaches expert-level language, format voice sets the challenge format posture, and style voice sets the interaction mechanics.

Vocabulary is auto-generated when adding new categories via bun run taxonomy:add (see generateVocabulary() in scripts/taxonomy/add-taxonomy.ts).