#Architecture Overview
Eko is a knowledge platform that processes three content pipelines — news, evergreen, and seed — into verified, structured knowledge cards. Users learn through interactive challenges powered by spaced repetition.
#Core Invariants
| Invariant | Description |
|---|
| Fact-first | Facts are the atomic unit; everything flows from structured, schema-validated fact_records |
| Verification before publication | No fact reaches the public feed without at least one validation tier pass |
| Source attribution | Every fact traces back to source articles and validation evidence |
| Schema conformance | Fact output validates against fact_record_schemas.fact_keys per topic category |
| Cost-bounded AI | All AI calls go through a model router with tier selection, daily spend caps, and per-call cost tracking |
| Public feed / gated detail | Feed is public; full card detail and interactions require Free/Eko+ subscription |
| Topic balance | Daily quotas per topic category prevent content monoculture |
| Challenge answer isolation | Challenge URLs, shared links, and metadata never reveal answer content |
#Data Flow
News APIs ─┐
├──▶ [INGEST_NEWS] ──▶ worker-ingest ──▶ news_sources table
│ │
│ ┌─────────────────────┘
│ ▼
│ [CLUSTER_STORIES] ──▶ worker-ingest ──▶ stories
│ │
│ ┌─────────────────────┘
│ ▼
│ [EXTRACT_FACTS] ──▶ worker-facts ──▶ fact_records
│ │
│ ┌───────────┬───────────────┘
│ ▼ ▼
│ [VALIDATE_FACT] [RESOLVE_IMAGE]
│ │ │
│ ▼ ▼
│ worker-validate worker-ingest
│ │ │
│ ▼ ▼
│ fact verified image cached
│
Seed Data ─┤
├──▶ [EXPLODE_CATEGORY_ENTRY] ──▶ worker-facts ──▶ fact_records
├──▶ [FIND_SUPER_FACTS] ──▶ worker-facts ──▶ cross-correlations
└──▶ [GENERATE_CHALLENGE_CONTENT] ──▶ worker-facts ──▶ fact_challenge_content
│
┌───────────────────────────────┘
▼ (if image would spoil answer)
[RESOLVE_CHALLENGE_IMAGE] ──▶ worker-ingest ──▶ anti-spoiler image
Evergreen ────▶ [GENERATE_EVERGREEN] ──▶ worker-facts ──▶ fact_records
#Database Schema
PostgreSQL via Supabase with Drizzle ORM. 32+ active tables across 6 groups. See Fact & Taxonomy Schema Map for the full reference.
#Taxonomy
| Table | Purpose |
|---|
topic_categories | Hierarchical taxonomy tree (depth 0-3, slug, path, dailyQuota). Supports deprecation via deprecated_at, replaced_by. |
topic_category_aliases | Maps external provider slugs to canonical topic category IDs |
unmapped_category_log | Logs unresolved category slugs for audit |
#Fact Storage
| Table | Purpose |
|---|
fact_records | Core fact storage — facts JSONB with key-value pairs, title, context, notabilityScore, status, validation |
fact_record_schemas | Per-topic fact shape definitions — factKeys JSONB, cardFormats array |
#Challenge System
| Table | Purpose |
|---|
fact_challenge_content | Pre-generated challenge text per style/difficulty with challenge_title |
challenge_formats | 8 named formats with knowledgeType and tone |
challenge_format_styles | Format → eligible style mapping |
challenge_format_topics | Format → eligible topic mapping |
challenge_sessions | Multi-turn conversational challenge state with conversation history |
card_interactions | User engagement tracking (views, answers, bookmarks, shares) |
challenge_group_progress | Per-position progress within challenge groups |
card_bookmarks | User bookmarked cards |
score_disputes | User score dispute submissions for AI re-evaluation |
#Ingestion & Observability
| Table | Purpose |
|---|
stories | Clustered news articles grouped by TF-IDF cosine similarity |
news_sources | Raw articles fetched from news APIs |
ai_cost_tracking | Per-model, per-feature daily cost aggregation |
ingestion_runs | Pipeline observability — tracks each cron invocation |
content_operations_log | Operational event log for pipeline debugging |
#Brand & Domain
| Table | Purpose |
|---|
brands | Canonical brand entities |
brand_categories | Brand categorization taxonomy |
brand_category_assignments | Brand → category mappings |
domains | Domain records with verification status |
#User & Subscription
| Table | Purpose |
|---|
profiles | User profiles synced from Supabase Auth |
plan_definitions | Plan configuration (free, base, pro, team, plus) |
user_subscriptions | Maps users to plans with Stripe integration |
notification_preferences | Per-user notification settings |
system_notifications | System-generated notifications |
feature_flags | Runtime feature flag toggles |
ai_model_tier_config | AI model routing configuration per tier |
reward_milestones | Achievement milestones for gamification |
user_reward_claims | Claimed milestone rewards |
user_quality_grades | Per-topic quality grade aggregation |
#Seed Pipeline
| Table | Purpose |
|---|
seed_entry_queue | Batch seed entry processing queue |
super_fact_links | Cross-entry correlation links |
seed_entry_links | Entry-to-fact provenance links |
#Queue System
Backend: Upstash Redis (REST API). 2-minute lease duration, max 3 attempts before dead-letter queue, exponential backoff with jitter.
#Message Types
| Queue | Consumer | Trigger |
|---|
INGEST_NEWS | worker-ingest | cron/ingest-news (every 15m) |
CLUSTER_STORIES | worker-ingest | cron/cluster-sweep (hourly) |
RESOLVE_IMAGE | worker-ingest | post-extraction (automatic) |
RESOLVE_CHALLENGE_IMAGE | worker-ingest | post-challenge-generation (automatic) |
EXTRACT_FACTS | worker-facts | post-clustering (automatic) |
IMPORT_FACTS | worker-facts | cron/import-facts (stub) |
GENERATE_EVERGREEN | worker-facts | cron/generate-evergreen (daily) |
EXPLODE_CATEGORY_ENTRY | worker-facts | seed pipeline (manual/batch) |
FIND_SUPER_FACTS | worker-facts | seed pipeline (manual/batch) |
GENERATE_CHALLENGE_CONTENT | worker-facts | post-extraction or seed pipeline |
VALIDATE_FACT | worker-validate | post-extraction + cron/validation-retry (every 4h) |
SEND_SMS | worker-sms | SMS delivery via Twilio |
#Workers
Five Bun-based queue consumer processes. Each exposes /health and implements graceful shutdown.
| Worker | Queues Consumed | Purpose |
|---|
worker-ingest | INGEST_NEWS, CLUSTER_STORIES, RESOLVE_IMAGE, RESOLVE_CHALLENGE_IMAGE | News fetch, article clustering, image resolution |
worker-facts | EXTRACT_FACTS, IMPORT_FACTS, GENERATE_EVERGREEN, EXPLODE_CATEGORY_ENTRY, FIND_SUPER_FACTS, GENERATE_CHALLENGE_CONTENT | Fact extraction, generation, challenges |
worker-validate | VALIDATE_FACT | 4-phase verification (structural → consistency → cross-model → evidence) |
worker-reel-render | (R2/internal triggers) | Render reel videos from fact records |
worker-sms | SEND_SMS | SMS notifications via Twilio |
#AI Model Router
All AI calls go through a tier-based model router with 27 models across 6 providers.
| Provider | Models |
|---|
| OpenAI | gpt-5.4-nano, gpt-5.4-mini, gpt-5-nano, gpt-5-mini, gpt-4o-mini, gpt-4o, gpt-4-turbo, gpt-4, gpt-3.5-turbo |
| Anthropic | claude-haiku-4-5, claude-sonnet-4-5, claude-opus-4-5, claude-opus-4-6 |
| Google | gemini-2.0-flash, gemini-2.0-flash-lite, gemini-2.5-pro |
| MiniMax | MiniMax-M2, MiniMax-M2.1, MiniMax-M2.5 |
| xAI | grok-4-1-fast-reasoning, grok-4-1-fast-non-reasoning, grok-4 |
Cost tracked via ai_cost_tracking table with daily spend caps (ANTHROPIC_DAILY_SPEND_CAP_USD).
#Legacy (V1)
The original V1 URL change tracking system and its vNext global URL library have been fully superseded by the V2 fact engine. 41 V1 legacy tables were dropped in migrations 0057-0060. V1 architecture decisions (ADR-001 through ADR-012) are preserved in decisions.md for historical reference. V1 documentation is archived in docs_archive/.
#Dependency Exceptions
All dependencies use @latest unless noted here:
(No exceptions currently)