Architecture Decisions (ADR-Lite)

Purpose: Capture the why behind Eko's foundational architectural choices. Audience: Founders, engineers, AI agents Scope: V1, vNext, and V2 fact-engine architecture decisions. Relationship: Complements STACK.md (how) and Project Instructions (output behavior).


ADR‑001 — URL‑Scoped Execution (V1 legacy)

Decision Each system execution is scoped to exactly one URL.

Rationale

  • Users intentionally select specific URLs for signal value
  • Site‑wide crawling introduces noise, cost, and legal risk
  • URL scope enforces precision and interpretability

Consequences

  • No cross‑URL inference at the system layer
  • No implicit site context
  • Summaries must stand on a single‑page delta

Status: Accepted


ADR‑002 — Change‑First, Not Content‑First (V1 legacy)

Decision Eko detects change before invoking intelligence.

Rationale

  • Most URLs are static most of the time
  • AI summarization without delta is wasteful and misleading
  • Users care about what changed, not restated content

Consequences

  • AI runs only after meaningful change
  • Static pages produce no output
  • Cost scales with change frequency, not page size

Status: Accepted


ADR‑003 — Logically Always‑On, Physically Scheduled (V1 legacy)

Decision Eko is logically always‑on but physically executed on schedules.

Rationale

  • Continuous rendering is cost‑prohibitive at scale
  • Scheduled checks provide near‑real‑time perception
  • Cadence control enables predictable economics

Consequences

  • No persistent browser sessions
  • Cadence selection is product‑level, not infra‑level
  • “Real‑time” is an illusion created by frequency, not persistence

Status: Accepted


ADR‑004 — Snapshot Comparison Over Live Observation (V1 legacy)

Decision Eko compares snapshots rather than observing pages live.

Rationale

  • Snapshot diffs are auditable and reproducible
  • Live observation complicates attribution of change
  • Snapshots enable historical reasoning

Consequences

  • Each change is traceable to two concrete states
  • Historical deltas can be re‑evaluated if logic improves

Status: Accepted


ADR‑005 — Conditional Rendering Only When Necessary (V1 legacy)

Decision Headless rendering and screenshots are used only when required.

Rationale

  • Most meaningful change is text‑detectable
  • Rendering is expensive and brittle
  • Rendering is best used as a diagnostic tool

Consequences

  • Default path is lightweight fetch
  • Rendering is triggered by ambiguity, not preference

Status: Accepted


ADR‑006 — Non‑Substitutive Intelligence (V1 legacy, principle retained in V2)

Decision Eko summaries must never replace the source page.

Rationale

  • Legal and ethical constraints
  • User trust depends on transparency
  • The goal is awareness, not consumption

Consequences

  • Abstracted summaries only
  • No full‑text reproduction
  • Clear confidence signaling

Status: Accepted


ADR‑007 — Cost Scales With URLs, Not Users (V1 legacy)

Decision Infrastructure cost is driven by tracked URLs and cadence.

Rationale

  • URL monitoring is the unit of work
  • User count is orthogonal to system load
  • Prevents distorted pricing models

Consequences

  • Pricing plans map cleanly to cost drivers
  • High user counts do not destabilize infra

Status: Accepted


ADR‑008 — No Predictive or Anticipatory Modeling (V1 legacy)

Decision Eko does not predict future changes in V1.

Rationale

  • Prediction introduces speculation
  • Requires broader context than URL scope allows
  • Reduces trust in early product stages

Consequences

  • Outputs remain observational
  • Future‑looking signals must be explicit and opt‑in

Status: Accepted



ADR‑009 — Global URL Library (vNext, partially superseded by V2)

Decision URLs are stored globally with a single canonical instance per URL, shared across all users.

Rationale

  • Per-user URL tracking duplicates checks for the same URL
  • Global library enables single check per URL (cost efficiency)
  • Brand-centric navigation requires URL-to-brand relationships
  • Trend analysis requires aggregating changes across users

Consequences

  • urls table holds exactly one row per canonical URL
  • Users link to global URLs via user_url_library
  • History gating controls visibility per user (free vs paid)
  • Workers check URLs globally, not per user

Status: Accepted


ADR‑010 — Write‑Through Compatibility (vNext, partially superseded by V2)

Decision V1 tables remain the worker write surface during vNext migration. AFTER INSERT triggers mirror data to vNext tables.

Rationale

  • Workers continue unchanged during migration
  • No big-bang cutover required
  • Data flows automatically to vNext tables
  • Rollback is trivial (disable triggers)

Consequences

  • Both V1 and vNext tables contain data
  • vNext reads from new tables, V1 writes to old tables
  • Eventually workers migrate to vNext tables directly

Status: Accepted


ADR‑011 — History Gating by Subscription (vNext, partially superseded by V2)

Decision Free users see change history only from when they added a URL. Paid users see full history.

Rationale

  • Incentivizes upgrades with tangible value
  • Respects that free users didn't "earn" historical context
  • Simplifies billing: access is time-scoped, not feature-gated

Consequences

  • user_url_library.history_start_at controls visibility
  • NULL = full history (paid), timestamp = gated (free)
  • RLS and query layer both enforce gating

Status: Accepted


ADR‑012 — URL Policy Enforcement (vNext, partially superseded by V2)

Decision All URLs are evaluated against a policy before entering the global library.

Rationale

  • Prevents abuse (adult content, malware, etc.)
  • Protects system from uncheckable URLs (auth walls)
  • Creates audit trail for blocked URLs

Consequences

  • url_policy_decisions logs all decisions
  • Blocked URLs never enter urls table
  • URLs requiring review queue for admin action

Status: Accepted


ADR-013 — Fact-First Architecture (V2)

Decision Facts are the atomic unit of the V2 system. Everything — feed, cards, challenges, validation — derives from structured fact_records.

Rationale

  • URL-scoped execution (V1) doesn't apply to news/knowledge domain
  • Facts are independently verifiable, quotable, and quizzable
  • Single atomic unit simplifies the entire downstream pipeline
  • Schema validation ensures consistency across all fact sources

Consequences

  • fact_records table is the central data structure
  • All pipelines (news, evergreen, seed) converge on fact output
  • UI components render from fact schema, not raw article content
  • V1 URL-monitoring ADRs (001-008) still apply to the legacy URL tracking domain

Status: Accepted


ADR-014 — Multi-Pipeline Convergence

Decision News ingestion, evergreen generation, and seed explosion all produce fact_records that flow through the same validation and challenge-generation pipeline.

Rationale

  • Single validation path regardless of source ensures consistent quality
  • Reduces code duplication across pipelines
  • Enables unified quality metrics and cost tracking
  • New fact sources only need to output fact_records to integrate

Consequences

  • worker-validate handles all fact validation regardless of origin
  • Challenge generation is decoupled from fact source
  • Quality metrics are comparable across pipelines
  • Adding a new fact source requires only a new ingestion handler

Status: Accepted


ADR-015 — Schema-Driven Fact Extraction

Decision fact_record_schemas.fact_keys defines per-topic-category shape. AI extraction is constrained by schema, not freeform.

Rationale

  • Freeform extraction produces inconsistent output across models and prompts
  • Schema constraint enables reliable validation and UI rendering
  • Per-category schemas capture domain-specific fields (e.g., sports stats vs. science findings)
  • Zod schemas provide runtime validation at extraction time

Consequences

  • Each topic category has a defined fact key schema
  • AI prompts include schema constraints via generateObject
  • Invalid extractions fail fast at the schema level
  • New topic categories require schema definition before extraction

Status: Accepted


ADR-016 — Cost-Bounded AI with Model Routing

Decision All AI calls go through a model router with tier selection (default/mid/high), daily spend caps, and per-call cost tracking.

Rationale

  • Unbounded AI spend is an existential risk for the product
  • Different tasks have different quality requirements (validation needs higher quality than image captioning)
  • Per-call cost tracking enables optimization and budgeting
  • Model routing allows switching providers without changing calling code

Consequences

  • ai_cost_log tracks every AI invocation with model, tokens, and cost
  • ANTHROPIC_DAILY_SPEND_CAP_USD enforces hard daily limits
  • Model router selects provider/model based on task tier
  • Escalation to expensive models (Opus) requires explicit opt-in and has separate caps

Status: Accepted


ADR-017 — Challenge Answer Isolation

Decision Challenge URLs, shared links, and metadata must never reveal answer content.

Rationale

  • Spoiled answers destroy learning value and user trust
  • Shared links and social media previews must be safe to distribute
  • Challenge metadata (OG tags, slugs, API responses) must not leak answers

Consequences

  • Anti-spoiler image resolution when the fact image would reveal the answer
  • Challenge content API responses omit answer data until user submits
  • URL slugs derived from fact titles, not answer content
  • OG/social card metadata uses question text only

Status: Accepted


Revision Policy

  • New decisions should be appended, not rewritten
  • Superseded decisions must be marked with status changes
  • This document favors clarity over completeness

Summary

These decisions define Eko's architecture:

V2 Fact Engine (Active):

  • Fact-first atomic unit (ADR-013)
  • Multi-pipeline convergence (ADR-014)
  • Schema-driven extraction (ADR-015)
  • Cost-bounded AI with model routing (ADR-016)
  • Challenge answer isolation (ADR-017)

vNext (Historical — partially superseded by V2):

  • Global URL library (ADR-009)
  • Write-through compatibility (ADR-010)
  • History gating (ADR-011)
  • Policy enforcement (ADR-012)

V1 (Historical — deprecated URL tracking):

  • URL-scoped execution (ADR-001)
  • Change-first output (ADR-002)
  • Scheduled execution (ADR-003)
  • Snapshot comparison (ADR-004)
  • Conditional rendering (ADR-005)
  • Non-substitutive intelligence (ADR-006, principle retained in V2)
  • Cost scales with URLs (ADR-007)
  • No predictive modeling (ADR-008)

If a proposal violates an ADR, it requires an explicit revision — not a workaround.