Architecture Decisions (ADR-Lite)

Purpose: Capture the why behind Eko's foundational architectural choices. Audience: Founders, engineers, AI agents Scope: V1, vNext, and V2 fact-engine architecture decisions. Relationship: Complements STACK.md (how) and Project Instructions (output behavior).

ADR‑001 — URL‑Scoped Execution (V1 legacy)

Decision Each system execution is scoped to exactly one URL.

Rationale

Users intentionally select specific URLs for signal value
Site‑wide crawling introduces noise, cost, and legal risk
URL scope enforces precision and interpretability

Consequences

No cross‑URL inference at the system layer
No implicit site context
Summaries must stand on a single‑page delta

Status: Accepted

ADR‑002 — Change‑First, Not Content‑First (V1 legacy)

Decision Eko detects change before invoking intelligence.

Rationale

Most URLs are static most of the time
AI summarization without delta is wasteful and misleading
Users care about what changed, not restated content

Consequences

AI runs only after meaningful change
Static pages produce no output
Cost scales with change frequency, not page size

Status: Accepted

ADR‑003 — Logically Always‑On, Physically Scheduled (V1 legacy)

Decision Eko is logically always‑on but physically executed on schedules.

Rationale

Continuous rendering is cost‑prohibitive at scale
Scheduled checks provide near‑real‑time perception
Cadence control enables predictable economics

Consequences

No persistent browser sessions
Cadence selection is product‑level, not infra‑level
“Real‑time” is an illusion created by frequency, not persistence

Status: Accepted

ADR‑004 — Snapshot Comparison Over Live Observation (V1 legacy)

Decision Eko compares snapshots rather than observing pages live.

Rationale

Snapshot diffs are auditable and reproducible
Live observation complicates attribution of change
Snapshots enable historical reasoning

Consequences

Each change is traceable to two concrete states
Historical deltas can be re‑evaluated if logic improves

Status: Accepted

ADR‑005 — Conditional Rendering Only When Necessary (V1 legacy)

Decision Headless rendering and screenshots are used only when required.

Rationale

Most meaningful change is text‑detectable
Rendering is expensive and brittle
Rendering is best used as a diagnostic tool

Consequences

Default path is lightweight fetch
Rendering is triggered by ambiguity, not preference

Status: Accepted

ADR‑006 — Non‑Substitutive Intelligence (V1 legacy, principle retained in V2)

Decision Eko summaries must never replace the source page.

Rationale

Legal and ethical constraints
User trust depends on transparency
The goal is awareness, not consumption

Consequences

Abstracted summaries only
No full‑text reproduction
Clear confidence signaling

Status: Accepted

ADR‑007 — Cost Scales With URLs, Not Users (V1 legacy)

Decision Infrastructure cost is driven by tracked URLs and cadence.

Rationale

URL monitoring is the unit of work
User count is orthogonal to system load
Prevents distorted pricing models

Consequences

Pricing plans map cleanly to cost drivers
High user counts do not destabilize infra

Status: Accepted

ADR‑008 — No Predictive or Anticipatory Modeling (V1 legacy)

Decision Eko does not predict future changes in V1.

Rationale

Prediction introduces speculation
Requires broader context than URL scope allows
Reduces trust in early product stages

Consequences

Outputs remain observational
Future‑looking signals must be explicit and opt‑in

Status: Accepted

ADR‑009 — Global URL Library (vNext, partially superseded by V2)

Decision URLs are stored globally with a single canonical instance per URL, shared across all users.

Rationale

Per-user URL tracking duplicates checks for the same URL
Global library enables single check per URL (cost efficiency)
Brand-centric navigation requires URL-to-brand relationships
Trend analysis requires aggregating changes across users

Consequences

urls table holds exactly one row per canonical URL
Users link to global URLs via user_url_library
History gating controls visibility per user (free vs paid)
Workers check URLs globally, not per user

Status: Accepted

ADR‑010 — Write‑Through Compatibility (vNext, partially superseded by V2)

Decision V1 tables remain the worker write surface during vNext migration. AFTER INSERT triggers mirror data to vNext tables.

Rationale

Workers continue unchanged during migration
No big-bang cutover required
Data flows automatically to vNext tables
Rollback is trivial (disable triggers)

Consequences

Both V1 and vNext tables contain data
vNext reads from new tables, V1 writes to old tables
Eventually workers migrate to vNext tables directly

Status: Accepted

ADR‑011 — History Gating by Subscription (vNext, partially superseded by V2)

Decision Free users see change history only from when they added a URL. Paid users see full history.

Rationale

Incentivizes upgrades with tangible value
Respects that free users didn't "earn" historical context
Simplifies billing: access is time-scoped, not feature-gated

Consequences

user_url_library.history_start_at controls visibility
NULL = full history (paid), timestamp = gated (free)
RLS and query layer both enforce gating

Status: Accepted

ADR‑012 — URL Policy Enforcement (vNext, partially superseded by V2)

Decision All URLs are evaluated against a policy before entering the global library.

Rationale

Prevents abuse (adult content, malware, etc.)
Protects system from uncheckable URLs (auth walls)
Creates audit trail for blocked URLs

Consequences

url_policy_decisions logs all decisions
Blocked URLs never enter urls table
URLs requiring review queue for admin action

Status: Accepted

ADR-013 — Fact-First Architecture (V2)

Decision Facts are the atomic unit of the V2 system. Everything — feed, cards, challenges, validation — derives from structured fact_records.

Rationale

URL-scoped execution (V1) doesn't apply to news/knowledge domain
Facts are independently verifiable, quotable, and quizzable
Single atomic unit simplifies the entire downstream pipeline
Schema validation ensures consistency across all fact sources

Consequences

fact_records table is the central data structure
All pipelines (news, evergreen, seed) converge on fact output
UI components render from fact schema, not raw article content
V1 URL-monitoring ADRs (001-008) still apply to the legacy URL tracking domain

Status: Accepted

ADR-014 — Multi-Pipeline Convergence

Decision News ingestion, evergreen generation, and seed explosion all produce fact_records that flow through the same validation and challenge-generation pipeline.

Rationale

Single validation path regardless of source ensures consistent quality
Reduces code duplication across pipelines
Enables unified quality metrics and cost tracking
New fact sources only need to output fact_records to integrate

Consequences

worker-validate handles all fact validation regardless of origin
Challenge generation is decoupled from fact source
Quality metrics are comparable across pipelines
Adding a new fact source requires only a new ingestion handler

Status: Accepted

ADR-015 — Schema-Driven Fact Extraction

Decision fact_record_schemas.fact_keys defines per-topic-category shape. AI extraction is constrained by schema, not freeform.

Rationale

Freeform extraction produces inconsistent output across models and prompts
Schema constraint enables reliable validation and UI rendering
Per-category schemas capture domain-specific fields (e.g., sports stats vs. science findings)
Zod schemas provide runtime validation at extraction time

Consequences

Each topic category has a defined fact key schema
AI prompts include schema constraints via generateObject
Invalid extractions fail fast at the schema level
New topic categories require schema definition before extraction

Status: Accepted

ADR-016 — Cost-Bounded AI with Model Routing

Decision All AI calls go through a model router with tier selection (default/mid/high), daily spend caps, and per-call cost tracking.

Rationale

Unbounded AI spend is an existential risk for the product
Different tasks have different quality requirements (validation needs higher quality than image captioning)
Per-call cost tracking enables optimization and budgeting
Model routing allows switching providers without changing calling code

Consequences

ai_cost_log tracks every AI invocation with model, tokens, and cost
ANTHROPIC_DAILY_SPEND_CAP_USD enforces hard daily limits
Model router selects provider/model based on task tier
Escalation to expensive models (Opus) requires explicit opt-in and has separate caps

Status: Accepted

ADR-017 — Challenge Answer Isolation

Decision Challenge URLs, shared links, and metadata must never reveal answer content.

Rationale

Spoiled answers destroy learning value and user trust
Shared links and social media previews must be safe to distribute
Challenge metadata (OG tags, slugs, API responses) must not leak answers

Consequences

Anti-spoiler image resolution when the fact image would reveal the answer
Challenge content API responses omit answer data until user submits
URL slugs derived from fact titles, not answer content
OG/social card metadata uses question text only

Status: Accepted

Revision Policy

New decisions should be appended, not rewritten
Superseded decisions must be marked with status changes
This document favors clarity over completeness

Summary

These decisions define Eko's architecture:

V2 Fact Engine (Active):

Fact-first atomic unit (ADR-013)
Multi-pipeline convergence (ADR-014)
Schema-driven extraction (ADR-015)
Cost-bounded AI with model routing (ADR-016)
Challenge answer isolation (ADR-017)

vNext (Historical — partially superseded by V2):

Global URL library (ADR-009)
Write-through compatibility (ADR-010)
History gating (ADR-011)
Policy enforcement (ADR-012)

V1 (Historical — deprecated URL tracking):

URL-scoped execution (ADR-001)
Change-first output (ADR-002)
Scheduled execution (ADR-003)
Snapshot comparison (ADR-004)
Conditional rendering (ADR-005)
Non-substitutive intelligence (ADR-006, principle retained in V2)
Cost scales with URLs (ADR-007)
No predictive modeling (ADR-008)

If a proposal violates an ADR, it requires an explicit revision — not a workaround.

#Architecture Decisions (ADR-Lite)

#ADR‑001 — URL‑Scoped Execution (V1 legacy)

#ADR‑002 — Change‑First, Not Content‑First (V1 legacy)

#ADR‑003 — Logically Always‑On, Physically Scheduled (V1 legacy)

#ADR‑004 — Snapshot Comparison Over Live Observation (V1 legacy)

#ADR‑005 — Conditional Rendering Only When Necessary (V1 legacy)

#ADR‑006 — Non‑Substitutive Intelligence (V1 legacy, principle retained in V2)

#ADR‑007 — Cost Scales With URLs, Not Users (V1 legacy)

#ADR‑008 — No Predictive or Anticipatory Modeling (V1 legacy)

#ADR‑009 — Global URL Library (vNext, partially superseded by V2)

#ADR‑010 — Write‑Through Compatibility (vNext, partially superseded by V2)

#ADR‑011 — History Gating by Subscription (vNext, partially superseded by V2)

#ADR‑012 — URL Policy Enforcement (vNext, partially superseded by V2)

#ADR-013 — Fact-First Architecture (V2)

#ADR-014 — Multi-Pipeline Convergence

#ADR-015 — Schema-Driven Fact Extraction

#ADR-016 — Cost-Bounded AI with Model Routing

#ADR-017 — Challenge Answer Isolation

#Revision Policy

#Summary

Architecture Decisions (ADR-Lite)

ADR‑001 — URL‑Scoped Execution (V1 legacy)

ADR‑002 — Change‑First, Not Content‑First (V1 legacy)

ADR‑003 — Logically Always‑On, Physically Scheduled (V1 legacy)

ADR‑004 — Snapshot Comparison Over Live Observation (V1 legacy)

ADR‑005 — Conditional Rendering Only When Necessary (V1 legacy)

ADR‑006 — Non‑Substitutive Intelligence (V1 legacy, principle retained in V2)

ADR‑007 — Cost Scales With URLs, Not Users (V1 legacy)

ADR‑008 — No Predictive or Anticipatory Modeling (V1 legacy)

ADR‑009 — Global URL Library (vNext, partially superseded by V2)

ADR‑010 — Write‑Through Compatibility (vNext, partially superseded by V2)

ADR‑011 — History Gating by Subscription (vNext, partially superseded by V2)

ADR‑012 — URL Policy Enforcement (vNext, partially superseded by V2)

ADR-013 — Fact-First Architecture (V2)

ADR-014 — Multi-Pipeline Convergence

ADR-015 — Schema-Driven Fact Extraction

ADR-016 — Cost-Bounded AI with Model Routing

ADR-017 — Challenge Answer Isolation

Revision Policy

Summary