Architecture Decisions (ADR-Lite)
Purpose: Capture the why behind Eko's foundational architectural choices.
Audience: Founders, engineers, AI agents
Scope: V1, vNext, and V2 fact-engine architecture decisions.
Relationship: Complements STACK.md (how) and Project Instructions (output behavior).
ADR‑001 — URL‑Scoped Execution (V1 legacy)
Decision Each system execution is scoped to exactly one URL.
Rationale
- Users intentionally select specific URLs for signal value
- Site‑wide crawling introduces noise, cost, and legal risk
- URL scope enforces precision and interpretability
Consequences
- No cross‑URL inference at the system layer
- No implicit site context
- Summaries must stand on a single‑page delta
Status: Accepted
ADR‑002 — Change‑First, Not Content‑First (V1 legacy)
Decision Eko detects change before invoking intelligence.
Rationale
- Most URLs are static most of the time
- AI summarization without delta is wasteful and misleading
- Users care about what changed, not restated content
Consequences
- AI runs only after meaningful change
- Static pages produce no output
- Cost scales with change frequency, not page size
Status: Accepted
ADR‑003 — Logically Always‑On, Physically Scheduled (V1 legacy)
Decision Eko is logically always‑on but physically executed on schedules.
Rationale
- Continuous rendering is cost‑prohibitive at scale
- Scheduled checks provide near‑real‑time perception
- Cadence control enables predictable economics
Consequences
- No persistent browser sessions
- Cadence selection is product‑level, not infra‑level
- “Real‑time” is an illusion created by frequency, not persistence
Status: Accepted
ADR‑004 — Snapshot Comparison Over Live Observation (V1 legacy)
Decision Eko compares snapshots rather than observing pages live.
Rationale
- Snapshot diffs are auditable and reproducible
- Live observation complicates attribution of change
- Snapshots enable historical reasoning
Consequences
- Each change is traceable to two concrete states
- Historical deltas can be re‑evaluated if logic improves
Status: Accepted
ADR‑005 — Conditional Rendering Only When Necessary (V1 legacy)
Decision Headless rendering and screenshots are used only when required.
Rationale
- Most meaningful change is text‑detectable
- Rendering is expensive and brittle
- Rendering is best used as a diagnostic tool
Consequences
- Default path is lightweight fetch
- Rendering is triggered by ambiguity, not preference
Status: Accepted
ADR‑006 — Non‑Substitutive Intelligence (V1 legacy, principle retained in V2)
Decision Eko summaries must never replace the source page.
Rationale
- Legal and ethical constraints
- User trust depends on transparency
- The goal is awareness, not consumption
Consequences
- Abstracted summaries only
- No full‑text reproduction
- Clear confidence signaling
Status: Accepted
ADR‑007 — Cost Scales With URLs, Not Users (V1 legacy)
Decision Infrastructure cost is driven by tracked URLs and cadence.
Rationale
- URL monitoring is the unit of work
- User count is orthogonal to system load
- Prevents distorted pricing models
Consequences
- Pricing plans map cleanly to cost drivers
- High user counts do not destabilize infra
Status: Accepted
ADR‑008 — No Predictive or Anticipatory Modeling (V1 legacy)
Decision Eko does not predict future changes in V1.
Rationale
- Prediction introduces speculation
- Requires broader context than URL scope allows
- Reduces trust in early product stages
Consequences
- Outputs remain observational
- Future‑looking signals must be explicit and opt‑in
Status: Accepted
ADR‑009 — Global URL Library (vNext, partially superseded by V2)
Decision URLs are stored globally with a single canonical instance per URL, shared across all users.
Rationale
- Per-user URL tracking duplicates checks for the same URL
- Global library enables single check per URL (cost efficiency)
- Brand-centric navigation requires URL-to-brand relationships
- Trend analysis requires aggregating changes across users
Consequences
urlstable holds exactly one row per canonical URL- Users link to global URLs via
user_url_library - History gating controls visibility per user (free vs paid)
- Workers check URLs globally, not per user
Status: Accepted
ADR‑010 — Write‑Through Compatibility (vNext, partially superseded by V2)
Decision V1 tables remain the worker write surface during vNext migration. AFTER INSERT triggers mirror data to vNext tables.
Rationale
- Workers continue unchanged during migration
- No big-bang cutover required
- Data flows automatically to vNext tables
- Rollback is trivial (disable triggers)
Consequences
- Both V1 and vNext tables contain data
- vNext reads from new tables, V1 writes to old tables
- Eventually workers migrate to vNext tables directly
Status: Accepted
ADR‑011 — History Gating by Subscription (vNext, partially superseded by V2)
Decision Free users see change history only from when they added a URL. Paid users see full history.
Rationale
- Incentivizes upgrades with tangible value
- Respects that free users didn't "earn" historical context
- Simplifies billing: access is time-scoped, not feature-gated
Consequences
user_url_library.history_start_atcontrols visibility- NULL = full history (paid), timestamp = gated (free)
- RLS and query layer both enforce gating
Status: Accepted
ADR‑012 — URL Policy Enforcement (vNext, partially superseded by V2)
Decision All URLs are evaluated against a policy before entering the global library.
Rationale
- Prevents abuse (adult content, malware, etc.)
- Protects system from uncheckable URLs (auth walls)
- Creates audit trail for blocked URLs
Consequences
url_policy_decisionslogs all decisions- Blocked URLs never enter
urlstable - URLs requiring review queue for admin action
Status: Accepted
ADR-013 — Fact-First Architecture (V2)
Decision
Facts are the atomic unit of the V2 system. Everything — feed, cards, challenges, validation — derives from structured fact_records.
Rationale
- URL-scoped execution (V1) doesn't apply to news/knowledge domain
- Facts are independently verifiable, quotable, and quizzable
- Single atomic unit simplifies the entire downstream pipeline
- Schema validation ensures consistency across all fact sources
Consequences
fact_recordstable is the central data structure- All pipelines (news, evergreen, seed) converge on fact output
- UI components render from fact schema, not raw article content
- V1 URL-monitoring ADRs (001-008) still apply to the legacy URL tracking domain
Status: Accepted
ADR-014 — Multi-Pipeline Convergence
Decision
News ingestion, evergreen generation, and seed explosion all produce fact_records that flow through the same validation and challenge-generation pipeline.
Rationale
- Single validation path regardless of source ensures consistent quality
- Reduces code duplication across pipelines
- Enables unified quality metrics and cost tracking
- New fact sources only need to output
fact_recordsto integrate
Consequences
worker-validatehandles all fact validation regardless of origin- Challenge generation is decoupled from fact source
- Quality metrics are comparable across pipelines
- Adding a new fact source requires only a new ingestion handler
Status: Accepted
ADR-015 — Schema-Driven Fact Extraction
Decision
fact_record_schemas.fact_keys defines per-topic-category shape. AI extraction is constrained by schema, not freeform.
Rationale
- Freeform extraction produces inconsistent output across models and prompts
- Schema constraint enables reliable validation and UI rendering
- Per-category schemas capture domain-specific fields (e.g., sports stats vs. science findings)
- Zod schemas provide runtime validation at extraction time
Consequences
- Each topic category has a defined fact key schema
- AI prompts include schema constraints via
generateObject - Invalid extractions fail fast at the schema level
- New topic categories require schema definition before extraction
Status: Accepted
ADR-016 — Cost-Bounded AI with Model Routing
Decision All AI calls go through a model router with tier selection (default/mid/high), daily spend caps, and per-call cost tracking.
Rationale
- Unbounded AI spend is an existential risk for the product
- Different tasks have different quality requirements (validation needs higher quality than image captioning)
- Per-call cost tracking enables optimization and budgeting
- Model routing allows switching providers without changing calling code
Consequences
ai_cost_logtracks every AI invocation with model, tokens, and costANTHROPIC_DAILY_SPEND_CAP_USDenforces hard daily limits- Model router selects provider/model based on task tier
- Escalation to expensive models (Opus) requires explicit opt-in and has separate caps
Status: Accepted
ADR-017 — Challenge Answer Isolation
Decision Challenge URLs, shared links, and metadata must never reveal answer content.
Rationale
- Spoiled answers destroy learning value and user trust
- Shared links and social media previews must be safe to distribute
- Challenge metadata (OG tags, slugs, API responses) must not leak answers
Consequences
- Anti-spoiler image resolution when the fact image would reveal the answer
- Challenge content API responses omit answer data until user submits
- URL slugs derived from fact titles, not answer content
- OG/social card metadata uses question text only
Status: Accepted
Revision Policy
- New decisions should be appended, not rewritten
- Superseded decisions must be marked with status changes
- This document favors clarity over completeness
Summary
These decisions define Eko's architecture:
V2 Fact Engine (Active):
- Fact-first atomic unit (ADR-013)
- Multi-pipeline convergence (ADR-014)
- Schema-driven extraction (ADR-015)
- Cost-bounded AI with model routing (ADR-016)
- Challenge answer isolation (ADR-017)
vNext (Historical — partially superseded by V2):
- Global URL library (ADR-009)
- Write-through compatibility (ADR-010)
- History gating (ADR-011)
- Policy enforcement (ADR-012)
V1 (Historical — deprecated URL tracking):
- URL-scoped execution (ADR-001)
- Change-first output (ADR-002)
- Scheduled execution (ADR-003)
- Snapshot comparison (ADR-004)
- Conditional rendering (ADR-005)
- Non-substitutive intelligence (ADR-006, principle retained in V2)
- Cost scales with URLs (ADR-007)
- No predictive modeling (ADR-008)
If a proposal violates an ADR, it requires an explicit revision — not a workaround.