Eko Glossary

This glossary defines the core terms used throughout Eko’s product, documentation, and codebase.

These definitions are authoritative. If a term is used inconsistently, this document takes precedence.


URL

A URL is a single, exact web address that Eko monitors as an independent signal source.

  • Each URL is treated in isolation
  • Eko does not infer meaning from parent or child pages
  • Monitoring a site requires adding each relevant URL separately

Check

A check is a scheduled evaluation of a URL at a specific point in time.

During a check, Eko:

  • Fetches the page
  • Normalizes its content
  • Compares it to the previous known state

Checks are system-controlled and cannot be triggered manually by users.


Cadence

Cadence is the frequency at which a URL is checked.

Examples:

  • Daily
  • Weekly

Cadence options are predefined by Eko and chosen to balance signal quality, cost, and fairness.


Render

A render is Eko’s stored representation of a URL at the time of a check.

Renders:

  • Are normalized versions of the page
  • Exclude irrelevant or unstable markup
  • Are retained for historical comparison

Renders are not copies of the original page and are not user-facing.


Normalization

Normalization is the process of transforming raw page content into a stable, comparable structure.

Normalization removes or stabilizes:

  • Dynamic IDs
  • Volatile markup
  • Cosmetic layout differences

The goal is to detect meaningful change, not superficial variation.


Section

A section is a logical portion of a page identified during normalization.

Sections:

  • Represent coherent blocks of information
  • Are assigned stable identifiers
  • Allow Eko to track changes at a granular level

Section Hash

A section hash is a deterministic fingerprint of a section’s normalized content.

Section hashes allow Eko to:

  • Detect additions, removals, and revisions
  • Ignore unchanged sections
  • Reduce false positives from reordering

Diff

A diff is the comparison between two renders of the same URL.

Diffs are used to determine:

  • What changed
  • How much changed
  • Whether the change is meaningful

Meaningful Change

A meaningful change is a modification that alters the information value of a page in a way that could affect a user’s understanding, decision, or action.

This concept is defined in detail in definition-of-meaningful-change.md.


Change Event

A change event occurs when a diff meets the meaningful change criteria.

Only change events are eligible to surface to users.


Summary

A summary is Eko’s user-facing explanation of a change event.

Every summary follows the same structure:

  • What changed
  • Why it matters
  • Confidence level

Confidence Level

The confidence level expresses how clear and unambiguous a detected change is.

  • High — strong, explicit signal
  • Medium — likely change with some ambiguity
  • Low — weak or indirect signal

Confidence reflects signal clarity, not importance.


URL Library

The URL library is a curated collection of high-signal URLs provided by Eko.

Library URLs:

  • Are selected for relevance and stability
  • Follow the same rules as custom URLs once added

Custom URL

A custom URL is a user-added URL not included in the shared library.

Custom URLs are subject to the same cadence, normalization, and change rules as library URLs.


Digest

A digest is a grouped delivery of summaries over a period of time.

Digests may be delivered:

  • In-app
  • By email

They are designed to be scannable and action-oriented.


Suppression

Suppression is the deliberate choice to withhold a detected difference from surfacing to users.

Suppression occurs when a change:

  • Does not meet the meaningful change threshold
  • Is too ambiguous
  • Is likely to create noise

Silence is considered a valid and often correct outcome.


Fair-Use Safe Output

Fair-use safe output refers to summaries that:

  • Are transformative
  • Avoid long quotations
  • Do not replace the original page

Users are expected to visit the source for full context.


Plan

A plan defines what capabilities and limits a user has within Eko.

Plans include:

  • Maximum number of tracked URLs
  • Allowed cadences (e.g., daily, weekly)
  • Access to trends and lifetime history

Default plans: Free, Base, Pro, Team.


Subscription

A subscription links a user to a specific plan.

Subscriptions:

  • Track current plan assignment
  • Integrate with Stripe for paid tiers
  • Have status: active, canceled, or past_due

URL Cap

The URL cap is the maximum number of active tracked URLs a user can have.

  • Enforced at database level via trigger
  • Varies by plan (e.g., Free = 10, Pro = 125)
  • Only counts active URLs

Notification Preferences

Notification preferences control how and when a user receives change alerts.

Options include:

  • Email enabled/disabled
  • Digest frequency (immediate or daily digest)
  • Quiet hours (times when notifications are suppressed)

Notification Delivery

A notification delivery is an audit record of a sent notification.

Each delivery tracks:

  • Channel (currently email only)
  • Status (pending, sent, failed)
  • Timestamp and any error message

Tracking Suggestion

A tracking suggestion is a discovery template shown during onboarding.

Suggestions:

  • Are organized by category
  • Include example URL patterns
  • Help users understand what Eko can monitor

AI Provider

The AI provider identifies which AI service generated a summary or fact.

Examples: openai, anthropic, google.

Used for observability, cost attribution, and model routing.


Global URL Library (vNext)

The Global URL Library is the vNext architecture where each URL exists exactly once in the system, shared across all users.

Key characteristics:

  • URLs are unique by canonical_url
  • One check pipeline per URL globally
  • Users link to global URLs via user_url_library
  • Reduces redundant checks and costs

User URL Library (vNext)

The User URL Library is the user-specific relationship to global URLs.

The user_url_library table:

  • Links users to URLs in the global library
  • Stores user-specific notes
  • Controls history gating via history_start_at

History Gating (vNext)

History gating controls how far back a user can see change history for a URL.

  • Free users: See changes only from when they added the URL
  • Paid users: See full change history (lifetime access)

Implemented via the history_start_at column in user_url_library.


Observation (vNext)

An observation is a global check record in the vNext architecture.

The url_observations table replaces per-user url_checks:

  • One observation per URL per day globally
  • All users share the same observation data
  • Populated via write-through triggers from V1

Change Event (vNext)

A change event is a global change record in the vNext architecture.

The url_change_events table:

  • Links to url_observations
  • Stores diff metadata
  • Shared across all users tracking the URL

URL Submission (vNext)

A URL submission is a user request to add a new URL to the system.

The submission workflow:

  1. User submits URL
  2. Policy evaluation (allow/block/review)
  3. Create global URL or link to existing
  4. Add to user's library

Policy Decision (vNext)

A policy decision determines whether a URL is allowed in the system.

Decision types:

  • allow — URL proceeds to global library
  • block — URL rejected (adult content, malware, etc.)
  • review — URL queued for admin review

Trend (vNext)

A trend is an aggregated view of change patterns across URLs.

Trend types:

  • Category trends (e.g., "SaaS pricing changes")
  • Brand vs brand comparisons
  • Seasonal patterns
  • URL type trends

Requires paid subscription (can_access_trends = true).


Write-Through Trigger (vNext)

A write-through trigger automatically mirrors V1 data to vNext tables.

Used during migration:

  • url_checksurl_observations
  • url_changesurl_change_events
  • summariesurl_change_summaries

Allows V1 workers to continue unchanged while populating vNext.


Brand Site (V1)

A brand site is a domain grouping for tracked URLs in V1.

Key characteristics:

  • Identified by canonical domain (eTLD+1) extracted from URLs
  • Used for navigation and organization only
  • Does NOT affect meaningful change detection, summarization, or notifications
  • Server-derived — clients never pass brand_site_id

Example: A tracked URL https://docs.stripe.com/api belongs to brand site stripe.com.


Registrable Domain

The registrable domain (also called eTLD+1) is the domain extracted from a URL for grouping.

Examples:

  • docs.stripe.comstripe.com
  • status.github.iostatus.github.io (public suffix)
  • www.example.co.ukexample.co.uk

Extracted using the tldts library with public suffix list awareness.


ModelAdapter

A ModelAdapter is a strategy pattern interface for per-model prompt customization. Each adapter encapsulates model-specific knowledge:

  • Prompt modifications (suffix, prefix, or override modes)
  • Known weaknesses surfaced to quality reviewers
  • Signoff guidance injected into test pipeline review prompts
  • Optional temperature and retry overrides

Adapters live in packages/ai/src/models/adapters/ and are looked up via getModelAdapter(modelId) in the adapter registry.


Eligibility (Model)

Eligibility is a model's qualification for production use. A model must score 97% or higher across 7 quality dimensions before it can be used for production seeding:

  • validation — facts pass multi-phase validation
  • evidence — facts corroborated by external sources
  • challenges — challenge content passes CQ rules
  • schema_adherence — output conforms to schemas
  • voice_adherence — matches Eko voice constitution
  • style_adherence — follows per-style rules
  • token_efficiency — stays within token budget

Tracked in scripts/seed/.llm-test-data/eligibility.jsonl.


Prompt Customization

Prompt customization is an adapter-provided modification to system prompts. Three modes exist, applied in priority order:

  1. Override — replaces the entire system prompt (highest priority)
  2. Prefix — prepended before the base prompt
  3. Suffix — appended after the base prompt (default, most common)

Customizations can also include temperature and retry overrides.


Topic Category

A topic category is a node in the hierarchical taxonomy that organizes facts by subject area.

  • Identified by a unique slug (e.g., sports, basketball, nba)
  • Has a depth level (0=root, 1=primary, 2=secondary, 3=leaf)
  • Stores a path representing its full ancestry (e.g., sports/basketball/nba)
  • Carries a dailyQuota and percentTarget controlling fact volume per topic
  • Links to fact_record_schemas defining what structured fields facts in that category must have

Taxonomy Depth

Taxonomy depth describes the hierarchical level of a topic category in the taxonomy tree.

  • Depth 0 — Root categories (e.g., sports, science, history)
  • Depth 1 — Primary subcategories (e.g., basketball, physics)
  • Depth 2 — Secondary subcategories (e.g., nba, quantum-mechanics)
  • Depth 3 — Leaf categories (e.g., nba-records, particle-physics)
  • The system supports up to MAX_TAXONOMY_DEPTH = 5 levels

Category Deprecation

Category deprecation is the mechanism for retiring topic categories without data loss.

  • Deprecated categories have a deprecated_at timestamp, a replaced_by reference to the successor category, and a deprecation_reason text
  • Deprecation preserves existing fact records — they remain linked to the original category
  • The resolveActiveCategory() function follows replacement chains to find the current active successor
  • Deprecated categories are excluded from active queries via isActive = false

Active Category Resolution

Active category resolution is the process of following deprecation replacement chains to find the current active category.

  • Implemented by resolveActiveCategory() in packages/db/src/drizzle/fact-engine-queries.ts
  • Follows replaced_by references up to 5 hops to prevent infinite loops
  • Returns the first active category found in the chain, or null if the chain is broken
  • Used as a fallback in resolveTopicCategory() when a matched category is inactive

Fact Record

A fact record is the atomic unit of Eko — a structured, schema-validated piece of knowledge.

  • Stored in the fact_records table with a facts JSONB column containing key-value pairs
  • Each fact has a title, context, topic category, notability score, and validation status
  • Facts are extracted from news stories, generated as evergreen content, or created via seed pipelines
  • Only facts with status = 'validated' appear in the public feed
  • Source types: news_extraction, ai_generated, file_seed, spinoff_discovery, ai_super_fact, api_import

Fact Record Schema

A fact record schema defines the expected structure of facts within a topic category.

  • Stored in the fact_record_schemas table with a factKeys JSONB column
  • Each key has a name, label, type (text, number, date, boolean), and required flag
  • The cardFormats array specifies which card display formats are available
  • buildFactObjectSchema() converts fact keys into a concrete Zod schema for AI extraction and validation

Evergreen Fact

An evergreen fact is timeless knowledge not tied to current events.

  • Source type: ai_generated
  • Generated daily via the GENERATE_EVERGREEN queue at mid model tier for higher quality
  • Examples: "The speed of light is 299,792,458 m/s", "The Eiffel Tower was originally intended to be temporary"
  • Deduplicated against existing fact titles in the same topic
  • A co-equal content pillar alongside news facts, ensuring the feed has content even during slow news cycles

Taxonomy Voice

Taxonomy voice is the per-domain emotional register injected into AI generation prompts.

  • Defined in TAXONOMY_VOICE in packages/ai/src/taxonomy-content-rules.ts
  • Each taxonomy has a personality, energy level, excitement driver, and pitfall guidance
  • Examples: "Nature documentary narrator" for nature, "enthusiastic superfan" for sports
  • Injected at layer 3 of the prompt layering hierarchy, between the universal voice constitution and format/style voice

Taxonomy Content Rules

Taxonomy content rules are per-domain extraction and challenge guidance for AI prompts.

  • Defined in TAXONOMY_CONTENT_RULES in packages/ai/src/taxonomy-content-rules.ts
  • Include extraction guidance, challenge guidance, anti-patterns, and domain vocabulary
  • Applied at layer 6 of the prompt layering hierarchy
  • formatVocabularyForPrompt() serializes domain vocabulary into a concise prompt block (~120-200 tokens)

Challenge Content

Challenge content is pre-generated quiz text for a fact across multiple interaction styles.

  • Stored in the fact_challenge_content table, linked to a fact_record
  • Generated for 6 pre-generated styles: fill_the_gap, direct_question, statement_blank, reverse_lookup, free_text, multiple_choice
  • Each row contains five fields: setup_text, challenge_text, reveal_correct, reveal_wrong, correct_answer
  • Includes a challenge_title for display and a style_data JSONB column with style-specific structured data

Challenge Image

A challenge image is an anti-spoiler image used when a fact's primary image would reveal the answer.

  • Resolved via the RESOLVE_CHALLENGE_IMAGE queue, consumed by worker-ingest
  • Triggers when challenge content generation detects the fact image contains the answer
  • Falls back to a topic-appropriate stock image that does not spoil the challenge
  • Stored alongside the challenge content for use in card rendering

Notability Score

The notability score is a 0.0-1.0 value indicating how noteworthy a fact is.

  • Scored by AI during fact extraction or generation
  • Facts below the threshold (default 0.6, configured via NOTABILITY_THRESHOLD) are discarded
  • Higher scores indicate greater significance, novelty, or interest value
  • Used to prioritize facts in the feed and filter low-value content

Final Note

If a term used in Eko documentation or UI is unclear, it should be added or clarified here.

A shared language is essential to maintaining product clarity and user trust.