Global URL Library Contract
Deprecated: This document describes the V1/vNext URL library system which has been fully superseded by the V2 fact engine. The V1/vNext tables described here have been dropped (migrations 0057-0060). See Architecture Overview for the current system and Fact & Taxonomy Schema Map for the V2 schema.
This document defines the invariants and rules governing the vNext Global URL Library system.
Overview
The vNext architecture transforms Eko from per-user URL tracking to a shared global URL library. This enables:
- Single check per URL across all users (cost efficiency)
- Brand-centric screens and navigation
- Trend analysis across URLs and categories
- Simpler subscription model based on library access
URL Uniqueness Rule
Invariant: Each URL exists exactly once in the urls table, keyed by canonical_url.
Canonical URL Definition
A canonical URL is the normalized form of a submitted URL:
- Fragment (
#...) removed - Hostname lowercased
- Tracking parameters stripped (
utm_*,ref,fbclid,gclid, etc.) - Query parameters sorted alphabetically
- Trailing slash removed (except for root path
/)
Enforcement
urls.canonical_urlhas aUNIQUEconstraintgetOrCreateGlobalUrlByCanonical()usesON CONFLICT DO NOTHING+ select pattern- Duplicate submissions link to existing global URL rather than creating new entry
User Add Flow
When a user submits a URL to track:
1. User submits URL → url_submissions record created (status: pending)
2. URL normalized → canonical_url computed
3. Policy evaluation → url_policy_decisions record created
4. Decision routing:
- BLOCK → submission.status = blocked, no global URL created
- REVIEW → submission.status = needs_review, awaits admin
- ALLOW → proceed to step 5
5. Global URL check:
- EXISTS → submission.status = approved_linked_existing, matched_url_id set
- NEW → urls record created, submission.status = approved_created_global
6. User library link → user_url_library record created
- history_start_at set based on plan (NULL for paid, now() for free)
Tables Involved
| Table | Purpose |
|---|---|
url_submissions | Audit trail of all submission attempts |
url_policy_decisions | Policy decision audit log |
urls | Global URL entity (unique by canonical_url) |
user_url_library | User-URL relationship |
Global-Only Checks Invariant
Invariant: There is exactly one check pipeline per URL globally, not per user.
V1 Compatibility
During migration, V1 workers continue writing to url_checks, url_changes, etc.
Write-through triggers mirror this data to vNext tables:
| V1 Table | vNext Table | Trigger |
|---|---|---|
url_checks | url_observations | sync_check_to_observation |
url_changes | url_change_events | sync_change_to_event |
summaries | url_change_summaries | sync_summary_to_vnext |
url_renders | url_render_artifacts | sync_render_to_artifact |
Future State
Workers will write directly to vNext tables (url_observations, etc.). The urls table's next_check_at and check_frequency fields drive scheduling.
History Gating Rule
Invariant: Free users see change history only from when they added the URL. Paid users see full history.
Implementation
-
user_url_library.history_start_at:
NULL= full history access (paid plans)timestamp= only see changes after this time (free plans)
-
Database trigger (
set_library_history_gating):- Sets
history_start_aton INSERT based on user's current plan
- Sets
-
RLS policy (
user_url_library_history_gated_select):- Filters
url_change_eventsbased onhistory_start_at
- Filters
-
Query layer (
listUrlUpdateInstances):- Accepts
gatingSinceparameter for explicit filtering - Defense in depth alongside RLS
- Accepts
Plan Upgrade Behavior
When a user upgrades from free to paid:
- Existing
history_start_atvalues remain (conservative approach) - Admin can manually reset to
NULLif business decides to unlock history retroactively
URL Policy Categories
URLs are evaluated against policy rules before being allowed into the system.
Policy Decisions
| Decision | Effect |
|---|---|
allow | URL proceeds to global library |
block | URL rejected, reason recorded |
review | URL queued for admin review |
Reason Codes
| Code | Description |
|---|---|
adult_content | Pornography, explicit material |
hate_or_extremism | Hate speech, extremist content |
malware_or_phishing | Known malicious URLs |
spam_or_low_quality | Spam, suspicious TLDs |
illegal_or_regulated | Gambling, drugs, weapons |
auth_wall_or_uncheckable | Login required, uncheckable |
duplicate | Already exists in system |
unknown | Unclassified rejection |
Evidence Logging
All policy decisions record evidence for audit:
{
"hostname": "example.com",
"pattern": "pattern that matched",
"reason": "human-readable explanation"
}
URL Cap Enforcement
Invariant: Users cannot exceed their plan's max_tracked_urls limit.
Enforcement Points
- Pre-check (UX):
checkUrlCapEntitlement()in application code - Authoritative (DB):
check_library_url_cap()trigger onuser_url_library
Trigger Behavior
-- Counts active entries in user_url_library
-- Compares against plan_definitions.max_tracked_urls
-- Raises exception if limit exceeded
Screen-to-Table Mapping
Dashboard Screen
Purpose: User's tracked URLs with latest updates
| Data | Source |
|---|---|
| User's URLs | user_url_library JOIN urls |
| Latest update per URL | url_change_events (with history gating) |
| Change summaries | url_change_summaries |
Brand Screen
Purpose: Brand overview with all trackable URLs
| Data | Source |
|---|---|
| Brand info | brand_library_brands (via brands view) |
| Brand's URLs | urls WHERE brand_id = ? |
| Latest updates | url_change_events for brand's URLs |
Brand URL Screen (Update History)
Purpose: Single URL with change history
| Data | Source |
|---|---|
| URL info | urls |
| Brand info | brand_library_brands |
| Change history | url_change_events (history gated) |
| Summaries | url_change_summaries |
| Gating status | user_url_library.history_start_at |
Update Instance Screen
Purpose: Single change event detail
| Data | Source |
|---|---|
| Change event | url_change_events |
| Summary | url_change_summaries |
| Observation | url_observations |
| URL + Brand | urls JOIN brand_library_brands |
RLS Policy Summary
Global Tables (service_role writes, user reads via library)
urls: SELECT viauser_url_librarymembershipurl_observations: SELECT via URL membershipurl_change_events: SELECT via URL membership + history gatingurl_change_summaries: SELECT via event membershipurl_render_artifacts: SELECT via URL membership
User Tables (user CRUD on own rows)
user_url_library: Full CRUD foruser_id = auth.uid()url_submissions: INSERT for self, SELECT own rows
Policy Tables
url_policy_decisions: SELECT via own submissions- Writes via service_role only
Trends (read-only for authenticated)
trend_definitions: SELECT active trendstrend_points: SELECT via trend membership
Billing/Onboarding
- Own rows only, service_role for billing writes
Migration Notes
Backfill Strategy
The migration (0007_global-url-library-vnext.sql) performs idempotent backfill:
tracked_urls→urls(dedupe by canonical_url, first created wins)tracked_urls→user_url_library(with history gating based on plan)url_checks→url_observations(dedupe by url_id + checked_day)url_changes→url_change_eventssummaries→url_change_summariesurl_renders→url_render_artifacts
All backfill statements use ON CONFLICT DO NOTHING for safety.
Write-Through Triggers
During transition, V1 tables remain the worker write surface. AFTER INSERT triggers mirror data to vNext tables in real-time. This allows gradual migration of workers without data loss.