Brand Library Schema (V1)

Deprecated: The V1 brand_library_* tables have been superseded by the V2 brands, domains, and brand_categories tables. See Architecture Overview for the current schema.

The Brand Library is a parallel schema for curated brand identity data and high-signal tracked URLs. It is system-owned seed data with no user ownership or RLS policies.

Note: Brand Library (brand_library_* tables) is distinct from Brand Sites (brand_sites table). Brand Library contains curated, admin-maintained brand metadata used for tracking suggestions. Brand Sites is a user-facing domain grouping derived automatically from tracked URLs. See v1-business-rules.md for brand sites documentation (archived v1 spec).

Purpose

  • Power Tracking Suggestions for user onboarding
  • Provide consistent brand context for URL change summaries
  • Maintain auditable provenance for all ingested data
  • Support confidence-aware querying and filtering

Design Principles

PrincipleImplementation
URL-scoped onlyNo site crawling or link following
Non-substitutiveSummaries describe purpose, not content
Confidence-awareEvery derived field carries explicit confidence
System-ownedNo RLS policies; service role access only

Schema Overview

┌─────────────────────────┐
│  brand_library_sources  │  ← Provenance tracking
└─────────────────────────┘
           │
           ▼
┌─────────────────────────┐
│  brand_library_brands   │  ← Canonical brand entities
└─────────────────────────┘
           │
           ▼
┌─────────────────────────┐
│   brand_library_urls    │  ← Validated trackable URLs
└─────────────────────────┘
           │
           ▼
┌─────────────────────────┐
│ brand_library_review_queue │  ← Ambiguity/QA queue
└─────────────────────────┘

Enums

brand_url_type

URL page types supported in V1:

ValueDescription
pricingPricing/plans pages
statusService status pages
privacyPrivacy policy pages
termsTerms of service pages
securitySecurity/trust pages

brand_confidence_level

Confidence indicators for derived fields:

ValueMeaning
highStrong signal, multiple confirmations
mediumReasonable confidence, may need review
lowWeak signal, flagged for review

brand_entity_type

Entity types for the review queue:

ValueDescription
brandBrand identity record
urlURL record

brand_review_status

Review queue item status:

ValueDescription
openPending review
fixedIssue resolved
ignoredIntentionally skipped

brand_review_reason

Reasons for review queue entries:

Brand-related:

ValueTrigger
DOMAIN_AMBIGUOUSMultiple possible canonical domains
DOMAIN_MISSINGNo domain could be determined
CONFLICTING_DOMAINSSources disagree on canonical domain
SUMMARY_LOW_CONFIDENCELLM summary confidence too low
SUMMARY_TOO_MARKETINGSummary contains promotional language
INSUFFICIENT_INPUTSNot enough data to generate summary

URL-related:

ValueTrigger
NO_MATCH_FOR_URL_TYPENo valid URL found for type
MULTIPLE_STRONG_CANDIDATESMultiple URLs scored equally
OFF_DOMAIN_REDIRECTFinal URL on different domain
URL_TYPE_MISMATCHPage content doesn't match URL type

Tables

brand_library_sources

Tracks provenance of all ingested data.

ColumnTypeNullableDescription
idUUIDNoPrimary key
nameTEXTNoSource name (e.g., "PDL Companies")
source_urlTEXTYesURL to source dataset
licenseTEXTYesLicense terms
notesTEXTYesAdditional notes
created_atTIMESTAMPTZNoRecord creation time
updated_atTIMESTAMPTZNoLast update time

brand_library_brands

Canonical brand entities with identity and categorization data.

ColumnTypeNullableDescription
idUUIDNoPrimary key
brand_nameTEXTNoDisplay name
canonical_domainTEXTNoPrimary domain (unique)
domain_aliasesJSONBYesAlternate domains array
category_pathTEXTNoTaxonomy path (e.g., "software/saas")
hq_cityTEXTYesHeadquarters city
hq_regionTEXTYesHeadquarters region/state
hq_countryTEXTYesHeadquarters country
business_summaryTEXTYes1-2 sentence description
business_modelTEXTYesSaaS, retailer, marketplace, etc.
audienceTEXTYesB2B, B2C, hybrid
logo_urlTEXTYesLogo image URL
confidence_identitybrand_confidence_levelNoIdentity confidence
confidence_categorybrand_confidence_levelNoCategory confidence
source_refsJSONBNoSource attribution
created_atTIMESTAMPTZNoRecord creation time
updated_atTIMESTAMPTZNoLast update time

Constraints:

  • canonical_domain is UNIQUE

Indexes:

  • confidence_identity
  • confidence_category

brand_library_urls

Validated, trackable URLs associated with brands.

ColumnTypeNullableDescription
idUUIDNoPrimary key
brand_idUUIDNoFK to brand_library_brands
url_typebrand_url_typeNoPage type enum
tracked_urlTEXTNoOriginal URL
final_urlTEXTNoResolved URL after redirects
http_statusINTEGERNoHTTP status code
titleTEXTYesPage title
h1TEXTYesFirst H1 element
why_trackTEXTNo1-2 sentences on change value
summaryTEXTNo2-3 sentences on page purpose
confidence_url_matchbrand_confidence_levelNoURL type match confidence
confidence_summarybrand_confidence_levelNoSummary confidence
source_refsJSONBNoSource attribution
created_atTIMESTAMPTZNoRecord creation time
updated_atTIMESTAMPTZNoLast update time

Constraints:

  • brand_id references brand_library_brands(id) ON DELETE CASCADE

Indexes:

  • brand_id
  • url_type
  • confidence_url_match
  • confidence_summary

brand_library_review_queue

Centralized queue for ambiguous or low-confidence items requiring manual review.

ColumnTypeNullableDescription
idUUIDNoPrimary key
entity_typebrand_entity_typeNobrand or url
entity_idUUIDNoID of the entity
reasonbrand_review_reasonNoWhy flagged
detailsJSONBYesAdditional context
statusbrand_review_statusNoopen, fixed, ignored
created_atTIMESTAMPTZNoRecord creation time
updated_atTIMESTAMPTZNoLast update time

Indexes:

  • status
  • (entity_type, entity_id)

Relationships

brand_library_sources
        │
        │ (referenced via source_refs JSONB)
        ▼
brand_library_brands ──────────────────────┐
        │                                  │
        │ 1:N                              │
        ▼                                  │
brand_library_urls                         │
        │                                  │
        │ (polymorphic via entity_type)    │
        ▼                                  │
brand_library_review_queue ◄───────────────┘

Migration

File: supabase/migrations/0005_brand_library_v1.sql

The migration:

  1. Creates 5 Postgres enums
  2. Creates 4 tables with timestamps
  3. Adds foreign key constraints
  4. Creates performance indexes
  5. Attaches update_updated_at triggers

TypeScript Types

Location: packages/db/src/client.ts

type BrandUrlType = 'pricing' | 'status' | 'privacy' | 'terms' | 'security'
type BrandConfidenceLevel = 'high' | 'medium' | 'low'
type BrandEntityType = 'brand' | 'url'
type BrandReviewStatus = 'open' | 'fixed' | 'ignored'
type BrandReviewReason =
  | 'DOMAIN_AMBIGUOUS' | 'DOMAIN_MISSING' | 'CONFLICTING_DOMAINS'
  | 'SUMMARY_LOW_CONFIDENCE' | 'SUMMARY_TOO_MARKETING' | 'INSUFFICIENT_INPUTS'
  | 'NO_MATCH_FOR_URL_TYPE' | 'MULTIPLE_STRONG_CANDIDATES'
  | 'OFF_DOMAIN_REDIRECT' | 'URL_TYPE_MISMATCH'

Location: packages/shared/src/types.ts

Interfaces: BrandLibrarySource, BrandLibraryBrand, BrandLibraryUrl, BrandLibraryReviewQueueItem

Location: packages/shared/src/schemas.ts

Zod schemas for runtime validation of all entities.


Access Patterns

OperationAccess Method
Seed data ingestionService role (bypasses RLS)
Admin review queueService role
Tracking suggestionsService role read
User-facing brand displayService role read

What This Schema Does NOT Include

Per V1 scope, these are explicitly excluded:

  • RLS policies (system-owned data)
  • Ingestion scripts
  • Brandfetch API integration
  • LLM prompt definitions
  • Admin UI
  • Seed data