Backfill and Fix NULL Columns in fact_records
Audit of fact_records revealed several columns systematically NULL due to pipeline gaps and missing wiring.
Challenges
Wave 1: Schema & Type Expansion
| # | Challenge | PASS Criteria |
|---|---|---|
| 1.1 | Add notability fields to ImportFactsMessageSchema | Zod schema includes notability_score, notability_reason, context as optional fields |
| 1.2 | Update createImportFactsMessage TS type | TS type includes challenge_title?, notability_score?, notability_reason?, context? |
| 1.3 | Add generationCostUsd to insertFactRecord | Function accepts and writes generationCostUsd to DB |
Wave 2: Pipeline Wiring Fixes
| # | Challenge | PASS Criteria |
|---|---|---|
| 2.1 | Enqueue GENERATE_CHALLENGE_CONTENT after validation | validate-fact.ts enqueues both RESOLVE_IMAGE and GENERATE_CHALLENGE_CONTENT for validated facts |
| 2.2 | Remove duplicate enqueue from import-facts.ts | import-facts.ts no longer imports or enqueues GENERATE_CHALLENGE_CONTENT; explanatory comment present |
| 2.3 | Wire notability fields through import pipeline | import-facts.ts passes notability_score, notability_reason, context from message to insertFactRecord; explode-entry.ts passes notability_score and context in IMPORT_FACTS items |
| 2.4 | Wire generationCostUsd in extraction handlers | extract-facts.ts computes cost via estimateCost() and passes to insertFactRecord; generate-evergreen.ts splits cost across records |
Wave 3: Backfill Existing Data
| # | Challenge | PASS Criteria |
|---|---|---|
| 3.1 | Backfill notabilityScore for NULL records | --dry-run shows counts; --notability sets source-type defaults; verified with SELECT COUNT(*) WHERE notability_score IS NULL AND status = 'validated' returns 0 |
| 3.2 | Enqueue challenge content for gaps | --challenge-content enqueues GENERATE_CHALLENGE_CONTENT for validated facts with zero fact_challenge_content rows |
| 3.3 | Audit data integrity | --audit reports anomalies: NULL validation JSONB, NULL publishedAt on validated records |
Verification
bun run typecheck
bun run test
bun run lint
bun scripts/seed/backfill-fact-nulls.ts --dry-run
Files Modified
| File | Change |
|---|---|
packages/shared/src/schemas.ts | Added optional notability fields to ImportFactsMessageSchema |
packages/queue/src/index.ts | Updated createImportFactsMessage TS type |
packages/db/src/drizzle/fact-engine-queries.ts | Added generationCostUsd to insertFactRecord |
apps/worker-validate/src/handlers/validate-fact.ts | Enqueue GENERATE_CHALLENGE_CONTENT after validation |
apps/worker-facts/src/handlers/import-facts.ts | Wire notability fields, remove duplicate challenge enqueue |
apps/worker-facts/src/handlers/explode-entry.ts | Pass notability/context to IMPORT_FACTS message |
apps/worker-facts/src/handlers/extract-facts.ts | Wire generationCostUsd |
apps/worker-facts/src/handlers/generate-evergreen.ts | Wire generationCostUsd |
scripts/seed/backfill-fact-nulls.ts | NEW: Backfill script |