5. Improve Titles

Purpose: Run a focused title improvement pass on fact records, rewriting only the title and challenge_title fields without touching other content.

Prerequisites:

  • .env.local has DATABASE_URL and API key for the routed model (see available models)
  • Facts exist in the database (run seeding pipeline first if empty)

Cost / Duration: ~$5-$10 for 5,000-10,000 facts | 1-3 hours

Prompt

I need to improve fact record titles across the corpus. This is a targeted pass
that rewrites only `title` and `challenge_title` fields -- all other fields
(context, notability_score, notability_reason) remain untouched.

### Step 1: Export facts (if not already exported)

Check whether `.cleanup-data/facts-export.jsonl` exists. If not, export:

```bash
bun scripts/seed/cleanup-content.ts --export
```

### Step 2: Run focused title improvement

```bash
bun scripts/seed/cleanup-content.ts --fix --fields title,challenge_title
```

This rewrites only the title fields. The script is resume-safe -- if interrupted,
it picks up where it left off.

### Step 3: Upload improved titles

```bash
bun scripts/seed/cleanup-content.ts --upload
```

Bulk uploads the rewritten title fields to the database.

### Step 4: Validate

```bash
bun scripts/seed/cleanup-content.ts --validate
```

Report whether title quality metrics improved. Sample a few titles for manual review
and check for:
- Concise length (not too long or generic)
- Topic-relevant context included
- Consistent formatting across the corpus

Verification

  • Export file exists at .cleanup-data/facts-export.jsonl
  • Title rewrites completed (check facts-fixed.jsonl for output)
  • Upload succeeded with non-zero rows written
  • Validation shows improved title quality metrics
  • Sample titles are concise, relevant, and consistently formatted

Back to index