API-Sports.io Evidence Pipeline Integration
Motivation
gpt-5.4-mini scores 90/97 on validation — the last gap is sports fact verification. The evidence pipeline (Phase 4b) currently uses TheSportsDB v1 free API which only returns team/player metadata, not match results or game statistics. When facts claim "Jordan scored 63 points against the Celtics" or "Argentina won the World Cup final 3-3 on penalties," the pipeline can't verify or refute these — it falls back to Wikipedia summaries that rarely cover specific games.
API-Sports.io provides 12 sport-specific APIs with 15+ years of historical data including fixtures, player stats, lineups, and game results. Free tier: 100 requests/day per API.
Service Overview
| API | Base URL | Key Endpoints |
|---|---|---|
| Football | v3.football.api-sports.io | /fixtures, /players, /teams |
| NBA | v2.nba.api-sports.io | /games, /players/statistics, /teams |
| Basketball | v1.basketball.api-sports.io | /games, /statistics |
| Baseball | v1.baseball.api-sports.io | /games, /players |
| NFL/NCAA | v1.american-football.api-sports.io | /games, /players |
| Formula 1 | v1.formula-1.api-sports.io | /races, /rankings/drivers |
| Hockey | v1.hockey.api-sports.io | /games, /players |
Auth: Header x-apisports-key: {API_SPORTS_API_KEY}
Free tier: 100 requests/day per sport API, resets 00:00 UTC
Rate limit: None documented on free tier beyond daily cap
Implementation Scope
Challenge 1: Config & Environment
Files: packages/config/src/index.ts, .env.example
- Add
API_SPORTS_API_KEYto env schema (optional, string) - Add getter
getApiSportsKey(): string | undefined - Add to
.env.examplewith comment
Acceptance: bun run env:check-example passes, key available via config getter.
Challenge 2: API-Sports Client
Files: packages/ai/src/api-sports-client.ts (new)
Multi-sport client with:
- Shared auth/fetch layer (
x-apisports-keyheader, 10s timeout, abort controller) - Sport-specific sub-clients:
football,nba,basketball(start with these 3) - In-memory cache (1h TTL, 3K max entries per sport — match TheSportsDB pattern)
- Rate tracking (log daily usage, warn at 80/100 requests)
- Metrics integration (
api_sports.api_calls,api_sports.cache_hit,api_sports.api_errors)
Key methods:
// Football (soccer)
searchFootballPlayer(name: string): Promise<FootballPlayer | null>
getFootballPlayerStats(playerId: number, season: number): Promise<FootballPlayerStats | null>
searchFootballFixture(team1: string, team2: string, season?: number): Promise<FootballFixture | null>
getFixtureById(fixtureId: number): Promise<FootballFixtureDetail | null>
// NBA
searchNbaPlayer(name: string): Promise<NbaPlayer | null>
getNbaPlayerStats(playerId: number, season: number): Promise<NbaPlayerStats | null>
searchNbaGame(team1: string, team2: string, season?: number): Promise<NbaGame | null>
Response types: Typed interfaces for each sport's player, fixture/game, and stats responses.
Acceptance: Client can look up "Lionel Messi" → player ID → 2022 season fixtures → World Cup final score. Client can look up "Michael Jordan" → player ID → 1986 playoff stats.
Challenge 3: Evidence Pipeline Integration
Files: packages/ai/src/validation/evidence.ts
Wire API-Sports into Phase 4b lookupExternalApis():
// In the sports branch of lookupExternalApis():
if (topicRoot === 'sports' && apiSportsKey) {
const sport = detectSportFromTopic(topicPath) // 'football' | 'nba' | 'basketball' | null
if (sport) {
const playerResult = await searchPlayer(entityName, sport)
if (playerResult) {
// Add player metadata as evidence
findings.push(`API-Sports: ${playerResult.name}, ${playerResult.team}`)
// If fact contains match/game claims, look up fixture
if (hasMatchClaim(factTitle, factContext)) {
const fixture = await searchRelevantFixture(playerResult, factContext)
if (fixture) {
findings.push(`API-Sports fixture: ${fixture.home} ${fixture.homeScore}-${fixture.awayScore} ${fixture.away}`)
// Hard evidence: actual score data to compare against fact claims
}
}
}
}
}
Sport detection from topic path:
sports/soccer/*→footballsports/basketball/*→ check if NBA entity →nba, elsebasketballsports/baseball/*→baseballsports/american-football/*→american-football
Acceptance: Evidence pipeline produces corroborating or contradicting findings from API-Sports when processing sports facts. Messi World Cup final fact gets verified with actual fixture data.
Challenge 4: Evidence Confidence Scoring
Files: packages/ai/src/validation/evidence.ts
Update Phase 4b confidence aggregation for API-Sports results:
- API-Sports match found with matching score → confidence 0.85 (strong corroboration)
- API-Sports match found with contradicting score → confidence 0.15 (strong contradiction)
- API-Sports player found but no matching fixture → confidence 0.6 (partial corroboration)
- API-Sports no results → fall through to existing Wikipedia/Wikidata path
Acceptance: Facts with API-Sports corroboration get higher evidence confidence than Wikipedia-only lookups.
Challenge 5: Tests
Files: packages/ai/src/__tests__/api-sports-client.test.ts (new)
- Unit tests for entity name → sport detection
- Unit tests for response parsing/typing
- Mock-based tests for cache behavior
- Integration test: lookup known player + fixture (can skip in CI with env guard)
Acceptance: bun run test passes with new test file.
Challenge 6: Tier 1 Rerun
Run gpt-5.4-mini Tier 1 canary with API-Sports wired in:
bun scripts/seed/llm-fact-quality-testing.ts --all --models gpt-5.4-mini --limit 10
Acceptance: validation ≥95 (up from 90), evidence ≥80 (up from 50).
Cost Estimate
- Free tier: 100 req/day per sport → ~50 facts/day verified (2 calls per fact)
- Testing pipeline (10-50 facts/run): well within free tier
- Production: evaluate if 100/day is sufficient or if $10/month paid tier needed
Dependencies
API_SPORTS_API_KEYin.env.local(already done)- No new packages needed (uses
fetch)
Non-Goals
- TheSportsDB v2 upgrade (separate, lower priority now)
- Real-time/live score features
- Non-sports API integrations (separate workstream)
- Formula 1, hockey, rugby (defer until football + NBA proven)