Error Handling Standards
Standards for consistent, observable, and recoverable error handling across the Eko codebase.
Core Rules
EH-001: Structured Error Types
All custom errors must extend a base error class with structured properties.
// Good: Structured error with code and context
class FetchError extends Error {
constructor(
message: string,
public readonly code: 'FETCH_TIMEOUT' | 'FETCH_NETWORK' | 'FETCH_HTTP',
public readonly context: { url: string; statusCode?: number }
) {
super(message)
this.name = 'FetchError'
}
}
// Bad: Generic error without context
throw new Error('Failed to fetch URL')
Required properties:
code: Machine-readable error code (SCREAMING_SNAKE_CASE)message: Human-readable descriptioncontext: Structured metadata object for debugging
EH-002: Never Swallow Errors
Catch blocks must handle errors explicitly. Never silently swallow errors.
// Good: Log and continue
try {
await riskyOperation()
} catch (error) {
logger.error('Operation failed', { error, context: { userId } })
// Explicit decision to continue
}
// Good: Rethrow with context
try {
await riskyOperation()
} catch (error) {
throw new OperationError('Failed during processing', { cause: error })
}
// Good: Return Result type
try {
const result = await riskyOperation()
return { success: true, data: result }
} catch (error) {
return { success: false, error: error.message }
}
// Bad: Silent swallow
try {
await riskyOperation()
} catch (error) {
// Nothing here - error is lost
}
// Bad: Log but continue without explicit decision
try {
await riskyOperation()
} catch {
console.log('failed')
}
Allowed patterns:
- Log and continue: Use
logger.error()with context - Rethrow with context: Wrap in a more specific error
- Return Result: Use
{ success, data/error }pattern - Explicit ignore: Comment explaining why (rare)
EH-003: Worker Error Boundaries
Workers must wrap message processing in try/catch and never crash the process.
// Good: Worker error boundary
async function processMessage(message: QueueMessage) {
const startTime = Date.now()
try {
await handleMessage(message)
metrics.increment('worker.message.success')
} catch (error) {
logger.error('Worker message processing failed', {
error,
messageId: message.id,
storyId: message.data.story_id,
durationMs: Date.now() - startTime,
})
metrics.increment('worker.message.error')
// Don't rethrow - worker continues processing other messages
}
}
// Bad: Unhandled error crashes worker
async function processMessage(message: QueueMessage) {
await handleMessage(message) // Throws, worker crashes
}
Worker requirements:
- Wrap all message processing in try/catch
- Log errors with entity context (
story_id,fact_record_id) when available - Increment error metrics
- Never let errors crash the worker process
EH-004: API Error Responses
HTTP errors must return consistent JSON structure.
// Good: Consistent error response
export function errorResponse(
status: number,
code: string,
message: string
): NextResponse {
return NextResponse.json(
{ error: message, code },
{ status }
)
}
// Usage
return errorResponse(404, 'URL_NOT_FOUND', 'The requested URL does not exist')
return errorResponse(400, 'INVALID_INPUT', 'URL parameter is required')
return errorResponse(500, 'INTERNAL_ERROR', 'An unexpected error occurred')
Response format:
{
"error": "Human-readable error message",
"code": "MACHINE_READABLE_CODE"
}
Status code guidelines:
| Code | When |
|---|---|
| 400 | Invalid input, validation failure |
| 401 | Authentication required |
| 403 | Forbidden (auth ok, permission denied) |
| 404 | Resource not found |
| 429 | Rate limited |
| 500 | Internal server error |
Security: Never expose internal error details in production responses.
Decision Framework
Use this table to decide how to handle different error scenarios:
| Scenario | Pattern | Example |
|---|---|---|
| Recoverable external error | Log, metric, continue | Network timeout on optional fetch |
| Fatal config error | Fail fast at startup | Missing required env var |
| Transient failure | Retry with backoff | Database connection lost |
| Data integrity error | Log, skip record, alert | Malformed queue message |
| User input error | Return 4xx with message | Invalid URL format |
| Rate limit hit | Return 429, log | Too many API calls |
| Unexpected error | Log full stack, return 500 | Null pointer |
Retry Patterns
For transient failures, use exponential backoff:
import { retry } from '@eko/shared'
// Good: Retry with exponential backoff
const result = await retry(
() => fetchUrl(url),
{
maxAttempts: 3,
baseDelayMs: 100,
}
)
Retry guidelines:
- Max 3 attempts for network operations
- Initial delay: 100-500ms
- Max delay: 5-10 seconds
- Only retry transient errors (timeouts, 5xx)
- Never retry 4xx or validation errors
Logging Errors
Use structured logging with appropriate log levels:
import { logger } from '@eko/observability'
// Good: Structured error logging
logger.error('Story clustering failed', {
error, // Full error object
code: error.code,
storyId: story.id,
category: story.category,
attempt: currentAttempt,
durationMs: Date.now() - startTime,
})
// Bad: Console logging
console.error('Failed:', error.message)
Log levels:
| Level | When |
|---|---|
error | Unexpected failures requiring investigation |
warn | Expected but unusual conditions |
info | Normal operation events |
debug | Detailed debugging info (dev only) |
Anti-Patterns
1. Empty Catch Blocks
// NEVER do this
try {
await operation()
} catch {
// Silent failure
}
2. Generic Error Messages
// Bad
throw new Error('Something went wrong')
// Good
throw new FetchError('URL returned HTTP 404', 'FETCH_HTTP', { url, statusCode: 404 })
3. Logging Sensitive Data
// Bad: Logs API key
logger.error('API call failed', { request })
// Good: Redact sensitive fields
logger.error('API call failed', {
url: request.url,
method: request.method,
// headers intentionally omitted
})
4. Inconsistent Error Responses
// Bad: Different formats in different routes
return { msg: 'Error!' }
return { error: { message: 'Error' } }
return { success: false, reason: 'Error' }
// Good: Use errorResponse() helper everywhere
return errorResponse(400, 'INVALID_INPUT', 'Missing required field')
Related
- Observability Guide — Logging and metrics
- Rules Index — All rules