8. Run Workers High Concurrency
Purpose: Start multiple worker instances with elevated concurrency to process large seeding pipeline runs faster.
Prerequisites:
- All standard worker prerequisites (see Start Workers)
- Sufficient API rate limits for the target concurrency level
- Seed controls configured for the run
- AI provider API key for the model backing each tier (check
ai_model_tier_configDB table; e.g.,GOOGLE_API_KEYfor Gemini models)
Cost / Duration: $0 (worker startup; AI costs depend on workload) | ~5 minutes to start
Prompt
Start workers in high-concurrency mode for a large pipeline run (e.g., seeding).
This spins up multiple instances to parallelize fact extraction and validation.
Step 1 -- Start 5 facts worker instances, each processing 10 messages concurrently:
```bash
for port in 4010 4011 4012 4013 4014; do
WORKER_CONCURRENCY=10 PORT=$port bun run dev:worker-facts &
done
```
Step 2 -- Start 3 validation worker instances (optional, for parallel validation):
```bash
for port in 4020 4021 4022; do
WORKER_CONCURRENCY=10 PORT=$port bun run dev:worker-validate &
done
```
Step 3 -- Start a single ingestion worker (one instance is usually sufficient):
```bash
bun run dev:worker-ingest
```
Step 4 (optional) -- If you have multiple API keys, assign them to different
instances for higher rate limits:
```bash
# Example with Google keys (adjust var name for current provider):
GOOGLE_API_KEY=$KEY_1 WORKER_CONCURRENCY=10 PORT=4010 bun run dev:worker-facts &
GOOGLE_API_KEY=$KEY_2 WORKER_CONCURRENCY=10 PORT=4011 bun run dev:worker-facts &
```
Monitor processing with:
```bash
bun scripts/diagnose-pipeline.ts
```
Queue depths should decrease steadily. If you see 429 rate limit errors,
reduce WORKER_CONCURRENCY or spread across more API keys. If you see port
conflicts, ensure each instance uses a unique PORT.
Verification
- All worker instances started without port conflicts
- Queue depths are decreasing steadily
- No rate limit errors (429) in worker logs
- Diagnostic script confirms all instances are processing messages
- No memory pressure from too many instances
Related Prompts
- Start Workers -- Standard single-instance worker startup
- Check Queue Health -- Monitor processing rates during the run
- Seed the Database -- The primary use case for high concurrency