Production Workflow

How Railway environments work, why they matter for Eko, and how to use them efficiently.

Why Environments Matter

Railway's environment model gives Eko something Vercel never could: full-stack preview environments. On Vercel, a preview deployment only covers the Next.js app — workers, crons, and queue consumers are invisible. On Railway, every environment is a complete isolated copy of the entire project — web apps, workers, queues, networking, and variables — all spun up together.

This means a PR that changes the fact extraction prompt can be tested end-to-end: the worker-facts instance in that PR environment consumes from its own queue context, with its own variables, completely isolated from production.

Environment Types

Railway provides three categories of environments:

Production

Created automatically with every project
The primary, always-on deployment
Connected to main branch for auto-deploy
Custom domains (app.eko.day, admin.eko.day, eko.day) point here
This is the only environment that should serve real users

Persistent Environments (Staging/Development)

Manually created, long-lived environments
Typically connected to a specific branch (e.g., staging, develop)
Isolated variables, networking, and configuration
Useful for pre-production validation before merging to main
Persist indefinitely until manually deleted

PR Environments (Ephemeral)

Temporary — created automatically when a PR is opened
Isolated — own services, variables, URLs, and networking
Self-cleaning — destroyed when the PR is merged or closed
Cost-efficient — usage-based billing means you only pay while the PR is open

How Environment Isolation Works

Each environment is fully isolated:

Layer	Isolation
Services	Each environment gets its own instances of every service
Variables	Environment-specific values (can inherit from base)
Networking	WireGuard mesh scoped per project+environment
DNS	Internal DNS resolver is environment-scoped
Domains	PR environments get auto-generated Railway domains
Databases	If using Railway-hosted DBs, each env gets its own (Eko uses external Supabase, so this is N/A)

Changes made to a service in one environment never affect other environments.

Setup Guide

1. Enable PR Environments

In the Railway dashboard:

Go to Project Settings > Environments
Toggle Enable PR Environments
Optionally toggle Enable Focused PR Environments (recommended for monorepos)
Optionally toggle Enable Bot PR Environments (for Dependabot, Renovate, Claude Code PRs)

2. Enable Focused PR Environments (Monorepo Optimization)

This is important for Eko. Without this, every PR spins up all 8 services even if the PR only touches one file. With focused PR environments:

Only services affected by changed files are deployed
Affected services are determined by watch paths or root directory configuration
Dependency services (referenced via ${{service.URL}} variables) also deploy
Undeployed services can be manually triggered from the canvas if needed

Example: A PR that only changes packages/ai/src/prompts.ts would deploy worker-facts and worker-validate (which depend on @eko/ai) but skip worker-ingest, eko-web, eko-admin, and eko-public.

3. Ensure Base Services Have Railway Domains

PR environments only receive auto-generated domains when their corresponding base (production) services use Railway-provided domains. Make sure production services have Railway domains assigned — even if you also use custom domains via Cloudflare.

4. Create a Persistent Staging Environment (Optional)

If you want a long-lived staging environment:

Open the environment dropdown in the top navigation
Select New Environment > Duplicate Environment
Choose production as the source
Name it staging
Configure it to auto-deploy from a staging branch
Override any variables that should differ (e.g., NODE_ENV=staging, LOG_LEVEL=debug)

Variable Management Across Environments

Inheritance Model

Railway does not have automatic variable inheritance between environments. Each environment has its own set of variables. When you create an environment by duplicating another, variables are copied at creation time — but they don't stay in sync.

Recommended Approach for Eko

Variable Type	Strategy
External service keys (Supabase, Upstash, Stripe, AI providers)	Copy to each environment. Use production keys in production, test keys in staging/PR
Operational config (`NODE_ENV`, `LOG_LEVEL`)	Set per environment (`production` / `staging` / `development`)
Sentry	Same DSN everywhere, differentiate by `SENTRY_ENVIRONMENT` variable
Secrets (`CRON_SECRET`)	Generate unique values per environment

Caveats

Supabase is external — PR environments will hit the same Supabase database unless you configure separate projects or use branch databases. For Eko, this means PR environment workers could write to production data if pointed at the production Supabase URL. Consider:
- Using a separate Supabase project for staging/PR environments
- Or accepting that PR environments are "read-only" tests against production data
- Or using Supabase branching (if available on your plan)
Upstash queues are shared — Same concern. PR environment workers would compete with production workers for queue messages unless you provision separate Upstash queues per environment. Mitigation:
- Set different UPSTASH_REDIS_REST_URL per environment
- Or disable queue consumption in PR environments (health check only)
- Or use queue name prefixes per environment

Workflows

Daily Development Workflow

1. Create feature branch locally
2. Develop and test with `bun run dev`
3. Push branch and open PR
4. Railway auto-creates PR environment
5. PR environment builds and deploys all affected services
6. Test against the PR environment's auto-generated URLs
7. Merge PR → Railway destroys PR environment
8. Production auto-deploys from `main`

Manual Deploy via CLI

# Deploy to production
railway up --service eko-web --environment production

# Deploy to staging
railway up --service eko-web --environment staging

# Deploy a specific worker
railway up --service worker-ingest --environment production

# Detached mode (don't wait for build logs)
railway up --service eko-web -d

# CI mode (stream build logs only, exit on build complete)
railway up --service eko-web -c

Rollback

# Via dashboard: click on a previous deployment → Rollback
# This redeploys the old deployment's source code

Rollbacks use the previous deployment's original source. They do not revert environment variable changes.

Redeploy (Apply Variable Changes)

railway redeploy --service worker-facts --environment production

Redeploy rebuilds without uploading new code — useful after changing environment variables.

Config as Code

Railway supports railway.json or railway.toml in the repo root for declarative configuration. Code config always overrides dashboard settings.

Example `railway.json` for Eko

{
  "$schema": "https://railway.com/railway.schema.json",
  "build": {
    "builder": "DOCKERFILE"
  },
  "deploy": {
    "healthcheckPath": "/health",
    "healthcheckTimeout": 30,
    "restartPolicyType": "ON_FAILURE",
    "restartPolicyMaxRetries": 5,
    "drainingSeconds": 30
  },
  "environments": {
    "pr": {
      "deploy": {
        "restartPolicyMaxRetries": 2
      }
    }
  }
}

Environment-Specific Overrides

The environments block supports named environments and the special pr identifier for ephemeral environments. Resolution priority:

Named ephemeral environment config (e.g., environments.my-pr-123)
Hardcoded pr environment config
Base config (top-level build/deploy)
Dashboard service settings

This means you can configure PR environments to behave differently — e.g., fewer restart retries, different health check timeouts, or different scaling.

Per-Service Config

Each service can have its own railway.json in its directory. For Eko's monorepo:

apps/worker-ingest/railway.json    # Worker-specific config
apps/worker-facts/railway.json     # Worker-specific config
apps/web/railway.json              # Web app config

Deployment Lifecycle

Every deployment progresses through these states:

Initializing → Building → Deploying → Active
                                    → Crashed (runtime failure)
                          → Failed (build failure)

Zero-Downtime Deploys

Railway uses singleton deploys by default. When a new deployment goes active:

New deployment builds and starts
Health check passes on new deployment
Old deployment receives SIGTERM
Configurable draining period (drainingSeconds) allows graceful shutdown
Old deployment is removed

For Eko's workers, the drainingSeconds: 30 setting gives the worker 30 seconds to finish processing its current queue message before being killed.

Efficiency Methods

1. Use Focused PR Environments

Enable this immediately. Without it, every PR deploys all 8 services — expensive and slow. With it, only affected services deploy.

2. Configure Watch Paths

Set watch paths per service so Railway knows which file changes affect which services:

Service	Watch Paths
`eko-web`	`apps/web/`, `packages/ui/`, `packages/db/`, `packages/shared/`
`worker-ingest`	`apps/worker-ingest/`, `packages/queue/`, `packages/db/`, `packages/shared/`, `packages/config/`, `packages/observability/`
`worker-facts`	`apps/worker-facts/`, `packages/ai/`, `packages/queue/`, `packages/db/`, `packages/shared/**`
`worker-validate`	`apps/worker-validate/`, `packages/ai/`, `packages/queue/`, `packages/db/`, `packages/shared/**`

3. Skip PR Environments for Docs-Only PRs

Watch paths naturally handle this — if a PR only touches docs/**, no services match, and no PR environment is created.

4. Use Detached Deploys for Non-Blocking Workflow

railway up -d --service eko-web
# Returns immediately — check dashboard for status

5. Use CI Mode in GitHub Actions

RAILWAY_TOKEN=${{ secrets.RAILWAY_TOKEN }} railway up -c --service eko-web

The -c flag streams only build logs and exits on completion — perfect for CI pipelines.

6. Rollback Instead of Hotfix

If a production deploy goes wrong, rolling back to the previous deployment is instant. Fix the code, open a PR, and deploy the fix through the normal flow instead of rushing a hotfix.

7. Use `railway redeploy` for Config Changes

After changing environment variables, use railway redeploy instead of pushing an empty commit. This rebuilds with the same source but picks up new variable values.

Comparison: Vercel vs Railway Environments

Feature	Vercel	Railway
Auto-preview per commit	Yes (every push)	PR-level only
Full-stack preview	No (web app only)	Yes (all services)
Worker preview	Not supported	Included automatically
Environment isolation	Partial (shared DB, no workers)	Full (network, DNS, variables)
Cleanup	Manual domain management	Automatic on PR close
Cost model	Per-seat + bandwidth	Usage-based
Config as code	`vercel.json`	`railway.json` / `railway.toml`
Monorepo optimization	Root directory config	Focused PR environments + watch paths
Rollback	Instant (redeploy image)	Instant (redeploy image)

Caveats and Gotchas

External Services Are Shared

Railway isolates its own services, but Eko's data layer is external:

Supabase — All environments hit the same database unless you configure separate Supabase projects per environment
Upstash Redis — All environments share queues unless you provision separate Redis instances
Stripe — Use test mode keys in non-production environments
Sentry — Same DSN is fine; tag with environment name

Recommendation: For PR environments, either point workers at a staging Supabase/Upstash instance, or disable queue consumption entirely and only test web app changes.

PR Environments Need Railway Domains on Base

If your production services only have custom domains (via Cloudflare), PR environments won't get auto-generated URLs. Always keep Railway-provided domains enabled on production services, even if they're not user-facing.

Config as Code Overrides Dashboard

Settings in railway.json silently override dashboard values. The dashboard won't reflect what's in your config file. This can be confusing when debugging — always check both.

Bot PRs Need Explicit Opt-In

Dependabot, Renovate, and Claude Code bot PRs don't trigger PR environments by default. Enable Bot PR Environments in project settings if you want CI-like environments for automated PRs.

Rollbacks Don't Revert Variables

Rolling back a deployment uses the old source code but the current environment variables. If a bad deploy was caused by a variable change, you need to manually revert the variable too.

No Automatic Variable Sync

When you update a variable in production, it does not propagate to staging or other environments. You must update each environment independently. Consider maintaining a checklist or script for cross-environment variable updates.

#Railway Environments: Dev/Preview/Production Workflow

#Why Environments Matter

#Environment Types

#Production

#Persistent Environments (Staging/Development)

#PR Environments (Ephemeral)

#How Environment Isolation Works

#Setup Guide

#1. Enable PR Environments

#2. Enable Focused PR Environments (Monorepo Optimization)

#3. Ensure Base Services Have Railway Domains

#4. Create a Persistent Staging Environment (Optional)

#Variable Management Across Environments

#Inheritance Model

#Recommended Approach for Eko

#Caveats

#Workflows

#Daily Development Workflow

#Manual Deploy via CLI

#Rollback

#Redeploy (Apply Variable Changes)

#Config as Code

#Example railway.json for Eko

#Environment-Specific Overrides

#Per-Service Config

#Deployment Lifecycle

#Zero-Downtime Deploys

#Efficiency Methods

#1. Use Focused PR Environments

#2. Configure Watch Paths

#3. Skip PR Environments for Docs-Only PRs

#4. Use Detached Deploys for Non-Blocking Workflow

#5. Use CI Mode in GitHub Actions

#6. Rollback Instead of Hotfix

#7. Use railway redeploy for Config Changes

#Comparison: Vercel vs Railway Environments

#Caveats and Gotchas

#External Services Are Shared

#PR Environments Need Railway Domains on Base

#Config as Code Overrides Dashboard

#Bot PRs Need Explicit Opt-In

#Rollbacks Don't Revert Variables

#No Automatic Variable Sync

#References