Skip to content

AWS Deployment Guide

Single EC2 instance deployment running Next.js, FastAPI, PostgreSQL, and Nginx.

Target: t3.small (2 vCPU, 2GB RAM) — suitable for up to ~25 concurrent users.

Architecture

┌──────────────────────────────────────┐
│  EC2 t3.small (Docker Compose)       │
│                                      │
│  ┌──────────┐  ┌──────────────────┐  │
│  │ Next.js  │  │  FastAPI         │  │
│  │ :3000    │  │  :9898           │  │
│  └────┬─────┘  └───────┬─────────┘  │
│       │                │             │
│  ┌────┴────────────────┴──────────┐  │
│  │  Nginx (:80 / :443)           │  │
│  └────────────────────────────────┘  │
│                                      │
│  ┌────────────────────────────────┐  │
│  │  PostgreSQL :5432              │  │
│  │  (containerized, local-only)   │  │
│  └────────────────────────────────┘  │
└──────────────────────────────────────┘

Prerequisites

  • AWS account with EC2 and S3 access
  • A domain name (optional, but needed for SSL)
  • SSH key pair for EC2

EC2 Instance Setup

1. Launch Instance

  • AMI: Amazon Linux 2023 or Ubuntu 22.04
  • Type: t3.small
  • Storage: 30GB gp3
  • Security Group:
  • SSH (22) — your IP only
  • HTTP (80) — 0.0.0.0/0
  • HTTPS (443) — 0.0.0.0/0

2. Install Docker

# Amazon Linux 2023
sudo dnf update -y
sudo dnf install -y docker git
sudo systemctl start docker
sudo systemctl enable docker
sudo usermod -aG docker ec2-user

# Install Docker Compose plugin
sudo mkdir -p /usr/local/lib/docker/cli-plugins
sudo curl -SL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-x86_64 \
  -o /usr/local/lib/docker/cli-plugins/docker-compose
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose

# Install Docker Buildx (required for multi-stage builds)
sudo curl -SL https://github.com/docker/buildx/releases/latest/download/buildx-v0.20.1.linux-amd64 \
  -o /usr/local/lib/docker/cli-plugins/docker-buildx
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-buildx

# Log out and back in for group changes
exit

3. Clone and Configure

git clone <backend-repo-url> ~/ai-tutor-backend
git clone <frontend-repo-url> ~/ai-tutor-ui
cd ~/ai-tutor-backend/deploy

cp .env.production.example .env.production

Edit .env.production with your actual values:

nano .env.production

Required changes: - POSTGRES_PASSWORD — strong random password - SECRET_KEYopenssl rand -hex 32 - BACKEND_CORS_ORIGINS — your domain - ANTHROPIC_API_KEY or Bedrock config - BACKUP_S3_BUCKET — your S3 bucket name

4. Deploy

Important: The env file is named .env.production (not .env), so --env-file .env.production is required on every docker compose command.

cd ~/ai-tutor-backend/deploy
docker compose --env-file .env.production up -d --build

Verify everything is running:

docker compose --env-file .env.production ps
curl http://localhost/health
curl http://localhost/

5. Run Database Migrations

docker compose --env-file .env.production exec backend alembic upgrade head

6. Set Up Backups (Optional — requires S3 bucket)

Skip this step if you don't have an S3 bucket configured yet. The app runs fine without backups — you can set this up later.

# Install AWS CLI
sudo dnf install -y aws-cli

# Configure credentials
aws configure

# Test backup
cd ~/ai-tutor-backend/deploy
source .env.production
chmod +x scripts/backup-postgres.sh
./scripts/backup-postgres.sh

# Schedule daily backup at 3 AM
(crontab -l 2>/dev/null; echo "0 3 * * * cd ~/ai-tutor-backend/deploy && source .env.production && ./scripts/backup-postgres.sh >> /var/log/pg-backup.log 2>&1") | crontab -
sudo dnf install -y certbot
sudo certbot certonly --standalone -d your-domain.com

# Then uncomment the SSL sections in:
# - deploy/nginx/nginx.conf (the server blocks at the bottom)
# - deploy/docker-compose.yml (port 443 and letsencrypt volume)

docker compose --env-file .env.production restart nginx

Common Operations

View logs

docker compose --env-file .env.production logs -f backend
docker compose --env-file .env.production logs -f frontend
docker compose --env-file .env.production logs -f postgres

Restart a service

docker compose --env-file .env.production restart backend

Redeploy after code changes

Automatic (default): Merge or push to staging branch — GitHub Actions auto-deploys to EC2. No manual steps needed.

git checkout staging && git merge main && git push && git checkout main

Manual fallback (if Actions is unavailable):

cd ~/ai-tutor-backend && git checkout staging && git pull
cd ~/ai-tutor-ui && git checkout staging && git pull
cd ~/ai-tutor-backend/deploy
docker compose --env-file .env.production up -d --build
docker compose --env-file .env.production exec -T backend alembic upgrade head

Manual database backup

cd ~/ai-tutor-backend/deploy
source .env.production
./scripts/backup-postgres.sh

Restore from backup

aws s3 cp s3://your-bucket/backups/postgres/ai_tutor_20260217.sql.gz /tmp/
gunzip /tmp/ai_tutor_20260217.sql.gz
docker compose --env-file .env.production exec -T postgres psql -U ai_tutor -d ai_tutor < /tmp/ai_tutor_20260217.sql

Monitoring

Health check: curl http://3.151.25.120/health

Metrics (Prometheus): curl http://3.151.25.120/metrics

Resource usage: docker stats

Cost Estimate

Item Monthly
EC2 t3.small on-demand ~$15
30GB EBS gp3 ~$2.40
S3 backups ~$0.50
Total ~$18/mo

With 1-year Reserved Instance: ~$11/mo


Future Expansion Guide

As your user base grows, here's the upgrade path — each step is independent.

Phase 1: Move Docker Builds to GitHub Actions + GHCR ($0 extra)

When: Deploys are slowing down the server, or you want faster/safer deploys.

Why: Right now, Docker images are built on EC2 during deploy. The Next.js build uses most of the server's 2GB RAM for ~2 minutes. Moving builds to GitHub Actions means EC2 just pulls a pre-built image — deploys drop from 2-3 min to ~10 seconds with zero server impact.

How it works:

Current:  Push → GitHub Actions → SSH → git pull → docker build ON EC2 → restart
Future:   Push → GitHub Actions builds image → pushes to GHCR → SSH → docker pull → restart

Steps:

  1. Enable GHCR (GitHub Container Registry) — it's free with your GitHub account (500MB storage on free plan, plenty for 2 images)

  2. Update .github/workflows/deploy.yml in both repos to:

  3. Check out the code on the GitHub runner
  4. Build the Docker image on the runner (7GB RAM, much faster)
  5. Tag it as ghcr.io/ai-teacher-poc/ai-tutor-backend:staging
  6. Push it to GHCR
  7. SSH into EC2 and run docker compose pull && docker compose up -d

  8. Update deploy/docker-compose.yml to use registry images instead of local builds:

    backend:
      # Replace this:
      # build:
      #   context: ..
      #   dockerfile: Dockerfile
      # With this:
      image: ghcr.io/ai-teacher-poc/ai-tutor-backend:staging
    

  9. Add a GitHub Actions secret GHCR_TOKEN (a Personal Access Token with write:packages scope) — or use the built-in GITHUB_TOKEN which has automatic GHCR access

  10. Log into GHCR on EC2 (one-time setup):

    echo "YOUR_GITHUB_PAT" | docker login ghcr.io -u YOUR_GITHUB_USERNAME --password-stdin
    

Result: EC2 never builds images again. Deploys are ~10 seconds. No memory pressure during deploys. Instant rollback by pulling a previous image tag.

Phase 2: Move PostgreSQL to RDS (~$15/mo extra)

When: 50+ users, or you want automated backups/failover.

Steps:

  1. Create an RDS PostgreSQL instance (db.t4g.micro or db.t4g.small)
  2. Dump your local database:
    docker compose --env-file .env.production exec postgres pg_dump -U ai_tutor -d ai_tutor > dump.sql
    
  3. Import into RDS:
    psql -h your-rds-endpoint.amazonaws.com -U ai_tutor -d ai_tutor < dump.sql
    
  4. Update deploy/.env.production:
    # Comment out local postgres vars
    # POSTGRES_USER=...
    # POSTGRES_PASSWORD=...
    
    # Add RDS connection
    RDS_USER=ai_tutor
    RDS_PASSWORD=your_rds_password
    RDS_ENDPOINT=your-db.xxxx.us-east-1.rds.amazonaws.com
    RDS_DB=ai_tutor
    
  5. In deploy/docker-compose.yml:
  6. Switch the DATABASE_URL line (commented instructions are already in the file)
  7. Remove the postgres service
  8. Remove postgres_data volume
  9. Remove the depends_on: postgres from backend
  10. Redeploy: docker compose --env-file .env.production up -d
  11. Remove the backup cron job (RDS handles backups automatically)

Phase 3: Add SSL and a Domain

When: Going to production / sharing with real users.

Steps:

  1. Point your domain's DNS A record to the EC2 Elastic IP
  2. Run certbot: sudo certbot certonly --standalone -d your-domain.com
  3. Uncomment the SSL sections in deploy/nginx/nginx.conf
  4. Uncomment port 443 and letsencrypt volume in deploy/docker-compose.yml
  5. Restart: docker compose --env-file .env.production restart nginx

Phase 4: Separate Frontend to Vercel/Amplify

When: Need CDN, edge caching, or faster global page loads.

Steps:

  1. Deploy ai-tutor-ui to Vercel with NEXT_PUBLIC_API_BASE_URL=https://api.your-domain.com/api/v1
  2. Remove the frontend service from docker-compose
  3. Update Nginx to only proxy /api/* to the backend (remove the frontend location block)
  4. Update CORS in .env.production to allow Vercel domain

Phase 5: Containerize with ECS/Fargate

When: 100+ concurrent users, need auto-scaling.

Steps:

  1. Push Docker images to ECR (Elastic Container Registry)
  2. Create ECS task definitions using the existing Dockerfiles
  3. Set up an ALB (Application Load Balancer) to replace Nginx
  4. Configure auto-scaling policies based on CPU/memory
  5. The existing Dockerfiles and health checks work as-is with ECS

Phase 6: Add Caching with ElastiCache

When: Database queries becoming a bottleneck (unlikely at <100 users).

Add a Redis instance for: - Session caching - LLM response caching (for repeated questions) - Rate limiting

Decision Matrix

Users Recommended Setup Monthly Cost
1–25 Single EC2 (current) ~$18
25–50 EC2 + RDS ~$35
50–100 EC2 + RDS + Vercel ~$35 + Vercel free tier
100+ ECS Fargate + RDS + Vercel ~$80–150