AWS Deployment Guide¶

Single EC2 instance deployment running Next.js, FastAPI, PostgreSQL, and Nginx.

Target: t3.small (2 vCPU, 2GB RAM) — suitable for up to ~25 concurrent users.

Architecture¶

┌──────────────────────────────────────┐
│  EC2 t3.small (Docker Compose)       │
│                                      │
│  ┌──────────┐  ┌──────────────────┐  │
│  │ Next.js  │  │  FastAPI         │  │
│  │ :3000    │  │  :9898           │  │
│  └────┬─────┘  └───────┬─────────┘  │
│       │                │             │
│  ┌────┴────────────────┴──────────┐  │
│  │  Nginx (:80 / :443)           │  │
│  └────────────────────────────────┘  │
│                                      │
│  ┌────────────────────────────────┐  │
│  │  PostgreSQL :5432              │  │
│  │  (containerized, local-only)   │  │
│  └────────────────────────────────┘  │
└──────────────────────────────────────┘

Prerequisites¶

AWS account with EC2 and S3 access
A domain name (optional, but needed for SSL)
SSH key pair for EC2

EC2 Instance Setup¶

1. Launch Instance¶

AMI: Amazon Linux 2023 or Ubuntu 22.04
Type: t3.small
Storage: 30GB gp3
Security Group:
SSH (22) — your IP only
HTTP (80) — 0.0.0.0/0
HTTPS (443) — 0.0.0.0/0

2. Install Docker¶

# Amazon Linux 2023
sudo dnf update -y
sudo dnf install -y docker git
sudo systemctl start docker
sudo systemctl enable docker
sudo usermod -aG docker ec2-user

# Install Docker Compose plugin
sudo mkdir -p /usr/local/lib/docker/cli-plugins
sudo curl -SL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-x86_64 \
  -o /usr/local/lib/docker/cli-plugins/docker-compose
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose

# Install Docker Buildx (required for multi-stage builds)
sudo curl -SL https://github.com/docker/buildx/releases/latest/download/buildx-v0.20.1.linux-amd64 \
  -o /usr/local/lib/docker/cli-plugins/docker-buildx
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-buildx

# Log out and back in for group changes
exit

3. Clone and Configure¶

git clone <backend-repo-url> ~/ai-tutor-backend
git clone <frontend-repo-url> ~/ai-tutor-ui
cd ~/ai-tutor-backend/deploy

cp .env.production.example .env.production

Edit .env.production with your actual values:

nano .env.production

Required changes: - POSTGRES_PASSWORD — strong random password - SECRET_KEY — openssl rand -hex 32 - BACKEND_CORS_ORIGINS — your domain - ANTHROPIC_API_KEY or Bedrock config - BACKUP_S3_BUCKET — your S3 bucket name

4. Deploy¶

Important: The env file is named .env.production (not .env), so --env-file .env.production is required on every docker compose command.

cd ~/ai-tutor-backend/deploy
docker compose --env-file .env.production up -d --build

Verify everything is running:

docker compose --env-file .env.production ps
curl http://localhost/health
curl http://localhost/

5. Run Database Migrations¶

docker compose --env-file .env.production exec backend alembic upgrade head

6. Set Up Backups (Optional — requires S3 bucket)¶

Skip this step if you don't have an S3 bucket configured yet. The app runs fine without backups — you can set this up later.

# Install AWS CLI
sudo dnf install -y aws-cli

# Configure credentials
aws configure

# Test backup
cd ~/ai-tutor-backend/deploy
source .env.production
chmod +x scripts/backup-postgres.sh
./scripts/backup-postgres.sh

# Schedule daily backup at 3 AM
(crontab -l 2>/dev/null; echo "0 3 * * * cd ~/ai-tutor-backend/deploy && source .env.production && ./scripts/backup-postgres.sh >> /var/log/pg-backup.log 2>&1") | crontab -

7. SSL with Let's Encrypt (optional but recommended)¶

sudo dnf install -y certbot
sudo certbot certonly --standalone -d your-domain.com

# Then uncomment the SSL sections in:
# - deploy/nginx/nginx.conf (the server blocks at the bottom)
# - deploy/docker-compose.yml (port 443 and letsencrypt volume)

docker compose --env-file .env.production restart nginx

Common Operations¶

View logs¶

docker compose --env-file .env.production logs -f backend
docker compose --env-file .env.production logs -f frontend
docker compose --env-file .env.production logs -f postgres

Restart a service¶

docker compose --env-file .env.production restart backend

Redeploy after code changes¶

Automatic (default): Merge or push to staging branch — GitHub Actions auto-deploys to EC2. No manual steps needed.

git checkout staging && git merge main && git push && git checkout main

Manual fallback (if Actions is unavailable):

cd ~/ai-tutor-backend && git checkout staging && git pull
cd ~/ai-tutor-ui && git checkout staging && git pull
cd ~/ai-tutor-backend/deploy
docker compose --env-file .env.production up -d --build
docker compose --env-file .env.production exec -T backend alembic upgrade head

Manual database backup¶

cd ~/ai-tutor-backend/deploy
source .env.production
./scripts/backup-postgres.sh

Restore from backup¶

aws s3 cp s3://your-bucket/backups/postgres/ai_tutor_20260217.sql.gz /tmp/
gunzip /tmp/ai_tutor_20260217.sql.gz
docker compose --env-file .env.production exec -T postgres psql -U ai_tutor -d ai_tutor < /tmp/ai_tutor_20260217.sql

Monitoring¶

Health check: curl http://3.151.25.120/health

Metrics (Prometheus): curl http://3.151.25.120/metrics

Resource usage: docker stats

Cost Estimate¶

Item	Monthly
EC2 `t3.small` on-demand	~$15
30GB EBS gp3	~$2.40
S3 backups	~$0.50
Total	~$18/mo

With 1-year Reserved Instance: ~$11/mo

Future Expansion Guide¶

As your user base grows, here's the upgrade path — each step is independent.

Phase 1: Move Docker Builds to GitHub Actions + GHCR ($0 extra)¶

When: Deploys are slowing down the server, or you want faster/safer deploys.

Why: Right now, Docker images are built on EC2 during deploy. The Next.js build uses most of the server's 2GB RAM for ~2 minutes. Moving builds to GitHub Actions means EC2 just pulls a pre-built image — deploys drop from 2-3 min to ~10 seconds with zero server impact.

How it works:

Current:  Push → GitHub Actions → SSH → git pull → docker build ON EC2 → restart
Future:   Push → GitHub Actions builds image → pushes to GHCR → SSH → docker pull → restart

Steps:

Enable GHCR (GitHub Container Registry) — it's free with your GitHub account (500MB storage on free plan, plenty for 2 images)
Update .github/workflows/deploy.yml in both repos to:
Check out the code on the GitHub runner
Build the Docker image on the runner (7GB RAM, much faster)
Tag it as ghcr.io/ai-teacher-poc/ai-tutor-backend:staging
Push it to GHCR
SSH into EC2 and run docker compose pull && docker compose up -d

Update deploy/docker-compose.yml to use registry images instead of local builds:

backend:
  # Replace this:
  # build:
  #   context: ..
  #   dockerfile: Dockerfile
  # With this:
  image: ghcr.io/ai-teacher-poc/ai-tutor-backend:staging

Add a GitHub Actions secret GHCR_TOKEN (a Personal Access Token with write:packages scope) — or use the built-in GITHUB_TOKEN which has automatic GHCR access

Log into GHCR on EC2 (one-time setup):

echo "YOUR_GITHUB_PAT" | docker login ghcr.io -u YOUR_GITHUB_USERNAME --password-stdin

Result: EC2 never builds images again. Deploys are ~10 seconds. No memory pressure during deploys. Instant rollback by pulling a previous image tag.

Phase 2: Move PostgreSQL to RDS (~$15/mo extra)¶

When: 50+ users, or you want automated backups/failover.

Steps:

Create an RDS PostgreSQL instance (db.t4g.micro or db.t4g.small)

Dump your local database:

docker compose --env-file .env.production exec postgres pg_dump -U ai_tutor -d ai_tutor > dump.sql

Import into RDS:

psql -h your-rds-endpoint.amazonaws.com -U ai_tutor -d ai_tutor < dump.sql

Update deploy/.env.production:

# Comment out local postgres vars
# POSTGRES_USER=...
# POSTGRES_PASSWORD=...

# Add RDS connection
RDS_USER=ai_tutor
RDS_PASSWORD=your_rds_password
RDS_ENDPOINT=your-db.xxxx.us-east-1.rds.amazonaws.com
RDS_DB=ai_tutor

In deploy/docker-compose.yml:
Switch the DATABASE_URL line (commented instructions are already in the file)
Remove the postgres service
Remove postgres_data volume
Remove the depends_on: postgres from backend
Redeploy: docker compose --env-file .env.production up -d
Remove the backup cron job (RDS handles backups automatically)

Phase 3: Add SSL and a Domain¶

When: Going to production / sharing with real users.

Steps:

Point your domain's DNS A record to the EC2 Elastic IP
Run certbot: sudo certbot certonly --standalone -d your-domain.com
Uncomment the SSL sections in deploy/nginx/nginx.conf
Uncomment port 443 and letsencrypt volume in deploy/docker-compose.yml
Restart: docker compose --env-file .env.production restart nginx

Phase 4: Separate Frontend to Vercel/Amplify¶

When: Need CDN, edge caching, or faster global page loads.

Steps:

Deploy ai-tutor-ui to Vercel with NEXT_PUBLIC_API_BASE_URL=https://api.your-domain.com/api/v1
Remove the frontend service from docker-compose
Update Nginx to only proxy /api/* to the backend (remove the frontend location block)
Update CORS in .env.production to allow Vercel domain

Phase 5: Containerize with ECS/Fargate¶

When: 100+ concurrent users, need auto-scaling.

Steps:

Push Docker images to ECR (Elastic Container Registry)
Create ECS task definitions using the existing Dockerfiles
Set up an ALB (Application Load Balancer) to replace Nginx
Configure auto-scaling policies based on CPU/memory
The existing Dockerfiles and health checks work as-is with ECS

Phase 6: Add Caching with ElastiCache¶

When: Database queries becoming a bottleneck (unlikely at <100 users).

Add a Redis instance for: - Session caching - LLM response caching (for repeated questions) - Rate limiting

Decision Matrix¶

Users	Recommended Setup	Monthly Cost
1–25	Single EC2 (current)	~$18
25–50	EC2 + RDS	~$35
50–100	EC2 + RDS + Vercel	~$35 + Vercel free tier
100+	ECS Fargate + RDS + Vercel	~$80–150