Skip to content

Architecture

Understanding the system design, patterns, and key technical decisions.

System Overview

2Sigma Backend is a FastAPI-based learning management system with AI tutoring capabilities. It's designed for multi-tenancy, scalability, and extensibility.

Key Characteristics

  • Async-first: Fully async architecture using FastAPI and async SQLAlchemy
  • Multi-tenant: Supports multiple universities with isolation
  • Hierarchical content: Tree-structured modules and content
  • AI-powered: Integrated Claude LLM for tutoring
  • RESTful API: Standard REST endpoints with OpenAPI docs
  • Type-safe: Python type hints throughout

Architecture Diagram

graph TB
    Client[Frontend/API Client]
    API[FastAPI Application]
    Auth[Auth Layer]
    Routes[API Routes]
    CRUD[CRUD Layer]
    Models[SQLAlchemy Models]
    DB[(PostgreSQL)]
    LLM[LLM Service]
    Anthropic[Anthropic API]
    Bedrock[AWS Bedrock]

    Client --> API
    API --> Auth
    Auth --> Routes
    Routes --> CRUD
    CRUD --> Models
    Models --> DB
    Routes --> LLM
    LLM --> Anthropic
    LLM --> Bedrock

Layered Architecture

1. Presentation Layer (API)

Location: app/api/v1/

  • RESTful HTTP endpoints
  • Request validation (Pydantic)
  • Response serialization
  • Error handling
  • Authentication/authorization checks

Example:

@router.get("/me", response_model=schemas.UserResponse)
async def get_current_user(
    current_user = Depends(deps.get_current_active_user)
):
    return current_user

2. Business Logic Layer (CRUD + Services)

CRUD (Data Access): app/crud/ - Repository pattern for database operations - Generic base class for common operations - Specialized methods per entity

Services: app/services/ - External integrations (LLM) - Complex business logic - Cross-entity operations

Example CRUD:

class CRUDUser(CRUDBase[User, UserCreate, UserUpdate]):
    async def get_by_email(self, db: AsyncSession, email: str):
        result = await db.execute(
            select(User).where(User.email == email)
        )
        return result.scalars().first()

3. Data Layer (Models)

Location: app/models/

  • SQLAlchemy ORM models
  • Database schema definitions
  • Relationships and constraints

4. Data Transfer Layer (Schemas)

Location: app/schemas/

  • Pydantic models for API contracts
  • Request validation
  • Response serialization
  • Separation from ORM models

Design Patterns

Repository Pattern

CRUD classes abstract database access:

# Base repository with generic operations
class CRUDBase(Generic[ModelType, CreateSchemaType, UpdateSchemaType]):
    async def get(self, db, id):...
    async def create(self, db, obj_in):...
    async def update(self, db, db_obj, obj_in):...
    async def delete(self, db, id):...

# Specialized repository
class CRUDUser(CRUDBase[User, UserCreate, UserUpdate]):
    async def get_by_email(self, db, email):...

Benefits: - Testable (easy to mock) - Reusable operations - Consistent interface

Dependency Injection

FastAPI's Depends() for loose coupling:

# Define dependencies
async def get_db():
    async with AsyncSession(engine) as session:
        yield session

async def get_current_user(
    db: AsyncSession = Depends(get_db),
    token: str = Depends(oauth2_scheme)
) -> User:
    # Validate token, return user
    ...

# Use in routes
@router.get("/me")
async def get_profile(
    current_user: User = Depends(get_current_user)
):
    return current_user

Benefits: - Testable (inject mocks) - Reusable logic - Clear dependencies

Strategy Pattern

Pluggable LLM providers:

class LLMService:
    def __init__(self):
        if settings.LLM_PROVIDER == "anthropic":
            self.llm = ChatAnthropic(...)
        elif settings.LLM_PROVIDER == "aws_bedrock":
            self.llm = ChatBedrock(...)

    async def generate_response(self, messages):
        # Unified interface regardless of provider
        return await self.llm.astream(messages)

Prompt Composition Pattern

Langfuse-style inline prompt composition:

# Prompt with inline dependency reference
prompt = Prompt(
    name="Video Tutor",
    version=3,
    prompt="You are a tutor for {{title}}.\n@@@langfusePrompt:name=General Protocol|label=production@@@",
    labels=["production"]
)

# Dependencies are parsed from @@@langfusePrompt:...@@@ tags
# and resolved recursively (max depth 5) at runtime

Multi-Tenancy Design

Tenant Isolation

  • Tenant: University (organization)
  • Isolation: Data scoped by university_id
  • RBAC: Roles per university (UserUniversityRole)

Row-Level Security

Queries filter by university:

# Courses for a specific university
courses = await db.execute(
    select(Course).where(Course.university_id == university_id)
)

Shared vs Tenant Data

Shared: - User accounts (global) - Content types (platform-wide) - Prompts (reusable, name-based)

Tenant-specific: - Courses - Enrollments - Content items - Departments

Hierarchical Content Model

Tree Structure

Modules form a recursive tree:

Course
└── Module (UNIT: "Week 1")
    ├── Module (LESSON: "Introduction")
    │   └── ContentItem (VIDEO)
    └── Module (LESSON: "Concepts")
        ├── ContentItem (TEXT)
        └── ContentItem (QUIZ)

Design Decisions

Why hierarchical? - Flexible course structure - Nested navigation - Modular content reuse

Implementation: - parent_module_id (self-referencing FK) - depth field for querying by level - sequence_index for ordering

Content Polymorphism

ContentItem uses JSONB for flexible data:

# Video content
{
    "content_type_id": 1,  # video
    "data_json": {
        "video_url": "https://...",
        "duration": 600,
        "subtitles": [...]
    }
}

# Quiz content
{
    "content_type_id": 2,  # quiz
    "data_json": {
        "questions": [...],
        "time_limit": 3600
    }
}

AI Chat Architecture

Context-Aware Conversations

Each chat session is tied to a content item:

ChatSession:
    user_id: User who owns session
    content_item_id: Associated content
    content_context: Snapshot of content (JSONB)
    messages: Conversation history

Streaming Responses

Uses Server-Sent Events (SSE) for real-time:

@router.post("/sessions/{session_id}/messages")
async def send_message(...):
    async for chunk in llm_service.stream_response(...):
        yield f"data: {chunk}\n\n"

Prompt Management

Langfuse-style versioned prompts with label-based resolution:

  • Prompts identified by name + version, resolved by label (default: production)
  • Inline composition via @@@langfusePrompt:name=X|label=Y@@@ tags
  • {{variable}} placeholders replaced at runtime with content context
  • Content items reference prompts by prompt_name + prompt_label

Session Sharing

Public sharing via token:

# Enable sharing
session.share_enabled = True
session.share_token = generate_unique_token()

# Public access (no auth)
GET /chat/share/{share_token}

Authentication & Security

JWT Token Strategy

Access Token: - Short-lived (30 min default) - Used for API requests - Contains user_id

Refresh Token: - Long-lived (7 days default) - Used to get new access tokens - Separate expiration

Password Security

  • Hashing: bcrypt with automatic salt
  • Normalization: Handles bcrypt 72-byte limit
  • Validation: On every login

Token Flow

1. User registers/logs in
   → Returns access + refresh tokens

2. Client stores tokens
   → Access token in memory
   → Refresh token in httpOnly cookie

3. API requests use access token
   → Bearer token in Authorization header

4. When access token expires
   → Use refresh token to get new pair
   → Repeat cycle

Performance Considerations

Async All the Way

# Async database
async with AsyncSession(engine) as session:
    result = await session.execute(query)

# Async routes
@router.get("/users")
async def list_users(...):
    return await crud.user.get_multi(db)

# Async LLM calls
async for chunk in llm.astream(messages):
    yield chunk

Database Optimizations

  • Indexes: All FKs, unique constraints, composite keys
  • GIN indexes: On JSONB columns and ARRAY columns (prompts labels/tags) for fast queries
  • Eager loading: Relationships loaded in single query
  • Connection pooling: SQLAlchemy manages pool

Caching Strategy (Future)

Recommended caching: - Course catalog (Redis) - Prompts (in-memory or Redis) - User sessions (Redis)

Error Handling

Centralized Exception Handling

FastAPI exception handlers:

@app.exception_handler(HTTPException)
async def http_exception_handler(request, exc):
    return JSONResponse(
        status_code=exc.status_code,
        content={"detail": exc.detail}
    )

Validation Errors

Pydantic provides detailed errors:

{
  "detail": [
    {
      "loc": ["body", "email"],
      "msg": "value is not a valid email address",
      "type": "value_error.email"
    }
  ]
}

Observability

LLM Tracing (Langfuse)

Optional integration tracks: - LLM calls and responses - Token usage - Latency - Cost estimation - Prompt versions

Logging

Structured logging (future enhancement):

logger.info(
    "User enrolled",
    user_id=user.id,
    course_id=course.id,
    timestamp=datetime.now()
)

Deployment Architecture

Development

[Developer Machine]
  ├── FastAPI (uvicorn --reload)
  └── PostgreSQL (local)
[Load Balancer]
[Multiple FastAPI Workers] (uvicorn workers)
[PostgreSQL Primary]
  └── [Read Replicas]

[Redis] (caching, session store)

[AWS Bedrock / Anthropic API] (LLM)

[Langfuse] (observability)

Technology Choices

Why FastAPI?

  • Async support (high concurrency)
  • Automatic OpenAPI docs
  • Type hints & validation
  • High performance
  • Modern Python

Why PostgreSQL?

  • Robust relational database
  • JSONB for flexible schema
  • Full-text search
  • Mature ecosystem
  • ACID guarantees

Why SQLAlchemy?

  • ORM abstraction
  • Async support (2.0+)
  • Migration tools (Alembic)
  • Type safety
  • Flexible queries

Why Async?

  • Handle many concurrent connections
  • Non-blocking I/O (DB, LLM, etc.)
  • Better resource utilization
  • Required for SSE streaming

Trade-offs & Decisions

Stateless JWT vs Sessions

Chose: Stateless JWT

Pros: - Scalable (no session store) - Simple deployment

Cons: - Can't revoke tokens (except expiration) - Larger payload

Mitigation: Short-lived access tokens

JSONB vs Separate Tables

Chose: JSONB for ContentItem.data_json

Pros: - Flexible schema per content type - No migrations for new fields - Easy to add content types

Cons: - Less type safety - Harder to query

Use case: Polymorphic content with varying fields

Monolith vs Microservices

Chose: Monolithic FastAPI app

Reasoning: - Simpler development - Easier transactions - Single deployment - Can split later if needed

Future Enhancements

  1. Caching layer (Redis)
  2. Full-text search (PostgreSQL FTS or Elasticsearch)
  3. WebSocket support (real-time collaboration)
  4. File upload service (S3)
  5. Background jobs (Celery)
  6. GraphQL API (alongside REST)
  7. Rate limiting
  8. Audit logging

Next Steps