Architecture¶

Understanding the system design, patterns, and key technical decisions.

System Overview¶

2Sigma Backend is a FastAPI-based learning management system with AI tutoring capabilities. It's designed for multi-tenancy, scalability, and extensibility.

Key Characteristics¶

Async-first: Fully async architecture using FastAPI and async SQLAlchemy
Multi-tenant: Supports multiple universities with isolation
Hierarchical content: Tree-structured modules and content
AI-powered: Integrated Claude LLM for tutoring
RESTful API: Standard REST endpoints with OpenAPI docs
Type-safe: Python type hints throughout

Architecture Diagram¶

graph TB
    Client[Frontend/API Client]
    API[FastAPI Application]
    Auth[Auth Layer]
    Routes[API Routes]
    CRUD[CRUD Layer]
    Models[SQLAlchemy Models]
    DB[(PostgreSQL)]
    LLM[LLM Service]
    Anthropic[Anthropic API]
    Bedrock[AWS Bedrock]

    Client --> API
    API --> Auth
    Auth --> Routes
    Routes --> CRUD
    CRUD --> Models
    Models --> DB
    Routes --> LLM
    LLM --> Anthropic
    LLM --> Bedrock

Layered Architecture¶

1. Presentation Layer (API)¶

Location: app/api/v1/

RESTful HTTP endpoints
Request validation (Pydantic)
Response serialization
Error handling
Authentication/authorization checks

Example:

@router.get("/me", response_model=schemas.UserResponse)
async def get_current_user(
    current_user = Depends(deps.get_current_active_user)
):
    return current_user

2. Business Logic Layer (CRUD + Services)¶

CRUD (Data Access): app/crud/ - Repository pattern for database operations - Generic base class for common operations - Specialized methods per entity

Services: app/services/ - External integrations (LLM) - Complex business logic - Cross-entity operations

Example CRUD:

class CRUDUser(CRUDBase[User, UserCreate, UserUpdate]):
    async def get_by_email(self, db: AsyncSession, email: str):
        result = await db.execute(
            select(User).where(User.email == email)
        )
        return result.scalars().first()

3. Data Layer (Models)¶

Location: app/models/

SQLAlchemy ORM models
Database schema definitions
Relationships and constraints

4. Data Transfer Layer (Schemas)¶

Location: app/schemas/

Pydantic models for API contracts
Request validation
Response serialization
Separation from ORM models

Design Patterns¶

Repository Pattern¶

CRUD classes abstract database access:

# Base repository with generic operations
class CRUDBase(Generic[ModelType, CreateSchemaType, UpdateSchemaType]):
    async def get(self, db, id):...
    async def create(self, db, obj_in):...
    async def update(self, db, db_obj, obj_in):...
    async def delete(self, db, id):...

# Specialized repository
class CRUDUser(CRUDBase[User, UserCreate, UserUpdate]):
    async def get_by_email(self, db, email):...

Benefits: - Testable (easy to mock) - Reusable operations - Consistent interface

Dependency Injection¶

FastAPI's Depends() for loose coupling:

# Define dependencies
async def get_db():
    async with AsyncSession(engine) as session:
        yield session

async def get_current_user(
    db: AsyncSession = Depends(get_db),
    token: str = Depends(oauth2_scheme)
) -> User:
    # Validate token, return user
    ...

# Use in routes
@router.get("/me")
async def get_profile(
    current_user: User = Depends(get_current_user)
):
    return current_user

Benefits: - Testable (inject mocks) - Reusable logic - Clear dependencies

Strategy Pattern¶

Pluggable LLM providers:

class LLMService:
    def __init__(self):
        if settings.LLM_PROVIDER == "anthropic":
            self.llm = ChatAnthropic(...)
        elif settings.LLM_PROVIDER == "aws_bedrock":
            self.llm = ChatBedrock(...)

    async def generate_response(self, messages):
        # Unified interface regardless of provider
        return await self.llm.astream(messages)

Prompt Composition Pattern¶

Langfuse-style inline prompt composition:

# Prompt with inline dependency reference
prompt = Prompt(
    name="Video Tutor",
    version=3,
    prompt="You are a tutor for {{title}}.\n@@@langfusePrompt:name=General Protocol|label=production@@@",
    labels=["production"]
)

# Dependencies are parsed from @@@langfusePrompt:...@@@ tags
# and resolved recursively (max depth 5) at runtime

Multi-Tenancy Design¶

Tenant Isolation¶

Tenant: University (organization)
Isolation: Data scoped by university_id
RBAC: Roles per university (UserUniversityRole)

Row-Level Security¶

Queries filter by university:

# Courses for a specific university
courses = await db.execute(
    select(Course).where(Course.university_id == university_id)
)

Shared vs Tenant Data¶

Shared: - User accounts (global) - Content types (platform-wide) - Prompts (reusable, name-based)

Tenant-specific: - Courses - Enrollments - Content items - Departments

Hierarchical Content Model¶

Tree Structure¶

Modules form a recursive tree:

Course
└── Module (UNIT: "Week 1")
    ├── Module (LESSON: "Introduction")
    │   └── ContentItem (VIDEO)
    └── Module (LESSON: "Concepts")
        ├── ContentItem (TEXT)
        └── ContentItem (QUIZ)

Design Decisions¶

Why hierarchical? - Flexible course structure - Nested navigation - Modular content reuse

Implementation: - parent_module_id (self-referencing FK) - depth field for querying by level - sequence_index for ordering

Content Polymorphism¶

ContentItem uses JSONB for flexible data:

# Video content
{
    "content_type_id": 1,  # video
    "data_json": {
        "video_url": "https://...",
        "duration": 600,
        "subtitles": [...]
    }
}

# Quiz content
{
    "content_type_id": 2,  # quiz
    "data_json": {
        "questions": [...],
        "time_limit": 3600
    }
}

AI Chat Architecture¶

Context-Aware Conversations¶

Each chat session is tied to a content item:

ChatSession:
    user_id: User who owns session
    content_item_id: Associated content
    content_context: Snapshot of content (JSONB)
    messages: Conversation history

Streaming Responses¶

Uses Server-Sent Events (SSE) for real-time:

@router.post("/sessions/{session_id}/messages")
async def send_message(...):
    async for chunk in llm_service.stream_response(...):
        yield f"data: {chunk}\n\n"

Prompt Management¶

Langfuse-style versioned prompts with label-based resolution:

Prompts identified by name + version, resolved by label (default: production)
Inline composition via @@@langfusePrompt:name=X|label=Y@@@ tags
{{variable}} placeholders replaced at runtime with content context
Content items reference prompts by prompt_name + prompt_label

Public sharing via token:

# Enable sharing
session.share_enabled = True
session.share_token = generate_unique_token()

# Public access (no auth)
GET /chat/share/{share_token}

Authentication & Security¶

JWT Token Strategy¶

Access Token: - Short-lived (30 min default) - Used for API requests - Contains user_id

Refresh Token: - Long-lived (7 days default) - Used to get new access tokens - Separate expiration

Password Security¶

Hashing: bcrypt with automatic salt
Normalization: Handles bcrypt 72-byte limit
Validation: On every login

Token Flow¶

1. User registers/logs in
   → Returns access + refresh tokens

2. Client stores tokens
   → Access token in memory
   → Refresh token in httpOnly cookie

3. API requests use access token
   → Bearer token in Authorization header

4. When access token expires
   → Use refresh token to get new pair
   → Repeat cycle

Performance Considerations¶

Async All the Way¶

# Async database
async with AsyncSession(engine) as session:
    result = await session.execute(query)

# Async routes
@router.get("/users")
async def list_users(...):
    return await crud.user.get_multi(db)

# Async LLM calls
async for chunk in llm.astream(messages):
    yield chunk

Database Optimizations¶

Indexes: All FKs, unique constraints, composite keys
GIN indexes: On JSONB columns and ARRAY columns (prompts labels/tags) for fast queries
Eager loading: Relationships loaded in single query
Connection pooling: SQLAlchemy manages pool

Caching Strategy (Future)¶

Recommended caching: - Course catalog (Redis) - Prompts (in-memory or Redis) - User sessions (Redis)

Error Handling¶

Centralized Exception Handling¶

FastAPI exception handlers:

@app.exception_handler(HTTPException)
async def http_exception_handler(request, exc):
    return JSONResponse(
        status_code=exc.status_code,
        content={"detail": exc.detail}
    )

Validation Errors¶

Pydantic provides detailed errors:

{
  "detail": [
    {
      "loc": ["body", "email"],
      "msg": "value is not a valid email address",
      "type": "value_error.email"
    }
  ]
}

Observability¶

LLM Tracing (Langfuse)¶

Optional integration tracks: - LLM calls and responses - Token usage - Latency - Cost estimation - Prompt versions

Logging¶

Structured logging (future enhancement):

logger.info(
    "User enrolled",
    user_id=user.id,
    course_id=course.id,
    timestamp=datetime.now()
)

Deployment Architecture¶

Development¶

[Developer Machine]
  ├── FastAPI (uvicorn --reload)
  └── PostgreSQL (local)

Production (Recommended)¶

[Load Balancer]
  ↓
[Multiple FastAPI Workers] (uvicorn workers)
  ↓
[PostgreSQL Primary]
  └── [Read Replicas]

[Redis] (caching, session store)

[AWS Bedrock / Anthropic API] (LLM)

[Langfuse] (observability)

Technology Choices¶

Why FastAPI?¶

Async support (high concurrency)
Automatic OpenAPI docs
Type hints & validation
High performance
Modern Python

Why PostgreSQL?¶

Robust relational database
JSONB for flexible schema
Full-text search
Mature ecosystem
ACID guarantees

Why SQLAlchemy?¶

ORM abstraction
Async support (2.0+)
Migration tools (Alembic)
Type safety
Flexible queries

Why Async?¶

Handle many concurrent connections
Non-blocking I/O (DB, LLM, etc.)
Better resource utilization
Required for SSE streaming

Trade-offs & Decisions¶

Stateless JWT vs Sessions¶

Chose: Stateless JWT

Pros: - Scalable (no session store) - Simple deployment

Cons: - Can't revoke tokens (except expiration) - Larger payload

Mitigation: Short-lived access tokens

JSONB vs Separate Tables¶

Chose: JSONB for ContentItem.data_json

Pros: - Flexible schema per content type - No migrations for new fields - Easy to add content types

Cons: - Less type safety - Harder to query

Use case: Polymorphic content with varying fields

Monolith vs Microservices¶

Chose: Monolithic FastAPI app

Reasoning: - Simpler development - Easier transactions - Single deployment - Can split later if needed

Future Enhancements¶

Caching layer (Redis)
Full-text search (PostgreSQL FTS or Elasticsearch)
WebSocket support (real-time collaboration)
File upload service (S3)
Background jobs (Celery)
GraphQL API (alongside REST)
Rate limiting
Audit logging