Architecture¶
Understanding the system design, patterns, and key technical decisions.
System Overview¶
2Sigma Backend is a FastAPI-based learning management system with AI tutoring capabilities. It's designed for multi-tenancy, scalability, and extensibility.
Key Characteristics¶
- Async-first: Fully async architecture using FastAPI and async SQLAlchemy
- Multi-tenant: Supports multiple universities with isolation
- Hierarchical content: Tree-structured modules and content
- AI-powered: Integrated Claude LLM for tutoring
- RESTful API: Standard REST endpoints with OpenAPI docs
- Type-safe: Python type hints throughout
Architecture Diagram¶
graph TB
Client[Frontend/API Client]
API[FastAPI Application]
Auth[Auth Layer]
Routes[API Routes]
CRUD[CRUD Layer]
Models[SQLAlchemy Models]
DB[(PostgreSQL)]
LLM[LLM Service]
Anthropic[Anthropic API]
Bedrock[AWS Bedrock]
Client --> API
API --> Auth
Auth --> Routes
Routes --> CRUD
CRUD --> Models
Models --> DB
Routes --> LLM
LLM --> Anthropic
LLM --> Bedrock
Layered Architecture¶
1. Presentation Layer (API)¶
Location: app/api/v1/
- RESTful HTTP endpoints
- Request validation (Pydantic)
- Response serialization
- Error handling
- Authentication/authorization checks
Example:
@router.get("/me", response_model=schemas.UserResponse)
async def get_current_user(
current_user = Depends(deps.get_current_active_user)
):
return current_user
2. Business Logic Layer (CRUD + Services)¶
CRUD (Data Access): app/crud/
- Repository pattern for database operations
- Generic base class for common operations
- Specialized methods per entity
Services: app/services/
- External integrations (LLM)
- Complex business logic
- Cross-entity operations
Example CRUD:
class CRUDUser(CRUDBase[User, UserCreate, UserUpdate]):
async def get_by_email(self, db: AsyncSession, email: str):
result = await db.execute(
select(User).where(User.email == email)
)
return result.scalars().first()
3. Data Layer (Models)¶
Location: app/models/
- SQLAlchemy ORM models
- Database schema definitions
- Relationships and constraints
4. Data Transfer Layer (Schemas)¶
Location: app/schemas/
- Pydantic models for API contracts
- Request validation
- Response serialization
- Separation from ORM models
Design Patterns¶
Repository Pattern¶
CRUD classes abstract database access:
# Base repository with generic operations
class CRUDBase(Generic[ModelType, CreateSchemaType, UpdateSchemaType]):
async def get(self, db, id):...
async def create(self, db, obj_in):...
async def update(self, db, db_obj, obj_in):...
async def delete(self, db, id):...
# Specialized repository
class CRUDUser(CRUDBase[User, UserCreate, UserUpdate]):
async def get_by_email(self, db, email):...
Benefits: - Testable (easy to mock) - Reusable operations - Consistent interface
Dependency Injection¶
FastAPI's Depends() for loose coupling:
# Define dependencies
async def get_db():
async with AsyncSession(engine) as session:
yield session
async def get_current_user(
db: AsyncSession = Depends(get_db),
token: str = Depends(oauth2_scheme)
) -> User:
# Validate token, return user
...
# Use in routes
@router.get("/me")
async def get_profile(
current_user: User = Depends(get_current_user)
):
return current_user
Benefits: - Testable (inject mocks) - Reusable logic - Clear dependencies
Strategy Pattern¶
Pluggable LLM providers:
class LLMService:
def __init__(self):
if settings.LLM_PROVIDER == "anthropic":
self.llm = ChatAnthropic(...)
elif settings.LLM_PROVIDER == "aws_bedrock":
self.llm = ChatBedrock(...)
async def generate_response(self, messages):
# Unified interface regardless of provider
return await self.llm.astream(messages)
Prompt Composition Pattern¶
Langfuse-style inline prompt composition:
# Prompt with inline dependency reference
prompt = Prompt(
name="Video Tutor",
version=3,
prompt="You are a tutor for {{title}}.\n@@@langfusePrompt:name=General Protocol|label=production@@@",
labels=["production"]
)
# Dependencies are parsed from @@@langfusePrompt:...@@@ tags
# and resolved recursively (max depth 5) at runtime
Multi-Tenancy Design¶
Tenant Isolation¶
- Tenant: University (organization)
- Isolation: Data scoped by
university_id - RBAC: Roles per university (UserUniversityRole)
Row-Level Security¶
Queries filter by university:
# Courses for a specific university
courses = await db.execute(
select(Course).where(Course.university_id == university_id)
)
Shared vs Tenant Data¶
Shared: - User accounts (global) - Content types (platform-wide) - Prompts (reusable, name-based)
Tenant-specific: - Courses - Enrollments - Content items - Departments
Hierarchical Content Model¶
Tree Structure¶
Modules form a recursive tree:
Course
└── Module (UNIT: "Week 1")
├── Module (LESSON: "Introduction")
│ └── ContentItem (VIDEO)
└── Module (LESSON: "Concepts")
├── ContentItem (TEXT)
└── ContentItem (QUIZ)
Design Decisions¶
Why hierarchical? - Flexible course structure - Nested navigation - Modular content reuse
Implementation:
- parent_module_id (self-referencing FK)
- depth field for querying by level
- sequence_index for ordering
Content Polymorphism¶
ContentItem uses JSONB for flexible data:
# Video content
{
"content_type_id": 1, # video
"data_json": {
"video_url": "https://...",
"duration": 600,
"subtitles": [...]
}
}
# Quiz content
{
"content_type_id": 2, # quiz
"data_json": {
"questions": [...],
"time_limit": 3600
}
}
AI Chat Architecture¶
Context-Aware Conversations¶
Each chat session is tied to a content item:
ChatSession:
user_id: User who owns session
content_item_id: Associated content
content_context: Snapshot of content (JSONB)
messages: Conversation history
Streaming Responses¶
Uses Server-Sent Events (SSE) for real-time:
@router.post("/sessions/{session_id}/messages")
async def send_message(...):
async for chunk in llm_service.stream_response(...):
yield f"data: {chunk}\n\n"
Prompt Management¶
Langfuse-style versioned prompts with label-based resolution:
- Prompts identified by name + version, resolved by label (default:
production) - Inline composition via
@@@langfusePrompt:name=X|label=Y@@@tags {{variable}}placeholders replaced at runtime with content context- Content items reference prompts by
prompt_name+prompt_label
Session Sharing¶
Public sharing via token:
# Enable sharing
session.share_enabled = True
session.share_token = generate_unique_token()
# Public access (no auth)
GET /chat/share/{share_token}
Authentication & Security¶
JWT Token Strategy¶
Access Token: - Short-lived (30 min default) - Used for API requests - Contains user_id
Refresh Token: - Long-lived (7 days default) - Used to get new access tokens - Separate expiration
Password Security¶
- Hashing: bcrypt with automatic salt
- Normalization: Handles bcrypt 72-byte limit
- Validation: On every login
Token Flow¶
1. User registers/logs in
→ Returns access + refresh tokens
2. Client stores tokens
→ Access token in memory
→ Refresh token in httpOnly cookie
3. API requests use access token
→ Bearer token in Authorization header
4. When access token expires
→ Use refresh token to get new pair
→ Repeat cycle
Performance Considerations¶
Async All the Way¶
# Async database
async with AsyncSession(engine) as session:
result = await session.execute(query)
# Async routes
@router.get("/users")
async def list_users(...):
return await crud.user.get_multi(db)
# Async LLM calls
async for chunk in llm.astream(messages):
yield chunk
Database Optimizations¶
- Indexes: All FKs, unique constraints, composite keys
- GIN indexes: On JSONB columns and ARRAY columns (prompts labels/tags) for fast queries
- Eager loading: Relationships loaded in single query
- Connection pooling: SQLAlchemy manages pool
Caching Strategy (Future)¶
Recommended caching: - Course catalog (Redis) - Prompts (in-memory or Redis) - User sessions (Redis)
Error Handling¶
Centralized Exception Handling¶
FastAPI exception handlers:
@app.exception_handler(HTTPException)
async def http_exception_handler(request, exc):
return JSONResponse(
status_code=exc.status_code,
content={"detail": exc.detail}
)
Validation Errors¶
Pydantic provides detailed errors:
{
"detail": [
{
"loc": ["body", "email"],
"msg": "value is not a valid email address",
"type": "value_error.email"
}
]
}
Observability¶
LLM Tracing (Langfuse)¶
Optional integration tracks: - LLM calls and responses - Token usage - Latency - Cost estimation - Prompt versions
Logging¶
Structured logging (future enhancement):
Deployment Architecture¶
Development¶
Production (Recommended)¶
[Load Balancer]
↓
[Multiple FastAPI Workers] (uvicorn workers)
↓
[PostgreSQL Primary]
└── [Read Replicas]
[Redis] (caching, session store)
[AWS Bedrock / Anthropic API] (LLM)
[Langfuse] (observability)
Technology Choices¶
Why FastAPI?¶
- Async support (high concurrency)
- Automatic OpenAPI docs
- Type hints & validation
- High performance
- Modern Python
Why PostgreSQL?¶
- Robust relational database
- JSONB for flexible schema
- Full-text search
- Mature ecosystem
- ACID guarantees
Why SQLAlchemy?¶
- ORM abstraction
- Async support (2.0+)
- Migration tools (Alembic)
- Type safety
- Flexible queries
Why Async?¶
- Handle many concurrent connections
- Non-blocking I/O (DB, LLM, etc.)
- Better resource utilization
- Required for SSE streaming
Trade-offs & Decisions¶
Stateless JWT vs Sessions¶
Chose: Stateless JWT
Pros: - Scalable (no session store) - Simple deployment
Cons: - Can't revoke tokens (except expiration) - Larger payload
Mitigation: Short-lived access tokens
JSONB vs Separate Tables¶
Chose: JSONB for ContentItem.data_json
Pros: - Flexible schema per content type - No migrations for new fields - Easy to add content types
Cons: - Less type safety - Harder to query
Use case: Polymorphic content with varying fields
Monolith vs Microservices¶
Chose: Monolithic FastAPI app
Reasoning: - Simpler development - Easier transactions - Single deployment - Can split later if needed
Future Enhancements¶
- Caching layer (Redis)
- Full-text search (PostgreSQL FTS or Elasticsearch)
- WebSocket support (real-time collaboration)
- File upload service (S3)
- Background jobs (Celery)
- GraphQL API (alongside REST)
- Rate limiting
- Audit logging