Skip to content

Database Schema Reference

Complete reference for all database models and relationships.

Overview

The 2Sigma Backend uses PostgreSQL with SQLAlchemy 2.0 (async). The schema supports multi-tenant learning management with AI tutoring.

Core Entity Hierarchy

Country
└── University (tenant)
    └── Department
        └── Course (definition)
            ├── CourseOffering (term instance)
            │   └── CourseEnrollment (user enrollment)
            └── Module (hierarchical tree)
                └── ContentItem (learning materials)

Models

User Management

User

Core user accounts.

Table: users

Column Type Constraints Description
id Integer PK User ID
email String(255) Unique, Not Null User email (login)
password_hash String(255) Not Null Bcrypt hashed password
full_name String(255) Not Null Display name
locale String(10) Default 'en' User language
time_zone String(50) Default 'UTC' User timezone
is_active Boolean Default True Account status
current_streak_days Integer Default 0 Learning streak
last_activity_date Date Nullable Last activity
created_at DateTime(TZ) Default now() Creation time
updated_at DateTime(TZ) Default now() Last update

Relationships: - profile (One-to-One) → UserProfile - university_roles (One-to-Many) → UserUniversityRole - enrollments (One-to-Many) → CourseEnrollment - chat_sessions (One-to-Many) → ChatSession

Indexes: - ix_users_email (Unique)


UserProfile

Extended user profile from onboarding.

Table: user_profiles

Column Type Description
id Integer PK
user_id Integer FK → users.id (Unique)
date_of_birth Date Nullable
learning_goal Enum CAREER_ADVANCEMENT, SKILL_DEVELOPMENT, etc.
study_level Enum HIGH_SCHOOL, UNDERGRADUATE, etc.
grade Enum FRESHMAN, SOPHOMORE, etc.

Relationships: - user → User - subject_preferences → Subject (Many-to-Many via UserSubjectPreference)


UserUniversityRole

Multi-tenant RBAC (role per university).

Table: user_university_roles

Column Type Description
id Integer PK
user_id Integer FK → users.id
university_id Integer FK → universities.id
role Enum STUDENT, INSTRUCTOR, TA, ADMIN, etc.
department_id Integer FK → departments.id (Optional)

Unique: (user_id, university_id, role)


Geography/Tenancy

Country

Table: countries

Column Type Description
id Integer PK
code String(2) ISO 3166-1 alpha-2 (Unique)
name String(100) Country name

University

Tenant organizations.

Table: universities

Column Type Description
id Integer PK
name String(255) University name
short_code String(20) Unique code (e.g., "MIT")
country_id Integer FK → countries.id
time_zone String(50) Default timezone
website_url String(255) Nullable
is_active Boolean Default True

Unique: (country_id, short_code)


Department

Table: departments

Column Type Description
id Integer PK
university_id Integer FK → universities.id
name String(255) Department name
code String(20) Nullable

Learning Taxonomy

Domain

High-level subject areas.

Table: domains

Column Type Description
id Integer PK
name String(100) e.g., "Mathematics", "Science"

Subject

Recursive subject hierarchy.

Table: subjects

Column Type Description
id Integer PK
domain_id Integer FK → domains.id
parent_subject_id Integer FK → subjects.id (Self-referencing)
name String(100) Subject name

Relationships: - domain → Domain - parent_subject → Subject (Self) - child_subjects → List[Subject]


Course Structure

Course

Course definitions (reusable across terms).

Table: courses

Column Type Description
id Integer PK
university_id Integer FK → universities.id
department_id Integer FK → departments.id (Optional)
primary_subject_id Integer FK → subjects.id
title String(255) Course title
code String(20) Course code (e.g., "CS101")
description Text Nullable
level Enum INTRODUCTORY, INTERMEDIATE, ADVANCED, EXPERT
credit_hours Float Default 0
is_published Boolean Default False
is_coming_soon Boolean Default False

Unique: (university_id, code)

Relationships: - offerings (One-to-Many) → CourseOffering - modules (One-to-Many) → Module - subjects (Many-to-Many) → Subject via CourseSubject - prerequisites (Many-to-Many) → Course via CoursePrerequisite


CourseOffering

Course instances (terms/semesters).

Table: course_offerings

Column Type Description
id Integer PK
course_id Integer FK → courses.id
university_id Integer FK → universities.id
term_name String(100) e.g., "Spring 2026"
starts_at DateTime(TZ) Nullable
ends_at DateTime(TZ) Nullable
enrollment_start DateTime(TZ) Nullable
enrollment_end DateTime(TZ) Nullable
visibility Enum PUBLIC, UNLISTED, INVITE_ONLY

CourseEnrollment

User course registrations.

Table: course_enrollments

Column Type Description
id Integer PK
user_id Integer FK → users.id
course_offering_id Integer FK → course_offerings.id
status Enum ACTIVE, COMPLETED, DROPPED, PENDING, WAITLISTED
enrolled_at DateTime(TZ) Default now()
completed_at DateTime(TZ) Nullable

Unique: (user_id, course_offering_id)


Content Structure

Module

Hierarchical content organization (tree structure).

Table: modules

Column Type Description
id Integer PK
course_id Integer FK → courses.id
parent_module_id Integer FK → modules.id (Self-referencing)
title String(255) Module title
description Text Nullable
module_type Enum COURSE_ROOT, UNIT, CHAPTER, LESSON, TOPIC, LAB, QUIZ
sequence_index Integer Order within parent
depth Integer Tree depth
is_published Boolean Default False

Relationships: - parent_module → Module (Self) - child_modules → List[Module] - content_items (One-to-Many) → ContentItem


ContentType

Admin-managed content types.

Table: content_types

Column Type Description
id Integer PK
name String(50) Internal name (e.g., "video")
label String(100) Display name
description Text Nullable
is_active Boolean Default True

ContentItem

Polymorphic learning materials.

Table: content_items

Column Type Description
id Integer PK
module_id Integer FK → modules.id
course_id Integer FK → courses.id
content_type_id Integer FK → content_types.id
title String(255) Content title
description Text Nullable
data_json JSONB Flexible content data
learning_objectives JSONB Array of objectives
sequence_index Integer Order within module
estimated_duration_seconds Integer Nullable
prompt_name String(255) Name of prompt to use (Optional)
prompt_label String(36) Label of prompt version to resolve (Default "production")

Indexes: - GIN index on data_json


Progress Tracking

UserModuleProgress

Table: user_module_progress

Column Type Description
id Integer PK
user_id Integer FK → users.id
course_offering_id Integer FK → course_offerings.id
module_id Integer FK → modules.id
progress_percent Integer 0-100
started_at DateTime(TZ) Default now()
completed_at DateTime(TZ) Nullable

Unique: (user_id, course_offering_id, module_id)


UserContentProgress

Table: user_content_progress

Column Type Description
id Integer PK
user_id Integer FK → users.id
course_offering_id Integer FK → course_offerings.id
content_item_id Integer FK → content_items.id
progress_percent Integer 0-100
attempts_count Integer Default 0
first_started_at DateTime(TZ) Default now()
last_interacted_at DateTime(TZ) Default now()
completed_at DateTime(TZ) Nullable

Unique: (user_id, course_offering_id, content_item_id)


AI Chat System

ChatSession

Table: chat_sessions

Column Type Description
id Integer PK
user_id Integer FK → users.id
content_item_id Integer FK → content_items.id (Optional)
content_context JSONB Snapshot of content
share_enabled Boolean Default False
share_token String(255) Unique, Nullable
created_at DateTime(TZ) Default now()
updated_at DateTime(TZ) Default now()

Relationships: - messages (One-to-Many) → ChatMessage - notes (One-to-Many) → ChatNote


ChatMessage

Table: chat_messages

Column Type Description
id Integer PK
session_id Integer FK → chat_sessions.id
role Enum USER, ASSISTANT, SYSTEM
content Text Message text
meta JSONB Model, prompt ID, etc.
created_at DateTime(TZ) Default now()

ChatNote

Table: chat_notes

Column Type Description
id Integer PK
session_id Integer FK → chat_sessions.id
message_id Integer FK → chat_messages.id (Optional)
content Text Note content
selected_text Text Highlighted text
created_at DateTime(TZ) Default now()

Prompt Management

Prompt

Langfuse-style versioned LLM prompts with label-based version management.

Table: prompts

Column Type Description
id BigInteger PK
name String(255) Prompt name (e.g., "General Protocol")
version Integer Auto-incrementing version per name
type String(20) Prompt type (default "text")
prompt Text Prompt template content (nullable)
config JSON Version-specific configuration (default {})
labels ARRAY(Text) Version labels (e.g., ["production", "latest"])
tags ARRAY(Text) Tags synchronized across all versions of a name
commit_message String(500) Version commit message (nullable)
created_by String(255) User ID or "API"
created_at DateTime(TZ) Default now()
updated_at DateTime(TZ) Default now()

Constraints: - Unique: (name, version) — no duplicate versions per prompt name - Labels are unique per prompt name (a label can only exist on one version at a time) - latest label is auto-assigned to the newest version - production is the default label for resolution

Indexes: - ix_prompts_name on (name) - ix_prompts_labels on (labels) using GIN - ix_prompts_tags on (tags) using GIN - ix_prompts_created_at on (created_at)


PromptDependency

Tracks inline prompt composition references (@@@langfusePrompt:name=X|label=Y@@@).

Table: prompt_dependencies

Column Type Description
id BigInteger PK
parent_id BigInteger FK → prompts.id (CASCADE)
child_name String(255) Referenced prompt name
child_version Integer Referenced version (nullable)
child_label String(36) Referenced label (nullable)
created_at DateTime(TZ) Default now()
updated_at DateTime(TZ) Default now()

Constraints: - Check: exactly one of child_version or child_label must be non-null - Max nesting depth: 5 levels - Circular dependencies are detected and rejected

Indexes: - ix_prompt_deps_parent on (parent_id) - ix_prompt_deps_child on (child_name)


System Configuration


SystemSetting

Key-value configuration store.

Table: system_settings

Column Type Description
id Integer PK
setting_key String(100) Unique key
setting_value Text Value (encrypted for secrets)
is_encrypted Boolean Default False
description Text Nullable
updated_at DateTime(TZ) Default now()

Database Enums

UserRole

STUDENT, INSTRUCTOR, TA, DEPARTMENT_ADMIN, UNIVERSITY_ADMIN, PLATFORM_ADMIN

LearningGoal

CAREER_ADVANCEMENT, SKILL_DEVELOPMENT, ACADEMIC_REQUIREMENT, PERSONAL_INTEREST, CERTIFICATION

StudyLevel

HIGH_SCHOOL, UNDERGRADUATE, GRADUATE, DOCTORATE, PROFESSIONAL

CourseLevel

INTRODUCTORY, INTERMEDIATE, ADVANCED, EXPERT

EnrollmentStatus

ACTIVE, COMPLETED, DROPPED, PENDING, WAITLISTED

ModuleType

COURSE_ROOT, UNIT, CHAPTER, LESSON, TOPIC, SECTION, LAB, QUIZ

MessageRole

USER, ASSISTANT, SYSTEM

Indexes and Performance

Key Indexes: - All foreign keys have indexes - Unique constraints on: user email, university short_code, course code - Composite indexes: (user_id, course_offering_id), (module_id, sequence_index) - GIN indexes on JSONB columns: content_items.data_json, chat_sessions.content_context - GIN indexes on ARRAY columns: prompts.labels, prompts.tags

Cascade Deletes: - Most relationships use CASCADE delete - Example: Deleting a course deletes its offerings, enrollments, and modules

Migration History

16 migrations from 2025-11-27 to 2026-01-15. See Database Migrations Guide.

Next Steps