Prompt Metadata Tracking¶

Overview¶

As of this update, chat messages track which prompt was used when generating AI responses. This metadata is stored in the meta JSONB column of the chat_messages table.

What Gets Tracked¶

When an AI assistant message is created, the following metadata is automatically saved:

{
  "prompt_id": 5,
  "prompt_version": 3,
  "prompt_name": "Video Tutor",
  "prompt_label": "production",
  "content_type": "video",
  "model": "claude-3-5-sonnet-20241022",
  "provider": "anthropic"
}

Metadata Fields¶

prompt_id: Database ID of the prompt version used
prompt_version: Version number of that prompt
prompt_name: Name of the prompt
prompt_label: Label that was used to resolve the prompt version (e.g., "production")
content_type: Content type of the content item being tutored (text, video, quiz, etc.)
model: The AI model used (e.g., "claude-3-5-sonnet-20241022")
provider: The LLM provider ("anthropic" or "aws_bedrock")

Implementation Details¶

Code Changes¶

app/services/llm_service.py:
_build_prompt() now returns a tuple: (prompt_text, metadata_dict)
stream_response() yields tuples of (content_chunk, metadata)
Metadata is only included in the first chunk to avoid duplication
app/api/v1/chat.py:
The send_message() endpoint captures metadata from the first chunk
Passes metadata to create_chat_message() when saving the assistant's response
app/crud/chat.py:
create_chat_message() already had a meta parameter - no changes needed

Database Schema¶

No schema changes required! The chat_messages.meta JSONB column already existed and supports arbitrary JSON data.

Use Cases¶

Audit Trail¶

Query which prompt version was used for specific conversations:

SELECT id, created_at, meta->>'prompt_name', meta->>'prompt_version'
FROM chat_messages
WHERE role = 'assistant'
  AND session_id = 123;

A/B Testing¶

Compare responses from different prompt versions:

SELECT 
  meta->>'prompt_version' as version,
  COUNT(*) as message_count,
  AVG(LENGTH(content)) as avg_response_length
FROM chat_messages
WHERE role = 'assistant'
  AND meta->>'prompt_id' IS NOT NULL
GROUP BY meta->>'prompt_version';

Debugging¶

Find all messages that used a specific prompt:

SELECT session_id, content, created_at
FROM chat_messages
WHERE role = 'assistant'
  AND meta->>'prompt_id' = '5';

Important Notes¶

Backward Compatibility¶

✅ Existing messages without metadata continue to work normally
✅ Empty meta objects ({}) are valid
⚠️ Messages created before this update will have empty meta fields

Future Improvements¶

Consider tracking additional metadata: - Response latency/timing - Token counts (input/output) - Temperature and other model parameters - User feedback/ratings on responses

Testing¶

To verify metadata is being saved:

from sqlalchemy import select
from app.models.chat import ChatMessage, MessageRole

# Get recent assistant messages
result = await db.execute(
    select(ChatMessage)
    .where(ChatMessage.role == MessageRole.ASSISTANT)
    .order_by(ChatMessage.created_at.desc())
    .limit(10)
)

for msg in result.scalars():
    print(f"Message {msg.id}: {msg.meta}")

app/services/llm_service.py - LLM service with metadata generation
app/api/v1/chat.py - Chat API endpoint that captures metadata
app/crud/prompt.py - Prompt CRUD operations
app/models/prompt.py - Prompt and PromptDependency model definitions
app/models/chat.py - ChatMessage model definition