Design: OAuth Authentication#
Overview#
Purpose: Implement secure OAuth 2.0 authentication for MCP servers to access Microsoft services (Graph API, SharePoint, Exchange) on behalf of authenticated users using Azure AD On-Behalf-Of (OBO) flow.
Scope: - OAuth 2.0 authentication flow with Azure AD - Token exchange using On-Behalf-Of (OBO) pattern - Secure token storage with multiple backend options - Token lifecycle management (refresh, expiry) - Integration with Microsoft identity platform
Out of Scope: - Client-side authentication UI (handled by Azure AD) - Multi-tenant authentication (single tenant only) - Custom identity providers (Azure AD only)
Architecture#
┌──────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Client │────▶│ MCP Server │────▶│ Microsoft │
│ (Claude) │ │ (FastMCP) │ │ Graph API │
└──────────────┘ └─────────────────┘ └─────────────────┘
│ ▲
│ │
▼ │
┌─────────────────┐ │
│ Azure AD │───────────────┘
│ (OAuth Provider)│ (OBO Token Exchange)
└─────────────────┘
│
▼
┌─────────────────┐
│ Token Storage │
│ (Memory/Disk/ │
│ DynamoDB) │
└─────────────────┘
Authentication Flow:
- User initiates connection to MCP server endpoint
- Server redirects to Azure AD login page (
login.microsoftonline.com) - User authenticates with Novo Nordisk credentials + MFA
- Azure AD returns authorization code to MCP server
- Server exchanges code for user access token (audience: MCP server)
- Token stored in configured backend (memory/disk/DynamoDB)
- On API call, server performs OBO token exchange: user token → Graph API token
- Graph token used to call Microsoft API with user's delegated permissions
- Tokens auto-refresh when expired using refresh token
Components:
- FastMCP Server - Python async HTTP server handling MCP protocol
- AzureProvider - FastMCP's built-in Azure AD OAuth integration
- OBOAuthenticator - Custom module for On-Behalf-Of token exchange (auth.py)
- Token Storage Backend - Pluggable storage (memory/disk/DynamoDB)
- GraphClient - HTTP client wrapper with automatic token exchange
- Azure AD - Identity provider, OAuth authorization server
Tech Stack#
Backend: Python 3.11, FastMCP 3.0+, Starlette (ASGI)
HTTP Client: httpx (async)
OAuth Library: MSAL (Microsoft Authentication Library) 1.28+
Token Storage: py-key-value-aio (supports memory/disk/DynamoDB)
Infrastructure: AWS ECS Fargate, Application Load Balancer
Secret Management: AWS SSM Parameter Store (encrypted)
Deployment: Docker containers, GitHub Actions CI/CD
OAuth 2.0 Flow Implementation#
Authorization Code Flow#
Protocol: OAuth 2.0 Authorization Code Grant (RFC 6749)
Note on PKCE: Public clients connecting to this MCP server (browsers, Claude Desktop, VS Code Copilot) use PKCE (RFC 7636) to secure their authorization code exchange — this is a client-side concern handled automatically by FastMCP's
AzureProvider. This spec covers the server-side authorization code flow only: the MCP server is a confidential client with aclient_secret, so PKCE is not required for the server → Azure AD leg.
Endpoints:
- Authorization: https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/authorize
- Token: https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token
Required Parameters:
client_id: Azure AD application ID for MCP server
client_secret: Securely stored in AWS SSM Parameter Store
tenant_id: Novo Nordisk Azure AD tenant
redirect_uri: {mcp_base_url}/oauth/callback
scope: api://{client_id}/access offline_access
response_type: code
Implementation: Handled automatically by FastMCP's AzureProvider class in server.py.
On-Behalf-Of (OBO) Token Exchange#
Purpose: Exchange user's MCP access token for Microsoft Graph API token while preserving user identity and permissions.
Protocol: OAuth 2.0 On-Behalf-Of flow (Microsoft documentation)
Token Endpoint: https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token
Request Parameters:
grant_type: urn:ietf:params:oauth:grant-type:jwt-bearer
client_id: {mcp_server_client_id}
client_secret: {mcp_server_client_secret}
assertion: {user_access_token}
scope: https://graph.microsoft.com/Mail.Read
https://graph.microsoft.com/Calendars.Read
https://graph.microsoft.com/MailboxSettings.Read
offline_access
requested_token_use: on_behalf_of
Implementation: Custom OBOAuthenticator class in auth.py:
async def exchange_token_via_obo(
user_token: str,
tenant_id: str,
client_id: str,
client_secret: str,
target_scope: str,
) -> str:
"""Exchange user token for Graph API token using OBO flow."""
# See: connectors/mcps/outlook-mcp/src/outlook_mcp/auth.py
Key Design Decisions: - Stateless OBO - No server-side token caching for Graph tokens (always exchange on-demand) - User permission preservation - Graph token inherits user's Azure AD RBAC permissions - Error handling - HTTP 4xx/5xx from Azure AD are logged and re-raised - Timeout - 30 second timeout for token exchange requests
Token Storage#
Storage Backends#
Three pluggable options configured via STORAGE_TYPE environment variable:
1. Memory (Default for Local Dev)#
- Use case: Local development only - Persistence: None - tokens lost on restart - Performance: Fastest (in-process) - Security: Limited to process memory2. Disk#
- Use case: Single-instance deployments, development - Persistence: Survives restarts - Performance: Fast (local filesystem I/O) - Security: File-level encryption at rest (OS-dependent) - Location: Configurable viaSTORAGE_DISK_DIRECTORY
3. DynamoDB (Production)#
storage_backend = DynamoDBStore(
table_name="mcp-oauth-storage-outlook-mcp",
region_name="eu-central-1"
)
STORAGE_DYNAMODB_TABLE_NAME, STORAGE_DYNAMODB_REGION
Implementation Reference: connectors/mcps/outlook-mcp/src/outlook_mcp/server.py:create_storage_backend()
Token Lifecycle#
Access Token: - Lifetime: 1 hour (Azure AD default) - Storage: Stored in configured backend after authorization code exchange - Refresh: Automatically refreshed when expired using refresh token
Refresh Token: - Lifetime: 90 days (Azure AD default, configurable via conditional access policies) - Storage: Stored alongside access token - Usage: Used to obtain new access token without user re-authentication - Revocation: User logout, password change, or Azure AD policy triggers revocation
Graph Token (OBO): - Lifetime: 1 hour (Azure AD default) - Storage: NOT stored - always exchanged on-demand from user token - Refresh: New OBO exchange performed when needed (no caching)
Session Expiry Handling: - When refresh token expires → User redirected to Azure AD login - When access token expires → Automatic refresh using refresh token (transparent to user) - When OBO exchange fails → HTTP 401 returned to client with error details
Security Design#
Authentication Security#
Multi-Factor Authentication (MFA): - Enforced by Azure AD conditional access policies - MCP server respects Azure AD authentication decisions - No server-side MFA logic required
Conditional Access: - IP restrictions, device compliance, trusted networks enforced by Azure AD - MCP server cannot bypass Azure AD policies - Failed conditional access checks prevent token issuance
Scope Validation:
- Required scopes: api://{client_id}/access, offline_access
- Graph scopes: Mail.Read, Calendars.Read, MailboxSettings.Read
- User must consent to all required scopes during login
- Azure AD RBAC honored - token scoped to user's permissions only
Token Security#
Secret Management:
- Client secrets stored in AWS SSM Parameter Store
- Secrets encrypted at rest with AWS KMS
- Retrieved at runtime via IAM role (no hardcoded secrets)
- Never logged or exposed in HTTP responses
- Implementation: settings.mcp_server_azure_client_secret (SecretStr type)
Token Protection:
- Access tokens never logged (Pydantic SecretStr type)
- Refresh tokens stored encrypted in DynamoDB at rest
- Tokens transmitted only over TLS 1.3
- No token caching in GraphClient (stateless OBO)
- Token expiry honored - no use of expired tokens
Transport Security:
- TLS 1.3 enforced on Application Load Balancer
- HTTPS only - HTTP requests redirected to HTTPS
- Certificate: AWS Certificate Manager (ACM) wildcard cert for *.connectors.novo-genai.com
Audit Logging#
Authentication Events Logged:
- User identity (email/UPN)
- Timestamp (ISO 8601 UTC)
- Result (success/failure)
- IP address (from X-Forwarded-For header)
- MCP endpoint accessed
Logged via Python logging module:
logger.info("User %s authenticated successfully from %s", user_email, ip_address)
logger.error("OBO token exchange failed: %s", error_message)
Log Destination: AWS CloudWatch Logs (centralized)
NOT Logged: - Access tokens, refresh tokens, client secrets - Authorization codes - Sensitive request/response bodies
API Design#
Authentication Middleware#
FastMCP's AzureProvider automatically injects authentication middleware that:
1. Intercepts all MCP tool requests
2. Validates bearer token in Authorization header
3. Extracts token and makes it available via get_access_token()
4. Returns HTTP 401 if token missing or invalid
Unauthenticated Endpoints:
- /health - Health check (bypasses auth)
- /oauth/authorize - OAuth initiation
- /oauth/callback - OAuth callback
GraphClient Pattern#
All Microsoft Graph API calls use GraphClient which encapsulates OBO token exchange:
client = _require_graph_client() # Gets user token, returns GraphClient
response = await client.make_authenticated_request(
url="https://graph.microsoft.com/v1.0/me/messages",
method="GET"
)
GraphClient automatically:
1. Extracts user token from request context
2. Performs OBO exchange to get Graph token
3. Adds Authorization: Bearer {graph_token} header
4. Makes HTTP request to Graph API
5. Returns JSON response
Error Handling: - HTTP 401 from Graph API → OBO exchange may have failed, re-raise - HTTP 403 → User lacks permission, re-raise - HTTP 429 → Rate limited, re-raise (client should retry with backoff) - Network errors → httpx.HTTPStatusError, logged and re-raised
Configuration#
Environment Variables#
Required (Production):
ENABLE_OAUTH=true
MCP_SERVER_AZURE_CLIENT_ID=<azure-app-id>
MCP_SERVER_AZURE_CLIENT_SECRET=<secret-from-ssm>
MCP_SERVER_AZURE_TENANT_ID=<tenant-id>
MCP_BASE_URL=https://outlook.connectors.novo-genai.com
STORAGE_TYPE=dynamodb
STORAGE_DYNAMODB_TABLE_NAME=mcp-oauth-storage-outlook-mcp
STORAGE_DYNAMODB_REGION=eu-central-1
CACHE_TYPE=dynamodb
CACHE_DYNAMODB_TABLE_NAME=cache-mcp-outlook-prod
MCP_SERVER_RATE_LIMIT_RPM=60 # Max tool invocations per user per minute (0 = disabled)
Development Only:
ENABLE_OAUTH=false # Skips OAuth, uses debug token
GRAPH_DEBUG_TOKEN=$(az account get-access-token --resource https://graph.microsoft.com --query accessToken -o tsv)
Graph Scopes (Space-Separated):
GRAPH_SCOPES="https://graph.microsoft.com/Mail.Read https://graph.microsoft.com/Calendars.Read https://graph.microsoft.com/MailboxSettings.Read offline_access"
FastMCP Server Scopes (Comma-Separated):
Implementation: connectors/mcps/outlook-mcp/src/outlook_mcp/config.py
Deployment Architecture#
AWS ECS Fargate#
Service Configuration:
- Container: Python 3.11 slim image with FastMCP app
- Task Definition: 0.5 vCPU, 1 GB memory
- Auto-scaling: 2-10 tasks based on CPU/memory
- Health Check: /health endpoint (15s interval)
Network:
- VPC: Private subnets (no direct internet access)
- Load Balancer: Public Application Load Balancer (ALB)
- Target Group: Port 8000 (FastMCP HTTP server)
- ALB Rule: Host: outlook.connectors.novo-genai.com → target group
Secrets Injection:
- AWS SSM Parameter Store parameters injected as environment variables at container startup
- IAM task role grants ssm:GetParameter permission
- No secrets in task definition or ECR image
CI/CD Pipeline#
Trigger: Push to main branch or manual workflow dispatch
Steps: 1. Checkout code 2. Build Docker image (multi-stage build for size optimization) 3. Push to AWS ECR 4. Update ECS service with new image 5. Wait for deployment to stabilize (health checks pass)
Implementation: .github/workflows/deploy-outlook-mcp.yml
Performance Considerations#
Token Exchange Latency#
OBO Exchange Time: ~200-500ms (network call to Azure AD)
Optimization Strategy: - No caching of Graph tokens (security > latency) - Future optimization: Short-lived in-memory cache (5-10 min TTL) if latency becomes issue
Storage Backend Performance#
| Backend | Read Latency | Write Latency | Durability | Multi-Instance |
|---|---|---|---|---|
| Memory | <1ms | <1ms | None | No |
| Disk | 1-5ms | 1-5ms | Yes | No |
| DynamoDB | 10-50ms | 10-50ms | Yes | Yes |
Recommendation: DynamoDB for production despite higher latency (required for multi-instance)
API Rate Limits#
Microsoft Graph API: - Mail: 10,000 requests per 10 minutes per user - Calendar: 10,000 requests per 10 minutes per user
Mitigation:
- Respect Retry-After header on HTTP 429 responses from Graph API
- Per-user rate limiting enforced server-side via RateLimitMiddleware (sliding window, keyed on Azure AD oid claim). Configured via MCP_SERVER_RATE_LIMIT_RPM (default: 60 req/min). Returns HTTP 429 with Retry-After on breach. See server.py in each connector.
Monitoring & Observability#
Metrics#
Logged to CloudWatch: - Authentication success/failure rate - OBO token exchange latency - Graph API request latency - HTTP error rates (4xx, 5xx)
Alerts (Configured in Terraform): - Error rate > 5% for 5 minutes - OBO exchange failures > 10 in 5 minutes - Health check failures (ECS target unhealthy)
Logs#
Log Format: JSON structured logs
Key Events:
- OBO token exchange successful
- OBO token exchange failed: {status_code}
- User authenticated: {user_email}
- GraphClient request: {method} {url} -> {status}
Log Level: - DEBUG: Token exchange details (no sensitive data) - INFO: Successful operations - ERROR: Failed operations with context
Testing Strategy#
Unit Tests#
test_auth.py:
- OBO token exchange success
- OBO token exchange failure (401, 403, 500)
- Timeout handling
- SecretStr handling
test_server.py:
- GraphClient initialization (OAuth enabled vs disabled)
- Debug token fallback
- Error handling when no token available
Integration Tests#
Manual Testing (Local Dev):
1. Run with ENABLE_OAUTH=false and GRAPH_DEBUG_TOKEN
2. Call MCP tools (e.g., list_emails)
3. Verify Graph API calls succeed
Production Testing: 1. Deploy to dev environment 2. Authenticate via Claude Desktop 3. Call MCP tools, verify OAuth flow end-to-end 4. Check CloudWatch logs for OBO exchanges
Key Design Decisions#
Why On-Behalf-Of (OBO) Instead of Service Principal?#
Decision: Use OBO flow, not service principal with application permissions.
Rationale: - User permission preservation - Graph API calls inherit user's Azure AD RBAC, ensuring data access is scoped to what the user can see - Audit trail - All Graph API calls attributed to user, not service account - Zero trust - No elevated service account with broad permissions - Compliance - Aligns with least privilege principle required for GxP systems
Trade-off: Additional latency (~200-500ms per OBO exchange) vs security and compliance
Why No Graph Token Caching?#
Decision: Always exchange user token for Graph token on-demand, no caching.
Rationale: - Security - Reduces token lifetime exposure (Graph token only exists during API call) - Simplicity - No cache invalidation logic, no stale token issues - Stateless - Easier to scale horizontally (no shared cache needed)
Trade-off: Higher latency vs security and simplicity
Future Optimization: If latency becomes issue, implement short-lived (5-10 min TTL) in-memory cache with proper invalidation
Why DynamoDB for Token Storage in Production?#
Decision: Use DynamoDB for OAuth token storage (not Redis/ElastiCache).
Rationale: - Serverless - No cluster management, auto-scaling - Durability - Replicated across AZs, no data loss on instance failure - Cost - Pay-per-request pricing, cheaper than Redis for low-volume workloads - Security - Encryption at rest with KMS, IAM-based access control
Trade-off: Higher latency (10-50ms vs <1ms for Redis) vs operational simplicity
Related Requirements#
URS: @URS:OAuthAuthentication (features/oauth-authentication.feature)
Risk Assessment: RISK:OAuthAuthentication (docs/risk-oauth-authentication.md)
Scenarios Addressed: - OAuth2 authentication flow with Azure AD - OBO token exchange for Microsoft Graph - Secure token storage (memory/disk/DynamoDB) - Token refresh and session expiry handling - Conditional access policy enforcement - Client secret management via AWS SSM
Version: 1.0
Date: 2024-04-16
Author: AILab
Approved by: Pending PR review