Skip to content

Design: OAuth Authentication#

Overview#

Purpose: Implement secure OAuth 2.0 authentication for MCP servers to access Microsoft services (Graph API, SharePoint, Exchange) on behalf of authenticated users using Azure AD On-Behalf-Of (OBO) flow.

Scope: - OAuth 2.0 authentication flow with Azure AD - Token exchange using On-Behalf-Of (OBO) pattern - Secure token storage with multiple backend options - Token lifecycle management (refresh, expiry) - Integration with Microsoft identity platform

Out of Scope: - Client-side authentication UI (handled by Azure AD) - Multi-tenant authentication (single tenant only) - Custom identity providers (Azure AD only)


Architecture#

┌──────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Client     │────▶│  MCP Server     │────▶│  Microsoft      │
│   (Claude)   │     │  (FastMCP)      │     │  Graph API      │
└──────────────┘     └─────────────────┘     └─────────────────┘
                             │                         ▲
                             │                         │
                             ▼                         │
                     ┌─────────────────┐               │
                     │  Azure AD       │───────────────┘
                     │  (OAuth Provider)│  (OBO Token Exchange)
                     └─────────────────┘
                     ┌─────────────────┐
                     │  Token Storage  │
                     │  (Memory/Disk/  │
                     │   DynamoDB)     │
                     └─────────────────┘

Authentication Flow:

  1. User initiates connection to MCP server endpoint
  2. Server redirects to Azure AD login page (login.microsoftonline.com)
  3. User authenticates with Novo Nordisk credentials + MFA
  4. Azure AD returns authorization code to MCP server
  5. Server exchanges code for user access token (audience: MCP server)
  6. Token stored in configured backend (memory/disk/DynamoDB)
  7. On API call, server performs OBO token exchange: user token → Graph API token
  8. Graph token used to call Microsoft API with user's delegated permissions
  9. Tokens auto-refresh when expired using refresh token

Components: - FastMCP Server - Python async HTTP server handling MCP protocol - AzureProvider - FastMCP's built-in Azure AD OAuth integration - OBOAuthenticator - Custom module for On-Behalf-Of token exchange (auth.py) - Token Storage Backend - Pluggable storage (memory/disk/DynamoDB) - GraphClient - HTTP client wrapper with automatic token exchange - Azure AD - Identity provider, OAuth authorization server


Tech Stack#

Backend: Python 3.11, FastMCP 3.0+, Starlette (ASGI)
HTTP Client: httpx (async)
OAuth Library: MSAL (Microsoft Authentication Library) 1.28+
Token Storage: py-key-value-aio (supports memory/disk/DynamoDB)
Infrastructure: AWS ECS Fargate, Application Load Balancer
Secret Management: AWS SSM Parameter Store (encrypted)
Deployment: Docker containers, GitHub Actions CI/CD


OAuth 2.0 Flow Implementation#

Authorization Code Flow#

Protocol: OAuth 2.0 Authorization Code Grant (RFC 6749)

Note on PKCE: Public clients connecting to this MCP server (browsers, Claude Desktop, VS Code Copilot) use PKCE (RFC 7636) to secure their authorization code exchange — this is a client-side concern handled automatically by FastMCP's AzureProvider. This spec covers the server-side authorization code flow only: the MCP server is a confidential client with a client_secret, so PKCE is not required for the server → Azure AD leg.

Endpoints: - Authorization: https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/authorize - Token: https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token

Required Parameters:

client_id:       Azure AD application ID for MCP server
client_secret:   Securely stored in AWS SSM Parameter Store
tenant_id:       Novo Nordisk Azure AD tenant
redirect_uri:    {mcp_base_url}/oauth/callback
scope:           api://{client_id}/access offline_access
response_type:   code

Implementation: Handled automatically by FastMCP's AzureProvider class in server.py.

On-Behalf-Of (OBO) Token Exchange#

Purpose: Exchange user's MCP access token for Microsoft Graph API token while preserving user identity and permissions.

Protocol: OAuth 2.0 On-Behalf-Of flow (Microsoft documentation)

Token Endpoint: https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token

Request Parameters:

grant_type:          urn:ietf:params:oauth:grant-type:jwt-bearer
client_id:           {mcp_server_client_id}
client_secret:       {mcp_server_client_secret}
assertion:           {user_access_token}
scope:               https://graph.microsoft.com/Mail.Read 
                     https://graph.microsoft.com/Calendars.Read
                     https://graph.microsoft.com/MailboxSettings.Read
                     offline_access
requested_token_use: on_behalf_of

Implementation: Custom OBOAuthenticator class in auth.py:

async def exchange_token_via_obo(
    user_token: str,
    tenant_id: str,
    client_id: str,
    client_secret: str,
    target_scope: str,
) -> str:
    """Exchange user token for Graph API token using OBO flow."""
    # See: connectors/mcps/outlook-mcp/src/outlook_mcp/auth.py

Key Design Decisions: - Stateless OBO - No server-side token caching for Graph tokens (always exchange on-demand) - User permission preservation - Graph token inherits user's Azure AD RBAC permissions - Error handling - HTTP 4xx/5xx from Azure AD are logged and re-raised - Timeout - 30 second timeout for token exchange requests


Token Storage#

Storage Backends#

Three pluggable options configured via STORAGE_TYPE environment variable:

1. Memory (Default for Local Dev)#

storage_backend = MemoryStore()
- Use case: Local development only - Persistence: None - tokens lost on restart - Performance: Fastest (in-process) - Security: Limited to process memory

2. Disk#

storage_backend = DiskStore(directory="./.oauth_storage")
- Use case: Single-instance deployments, development - Persistence: Survives restarts - Performance: Fast (local filesystem I/O) - Security: File-level encryption at rest (OS-dependent) - Location: Configurable via STORAGE_DISK_DIRECTORY

3. DynamoDB (Production)#

storage_backend = DynamoDBStore(
    table_name="mcp-oauth-storage-outlook-mcp",
    region_name="eu-central-1"
)
- Use case: Multi-instance production deployments - Persistence: Durable, replicated - Performance: Network latency (~10-50ms) - Security: Encryption at rest (AWS KMS), IAM-based access control - Scaling: Auto-scales with AWS managed service - Configuration: STORAGE_DYNAMODB_TABLE_NAME, STORAGE_DYNAMODB_REGION

Implementation Reference: connectors/mcps/outlook-mcp/src/outlook_mcp/server.py:create_storage_backend()

Token Lifecycle#

Access Token: - Lifetime: 1 hour (Azure AD default) - Storage: Stored in configured backend after authorization code exchange - Refresh: Automatically refreshed when expired using refresh token

Refresh Token: - Lifetime: 90 days (Azure AD default, configurable via conditional access policies) - Storage: Stored alongside access token - Usage: Used to obtain new access token without user re-authentication - Revocation: User logout, password change, or Azure AD policy triggers revocation

Graph Token (OBO): - Lifetime: 1 hour (Azure AD default) - Storage: NOT stored - always exchanged on-demand from user token - Refresh: New OBO exchange performed when needed (no caching)

Session Expiry Handling: - When refresh token expires → User redirected to Azure AD login - When access token expires → Automatic refresh using refresh token (transparent to user) - When OBO exchange fails → HTTP 401 returned to client with error details


Security Design#

Authentication Security#

Multi-Factor Authentication (MFA): - Enforced by Azure AD conditional access policies - MCP server respects Azure AD authentication decisions - No server-side MFA logic required

Conditional Access: - IP restrictions, device compliance, trusted networks enforced by Azure AD - MCP server cannot bypass Azure AD policies - Failed conditional access checks prevent token issuance

Scope Validation: - Required scopes: api://{client_id}/access, offline_access - Graph scopes: Mail.Read, Calendars.Read, MailboxSettings.Read - User must consent to all required scopes during login - Azure AD RBAC honored - token scoped to user's permissions only

Token Security#

Secret Management: - Client secrets stored in AWS SSM Parameter Store - Secrets encrypted at rest with AWS KMS - Retrieved at runtime via IAM role (no hardcoded secrets) - Never logged or exposed in HTTP responses - Implementation: settings.mcp_server_azure_client_secret (SecretStr type)

Token Protection: - Access tokens never logged (Pydantic SecretStr type) - Refresh tokens stored encrypted in DynamoDB at rest - Tokens transmitted only over TLS 1.3 - No token caching in GraphClient (stateless OBO) - Token expiry honored - no use of expired tokens

Transport Security: - TLS 1.3 enforced on Application Load Balancer - HTTPS only - HTTP requests redirected to HTTPS - Certificate: AWS Certificate Manager (ACM) wildcard cert for *.connectors.novo-genai.com

Audit Logging#

Authentication Events Logged: - User identity (email/UPN) - Timestamp (ISO 8601 UTC) - Result (success/failure) - IP address (from X-Forwarded-For header) - MCP endpoint accessed

Logged via Python logging module:

logger.info("User %s authenticated successfully from %s", user_email, ip_address)
logger.error("OBO token exchange failed: %s", error_message)

Log Destination: AWS CloudWatch Logs (centralized)

NOT Logged: - Access tokens, refresh tokens, client secrets - Authorization codes - Sensitive request/response bodies


API Design#

Authentication Middleware#

FastMCP's AzureProvider automatically injects authentication middleware that: 1. Intercepts all MCP tool requests 2. Validates bearer token in Authorization header 3. Extracts token and makes it available via get_access_token() 4. Returns HTTP 401 if token missing or invalid

Unauthenticated Endpoints: - /health - Health check (bypasses auth) - /oauth/authorize - OAuth initiation - /oauth/callback - OAuth callback

GraphClient Pattern#

All Microsoft Graph API calls use GraphClient which encapsulates OBO token exchange:

client = _require_graph_client()  # Gets user token, returns GraphClient
response = await client.make_authenticated_request(
    url="https://graph.microsoft.com/v1.0/me/messages",
    method="GET"
)

GraphClient automatically: 1. Extracts user token from request context 2. Performs OBO exchange to get Graph token 3. Adds Authorization: Bearer {graph_token} header 4. Makes HTTP request to Graph API 5. Returns JSON response

Error Handling: - HTTP 401 from Graph APIOBO exchange may have failed, re-raise - HTTP 403 → User lacks permission, re-raise - HTTP 429 → Rate limited, re-raise (client should retry with backoff) - Network errors → httpx.HTTPStatusError, logged and re-raised


Configuration#

Environment Variables#

Required (Production):

ENABLE_OAUTH=true
MCP_SERVER_AZURE_CLIENT_ID=<azure-app-id>
MCP_SERVER_AZURE_CLIENT_SECRET=<secret-from-ssm>
MCP_SERVER_AZURE_TENANT_ID=<tenant-id>
MCP_BASE_URL=https://outlook.connectors.novo-genai.com
STORAGE_TYPE=dynamodb
STORAGE_DYNAMODB_TABLE_NAME=mcp-oauth-storage-outlook-mcp
STORAGE_DYNAMODB_REGION=eu-central-1
CACHE_TYPE=dynamodb
CACHE_DYNAMODB_TABLE_NAME=cache-mcp-outlook-prod
MCP_SERVER_RATE_LIMIT_RPM=60  # Max tool invocations per user per minute (0 = disabled)

Development Only:

ENABLE_OAUTH=false  # Skips OAuth, uses debug token
GRAPH_DEBUG_TOKEN=$(az account get-access-token --resource https://graph.microsoft.com --query accessToken -o tsv)

Graph Scopes (Space-Separated):

GRAPH_SCOPES="https://graph.microsoft.com/Mail.Read https://graph.microsoft.com/Calendars.Read https://graph.microsoft.com/MailboxSettings.Read offline_access"

FastMCP Server Scopes (Comma-Separated):

FASTMCP_SERVER_AUTH_AZURE_REQUIRED_SCOPES="access"  # Unprefixed, FastMCP adds api:// prefix

Implementation: connectors/mcps/outlook-mcp/src/outlook_mcp/config.py


Deployment Architecture#

AWS ECS Fargate#

Service Configuration: - Container: Python 3.11 slim image with FastMCP app - Task Definition: 0.5 vCPU, 1 GB memory - Auto-scaling: 2-10 tasks based on CPU/memory - Health Check: /health endpoint (15s interval)

Network: - VPC: Private subnets (no direct internet access) - Load Balancer: Public Application Load Balancer (ALB) - Target Group: Port 8000 (FastMCP HTTP server) - ALB Rule: Host: outlook.connectors.novo-genai.com → target group

Secrets Injection: - AWS SSM Parameter Store parameters injected as environment variables at container startup - IAM task role grants ssm:GetParameter permission - No secrets in task definition or ECR image

CI/CD Pipeline#

Trigger: Push to main branch or manual workflow dispatch

Steps: 1. Checkout code 2. Build Docker image (multi-stage build for size optimization) 3. Push to AWS ECR 4. Update ECS service with new image 5. Wait for deployment to stabilize (health checks pass)

Implementation: .github/workflows/deploy-outlook-mcp.yml


Performance Considerations#

Token Exchange Latency#

OBO Exchange Time: ~200-500ms (network call to Azure AD)

Optimization Strategy: - No caching of Graph tokens (security > latency) - Future optimization: Short-lived in-memory cache (5-10 min TTL) if latency becomes issue

Storage Backend Performance#

Backend Read Latency Write Latency Durability Multi-Instance
Memory <1ms <1ms None No
Disk 1-5ms 1-5ms Yes No
DynamoDB 10-50ms 10-50ms Yes Yes

Recommendation: DynamoDB for production despite higher latency (required for multi-instance)

API Rate Limits#

Microsoft Graph API: - Mail: 10,000 requests per 10 minutes per user - Calendar: 10,000 requests per 10 minutes per user

Mitigation: - Respect Retry-After header on HTTP 429 responses from Graph API - Per-user rate limiting enforced server-side via RateLimitMiddleware (sliding window, keyed on Azure AD oid claim). Configured via MCP_SERVER_RATE_LIMIT_RPM (default: 60 req/min). Returns HTTP 429 with Retry-After on breach. See server.py in each connector.


Monitoring & Observability#

Metrics#

Logged to CloudWatch: - Authentication success/failure rate - OBO token exchange latency - Graph API request latency - HTTP error rates (4xx, 5xx)

Alerts (Configured in Terraform): - Error rate > 5% for 5 minutes - OBO exchange failures > 10 in 5 minutes - Health check failures (ECS target unhealthy)

Logs#

Log Format: JSON structured logs

Key Events: - OBO token exchange successful - OBO token exchange failed: {status_code} - User authenticated: {user_email} - GraphClient request: {method} {url} -> {status}

Log Level: - DEBUG: Token exchange details (no sensitive data) - INFO: Successful operations - ERROR: Failed operations with context


Testing Strategy#

Unit Tests#

test_auth.py: - OBO token exchange success - OBO token exchange failure (401, 403, 500) - Timeout handling - SecretStr handling

test_server.py: - GraphClient initialization (OAuth enabled vs disabled) - Debug token fallback - Error handling when no token available

Integration Tests#

Manual Testing (Local Dev): 1. Run with ENABLE_OAUTH=false and GRAPH_DEBUG_TOKEN 2. Call MCP tools (e.g., list_emails) 3. Verify Graph API calls succeed

Production Testing: 1. Deploy to dev environment 2. Authenticate via Claude Desktop 3. Call MCP tools, verify OAuth flow end-to-end 4. Check CloudWatch logs for OBO exchanges


Key Design Decisions#

Why On-Behalf-Of (OBO) Instead of Service Principal?#

Decision: Use OBO flow, not service principal with application permissions.

Rationale: - User permission preservation - Graph API calls inherit user's Azure AD RBAC, ensuring data access is scoped to what the user can see - Audit trail - All Graph API calls attributed to user, not service account - Zero trust - No elevated service account with broad permissions - Compliance - Aligns with least privilege principle required for GxP systems

Trade-off: Additional latency (~200-500ms per OBO exchange) vs security and compliance

Why No Graph Token Caching?#

Decision: Always exchange user token for Graph token on-demand, no caching.

Rationale: - Security - Reduces token lifetime exposure (Graph token only exists during API call) - Simplicity - No cache invalidation logic, no stale token issues - Stateless - Easier to scale horizontally (no shared cache needed)

Trade-off: Higher latency vs security and simplicity

Future Optimization: If latency becomes issue, implement short-lived (5-10 min TTL) in-memory cache with proper invalidation

Why DynamoDB for Token Storage in Production?#

Decision: Use DynamoDB for OAuth token storage (not Redis/ElastiCache).

Rationale: - Serverless - No cluster management, auto-scaling - Durability - Replicated across AZs, no data loss on instance failure - Cost - Pay-per-request pricing, cheaper than Redis for low-volume workloads - Security - Encryption at rest with KMS, IAM-based access control

Trade-off: Higher latency (10-50ms vs <1ms for Redis) vs operational simplicity


URS: @URS:OAuthAuthentication (features/oauth-authentication.feature)

Risk Assessment: RISK:OAuthAuthentication (docs/risk-oauth-authentication.md)

Scenarios Addressed: - OAuth2 authentication flow with Azure AD - OBO token exchange for Microsoft Graph - Secure token storage (memory/disk/DynamoDB) - Token refresh and session expiry handling - Conditional access policy enforcement - Client secret management via AWS SSM


Version: 1.0
Date: 2024-04-16
Author: AILab
Approved by: Pending PR review