Architecture and Data Flow#
Relevant controls: SC.02.01, SC.02.04, SC.09.01, SC.09.02, SC.09.04, DI.04.01
Compliance reference
This document supports ITRA controls SC-02 Network Security and SC-09 Asset Management.
1. System Overview#
AI connectors is a platform of Model Context Protocol (MCP) servers that expose Novo Nordisk data sources as tools for AI assistants (e.g. Claude). Each MCP server is a Python/FastMCP application running on AWS ECS Fargate, accessed over HTTPS using Azure AD OAuth 2.0. Users authenticate with their corporate identity, and the MCP calls downstream APIs (Microsoft Graph, Databricks) strictly on behalf of that user via the OAuth 2.0 On-Behalf-Of (OBO) flow.
All MCP servers are read-only. No MCP has write, send, or delete permissions on any downstream system.
Current MCP servers:
| MCP | Data source | URL (prod) |
|---|---|---|
| SharePoint | Microsoft SharePoint / OneDrive | sharepoint.connectors.novo-genai.com |
| Outlook | Microsoft Exchange (mail + calendar) | outlook.connectors.novo-genai.com |
| Teams | Microsoft Teams (channels, messages, chats) | teams.connectors.novo-genai.com |
| Databricks | Azure Databricks workspaces | databricks.connectors.novo-genai.com |
2. Component Inventory#
Update this table whenever components are added, changed, or retired.
Shared Infrastructure (one per environment)#
| Component | Type | Region | Dev | Prod | Notes |
|---|---|---|---|---|---|
| AWS Account | AWS | — | 094069622854 |
673034950531 |
Separate accounts per env |
| VPC | AWS VPC | eu-central-1 |
10.0.0.0/16 |
10.0.0.0/16 |
3 public + 3 private subnets across 3 AZs |
| Public ALB | Application Load Balancer | eu-central-1 |
aiconnectors-public |
aiconnectors-public |
Internet-facing; TLS termination |
| ACM Certificate | AWS ACM | eu-central-1 |
*.dev.connectors.novo-genai.com |
*.connectors.novo-genai.com |
Wildcard; DNS-validated |
| ECS Cluster | AWS ECS Fargate | eu-central-1 |
aiconnectors-dev-aiconnectors |
aiconnectors-prod-aiconnectors |
Fargate only |
| Route53 Zone | AWS Route53 | Global | dev.connectors.novo-genai.com (Z0335301LNP1JONT36JP) |
connectors.novo-genai.com (Z0176827309TAYY1VSM5G) |
|
| Audit S3 Bucket | AWS S3 | eu-central-1 |
nn-aiconnectors-audit-dev |
nn-aiconnectors-audit-prod |
AES256; versioned; 90d→Glacier; 365d→expire |
| Audit Firehose | AWS Kinesis Firehose | eu-central-1 |
aiconnectors-audit-dev |
aiconnectors-audit-prod |
NDJSON; 5 MB / 60s buffer → S3 |
| Alarms SNS Topic | AWS SNS | eu-central-1 |
aiconnectors-alarms-dev |
aiconnectors-alarms-prod |
Email subscribers: team |
| SSM Parameter Store | AWS SSM | eu-central-1 |
/aiconnectors/dev/… |
/aiconnectors/prod/… |
SecureString (KMS-encrypted) |
Per-MCP Resources (one set per MCP per environment)#
| Component | Type | Dev example | Prod example |
|---|---|---|---|
| ECR Repository | AWS ECR | 094069622854.dkr.ecr.eu-central-1.amazonaws.com/mcp-sharepoint |
673034950531.dkr.ecr.eu-central-1.amazonaws.com/mcp-sharepoint |
| ECS Service | AWS ECS | mcp-sharepoint-main-svc |
mcp-sharepoint-main-svc |
| DynamoDB Table | AWS DynamoDB | mcp-oauth-storage-mcp-sharepoint-dev |
mcp-oauth-storage-mcp-sharepoint-prod |
| ECS Execution Role | AWS IAM | mcp-sharepoint-execution-role |
mcp-sharepoint-execution-role |
| ECS Task Role | AWS IAM | mcp-sharepoint-task-role |
mcp-sharepoint-task-role |
| Route53 Record | A (alias) | sharepoint.dev.connectors.novo-genai.com |
sharepoint.connectors.novo-genai.com |
| GitHub Actions Role | AWS IAM | mcp-sharepoint-github-actions |
mcp-sharepoint-github-actions |
External Dependencies#
| Component | Type | Owner | Purpose |
|---|---|---|---|
| Azure AD Tenant | Identity provider | Microsoft / Novo Nordisk IT | OAuth 2.0 authentication and OBO token exchange |
| Microsoft Graph API | REST API | Microsoft | SharePoint, Outlook, Teams data |
| Azure Databricks | Data platform | Novo Nordisk Data & AI | Databricks workspace access |
| GitHub Actions | CI/CD | GitHub | Build, push, deploy pipeline |
3. Network Architecture#
VPC Layout#
AWS VPC — 10.0.0.0/16 (eu-central-1)
│
├── Public Subnets (3 AZs)
│ ├── 10.0.64.0/20 (AZ-a)
│ ├── 10.0.80.0/20 (AZ-b)
│ └── 10.0.96.0/20 (AZ-c)
│ └── Public ALB ← Internet traffic (HTTPS/443)
│ └── NAT Gateway ← Outbound from private subnets
│
└── Private Subnets (3 AZs)
├── 10.0.0.0/20 (AZ-a)
├── 10.0.16.0/20 (AZ-b)
└── 10.0.32.0/20 (AZ-c)
└── ECS Fargate Tasks ← MCP containers (port 80, internal only)
Traffic Routing#
| Traffic path | Protocol | Notes |
|---|---|---|
| Internet → ALB | HTTPS / 443 | TLS terminated at ALB; ACM wildcard cert |
| ALB → ECS task | HTTP / 80 | Internal to VPC; ECS tasks in private subnets |
| ECS → Azure AD / Microsoft Graph | HTTPS / 443 | Via NAT Gateway → Internet |
| ECS → Databricks | HTTPS / 443 | Via NAT Gateway → Internet |
| ECS → DynamoDB | HTTPS / 443 | Via VPC endpoint (no NAT) |
| ECS → Kinesis Firehose | HTTPS / 443 | Via VPC |
| ECS → SSM | HTTPS / 443 | Via VPC |
| ECS → ECR | HTTPS / 443 | Via S3 Gateway VPC Endpoint (no NAT cost) |
ALB Host-Based Routing#
The single shared ALB routes requests to the correct MCP based on the Host header:
| Host header | Target ECS service | ALB rule priority |
|---|---|---|
sharepoint.{env}connectors.novo-genai.com |
mcp-sharepoint-main-svc |
100 |
outlook.{env}connectors.novo-genai.com |
mcp-outlook-main-svc |
110 |
teams.{env}connectors.novo-genai.com |
mcp-teams-main-svc |
120 |
databricks.{env}connectors.novo-genai.com |
mcp-databricks-main-svc |
130 |
4. Data Flow#
Common Flow (all MCPs)#
1. AI Client (e.g. Claude)
│ HTTPS POST /mcp Bearer: <Azure AD token A>
▼
2. Route53 DNS → Public ALB
│ TLS termination; host-based routing
▼
3. ECS Fargate Task (FastMCP app, port 80)
│ Validates token A (Azure AD JWKS)
│ Exchanges token A → token B via OBO (Azure AD)
│ Caches token B in DynamoDB (TTL ~1h)
├──► Azure AD token endpoint [HTTPS/443]
├──► Microsoft Graph API / Databricks [HTTPS/443, token B]
└──► Kinesis Firehose [audit record per tool call]
│
▼
S3 (NDJSON audit log, partitioned by date)
5. Environment Separation#
Dev and prod are fully isolated at every layer. There is no shared infrastructure, shared credentials, or shared network path between environments.
| Aspect | Dev | Prod |
|---|---|---|
| AWS Account | 094069622854 (AWS-NN-AIconnectors-DEV) |
673034950531 (AWS-NN-AIconnectors-PRD) |
| Region | eu-central-1 |
eu-central-1 |
| DNS zone | dev.connectors.novo-genai.com |
connectors.novo-genai.com |
| ACM cert | *.dev.connectors.novo-genai.com |
*.connectors.novo-genai.com |
| ECS cluster | aiconnectors-dev-aiconnectors |
aiconnectors-prod-aiconnectors |
| ECR registries | 094069622854.dkr.ecr.eu-central-1.amazonaws.com |
673034950531.dkr.ecr.eu-central-1.amazonaws.com |
| Azure AD app regs | Separate app registrations (RITM provisioned) | Separate app registrations (RITM provisioned) |
| DynamoDB tables | mcp-oauth-storage-mcp-{name}-dev |
mcp-oauth-storage-mcp-{name}-prod |
| SSM secrets | /aiconnectors/dev/{mcp}/… |
/aiconnectors/prod/{mcp}/… |
| Audit bucket | nn-aiconnectors-audit-dev |
nn-aiconnectors-audit-prod |
| Prod deploy gate | — | environment: Production gate in GitHub Actions (requires approval) |
6. ECS Task Configuration#
Canonical task configuration (CPU, memory, capacity provider, desired count, environment variables) lives in the source files and is merged at deploy time by ecs/render.py:
connectors/mcps/<name>/ecs/task-definition.base.yml
connectors/mcps/<name>/ecs/env/dev.yml
connectors/mcps/<name>/ecs/env/prod.yml
The following security-relevant settings apply to all MCP containers and should be verified during compliance review:
| Setting | Value |
|---|---|
| Base image | python:3.13-slim |
| Health check endpoint | GET /health |
| Read-only root filesystem | Yes (SharePoint, Outlook); No (Teams, Databricks) |
/tmp tmpfs |
64 MB, rw,noexec,nosuid (SharePoint, Outlook) |
| Secrets injection | SSM SecureString via ECS secrets (never plaintext env vars) |
7. CI/CD Pipeline#
Each MCP is built and deployed by a shared reusable workflow (.github/workflows/tmpl-deploy-mcp.yml). The pipeline is triggered by changes to the MCP's source directory.
push to main
└── lint (ruff + black) ──► [blocks if failed]
└── build
└── docker build → push to dev ECR ({github.sha} tag)
└── apply-infra-dev (terragrunt apply)
└── deploy-dev (render task def → ECS deploy)
└── test-dev (BDD smoke tests)
└── [Production gate: manual approval]
└── deploy-prod (promote image dev ECR → prod ECR → ECS deploy)
Image tagging: {account}.dkr.ecr.eu-central-1.amazonaws.com/{image-name}:{github.sha}
8. Integration Points#
| Integration | Direction | Protocol / Port | Frequency | Data exchanged |
|---|---|---|---|---|
Microsoft Graph API (graph.microsoft.com) |
Outbound | HTTPS / 443 | Per tool call (on-demand) | SharePoint files/metadata, email, calendar, Teams messages |
Azure AD token endpoint (login.microsoftonline.com) |
Outbound | HTTPS / 443 | Per user session (then cached ~1h in DynamoDB) | OAuth2 access tokens |
| Azure Databricks workspaces | Outbound | HTTPS / 443 | Per tool call (on-demand) | SQL query results, workspace metadata |
| AWS Kinesis Firehose | Outbound | HTTPS / 443 | Per tool call | Audit NDJSON records |
| AWS DynamoDB | Outbound | HTTPS / 443 | Per tool call (read) + per token refresh (write) | OAuth token cache |
| AWS SSM Parameter Store | Outbound | HTTPS / 443 | At container startup | Azure AD client ID + client secret |
