Skip to content

Architecture and Data Flow#

Relevant controls: SC.02.01, SC.02.04, SC.09.01, SC.09.02, SC.09.04, DI.04.01

Compliance reference

This document supports ITRA controls SC-02 Network Security and SC-09 Asset Management.


1. System Overview#

AI connectors is a platform of Model Context Protocol (MCP) servers that expose Novo Nordisk data sources as tools for AI assistants (e.g. Claude). Each MCP server is a Python/FastMCP application running on AWS ECS Fargate, accessed over HTTPS using Azure AD OAuth 2.0. Users authenticate with their corporate identity, and the MCP calls downstream APIs (Microsoft Graph, Databricks) strictly on behalf of that user via the OAuth 2.0 On-Behalf-Of (OBO) flow.

All MCP servers are read-only. No MCP has write, send, or delete permissions on any downstream system.

Architecture diagram

Current MCP servers:

MCP Data source URL (prod)
SharePoint Microsoft SharePoint / OneDrive sharepoint.connectors.novo-genai.com
Outlook Microsoft Exchange (mail + calendar) outlook.connectors.novo-genai.com
Teams Microsoft Teams (channels, messages, chats) teams.connectors.novo-genai.com
Databricks Azure Databricks workspaces databricks.connectors.novo-genai.com

2. Component Inventory#

Update this table whenever components are added, changed, or retired.

Shared Infrastructure (one per environment)#

Component Type Region Dev Prod Notes
AWS Account AWS 094069622854 673034950531 Separate accounts per env
VPC AWS VPC eu-central-1 10.0.0.0/16 10.0.0.0/16 3 public + 3 private subnets across 3 AZs
Public ALB Application Load Balancer eu-central-1 aiconnectors-public aiconnectors-public Internet-facing; TLS termination
ACM Certificate AWS ACM eu-central-1 *.dev.connectors.novo-genai.com *.connectors.novo-genai.com Wildcard; DNS-validated
ECS Cluster AWS ECS Fargate eu-central-1 aiconnectors-dev-aiconnectors aiconnectors-prod-aiconnectors Fargate only
Route53 Zone AWS Route53 Global dev.connectors.novo-genai.com (Z0335301LNP1JONT36JP) connectors.novo-genai.com (Z0176827309TAYY1VSM5G)
Audit S3 Bucket AWS S3 eu-central-1 nn-aiconnectors-audit-dev nn-aiconnectors-audit-prod AES256; versioned; 90d→Glacier; 365d→expire
Audit Firehose AWS Kinesis Firehose eu-central-1 aiconnectors-audit-dev aiconnectors-audit-prod NDJSON; 5 MB / 60s buffer → S3
Alarms SNS Topic AWS SNS eu-central-1 aiconnectors-alarms-dev aiconnectors-alarms-prod Email subscribers: team
SSM Parameter Store AWS SSM eu-central-1 /aiconnectors/dev/… /aiconnectors/prod/… SecureString (KMS-encrypted)

Per-MCP Resources (one set per MCP per environment)#

Component Type Dev example Prod example
ECR Repository AWS ECR 094069622854.dkr.ecr.eu-central-1.amazonaws.com/mcp-sharepoint 673034950531.dkr.ecr.eu-central-1.amazonaws.com/mcp-sharepoint
ECS Service AWS ECS mcp-sharepoint-main-svc mcp-sharepoint-main-svc
DynamoDB Table AWS DynamoDB mcp-oauth-storage-mcp-sharepoint-dev mcp-oauth-storage-mcp-sharepoint-prod
ECS Execution Role AWS IAM mcp-sharepoint-execution-role mcp-sharepoint-execution-role
ECS Task Role AWS IAM mcp-sharepoint-task-role mcp-sharepoint-task-role
Route53 Record A (alias) sharepoint.dev.connectors.novo-genai.com sharepoint.connectors.novo-genai.com
GitHub Actions Role AWS IAM mcp-sharepoint-github-actions mcp-sharepoint-github-actions

External Dependencies#

Component Type Owner Purpose
Azure AD Tenant Identity provider Microsoft / Novo Nordisk IT OAuth 2.0 authentication and OBO token exchange
Microsoft Graph API REST API Microsoft SharePoint, Outlook, Teams data
Azure Databricks Data platform Novo Nordisk Data & AI Databricks workspace access
GitHub Actions CI/CD GitHub Build, push, deploy pipeline

3. Network Architecture#

VPC Layout#

AWS VPC — 10.0.0.0/16 (eu-central-1)
├── Public Subnets (3 AZs)
│   ├── 10.0.64.0/20  (AZ-a)
│   ├── 10.0.80.0/20  (AZ-b)
│   └── 10.0.96.0/20  (AZ-c)
│   └── Public ALB  ← Internet traffic (HTTPS/443)
│   └── NAT Gateway  ← Outbound from private subnets
└── Private Subnets (3 AZs)
    ├── 10.0.0.0/20   (AZ-a)
    ├── 10.0.16.0/20  (AZ-b)
    └── 10.0.32.0/20  (AZ-c)
    └── ECS Fargate Tasks  ← MCP containers (port 80, internal only)

Traffic Routing#

Traffic path Protocol Notes
Internet → ALB HTTPS / 443 TLS terminated at ALB; ACM wildcard cert
ALBECS task HTTP / 80 Internal to VPC; ECS tasks in private subnets
ECS → Azure AD / Microsoft Graph HTTPS / 443 Via NAT Gateway → Internet
ECS → Databricks HTTPS / 443 Via NAT Gateway → Internet
ECS → DynamoDB HTTPS / 443 Via VPC endpoint (no NAT)
ECS → Kinesis Firehose HTTPS / 443 Via VPC
ECSSSM HTTPS / 443 Via VPC
ECSECR HTTPS / 443 Via S3 Gateway VPC Endpoint (no NAT cost)

ALB Host-Based Routing#

The single shared ALB routes requests to the correct MCP based on the Host header:

Host header Target ECS service ALB rule priority
sharepoint.{env}connectors.novo-genai.com mcp-sharepoint-main-svc 100
outlook.{env}connectors.novo-genai.com mcp-outlook-main-svc 110
teams.{env}connectors.novo-genai.com mcp-teams-main-svc 120
databricks.{env}connectors.novo-genai.com mcp-databricks-main-svc 130

4. Data Flow#

Common Flow (all MCPs)#

1. AI Client (e.g. Claude)
        │  HTTPS POST /mcp  Bearer: <Azure AD token A>
2. Route53 DNS → Public ALB
        │  TLS termination; host-based routing
3. ECS Fargate Task (FastMCP app, port 80)
        │  Validates token A (Azure AD JWKS)
        │  Exchanges token A → token B via OBO (Azure AD)
        │  Caches token B in DynamoDB (TTL ~1h)
        ├──► Azure AD token endpoint  [HTTPS/443]
        ├──► Microsoft Graph API / Databricks  [HTTPS/443, token B]
        └──► Kinesis Firehose  [audit record per tool call]
                  S3 (NDJSON audit log, partitioned by date)

5. Environment Separation#

Dev and prod are fully isolated at every layer. There is no shared infrastructure, shared credentials, or shared network path between environments.

Aspect Dev Prod
AWS Account 094069622854 (AWS-NN-AIconnectors-DEV) 673034950531 (AWS-NN-AIconnectors-PRD)
Region eu-central-1 eu-central-1
DNS zone dev.connectors.novo-genai.com connectors.novo-genai.com
ACM cert *.dev.connectors.novo-genai.com *.connectors.novo-genai.com
ECS cluster aiconnectors-dev-aiconnectors aiconnectors-prod-aiconnectors
ECR registries 094069622854.dkr.ecr.eu-central-1.amazonaws.com 673034950531.dkr.ecr.eu-central-1.amazonaws.com
Azure AD app regs Separate app registrations (RITM provisioned) Separate app registrations (RITM provisioned)
DynamoDB tables mcp-oauth-storage-mcp-{name}-dev mcp-oauth-storage-mcp-{name}-prod
SSM secrets /aiconnectors/dev/{mcp}/… /aiconnectors/prod/{mcp}/…
Audit bucket nn-aiconnectors-audit-dev nn-aiconnectors-audit-prod
Prod deploy gate environment: Production gate in GitHub Actions (requires approval)

6. ECS Task Configuration#

Canonical task configuration (CPU, memory, capacity provider, desired count, environment variables) lives in the source files and is merged at deploy time by ecs/render.py:

connectors/mcps/<name>/ecs/task-definition.base.yml
connectors/mcps/<name>/ecs/env/dev.yml
connectors/mcps/<name>/ecs/env/prod.yml

The following security-relevant settings apply to all MCP containers and should be verified during compliance review:

Setting Value
Base image python:3.13-slim
Health check endpoint GET /health
Read-only root filesystem Yes (SharePoint, Outlook); No (Teams, Databricks)
/tmp tmpfs 64 MB, rw,noexec,nosuid (SharePoint, Outlook)
Secrets injection SSM SecureString via ECS secrets (never plaintext env vars)

7. CI/CD Pipeline#

Each MCP is built and deployed by a shared reusable workflow (.github/workflows/tmpl-deploy-mcp.yml). The pipeline is triggered by changes to the MCP's source directory.

push to main
    └── lint (ruff + black) ──► [blocks if failed]
            └── build
                    └── docker build → push to dev ECR ({github.sha} tag)
                            └── apply-infra-dev (terragrunt apply)
                                    └── deploy-dev (render task def → ECS deploy)
                                            └── test-dev (BDD smoke tests)
                                                    └── [Production gate: manual approval]
                                                            └── deploy-prod (promote image dev ECR → prod ECR → ECS deploy)

Image tagging: {account}.dkr.ecr.eu-central-1.amazonaws.com/{image-name}:{github.sha}


8. Integration Points#

Integration Direction Protocol / Port Frequency Data exchanged
Microsoft Graph API (graph.microsoft.com) Outbound HTTPS / 443 Per tool call (on-demand) SharePoint files/metadata, email, calendar, Teams messages
Azure AD token endpoint (login.microsoftonline.com) Outbound HTTPS / 443 Per user session (then cached ~1h in DynamoDB) OAuth2 access tokens
Azure Databricks workspaces Outbound HTTPS / 443 Per tool call (on-demand) SQL query results, workspace metadata
AWS Kinesis Firehose Outbound HTTPS / 443 Per tool call Audit NDJSON records
AWS DynamoDB Outbound HTTPS / 443 Per tool call (read) + per token refresh (write) OAuth token cache
AWS SSM Parameter Store Outbound HTTPS / 443 At container startup Azure AD client ID + client secret