Deploying Patient Journey Intelligence on AWS

Patient Journey Intelligence can be deployed entirely within your AWS infrastructure, ensuring that your healthcare data never leaves your environment and remains under your complete control.

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                           AWS Account                            │
│                                                                   │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │                        VPC                                  │ │
│  │                                                             │ │
│  │  ┌──────────────┐  ┌───────────────┐  ┌────────────────┐ │ │
│  │  │   Internet   │  │  Application  │  │    Private     │ │ │
│  │  │    Gateway   │  │  Load Balancer│  │    Subnets     │ │ │
│  │  └──────┬───────┘  └───────┬───────┘  └────────┬───────┘ │ │
│  │         │                   │                    │         │ │
│  │         ▼                   ▼                    ▼         │ │
│  │  ┌──────────────────────────────────────────────────────┐ │ │
│  │  │              EKS Cluster (Kubernetes)                │ │ │
│  │  │                                                      │ │ │
│  │  │  ┌─────────────┐  ┌──────────────┐ ┌────────────┐ │ │ │
│  │  │  │   Web UI    │  │  API Server  │ │NLP Pipeline│ │ │ │
│  │  │  │   (React)   │  │   (REST)     │ │  (Spark)   │ │ │ │
│  │  │  └─────────────┘  └──────────────┘ └────────────┘ │ │ │
│  │  │                                                      │ │ │
│  │  │  ┌─────────────┐  ┌──────────────┐ ┌────────────┐ │ │ │
│  │  │  │ Ingestion   │  │ De-ID Engine │ │ Terminology│ │ │ │
│  │  │  │   Workers   │  │              │ │   Server   │ │ │ │
│  │  │  └─────────────┘  └──────────────┘ └────────────┘ │ │ │
│  │  └──────────────────────────────────────────────────────┘ │ │
│  │                                                             │ │
│  └─────────────────────────────────────────────────────────────┘ │
│                                                                   │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐ │
│  │   RDS Aurora    │  │      S3         │  │  ElastiCache    │ │
│  │   (PostgreSQL)  │  │  (Data Lake)    │  │    (Redis)      │ │
│  │                 │  │                 │  │                 │ │
│  │  - OMOP CDM     │  │  - Raw Files    │  │  - Session Mgmt │ │
│  │  - Metadata     │  │  - Documents    │  │  - Query Cache  │ │
│  │  - Audit Logs   │  │  - DICOM Images │  │                 │ │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘ │
│                                                                   │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐ │
│  │   Secrets Mgr   │  │   CloudWatch    │  │   AWS Backup    │ │
│  │                 │  │                 │  │                 │ │
│  │  - DB Creds     │  │  - Logs         │  │  - RDS Backups  │ │
│  │  - API Keys     │  │  - Metrics      │  │  - S3 Snapshots │ │
│  │  - Certificates │  │  - Alarms       │  │                 │ │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘ │
│                                                                   │
└───────────────────────────────────────────────────────────────────┘

Key Components

Compute Layer

Amazon EKS (Elastic Kubernetes Service)

Orchestrates all application containers
Auto-scaling for compute workloads
High availability across multiple AZs
Managed Kubernetes control plane

EC2 Instance Types

Web/API Services: t3.xlarge or m5.xlarge
NLP Processing: c5.4xlarge or c5.9xlarge (CPU-optimized)
Spark Jobs: r5.2xlarge or r5.4xlarge (memory-optimized)

Data Storage

Amazon RDS Aurora PostgreSQL

OMOP CDM database (primary data store)
Multi-AZ deployment for high availability
Automated backups and point-in-time recovery
Instance size: db.r5.2xlarge or larger

Amazon S3

Raw clinical data files
Unstructured documents and PDFs
DICOM imaging metadata
Archive and backup storage
Lifecycle policies for cost optimization

Amazon ElastiCache (Redis)

Session management
Query result caching
Real-time processing queues

Networking & Security

VPC Configuration

Private subnets for application and data layers
Public subnets for load balancers only
NAT Gateway for outbound internet access
VPC Endpoints for AWS services (no internet routing)

Security Groups

Strict ingress/egress rules
Least privilege access
Application layer segmentation

AWS Secrets Manager

Database credentials rotation
API keys and certificates
Encryption at rest

Monitoring & Operations

Amazon CloudWatch

Application and infrastructure logs
Custom metrics and dashboards
Automated alarms and notifications
Log retention policies

AWS Backup

Automated RDS snapshots
S3 versioning and replication
Cross-region backup replication (optional)

Deployment Options

Option 1: Fully Managed by John Snow Labs

John Snow Labs operates the infrastructure within your AWS account
We handle updates, monitoring, and maintenance
You retain full data ownership and control
Access via AWS PrivateLink or VPN

Option 2: Self-Managed

You operate the infrastructure
John Snow Labs provides deployment automation (Terraform/CloudFormation)
Technical support and runbooks included
Update packages delivered as container images

Option 3: Hybrid

Core services managed by John Snow Labs
You manage networking and security controls
Shared responsibility model

Prerequisites

AWS Account Requirements

AWS Organization with appropriate service limits
IAM permissions to create:
- EKS clusters
- RDS instances
- S3 buckets
- VPC and networking resources
- IAM roles and policies

Networking Requirements

Dedicated VPC (recommended /16 CIDR block)
Minimum 3 Availability Zones
Private subnets for workloads
Public subnets for load balancers
Internet Gateway or NAT Gateway
Optional: Direct Connect or VPN to on-premises systems

Compliance & Security

AWS Config for compliance monitoring
AWS CloudTrail for audit logging
Encryption at rest (KMS)
Encryption in transit (TLS 1.2+)
HIPAA-eligible AWS services only

Installation Process

1. Infrastructure Provisioning (Week 1-2)

# Using Terraform (provided by John Snow Labs)
terraform init
terraform plan -var-file="production.tfvars"
terraform apply

Provisions:

VPC and networking
EKS cluster
RDS Aurora cluster
S3 buckets
Security groups and IAM roles

2. Application Deployment (Week 3-4)

# Deploy using Helm charts
helm repo add jsl https://charts.johnsnowlabs.com
helm install patient-journey jsl/patient-journey-intelligence \
  --namespace Patient Journey Intelligence \
  --values custom-values.yaml

Deploys:

Application containers
Ingestion workers
NLP pipeline services
Web UI and API server

3. Data Source Integration (Week 5-6)

Configure source connections
Set up ingestion schedules
Test data extraction
Validate de-identification

4. Validation & Go-Live (Week 7-8)

User acceptance testing
Performance validation
Security audit
Production cutover

Resource Sizing Guidelines

Small Deployment (< 100K patients)

Component	Specification
EKS Nodes	3 x m5.xlarge
RDS Aurora	db.r5.large
S3 Storage	500 GB
ElastiCache	cache.t3.medium
Estimated Cost	$2,500 - $3,500/month

Medium Deployment (100K - 1M patients)

Component	Specification
EKS Nodes	6 x m5.2xlarge
RDS Aurora	db.r5.2xlarge
S3 Storage	5 TB
ElastiCache	cache.r5.large
Estimated Cost	$8,000 - $12,000/month

Large Deployment (> 1M patients)

Component	Specification
EKS Nodes	12 x m5.4xlarge
RDS Aurora	db.r5.4xlarge (with read replicas)
S3 Storage	50+ TB
ElastiCache	cache.r5.xlarge
Estimated Cost	$20,000 - $35,000/month

Costs exclude data transfer and backup storage

High Availability Configuration

Multi-AZ Deployment

Application pods distributed across 3 AZs
RDS Aurora with Multi-AZ failover
S3 cross-region replication (optional)
Load balancer health checks and auto-recovery

Disaster Recovery

RTO (Recovery Time Objective): < 1 hour
RPO (Recovery Point Objective): < 15 minutes
Automated backup to separate AWS region
Documented runbooks for failover scenarios

Security Best Practices

Network Security

Private subnets for all data processing
Network ACLs and security groups
VPC Flow Logs enabled
No direct internet access for compute nodes

Data Encryption

At Rest: AWS KMS encryption for RDS, S3, EBS volumes
In Transit: TLS 1.2+ for all communications
Application Level: Additional de-identification layer

Access Control

IAM roles with least privilege
MFA required for administrative access
AWS SSO integration
Audit logging via CloudTrail

Compliance

HIPAA-eligible services only
Business Associate Agreement (BAA) with AWS
Regular security assessments
Compliance reporting via AWS Config

Monitoring & Alerting

Key Metrics

EKS cluster health and resource utilization
RDS performance metrics (CPU, memory, IOPS)
Application response times
Data ingestion throughput
NLP processing latency
Failed job counts

Automated Alerts

Service degradation
Resource exhaustion
Security events
Failed backups
Unusual data access patterns

Maintenance & Updates

Regular Maintenance Windows

Monthly: OS and security patches
Quarterly: Application updates
Annually: Major version upgrades

Zero-Downtime Updates

Rolling updates for application containers
Blue-green deployment for major changes
Database schema migrations with minimal downtime

Cost Optimization

Reserved Instances

1-year or 3-year RDS Reserved Instances (30-60% savings)
EC2 Savings Plans for EKS nodes

S3 Lifecycle Policies

Move infrequent data to S3 Glacier after 90 days
Archive completed jobs to S3 Deep Archive

Auto-Scaling

Scale EKS nodes based on workload demand
Schedule scale-down during off-peak hours

Support & Operations

Included Support

24/7 infrastructure monitoring
Automated alerting and incident response
Quarterly health checks and optimization reviews
Access to John Snow Labs engineering team

Optional Services

Managed operations (fully managed option)
Custom integration development
Training and onboarding
Dedicated technical account manager

Next Steps

Architecture Review: Schedule a call with John Snow Labs solutions architects
AWS Account Preparation: Set up VPC, networking, and IAM permissions
Deployment Planning: Define timeline and resource requirements
Proof of Concept: Optional pilot deployment with sample data

Architecture Overview​

Key Components​

Compute Layer​

Data Storage​

Networking & Security​

Monitoring & Operations​

Deployment Options​

Option 1: Fully Managed by John Snow Labs​

Option 2: Self-Managed​

Option 3: Hybrid​

Prerequisites​

AWS Account Requirements​

Networking Requirements​

Compliance & Security​

Installation Process​

1. Infrastructure Provisioning (Week 1-2)​

2. Application Deployment (Week 3-4)​

3. Data Source Integration (Week 5-6)​

4. Validation & Go-Live (Week 7-8)​

Resource Sizing Guidelines​

Small Deployment (< 100K patients)​

Medium Deployment (100K - 1M patients)​

Large Deployment (> 1M patients)​

High Availability Configuration​

Multi-AZ Deployment​

Disaster Recovery​

Security Best Practices​

Network Security​

Data Encryption​

Access Control​

Compliance​

Monitoring & Alerting​

Key Metrics​

Automated Alerts​

Maintenance & Updates​

Regular Maintenance Windows​

Zero-Downtime Updates​

Cost Optimization​

Reserved Instances​

S3 Lifecycle Policies​

Auto-Scaling​

Support & Operations​

Included Support​

Optional Services​

Next Steps​

Additional Resources​