Skip to main content

Deploying Patient Journey Intelligence on AWS

Patient Journey Intelligence can be deployed entirely within your AWS infrastructure, ensuring that your healthcare data never leaves your environment and remains under your complete control.

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│ AWS Account │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ VPC │ │
│ │ │ │
│ │ ┌──────────────┐ ┌───────────────┐ ┌────────────────┐ │ │
│ │ │ Internet │ │ Application │ │ Private │ │ │
│ │ │ Gateway │ │ Load Balancer│ │ Subnets │ │ │
│ │ └──────┬───────┘ └───────┬───────┘ └────────┬───────┘ │ │
│ │ │ │ │ │ │
│ │ ▼ ▼ ▼ │ │
│ │ ┌──────────────────────────────────────────────────────┐ │ │
│ │ │ EKS Cluster (Kubernetes) │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────┐ ┌──────────────┐ ┌────────────┐ │ │ │
│ │ │ │ Web UI │ │ API Server │ │NLP Pipeline│ │ │ │
│ │ │ │ (React) │ │ (REST) │ │ (Spark) │ │ │ │
│ │ │ └─────────────┘ └──────────────┘ └────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────┐ ┌──────────────┐ ┌────────────┐ │ │ │
│ │ │ │ Ingestion │ │ De-ID Engine │ │ Terminology│ │ │ │
│ │ │ │ Workers │ │ │ │ Server │ │ │ │
│ │ │ └─────────────┘ └──────────────┘ └────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ RDS Aurora │ │ S3 │ │ ElastiCache │ │
│ │ (PostgreSQL) │ │ (Data Lake) │ │ (Redis) │ │
│ │ │ │ │ │ │ │
│ │ - OMOP CDM │ │ - Raw Files │ │ - Session Mgmt │ │
│ │ - Metadata │ │ - Documents │ │ - Query Cache │ │
│ │ - Audit Logs │ │ - DICOM Images │ │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Secrets Mgr │ │ CloudWatch │ │ AWS Backup │ │
│ │ │ │ │ │ │ │
│ │ - DB Creds │ │ - Logs │ │ - RDS Backups │ │
│ │ - API Keys │ │ - Metrics │ │ - S3 Snapshots │ │
│ │ - Certificates │ │ - Alarms │ │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
└───────────────────────────────────────────────────────────────────┘

Key Components

Compute Layer

Amazon EKS (Elastic Kubernetes Service)

  • Orchestrates all application containers
  • Auto-scaling for compute workloads
  • High availability across multiple AZs
  • Managed Kubernetes control plane

EC2 Instance Types

  • Web/API Services: t3.xlarge or m5.xlarge
  • NLP Processing: c5.4xlarge or c5.9xlarge (CPU-optimized)
  • Spark Jobs: r5.2xlarge or r5.4xlarge (memory-optimized)

Data Storage

Amazon RDS Aurora PostgreSQL

  • OMOP CDM database (primary data store)
  • Multi-AZ deployment for high availability
  • Automated backups and point-in-time recovery
  • Instance size: db.r5.2xlarge or larger

Amazon S3

  • Raw clinical data files
  • Unstructured documents and PDFs
  • DICOM imaging metadata
  • Archive and backup storage
  • Lifecycle policies for cost optimization

Amazon ElastiCache (Redis)

  • Session management
  • Query result caching
  • Real-time processing queues

Networking & Security

VPC Configuration

  • Private subnets for application and data layers
  • Public subnets for load balancers only
  • NAT Gateway for outbound internet access
  • VPC Endpoints for AWS services (no internet routing)

Security Groups

  • Strict ingress/egress rules
  • Least privilege access
  • Application layer segmentation

AWS Secrets Manager

  • Database credentials rotation
  • API keys and certificates
  • Encryption at rest

Monitoring & Operations

Amazon CloudWatch

  • Application and infrastructure logs
  • Custom metrics and dashboards
  • Automated alarms and notifications
  • Log retention policies

AWS Backup

  • Automated RDS snapshots
  • S3 versioning and replication
  • Cross-region backup replication (optional)

Deployment Options

Option 1: Fully Managed by John Snow Labs

  • John Snow Labs operates the infrastructure within your AWS account
  • We handle updates, monitoring, and maintenance
  • You retain full data ownership and control
  • Access via AWS PrivateLink or VPN

Option 2: Self-Managed

  • You operate the infrastructure
  • John Snow Labs provides deployment automation (Terraform/CloudFormation)
  • Technical support and runbooks included
  • Update packages delivered as container images

Option 3: Hybrid

  • Core services managed by John Snow Labs
  • You manage networking and security controls
  • Shared responsibility model

Prerequisites

AWS Account Requirements

  • AWS Organization with appropriate service limits
  • IAM permissions to create:
    • EKS clusters
    • RDS instances
    • S3 buckets
    • VPC and networking resources
    • IAM roles and policies

Networking Requirements

  • Dedicated VPC (recommended /16 CIDR block)
  • Minimum 3 Availability Zones
  • Private subnets for workloads
  • Public subnets for load balancers
  • Internet Gateway or NAT Gateway
  • Optional: Direct Connect or VPN to on-premises systems

Compliance & Security

  • AWS Config for compliance monitoring
  • AWS CloudTrail for audit logging
  • Encryption at rest (KMS)
  • Encryption in transit (TLS 1.2+)
  • HIPAA-eligible AWS services only

Installation Process

1. Infrastructure Provisioning (Week 1-2)

# Using Terraform (provided by John Snow Labs)
terraform init
terraform plan -var-file="production.tfvars"
terraform apply

Provisions:

  • VPC and networking
  • EKS cluster
  • RDS Aurora cluster
  • S3 buckets
  • Security groups and IAM roles

2. Application Deployment (Week 3-4)

# Deploy using Helm charts
helm repo add jsl https://charts.johnsnowlabs.com
helm install patient-journey jsl/patient-journey-intelligence \
--namespace Patient Journey Intelligence \
--values custom-values.yaml

Deploys:

  • Application containers
  • Ingestion workers
  • NLP pipeline services
  • Web UI and API server

3. Data Source Integration (Week 5-6)

  • Configure source connections
  • Set up ingestion schedules
  • Test data extraction
  • Validate de-identification

4. Validation & Go-Live (Week 7-8)

  • User acceptance testing
  • Performance validation
  • Security audit
  • Production cutover

Resource Sizing Guidelines

Small Deployment (< 100K patients)

ComponentSpecification
EKS Nodes3 x m5.xlarge
RDS Auroradb.r5.large
S3 Storage500 GB
ElastiCachecache.t3.medium
Estimated Cost$2,500 - $3,500/month

Medium Deployment (100K - 1M patients)

ComponentSpecification
EKS Nodes6 x m5.2xlarge
RDS Auroradb.r5.2xlarge
S3 Storage5 TB
ElastiCachecache.r5.large
Estimated Cost$8,000 - $12,000/month

Large Deployment (> 1M patients)

ComponentSpecification
EKS Nodes12 x m5.4xlarge
RDS Auroradb.r5.4xlarge (with read replicas)
S3 Storage50+ TB
ElastiCachecache.r5.xlarge
Estimated Cost$20,000 - $35,000/month

Costs exclude data transfer and backup storage

High Availability Configuration

Multi-AZ Deployment

  • Application pods distributed across 3 AZs
  • RDS Aurora with Multi-AZ failover
  • S3 cross-region replication (optional)
  • Load balancer health checks and auto-recovery

Disaster Recovery

  • RTO (Recovery Time Objective): < 1 hour
  • RPO (Recovery Point Objective): < 15 minutes
  • Automated backup to separate AWS region
  • Documented runbooks for failover scenarios

Security Best Practices

Network Security

  • Private subnets for all data processing
  • Network ACLs and security groups
  • VPC Flow Logs enabled
  • No direct internet access for compute nodes

Data Encryption

  • At Rest: AWS KMS encryption for RDS, S3, EBS volumes
  • In Transit: TLS 1.2+ for all communications
  • Application Level: Additional de-identification layer

Access Control

  • IAM roles with least privilege
  • MFA required for administrative access
  • AWS SSO integration
  • Audit logging via CloudTrail

Compliance

  • HIPAA-eligible services only
  • Business Associate Agreement (BAA) with AWS
  • Regular security assessments
  • Compliance reporting via AWS Config

Monitoring & Alerting

Key Metrics

  • EKS cluster health and resource utilization
  • RDS performance metrics (CPU, memory, IOPS)
  • Application response times
  • Data ingestion throughput
  • NLP processing latency
  • Failed job counts

Automated Alerts

  • Service degradation
  • Resource exhaustion
  • Security events
  • Failed backups
  • Unusual data access patterns

Maintenance & Updates

Regular Maintenance Windows

  • Monthly: OS and security patches
  • Quarterly: Application updates
  • Annually: Major version upgrades

Zero-Downtime Updates

  • Rolling updates for application containers
  • Blue-green deployment for major changes
  • Database schema migrations with minimal downtime

Cost Optimization

Reserved Instances

  • 1-year or 3-year RDS Reserved Instances (30-60% savings)
  • EC2 Savings Plans for EKS nodes

S3 Lifecycle Policies

  • Move infrequent data to S3 Glacier after 90 days
  • Archive completed jobs to S3 Deep Archive

Auto-Scaling

  • Scale EKS nodes based on workload demand
  • Schedule scale-down during off-peak hours

Support & Operations

Included Support

  • 24/7 infrastructure monitoring
  • Automated alerting and incident response
  • Quarterly health checks and optimization reviews
  • Access to John Snow Labs engineering team

Optional Services

  • Managed operations (fully managed option)
  • Custom integration development
  • Training and onboarding
  • Dedicated technical account manager

Next Steps

  1. Architecture Review: Schedule a call with John Snow Labs solutions architects
  2. AWS Account Preparation: Set up VPC, networking, and IAM permissions
  3. Deployment Planning: Define timeline and resource requirements
  4. Proof of Concept: Optional pilot deployment with sample data

Additional Resources