Deploying Patient Journey Intelligence on AWS
Patient Journey Intelligence can be deployed entirely within your AWS infrastructure, ensuring that your healthcare data never leaves your environment and remains under your complete control.
Architecture Overview
┌─────────────────────────────────────────────────────────────────┐
│ AWS Account │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ VPC │ │
│ │ │ │
│ │ ┌──────────────┐ ┌───────────────┐ ┌────────────────┐ │ │
│ │ │ Internet │ │ Application │ │ Private │ │ │
│ │ │ Gateway │ │ Load Balancer│ │ Subnets │ │ │
│ │ └──────┬───────┘ └───────┬───────┘ └────────┬───────┘ │ │
│ │ │ │ │ │ │
│ │ ▼ ▼ ▼ │ │
│ │ ┌──────────────────────────────────────────────────────┐ │ │
│ │ │ EKS Cluster (Kubernetes) │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────┐ ┌──────────────┐ ┌────────────┐ │ │ │
│ │ │ │ Web UI │ │ API Server │ │NLP Pipeline│ │ │ │
│ │ │ │ (React) │ │ (REST) │ │ (Spark) │ │ │ │
│ │ │ └─────────────┘ └──────────────┘ └────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────┐ ┌──────────────┐ ┌────────────┐ │ │ │
│ │ │ │ Ingestion │ │ De-ID Engine │ │ Terminology│ │ │ │
│ │ │ │ Workers │ │ │ │ Server │ │ │ │
│ │ │ └─────────────┘ └──────────────┘ └────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ RDS Aurora │ │ S3 │ │ ElastiCache │ │
│ │ (PostgreSQL) │ │ (Data Lake) │ │ (Redis) │ │
│ │ │ │ │ │ │ │
│ │ - OMOP CDM │ │ - Raw Files │ │ - Session Mgmt │ │
│ │ - Metadata │ │ - Documents │ │ - Query Cache │ │
│ │ - Audit Logs │ │ - DICOM Images │ │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Secrets Mgr │ │ CloudWatch │ │ AWS Backup │ │
│ │ │ │ │ │ │ │
│ │ - DB Creds │ │ - Logs │ │ - RDS Backups │ │
│ │ - API Keys │ │ - Metrics │ │ - S3 Snapshots │ │
│ │ - Certificates │ │ - Alarms │ │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
└───────────────────────────────────────────────────────────────────┘
Key Components
Compute Layer
Amazon EKS (Elastic Kubernetes Service)
- Orchestrates all application containers
- Auto-scaling for compute workloads
- High availability across multiple AZs
- Managed Kubernetes control plane
EC2 Instance Types
- Web/API Services: t3.xlarge or m5.xlarge
- NLP Processing: c5.4xlarge or c5.9xlarge (CPU-optimized)
- Spark Jobs: r5.2xlarge or r5.4xlarge (memory-optimized)
Data Storage
Amazon RDS Aurora PostgreSQL
- OMOP CDM database (primary data store)
- Multi-AZ deployment for high availability
- Automated backups and point-in-time recovery
- Instance size: db.r5.2xlarge or larger
Amazon S3
- Raw clinical data files
- Unstructured documents and PDFs
- DICOM imaging metadata
- Archive and backup storage
- Lifecycle policies for cost optimization
Amazon ElastiCache (Redis)
- Session management
- Query result caching
- Real-time processing queues
Networking & Security
VPC Configuration
- Private subnets for application and data layers
- Public subnets for load balancers only
- NAT Gateway for outbound internet access
- VPC Endpoints for AWS services (no internet routing)
Security Groups
- Strict ingress/egress rules
- Least privilege access
- Application layer segmentation
AWS Secrets Manager
- Database credentials rotation
- API keys and certificates
- Encryption at rest
Monitoring & Operations
Amazon CloudWatch
- Application and infrastructure logs
- Custom metrics and dashboards
- Automated alarms and notifications
- Log retention policies
AWS Backup
- Automated RDS snapshots
- S3 versioning and replication
- Cross-region backup replication (optional)
Deployment Options
Option 1: Fully Managed by John Snow Labs
- John Snow Labs operates the infrastructure within your AWS account
- We handle updates, monitoring, and maintenance
- You retain full data ownership and control
- Access via AWS PrivateLink or VPN
Option 2: Self-Managed
- You operate the infrastructure
- John Snow Labs provides deployment automation (Terraform/CloudFormation)
- Technical support and runbooks included
- Update packages delivered as container images
Option 3: Hybrid
- Core services managed by John Snow Labs
- You manage networking and security controls
- Shared responsibility model
Prerequisites
AWS Account Requirements
- AWS Organization with appropriate service limits
- IAM permissions to create:
- EKS clusters
- RDS instances
- S3 buckets
- VPC and networking resources
- IAM roles and policies
Networking Requirements
- Dedicated VPC (recommended /16 CIDR block)
- Minimum 3 Availability Zones
- Private subnets for workloads
- Public subnets for load balancers
- Internet Gateway or NAT Gateway
- Optional: Direct Connect or VPN to on-premises systems
Compliance & Security
- AWS Config for compliance monitoring
- AWS CloudTrail for audit logging
- Encryption at rest (KMS)
- Encryption in transit (TLS 1.2+)
- HIPAA-eligible AWS services only
Installation Process
1. Infrastructure Provisioning (Week 1-2)
# Using Terraform (provided by John Snow Labs)
terraform init
terraform plan -var-file="production.tfvars"
terraform apply
Provisions:
- VPC and networking
- EKS cluster
- RDS Aurora cluster
- S3 buckets
- Security groups and IAM roles
2. Application Deployment (Week 3-4)
# Deploy using Helm charts
helm repo add jsl https://charts.johnsnowlabs.com
helm install patient-journey jsl/patient-journey-intelligence \
--namespace Patient Journey Intelligence \
--values custom-values.yaml
Deploys:
- Application containers
- Ingestion workers
- NLP pipeline services
- Web UI and API server
3. Data Source Integration (Week 5-6)
- Configure source connections
- Set up ingestion schedules
- Test data extraction
- Validate de-identification
4. Validation & Go-Live (Week 7-8)
- User acceptance testing
- Performance validation
- Security audit
- Production cutover
Resource Sizing Guidelines
Small Deployment (< 100K patients)
| Component | Specification |
|---|---|
| EKS Nodes | 3 x m5.xlarge |
| RDS Aurora | db.r5.large |
| S3 Storage | 500 GB |
| ElastiCache | cache.t3.medium |
| Estimated Cost | $2,500 - $3,500/month |
Medium Deployment (100K - 1M patients)
| Component | Specification |
|---|---|
| EKS Nodes | 6 x m5.2xlarge |
| RDS Aurora | db.r5.2xlarge |
| S3 Storage | 5 TB |
| ElastiCache | cache.r5.large |
| Estimated Cost | $8,000 - $12,000/month |
Large Deployment (> 1M patients)
| Component | Specification |
|---|---|
| EKS Nodes | 12 x m5.4xlarge |
| RDS Aurora | db.r5.4xlarge (with read replicas) |
| S3 Storage | 50+ TB |
| ElastiCache | cache.r5.xlarge |
| Estimated Cost | $20,000 - $35,000/month |
Costs exclude data transfer and backup storage
High Availability Configuration
Multi-AZ Deployment
- Application pods distributed across 3 AZs
- RDS Aurora with Multi-AZ failover
- S3 cross-region replication (optional)
- Load balancer health checks and auto-recovery
Disaster Recovery
- RTO (Recovery Time Objective): < 1 hour
- RPO (Recovery Point Objective): < 15 minutes
- Automated backup to separate AWS region
- Documented runbooks for failover scenarios
Security Best Practices
Network Security
- Private subnets for all data processing
- Network ACLs and security groups
- VPC Flow Logs enabled
- No direct internet access for compute nodes
Data Encryption
- At Rest: AWS KMS encryption for RDS, S3, EBS volumes
- In Transit: TLS 1.2+ for all communications
- Application Level: Additional de-identification layer
Access Control
- IAM roles with least privilege
- MFA required for administrative access
- AWS SSO integration
- Audit logging via CloudTrail
Compliance
- HIPAA-eligible services only
- Business Associate Agreement (BAA) with AWS
- Regular security assessments
- Compliance reporting via AWS Config
Monitoring & Alerting
Key Metrics
- EKS cluster health and resource utilization
- RDS performance metrics (CPU, memory, IOPS)
- Application response times
- Data ingestion throughput
- NLP processing latency
- Failed job counts
Automated Alerts
- Service degradation
- Resource exhaustion
- Security events
- Failed backups
- Unusual data access patterns
Maintenance & Updates
Regular Maintenance Windows
- Monthly: OS and security patches
- Quarterly: Application updates
- Annually: Major version upgrades
Zero-Downtime Updates
- Rolling updates for application containers
- Blue-green deployment for major changes
- Database schema migrations with minimal downtime
Cost Optimization
Reserved Instances
- 1-year or 3-year RDS Reserved Instances (30-60% savings)
- EC2 Savings Plans for EKS nodes
S3 Lifecycle Policies
- Move infrequent data to S3 Glacier after 90 days
- Archive completed jobs to S3 Deep Archive
Auto-Scaling
- Scale EKS nodes based on workload demand
- Schedule scale-down during off-peak hours
Support & Operations
Included Support
- 24/7 infrastructure monitoring
- Automated alerting and incident response
- Quarterly health checks and optimization reviews
- Access to John Snow Labs engineering team
Optional Services
- Managed operations (fully managed option)
- Custom integration development
- Training and onboarding
- Dedicated technical account manager
Next Steps
- Architecture Review: Schedule a call with John Snow Labs solutions architects
- AWS Account Preparation: Set up VPC, networking, and IAM permissions
- Deployment Planning: Define timeline and resource requirements
- Proof of Concept: Optional pilot deployment with sample data