Skip to main content

On-Premise Deployment

Patient Journey Intelligence can be deployed entirely on your on-premise infrastructure for organizations with strict data residency requirements or existing datacenter investments.

Architecture Overview

┌──────────────────────────────────────────────────────────────────┐
│ On-Premise Datacenter │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Kubernetes Cluster (On-Prem) │ │
│ │ │ │
│ │ ┌─────────┐ ┌──────────┐ ┌──────────┐ ┌────────────────┐ │ │
│ │ │ Master │ │ Master │ │ Master │ │ Worker Nodes │ │ │
│ │ │ Node 1 │ │ Node 2 │ │ Node 3 │ │ (10-50+) │ │ │
│ │ └─────────┘ └──────────┘ └──────────┘ └────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────┐ │ │
│ │ │ Application Workloads │ │ │
│ │ │ ┌──────────┐ ┌──────────┐ ┌───────────────────┐ │ │ │
│ │ │ │ Web UI │ │API Server│ │ NLP Pipeline │ │ │ │
│ │ │ └──────────┘ └──────────┘ └───────────────────┘ │ │ │
│ │ │ ┌──────────┐ ┌──────────┐ ┌───────────────────┐ │ │ │
│ │ │ │Ingestion │ │ De-ID │ │ Terminology Svc │ │ │ │
│ │ │ └──────────┘ └──────────┘ └───────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ PostgreSQL │ │ NFS/SAN │ │ Redis Cluster │ │
│ │ HA Cluster │ │ Storage │ │ │ │
│ │ │ │ │ │ - Caching │ │
│ │ - OMOP CDM │ │ - Documents │ │ - Session Management │ │
│ │ - Metadata │ │ - Files │ │ │ │
│ │ - 3 nodes │ │ - Backups │ │ (3 node cluster) │ │
│ └──────────────┘ └──────────────┘ └────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ Load │ │ Monitoring │ │ Backup & Recovery │ │
│ │ Balancer │ │ │ │ │ │
│ │ (HAProxy/ │ │ - Prometheus │ │ - Velero (K8s) │ │
│ │ Nginx) │ │ - Grafana │ │ - pgBackRest (DB) │ │
│ │ │ │ - ELK Stack │ │ - File-level backups │ │
│ └──────────────┘ └──────────────┘ └────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────┘

Infrastructure Requirements

Compute

Kubernetes Master Nodes (3 required for HA):

  • CPU: 8 cores per node
  • RAM: 32 GB per node
  • Storage: 200 GB SSD per node
  • OS: Ubuntu 22.04 LTS, RHEL 8/9, or Rocky Linux 9

Kubernetes Worker Nodes (minimum 3, recommended 10+):

  • CPU: 16-32 cores per node
  • RAM: 64-128 GB per node
  • Storage: 500 GB SSD per node (for container images and local volumes)

Total Minimum:

  • 18 servers (3 masters + 15 workers for medium deployment)
  • 384 CPU cores
  • 1.5 TB RAM

Storage

Shared Storage (NFS, CephFS, GlusterFS, or enterprise SAN):

  • Capacity: 10 TB - 100 TB (depends on data volume)
  • IOPS: 10,000+ for database workloads
  • Latency: < 5ms
  • Redundancy: RAID 10 or equivalent

Database Storage:

  • Dedicated SSD/NVMe storage for PostgreSQL
  • 5 TB - 50 TB depending on patient volume
  • 20,000+ IOPS

Networking

  • Internal Network: 10 Gbps between nodes
  • Load Balancer: HA proxy or hardware load balancer
  • Firewall: Access control for internal/external traffic
  • VPN/Bastion: Secure remote access

Kubernetes Distribution

Supported distributions:

  • Vanilla Kubernetes (kubeadm)
  • Red Hat OpenShift
  • Rancher Kubernetes Engine (RKE)
  • VMware Tanzu
  • SUSE Rancher

Software Stack

Core Components

ComponentTechnologyPurpose
Container OrchestrationKubernetes 1.26+Manage application containers
DatabasePostgreSQL 14+ (HA cluster)OMOP CDM storage
CachingRedis 7+ (cluster mode)Session and query caching
StorageNFS/CephFS/GlusterFSShared file storage
Load BalancerHAProxy/NginxTraffic distribution
MonitoringPrometheus + GrafanaMetrics and dashboards
LoggingELK Stack or LokiCentralized logging
BackupVelero + pgBackRestDisaster recovery

Container Registry

Options:

  • Harbor (recommended for on-prem)
  • JFrog Artifactory
  • Nexus Repository
  • Docker Trusted Registry

Installation Process

1. Infrastructure Provisioning (Week 1-2)

Physical/Virtual Server Setup:

  • Provision servers according to specifications
  • Configure networking and storage
  • Install base OS

Kubernetes Cluster Deployment:

# Using kubeadm (example)
kubeadm init --control-plane-endpoint="lb.example.com:6443" --upload-certs
kubeadm join <master-endpoint> --token <token> --discovery-token-ca-cert-hash <hash>

Persistent Storage:

# Install NFS provisioner or Rook-Ceph
kubectl apply -f nfs-provisioner.yaml

2. Database Cluster Setup (Week 2)

PostgreSQL HA Cluster:

  • Deploy Patroni or Stolon for PostgreSQL HA
  • Configure streaming replication
  • Set up automated backups
# Example: Deploy PostgreSQL with Helm
helm install postgresql bitnami/postgresql-ha \
--set postgresql.replicaCount=3 \
--set persistence.size=1Ti

3. Application Deployment (Week 3-4)

Load Patient Journey Intelligence container images from John Snow Labs:

# Pull images to private registry
docker pull jsl.ocir.io/patient-journey:5.2.0
docker tag jsl.ocir.io/patient-journey:5.2.0 registry.local/patient-journey:5.2.0
docker push registry.local/patient-journey:5.2.0

Deploy with Helm:

helm install patient-journey jsl/patient-journey-intelligence \
--namespace Patient Journey Intelligence \
--values on-prem-values.yaml

4. Data Integration (Week 5-6)

Configure connections to on-premise EHR and clinical systems.

5. Testing & Go-Live (Week 7-8)

User acceptance testing and production cutover.

High Availability Configuration

Database HA

  • Primary-Replica Setup: 1 primary + 2 replicas
  • Automatic Failover: Patroni or Stolon
  • Backup Strategy: Daily full + continuous WAL archiving

Application HA

  • Multiple replicas for each service
  • Pod anti-affinity to spread across nodes
  • Readiness and liveness probes

Storage HA

  • RAID 10 or distributed storage (Ceph)
  • Snapshots and replication

Resource Sizing Examples

Small Deployment (< 100K patients)

ResourceSpecification
Masters3 x (8 cores, 32 GB RAM)
Workers6 x (16 cores, 64 GB RAM)
PostgreSQL3 x (16 cores, 128 GB RAM, 2 TB SSD)
Shared Storage10 TB NFS
Network10 Gbps

Medium Deployment (100K - 1M patients)

ResourceSpecification
Masters3 x (16 cores, 64 GB RAM)
Workers15 x (32 cores, 128 GB RAM)
PostgreSQL3 x (32 cores, 256 GB RAM, 10 TB SSD)
Shared Storage50 TB NFS/SAN
Network25 Gbps

Large Deployment (> 1M patients)

ResourceSpecification
Masters3 x (16 cores, 64 GB RAM)
Workers30+ x (32 cores, 128 GB RAM)
PostgreSQL3 x (64 cores, 512 GB RAM, 50 TB SSD)
Shared Storage200 TB SAN
Network40 Gbps

Security

Network Security

  • Firewall rules (iptables/firewalld)
  • Network segmentation (VLANs)
  • TLS 1.2+ for all communications

Access Control

  • LDAP/Active Directory integration
  • RBAC for Kubernetes
  • Database role-based access

Data Protection

  • Encryption at rest (LUKS, dm-crypt)
  • Encrypted backups
  • De-identification for secondary use

Compliance

  • Audit logging
  • Access tracking
  • Regular security assessments

Monitoring & Operations

Metrics

  • Kubernetes cluster health (Prometheus)
  • Node resource utilization
  • Application performance (APM)
  • Database metrics

Dashboards

  • Grafana for visualization
  • Pre-built Patient Journey Intelligence dashboards

Alerting

  • PagerDuty/Opsgenie integration
  • Email/SMS notifications
  • Escalation policies

Backup & Disaster Recovery

Backup Strategy

  • Database: Daily full + hourly incrementals
  • File Storage: Daily snapshots
  • Kubernetes State: Velero backups
  • Retention: 30 days online, 1 year archive

Recovery Procedures

  • RTO: < 4 hours
  • RPO: < 1 hour
  • Regular DR testing quarterly

Advantages of On-Premise

  • Data Sovereignty: Complete control over data location
  • Network Performance: Low latency to on-prem EHR systems
  • Compliance: Meet strict regulatory requirements
  • Integration: Direct access to internal systems
  • Cost Predictability: No cloud usage spikes

Challenges & Considerations

  • Capital Investment: Upfront hardware costs
  • Operational Overhead: Requires dedicated IT staff
  • Scalability: Limited by physical infrastructure
  • Disaster Recovery: Requires second datacenter

Support Model

John Snow Labs provides:

  • Deployment automation scripts
  • Container images and updates
  • Technical support (24/7 available)
  • Runbooks and operational guides
  • Quarterly health checks

Next Steps

  1. Infrastructure Assessment: Review current datacenter capacity
  2. Architecture Planning: Design deployment with John Snow Labs
  3. Procurement: Order necessary hardware
  4. Pilot Deployment: Test with sample data

Additional Resources