Skip to main content

Database Configuration

Database Configuration is where you define and manage the data storage systems that power PJI. This module controls how the platform connects to databases, stores clinical and operational data, and maintains performance under load.

Who is this for? Database administrators, platform administrators, and technical teams responsible for data infrastructure, storage configuration, and database performance management.


Configure OMOP and Metadata Databases

PJI stores different types of data in database systems that you configure and manage. The Database Configuration module handles:

  • OMOP CDM Database: Stores clinical data in the Observational Medical Outcomes Partnership (OMOP) Common Data Model format, the standardized schema for healthcare analytics
  • Metadata Database: Contains system metadata, user information, cohort definitions, and operational configurations
  • Connection Pooling: Manages database connection efficiency to handle multiple concurrent users without overwhelming the database
  • Performance Tuning: Optimizes database settings to ensure fast query responses and efficient data operations

Understanding these components helps you maintain a stable, performant data infrastructure for your PJI deployment.


Supported Database Types

PJI supports multiple database platforms, giving you flexibility to choose what works best for your infrastructure and cloud environment. Each option offers different advantages for performance, scalability, and integration.

What it is: The most widely recommended open-source relational database for PJI deployments.

Why choose it:

  • Industry-standard database with excellent support for healthcare analytics
  • Strong compatibility with OMOP CDM standards and tooling
  • Mature ecosystem with extensive documentation and community support
  • Can be deployed on-premises or in any cloud environment
  • Cost-effective for organizations of all sizes

Best for: Organizations seeking maximum flexibility, strong community support, and proven reliability for clinical data workloads.


Amazon RDS Aurora

What it is: Amazon Web Services' cloud-native, PostgreSQL-compatible database service.

Why choose it:

  • Built specifically for cloud scalability and high availability
  • Automatic failover and backup capabilities built in
  • Excellent performance for read-heavy analytics workloads
  • Seamless integration with other AWS services
  • Managed service reduces operational overhead

Best for: Organizations already using AWS infrastructure who want a fully managed database with automatic scaling and high availability.


Azure Database for PostgreSQL

What it is: Microsoft Azure's managed PostgreSQL database service.

Why choose it:

  • Native integration with Azure cloud services and security features
  • Managed service with automatic patching and backups
  • Built-in high availability and disaster recovery options
  • Simplified compliance with healthcare regulations in Azure environment
  • Flexible compute and storage scaling

Best for: Organizations with existing Azure investments who want tight integration with Microsoft's healthcare cloud ecosystem.


Snowflake

What it is: A cloud-based data warehousing platform optimized for analytics and large-scale data operations.

Why choose it:

  • Exceptional performance for complex analytical queries across massive datasets
  • Separation of compute and storage allows independent scaling
  • Advanced data sharing capabilities for multi-institutional research
  • Near-zero maintenance with fully managed architecture
  • Built-in support for semi-structured data

Best for: Large healthcare systems or research organizations with significant data volumes and complex analytical requirements.


Database Configuration Settings

Configure how PJI connects to and interacts with your database systems. These settings ensure secure access, proper data organization, and reliable disaster recovery.

Connection Strings

What they are: Connection strings define how PJI locates and connects to your database servers.

What you configure:

  • Hostname or endpoint: The server address where your database is running
  • Port number: The network port for database connections (typically 5432 for PostgreSQL)
  • Database name: The specific database instance to use
  • SSL/TLS settings: Encryption requirements for data in transit
  • Timeout values: How long to wait before declaring a connection failed

Example structure: postgresql://hostname:5432/database_name?sslmode=require

Important: Always use encrypted connections (SSL/TLS) when connecting to databases containing protected health information (PHI) to maintain HIPAA compliance.


Credentials Management

What it controls: Secure storage and handling of database authentication credentials.

Configuration options:

Username and Password Authentication:

  • Store database credentials securely within PJI's encrypted configuration
  • Use read-only credentials for analytics queries when possible
  • Separate credentials for different access levels (read vs. write)

Advanced Authentication Methods:

  • IAM-based authentication: Use cloud provider identity services (AWS IAM, Azure AD) instead of passwords
  • Certificate-based authentication: Authenticate using SSL certificates for enhanced security
  • Secret management integration: Connect to external secret stores (AWS Secrets Manager, Azure Key Vault, HashiCorp Vault)

Best practice: Never store database credentials in plain text. Use your organization's secret management system or PJI's encrypted credential storage. Rotate credentials regularly according to your security policies.


Schema Configuration

What it defines: The organization of data within your database, including which schemas and tables PJI should use.

OMOP CDM Schema Settings:

  • Schema name: The database schema containing your OMOP CDM tables (e.g., "cdm_data")
  • Table prefix: Any prefix applied to OMOP table names (usually none for standard implementations)
  • CDM version: Which version of the OMOP CDM standard your data follows (e.g., 5.3, 5.4)

Metadata Schema Settings:

  • Schema name: Where PJI stores its operational data (user accounts, cohort definitions, etc.)
  • Initialization options: Whether PJI should automatically create schema structures or expect them to exist

Why this matters: Correct schema configuration ensures PJI can find your clinical data and operate properly. Misconfigured schemas are the most common cause of connection issues.


Backup Settings

What they control: How your database data is protected against loss or corruption.

Configuration areas:

Backup Schedule:

  • Frequency: How often backups run (daily, hourly, continuous)
  • Retention period: How long to keep backup copies before deletion
  • Backup windows: When backups run to minimize impact on system performance

Backup Types:

  • Full backups: Complete copy of all database data (slower, comprehensive)
  • Incremental backups: Only changes since last backup (faster, requires full backup as baseline)
  • Transaction log backups: Continuous backup of database changes for point-in-time recovery

Backup Storage:

  • Storage location: Where backup files are stored (separate from primary database)
  • Encryption: Whether backups are encrypted at rest
  • Geographic redundancy: Storing backups in multiple physical locations for disaster recovery

Recovery Testing:

  • Regularly test that backups can be restored successfully
  • Document recovery procedures for different failure scenarios
  • Establish Recovery Time Objective (RTO) and Recovery Point Objective (RPO) targets

Critical: For HIPAA compliance, ensure backups are encrypted, stored securely, and tested regularly. Never store backups on the same infrastructure as your primary database.


Best Practices for Database Configuration

Use Managed Database Services When Possible: Cloud-managed databases (RDS, Azure Database) handle patching, backups, and high availability automatically, reducing operational burden.

Implement Connection Pooling: Configure appropriate pool sizes to handle concurrent users efficiently without exhausting database connections or creating unnecessary overhead.

Monitor Database Performance Continuously: Track query performance, connection counts, and resource utilization to identify issues before they impact users.

Separate Read and Write Operations: Use read replicas for analytics queries to avoid impacting operational database performance.

Test Configuration Changes in Non-Production: Always validate database configuration changes in a test environment before applying to production systems.

Document Your Configuration: Maintain clear documentation of connection strings, schema names, and backup procedures for disaster recovery scenarios.

Regular Security Audits: Review database access credentials, connection encryption, and audit logs quarterly to maintain security posture.


Troubleshooting Database Issues

Common database configuration problems and their solutions:

ProblemHow to Fix It
PJI cannot connect to databaseVerify connection string is correct, database server is running, and firewall rules allow connections from PJI servers. Check credentials are valid and not expired
Slow query performanceReview database indexes on OMOP tables, check if connection pool is sized appropriately, consider scaling database resources or implementing read replicas
"Schema not found" errorsVerify schema configuration matches actual database schema names. Check that the database user has permissions to access specified schemas
Backup failuresEnsure sufficient storage space for backups, verify backup credentials have necessary permissions, check network connectivity to backup storage location

Getting Help

If you need assistance with database configuration:

  • Connection Issues: Contact your database administrator or cloud provider support to verify connectivity and credentials
  • Performance Problems: Reach out to PJI support with query performance metrics and database statistics for optimization recommendations
  • Schema Configuration: Consult PJI implementation documentation or contact your implementation partner for guidance on OMOP CDM setup
  • Backup and Recovery: Work with your IT infrastructure team to design backup strategies that meet your organization's compliance and recovery requirements