Data Sources
The Data Sources module is the authoritative interface for managing all external data contributors within the Patient Journey Intelligence (PJI) ecosystem. This module governs the creation, configuration, credentialing, and monitoring of ingestion pipelines, ensuring that data from diverse sources is reliably retrieved, normalized, and processed in accordance with predefined schedules or ad-hoc demands.
By supporting seamless integration with systems such as electronic health records (EHRs), object storage platforms, imaging repositories, and RESTful APIs, the Data Sources module lays the foundation for robust, compliant, and repeatable clinical data onboarding.
Data Sources Configuration
This video demonstrates the interface visually without audio narration.
Core Capabilities
The Data Sources module provides a comprehensive suite of integration capabilities:
- Establish secure connections to external systems: including EHR platforms, cloud storage (e.g., Amazon S3), FHIR-compliant services, DICOM imaging systems, APIs, and SFTP endpoints
- Credential lifecycle management: with encryption, masking, and secure storage mechanisms
- Configurable ingestion workflows: allowing for both automated scheduling and manual triggering
- Connection health monitoring: including historical execution status and latency diagnostics
- Editable source configurations: enabling non-disruptive updates without impacting legacy data
- Unified integration inventory: displaying all active and retired connectors for governance and traceability
Functionally, this module anchors PJI's Data Integration architecture, enabling each ingestion pipeline to operate with a validated, secure, and persistent data source.
Interface Overview
The main interface presents a searchable and sortable registry of all active and historical data connectors. Key metadata fields include:
- Source Identifier: A descriptive label denoting the connection
- Connector Type: e.g., HTTP Plugin, SFTP, AWS HealthLake, AWS HealthImaging, S3
- Created By / Created Timestamp: Audit trail for origin of configuration
- Last Modified: Most recent update to parameters or schedule
- Last Successful Connection (Visit): Verifies ingestion recency
- Operational Status: Active, Completed, Failed
- Ingestion Schedule: None, Daily, Weekly, Monthly
- Administrative Actions: Edit, Delete
This centralized dashboard supports rapid assessment of integration health and operational consistency across the ingest ecosystem.
Creating a New Data Source
Users initiate source creation by selecting Add Source, launching a guided three-phase configuration wizard.
1. Source Metadata & Type Selection
Define a unique name and choose the appropriate integration type. PJI supports a broad range of data ingestion connectors, including:
- AWS HealthLake: FHIR-native structured data
- AWS HealthImaging: DICOM image object retrieval
- Amazon S3: File-based document ingestion
- HTTP Plugin: Integration with RESTful endpoints
- SFTP: Secure file transfer protocol
- EHR Connectors: (Epic, Cerner, Snowflake) Coming Soon
Each connector is accompanied by contextual descriptions to guide optimal selection.
2. Parameter Configuration
Connector-specific fields are dynamically rendered based on selected source type. Typical configuration parameters include:
- Bucket or datastore identifiers
- Endpoint URLs
- Geographic region (for cloud-based systems)
- Access credentials and authentication tokens
- HTTP methods, headers, and payload schemas
- SFTP host details and secure paths
Advanced network options (e.g., private VPC routing, static IP allowlisting) are also supported.
All credential inputs are fully encrypted, masked, and stored in compliance with enterprise security and healthcare regulatory frameworks.
3. Review and Confirmation
The final screen displays a comprehensive summary of the proposed configuration, including:
- Source metadata
- Connection parameters
- Selected ingestion schedule
Selecting Create Source finalizes the integration, making the connector immediately available to the Data Ingestion module.
Modifying an Existing Source
Clicking the Edit icon reopens the configuration interface for the specified source.
- Source Name can be updated
- Source Type is locked to preserve schema compatibility
- All configurable fields are available for modification
A security notice reminds users that existing credentials are hidden by default. Options include:
- Leaving credential fields blank to preserve existing keys
- Inputting new credentials to facilitate secure key rotation
This model supports secure, auditable configuration updates while maintaining operational continuity.
Administrative Source Controls
Each connector entry supports the following administrative actions:
- ** Edit**: Modify integration parameters, schedules, or credentials
- ** Delete**: Remove the connector from active use (ingested data is retained)
All actions are role-restricted, governed by platform-level authorization and access control protocols.
Scheduling and Automation Options
Users may assign automated ingestion schedules during source setup:
- None: Ingestion is triggered manually
- Daily: Executes ingestion once per 24-hour cycle
- Weekly: Executes on specific days of the week
- Monthly: Executes on designated calendar dates
Scheduled ingestion jobs are automatically queued in the Data Ingestion pipeline, ensuring consistent, unattended data refresh cycles.
Connection Monitoring and Diagnostics
To ensure ingestion reliability, the Source Registry includes built-in observability features:
- Last Visit Timestamp: Confirms most recent successful connection
- Status Indicators: Visual feedback for Active, Completed, or Failed states
- Error Logs: Diagnostic feedback to facilitate rapid troubleshooting
This real-time visibility enables administrators to preemptively detect and resolve data pipeline disruptions.
The Data Sources module is integral to the integrity, reliability, and scalability of PJI's data architecture. It supports:
- Robust ingestion pipelines for structured (FHIR), unstructured (notes), and imaging (DICOM) data
- Enterprise-grade security, featuring encrypted credentials, access controls, and auditability
- Operational intelligence, enabling continuous monitoring and actionable diagnostics
- Adaptability to a range of clinical and technical environments—from small data marts to enterprise-wide EHR systems
Once registered, data sources feed into PJI's downstream processing framework—powering normalization, NLP extraction, OMOP conversion, deduplication, enrichment, and analytics across the platform.
By providing centralized, compliant, and resilient integration management, the Data Sources module ensures a continuously high-quality, governed data supply chain for all modules within the Patient Journey Intelligence platform.