Skip to main content

Patient Journey Intelligence Platform

A single, reusable data foundation platform that transforms fragmented clinical data into AI-ready patient journeys for research, quality improvement, and regulatory reporting

Healthcare organizations collect vast amounts of clinical data during routine patient care, but most of it remains locked in formats unsuitable for secondary use. Patient Journey Intelligence Platform solves this by continuously transforming raw, multimodal clinical data into standardized, longitudinal patient journeys—enabling research, AI development, quality measurement, and regulatory compliance from a single governed data layer.


The Secondary Use Data Challenge

Secondary use—reusing clinical data collected during patient care for research, quality improvement, population health, AI development, and regulatory reporting—remains one of healthcare's most persistent bottlenecks.

The Gap: Clinical Data Wasn't Built for Analytics

Clinical data is captured primarily for billing and documentation, not analytics or AI. This creates fundamental challenges:

  • Fragmentation Across Modalities Critical patient information is scattered across structured EHR fields, free-text clinical notes, scanned PDFs, imaging reports, lab systems, and external registries. No single source tells the complete story.

  • Unstructured Data Contains the Missing Context Up to 40% of critical diagnoses exist only in unstructured clinical notes—never coded into structured fields. Treatment rationale, disease progression, and clinical reasoning are documented as free text, invisible to traditional analytics.

  • Duplicated Pipeline Development Each new use case, whether a cancer registry, clinical trial cohort, quality measure, or AI model, requires rebuilding similar data pipelines from scratch. Organizations spend 10+ FTE-years annually on redundant data engineering.

  • Research-to-Production Disconnect Models trained on research datasets often fail in production because operational data is preprocessed differently, uses inconsistent terminologies, or lacks the same feature definitions.

  • Compliance and Governance Overhead Maintaining separate identified datasets for operations and de-identified datasets for research doubles infrastructure costs and creates version drift. Audit trails, data lineage, and PHI management require custom tooling.

Real-World Impact

Consider a health system developing a sepsis prediction model:

  1. Months of data engineering to extract vital signs from EHR tables, parse infection mentions from clinical notes, link lab results across systems
  2. Critical context missed because antibiotic administration notes were in free text, not discrete orders
  3. Model fails at deployment because research data used LOINC codes while production EHR uses proprietary lab codes
  4. New registry project starts from zero six months later, rebuilding similar pipelines for a different condition

This cycle repeats across every analytics initiative, preventing healthcare organizations from compounding their AI investments.


How Patient Journey Intelligence Solves the Secondary Use Bottleneck

Patient Journey Intelligence Platform provides a single, reusable foundation that eliminates redundant pipeline development and ensures every downstream use case operates on the same curated, standardized patient data.

Unified Data Pipeline: Build Once, Use Everywhere

📥

1. Ingest Any Data Source

Connect EHR systems, clinical notes, imaging (DICOM), PDFs, lab feeds, and external registries without extensive preprocessing. The platform handles data as-is, regardless of format or structure.

🔬

2. AI-Powered Extraction

John Snow Labs Medical Language Models extract structured facts from unstructured content, detect negations and temporal relationships, and resolve entities across documents—automatically capturing the 40% of clinical context trapped in free text.

🧬

3. Standardize to OMOP CDM

All data is mapped to OMOP Common Data Model v5.4 with standard terminologies (SNOMED CT, RxNorm, LOINC, ICD-10). This creates a unified schema compatible with OHDSI analytics, BI tools, and AI frameworks.

🤖

4. Continuously Update Patient Journeys

Longitudinal timelines combine all data about each patient into coherent narratives—visits, conditions, medications, procedures, labs—automatically updated as new clinical documents arrive.

Living, Governed Data Assets

Unlike point-in-time data exports, the platform maintains living datasets that stay synchronized with your clinical systems. Every extracted fact includes:

  • Provenance: Source documents, extraction timestamps, transformation lineage
  • Confidence Scores: ML model certainty for quality control and expert review workflows
  • Versioning: Time-stamped updates preserving historical states for reproducibility
  • Clinical Context: Complete temporal sequences showing disease progression, treatment response, and care patterns

Parallel Identified and De-Identified Datasets

The platform automatically maintains two synchronized OMOP datasets from the same source data:

  1. Identified OMOP (Operational Dataset): Full PHI for clinical operations, care coordination, point-of-care AI, and internal quality improvement
  2. De-Identified OMOP (Research Dataset): HIPAA Safe Harbor compliant with consistent pseudonyms, date-shifting, and PHI removal for research, external collaborations, and AI model training

This eliminates the research-to-production gap. Train models on de-identified research data, then deploy on identified operational data with identical feature definitions, terminologies, and data lineage.


The Impact: Measurable Improvements Across Your Organization

Use Cases Enabled

Clinical Research

  • Identify trial-eligible patients across all data sources in minutes, not weeks
  • Analyze real-world treatment effectiveness with complete medication and outcome timelines
  • Federate multi-site studies using standardized OMOP cohorts

AI and Predictive Analytics

  • Train models on de-identified research data, deploy on identified operational data with zero feature drift
  • Build clinical decision support tools that compound across use cases rather than fragment
  • Access pre-extracted features (diagnosis timelines, medication adherence, lab trends) without custom NLP pipelines

Quality Improvement and Population Health

  • Measure outcomes against clinical guidelines using standardized terminologies
  • Track chronic disease management across all touchpoints
  • Detect care gaps and intervention opportunities at scale

Regulatory Reporting and Registries

  • Automate cancer registry abstraction with NAACCR compliance
  • Generate quality measure reports (HEDIS, CMS) without manual chart review
  • Maintain public health surveillance feeds with full data lineage

Built on Open Standards

Patient Journey Intelligence is architected around open standards that ensure your data remains yours: portable, interoperable, and future-proof. This standards-first approach prevents vendor lock-in and enables seamless integration with the broader healthcare AI ecosystem.

OMOP Common Data Model v5.4

All patient data is standardized to OMOP CDM v5.4, the leading observational research standard maintained by the OHDSI community. By adopting OMOP, your data becomes immediately compatible with:

  • OHDSI Ecosystem Tools: ATLAS for cohort definitions, ACHILLES for data characterization, CohortMethod for causal inference, and dozens of validated analytics packages
  • Multi-Institutional Collaboration: Share study protocols and federated analytics without exchanging raw data. The results remain comparable because everyone speaks the same schema
  • Reproducible Research: Published studies using OMOP cohorts can be replicated across institutions, accelerating evidence generation
  • AI Model Portability: Train models on standardized features that work across any OMOP dataset, eliminating custom preprocessing for each deployment

Your data stays in your control. OMOP is an open specification with no licensing fees, proprietary formats, or cloud dependencies. If you ever choose to move away from Patient Journey Intelligence, your OMOP data remains fully accessible and usable with any OMOP-compatible tool.

Supported domains: Condition, Drug Exposure, Procedure, Measurement, Observation, Visit, Person, Provider, Device Exposure, Note, Specimen

Standard Medical Terminologies

Clinical concepts are mapped to open, standardized medical vocabularies that enable semantic interoperability across systems. This means a diagnosis coded in your EHR can be automatically aligned with research cohorts, clinical guidelines, and AI models, without manual mapping.

Core Vocabularies:

  • SNOMED CT: Comprehensive clinical terminology covering diagnoses, findings, procedures, and anatomical structures
  • RxNorm: Standardized drug nomenclature linking brand names, generics, and ingredients
  • LOINC: Universal codes for lab tests, clinical observations, and diagnostic studies
  • ICD-10-CM: Diagnosis codes for billing and epidemiology, automatically mapped to SNOMED concepts
  • HPO (Human Phenotype Ontology): Phenotypic abnormalities for rare disease and genetics research
  • UMLS Metathesaurus: Cross-terminology mappings enabling translation between coding systems

Why This Matters: When your data uses standard terminologies, insights from one tool immediately transfer to another. A cohort defined in ATLAS can be directly queried in your BI tool. An AI model trained on SNOMED-coded features will work on any OMOP dataset. Clinical decision support rules written once apply everywhere.

This eliminates the "translation tax" where each new application requires custom data dictionaries, and ensures your AI investments compound rather than fragment.

Model Context Protocol (MCP)

All platform capabilities (e.g. data extraction, cohort queries, patient timelines, registry abstraction, etc.) are exposed via Model Context Protocol (MCP), an open standard for AI agent interoperability developed by Anthropic.

What MCP Enables:

  • Composable Workflows: AI agents can invoke platform tools (e.g., "find patients with diabetic retinopathy") and combine them with external capabilities (e.g., scheduling, EHR writes) in multi-step workflows.
  • Tool Discovery: Agents automatically discover available functions, parameters, and data schemas. No hardcoded integrations are required.
  • Ecosystem Integration: Any MCP compatible agent can access your curated patient data, while platform agents can leverage external MCP tools for scheduling, notifications, or real-time data feeds.
  • Custom Extensions: Build internal MCP tools that expose proprietary logic or institutional data, making them instantly available to all agents.

Example Use Case: A clinical research coordinator asks an AI agent to "identify eligible patients for the diabetes trial and draft recruitment letters." The agent uses MCP to query your OMOP cohort, retrieve patient summaries, and compose personalized outreach, all without custom API development.

By standardizing on MCP, Patient Journey Intelligence becomes a composable platform rather than a closed system, enabling you to build sophisticated agentic workflows that span clinical operations, research, and analytics.


Get Started

Patient Journey Intelligence is deployed within your infrastructure, it is not framed as SaaS. Your clinical data never leaves your environment, ensuring complete control over security, compliance, and data governance. The platform runs on your chosen cloud provider (AWS, Azure, Google Cloud), data warehouse (Snowflake, Databricks), or on-premise Kubernetes cluster.

The John Snow Labs team deploys and configures the platform for you. We handle infrastructure setup, data source integration, clinical workflow configuration, and provide comprehensive team training. Initial deployment typically takes 12 weeks from kickoff to production-ready OMOP datasets.

Three steps to AI-ready clinical data

1

Assess Your Data Landscape

Evaluate your data sources, infrastructure, and governance needs.

Data Readiness Assessment →

2

Integrate & Curate

Integrate patient data from your private sources, extract relevant medical information, reason and normalize data point, translate to OMOP.

Source Integration → | Configure Curation →

3

Create Applications and Agents

Create cohorts, compute measures, visualize patient journeys, and extract features for AI.

Build Cohorts → | View Timelines →