Skip to main content

Medical Reasoning

Healthcare data arrives fragmented across systems: a medication order in the pharmacy system, a duplicate entry in the EHR, a conflicting dosage in discharge instructions. Lab results recorded at 10:15 AM might appear in the data feed with a timestamp of 10:00 AM. A diagnosis marked "rule out pneumonia" in one note becomes "pneumonia" in a problem list. A single ED visit generates three separate visit records as the patient moves from triage to treatment to admission.

Without intelligent reasoning, these inconsistencies propagate into analytics and research datasets, creating duplicate entities, conflicting facts, and temporal impossibilities. Medical Reasoning applies AI-powered logic to automatically detect and resolve these issues, ensuring every patient timeline is consistent, deduplicated, and trustworthy.


Why Medical Reasoning Matters

Consider a patient on metformin for diabetes. Over the course of a year, this single medication might appear in your data as:

  • Metformin 1000mg BID (pharmacy order, Jan 15)
  • Metformin 1000mg twice daily (EHR medication list, Jan 15)
  • Metformin 1g BID (discharge summary, Jan 20)
  • Metformin 500mg BID (progress note, Feb 10: prescriber reduced dose)
  • Metformin 500mg twice daily (pharmacy order, Feb 10)
  • Metformin 1000mg BID (data entry error, Feb 15)

Without reasoning: Six separate medication records, making it impossible to determine actual exposure history or detect the erroneous 1000mg entry after dose reduction.

With reasoning: Automated deduplication merges the identical Jan 15 entries, establishes temporal sequence, detects the Feb 10 dose change, flags the Feb 15 conflicting entry, and produces an accurate medication timeline with provenance for every decision.

This same principle applies to diagnoses, procedures, lab results, and visit records: every clinical entity that exists in multiple forms across your data ecosystem.


Core Reasoning Capabilities

Medical Reasoning operates automatically during data integration, applying specialized logic to resolve common data quality issues:

Entity Deduplication

Identify and merge duplicate clinical entities across data sources, preserving provenance and resolving minor variations in representation.

Conflict Resolution

Detect and resolve contradictions when multiple sources provide incompatible information for the same clinical fact.

Temporal Reasoning

Establish correct chronological order of events, detect temporal inconsistencies, and reconstruct accurate timelines.

Confidence Scoring

Assign reliability scores to extracted facts based on source quality, extraction confidence, and consistency with other evidence.

Provenance Tracking

Maintain complete lineage from original source documents to final standardized representation, enabling audit and validation.


Specialized Reasoning Profiles

Different clinical entities require tailored reasoning logic. The platform provides pre-configured reasoning profiles optimized for common data quality challenges:

Medication Reconciliation

The challenge: Medications appear in pharmacy orders, EHR medication lists, discharge summaries, and clinical notes: often with minor variations in representation, dosages, or timing.

Example scenario:

Source 1 (Pharmacy): Lisinopril 10mg, ordered 2024-01-15 08:30

Source 2 (EHR Med List): Lisinopril 10 MG Oral Tablet, recorded 2024-01-15 09:00

Source 3 (Discharge Summary): Lisinopril 10mg PO daily, documented 2024-01-20

Reasoning actions:

  • Merge duplicate entries: All three refer to the same medication start event on Jan 15
  • Resolve minor variations: "10mg" vs "10 MG" vs "10mg PO daily" are equivalent representations
  • Establish temporal sequence: Pharmacy order (08:30) preceded EHR entry (09:00), discharge summary documented later but refers to original start date
  • Produce unified record: Single medication exposure with start date Jan 15, provenance linking to all three sources

Diagnosis Consolidation

The challenge: Diagnoses appear in problem lists, encounter diagnoses, billing codes, and clinical notes: with varying levels of certainty, specificity, and assertion status.

Example scenario:

Source 1 (ED Note, Jan 10): "Rule out pneumonia" → Assertion: Hypothetical

Source 2 (Chest X-ray, Jan 10): "Right lower lobe infiltrate consistent with pneumonia"

Source 3 (Problem List, Jan 11): "Pneumonia" → Assertion: Present

Source 4 (Discharge Summary, Jan 15): "Community-acquired pneumonia, resolved"

Reasoning actions:

  • Reconcile assertion status: "Rule out" (hypothetical) on Jan 10 became confirmed (present) on Jan 11 after imaging
  • Deduplicate condition: All four references describe the same pneumonia episode
  • Establish diagnosis dates: Suspected Jan 10, confirmed Jan 11, resolved Jan 15
  • Set primary diagnosis: ED encounter and inpatient stay both list pneumonia as primary

Visit Consolidation

The challenge: A single care episode generates multiple visit records as patients move between care settings (ED → inpatient → observation) or as systems record encounters separately.

Example scenario:

Source 1 (ED System): Visit ID 12345, arrived 2024-02-20 14:30, discharged 2024-02-20 18:00

Source 2 (ADT Feed): Admission 2024-02-20 17:45, admitted to floor 5 West

Source 3 (Billing): Inpatient stay 2024-02-20 to 2024-02-23

Reasoning actions:

  • Detect overlapping visits: ED visit (14:30-18:00) overlaps with inpatient admission (17:45)
  • Link ED to inpatient: Same patient, continuous care, ED visit led directly to admission
  • Merge into care episode: Single hospitalization from ED arrival (14:30 Feb 20) to discharge (Feb 23)
  • Preserve visit types: Episode contains ED visit component + inpatient stay component

Conflict Resolution Strategies

When multiple data sources provide contradictory information, the reasoning engine applies a hierarchy of strategies to determine the most accurate representation:

1

Source Priority

Trust designated authoritative sources: pharmacy system for medications, lab system for test results, radiology PACS for imaging findings

2

Recency

When sources have equal authority, prefer the most recent information: updated diagnoses supersede previous entries, recent lab results reflect current status

3

Confidence Score

Choose higher-confidence extractions when NLP extracted data: structured data typically higher confidence than free-text mentions

4

Clinical Logic

Apply medical reasoning rules: drug dosages must be within safe ranges, temporal sequences must be logically consistent, diagnoses must align with documented findings

Resolution Example

Conflict: Three sources report different blood pressures for the same patient visit:

  • Nursing flowsheet (14:23): 145/92 mmHg: structured data, high confidence
  • Progress note (14:30): "BP approximately 140/90": NLP extraction, medium confidence
  • Discharge summary (next day): "BP 130/85": NLP extraction, high confidence, but from following day

Resolution logic:

  1. Source priority: Nursing flowsheet is authoritative for vital signs
  2. Recency: All entries from same day except discharge summary
  3. Confidence: Flowsheet = structured (highest), note = estimated (medium)
  4. Clinical logic: Values are consistent (all elevated), discharge summary is from different timepoint

Decision: Use nursing flowsheet value (145/92 at 14:23) as primary reading for this visit. Retain progress note mention as corroborating evidence. Link discharge summary BP to following day's assessment.


Provenance and Auditability

Every reasoning decision is fully traceable. When the engine merges duplicates, resolves conflicts, or adjusts temporal sequences, it preserves:

Source System Identifiers

Original record IDs from each contributing system: enables tracing back to source documents for validation

Extraction Timestamps

When each fact was extracted or recorded: distinguishes data capture time from clinical event time

Confidence Scores

Reliability assessment for each extracted fact: enables filtering by confidence threshold for sensitive analyses

Resolution Method Applied

Justification for the retained value: transparent decision-making for audit and regulatory compliance

This complete audit trail ensures that every clinical fact in the OMOP CDM can be traced to its original source, with full transparency into any reasoning decisions applied during integration.


Impact on Data Quality and Trust

Without medical reasoning:

  • Duplicate entities: Same medication appears 5 times, inflating exposure counts
  • Conflicting facts: Patient has "pneumonia" and "no pneumonia" simultaneously
  • Temporal impossibilities: Discharge date before admission date, medication start after stop
  • Lost trust: Researchers encounter obvious errors and lose confidence in the entire dataset

With medical reasoning:

  • Clean timelines: Each patient has a single, deduplicated, temporally consistent history
  • Resolved conflicts: Contradictions automatically detected and resolved using transparent logic
  • Audit trail: Every decision traceable to source documents and reasoning rules
  • Regulatory compliance: Full provenance satisfies HIPAA, 21 CFR Part 11, and IRB requirements