HL7-to-FHIR Clinical Data Lakehouse

End-to-end clinical data integration platform ingesting HL7 v2 messages (ADT, ORM, ORU, reconciliation) via Mirth Connect into a Databricks Delta Lake on ADLS Gen 2, with full FHIR R4 conversion for downstream interoperability.

June 1, 2024
Mirth Connect HL7 v2 FHIR R4 Databricks Delta Lake ADLS Gen 2 Azure Data Factory Python PostgreSQL

HL7-to-FHIR Clinical Data Lakehouse

Overview

Built a production clinical data integration platform for a hybrid on-prem/Azure environment, handling all inbound and outbound HL7 v2 message traffic from hospital systems, reference labs, and internal instruments.

Architecture

  • Ingestion: Mirth Connect channels parse and route ADT (A01–A08), ORM O01, ORU R01, and reconciliation messages from TCP/MLLP listeners
  • Landing Zone: Raw HL7 payloads written to ADLS Gen 2 bronze layer as JSON-serialized segments
  • Processing: Databricks PySpark jobs normalize segments into silver-layer Delta tables (patient, encounter, order, result)
  • FHIR Conversion: Custom Python transformers map silver tables to FHIR R4 resources (Patient, Encounter, DiagnosticReport, Observation) served via a REST API
  • Orchestration: Azure Data Factory pipelines schedule bronze-to-silver and silver-to-gold promotion jobs
  • Audit & Reconciliation: PostgreSQL tracks message acknowledgment state, reprocessing queues, and SLA breach alerts

Key Outcomes

  • Reduced manual reconciliation effort by ~70% through automated ACK tracking and exception routing
  • Achieved sub-5-minute latency from HL7 receipt to queryable Delta table row
  • FHIR R4 output consumed by two downstream EHR systems with zero breaking schema changes across 18 months