My Projects
Here's a collection of projects I've worked on. Each project represents a learning journey and showcases different aspects of my development skills.
Real-Time Clinical Event Streaming with Kafka & Flink
Event-driven pipeline using Apache Kafka and Apache Flink to stream ADT and lab result events in real time into a Databricks Delta Lake, enabling live patient census dashboards and alerting.
Genomics Variant Ingestion & QC Pipeline
Scalable PySpark pipeline on Databricks for ingesting, validating, and annotating VCF files from whole-genome and targeted sequencing panels, stored in a Delta Lake variant catalog on Azure.
HL7-to-FHIR Clinical Data Lakehouse
End-to-end clinical data integration platform ingesting HL7 v2 messages (ADT, ORM, ORU, reconciliation) via Mirth Connect into a Databricks Delta Lake on ADLS Gen 2, with full FHIR R4 conversion for downstream interoperability.
Clinical Data Warehouse — OMOP CDM with dbt
Transformed a multi-source clinical Delta Lake into an OMOP Common Data Model using dbt on Databricks, enabling standardized cohort analysis and federated research queries across patient populations.
Multi-Source EHR Pipeline Orchestration with Airflow
Apache Airflow DAGs orchestrating a multi-source EHR ingestion platform that coordinates Mirth Connect polling, ADF triggers, Databricks job runs, and data quality checks across a clinical data lakehouse.
Interested in Working Together?
I'm always open to discussing new opportunities and interesting projects.
Get In Touch