FHIR to OMOP FHIR IG
1.0.0-ballot - INFORMATIVE 1 - Ballot International flag

This page is part of the Vulcan FHIR to OMOP FHIR Implementation Guide (v1.0.0-ballot: INFORMATIVE 1 Ballot 1) based on FHIR (HL7® FHIR® Standard) v5.0.0. No current official version has been published yet. For a full list of available versions, see the Directory of published versions

Transformation Strategies & Best Practices

Page standards status: Informative

The successful implementation of FHIR-to-OMOP transformations requires careful balance between OMOP's clinical data model philosophy and the comprehensive data business and provenance needs of modern healthcare organizations. By understanding where these approaches differ, implementers can design robust, scalable solutions that serve both clinical research and operational stakeholders effectively.

ETL Documentation

FHIR-to-OMOP implementations benefit from comprehensive ETL documentation that specifies mapping choices and clearly articulates assumptions made during the transformation process. This documentation is a critical feature of implementations that scale over time, particularly when dealing with multiple source systems that may handle temporal data differently or have varying levels of data completeness. In any OMOP instance that is populated with a feed that has undergone a FHIR to OMOP transformation, the differences in purpose and structure of the underlying FHIR sources and the OMOP CDM dictate transformation choices that may need to be made to best serve the purpose of that specific OMOP instance. Aspects of ETL design rationale that should be documented include:

 * The limitations of certain mappings (e.g., MedicationRequest to drug_exposure).
 * Assumptions made during mapping (e.g., inferred exposure based on prescription data).
 * Guidance on when to filter data (e.g., removing planned procedures).

This guide leverages common EHR transformation scenarios and includes detailed examples as a foundation to help users develop navigate edge cases and develop implementation-specific strategies for effective and consistent ETL from FHIR to OMOP, especially where FHIR resources may vary by source. For organizations where audit requirements mandate robust provenance tracking, designing custom OMOP extensions for identifer management or recorded date preservation et al represents a strategic investment in long-term data usability. These extensions should be carefully planned to avoid conflicts with standard OMOP conventions while providing the necessary metadata for compliance and quality assurance processes.

Source Value Preservation

Consistently preserving source values is a critical component of FHIR to OMOP transformation, ensuring data lineage, maintenance of incremental data stores, supports future remapping efforts, and enabling quality assurance validation procedures. This strategy accomodates vocabulary evolution and improved mapping algorithms which may require reprocessing of source data, making original value retention essential for long-term data management.

Source value fields must always preserve original codes exactly as provided in FHIR resources, maintaining character-for-character accuracy to ensure complete traceability back to the source system. This includes maintaining any formatting, spacing, or special characters present in the original codes, as these may carry semantic meaning or system-specific significance that could be relevant for future processing or validation efforts. Source identifier fields require population with OMOP concept_id values when source codes exist within the OHDSI Standardized Vocabularies, while unmapped codes are populated with a "0" to indicate their non-Standard OMOP concept status. The use of "0" for unmapped codes follows OMOP conventions and enables analytical queries to distinguish between successfully mapped and unmapped source data. This OMOP convention provides clear indication of OMOP CDM mapping conformance for each map implemented while maintaining the a record source codes and their OMOP representations when available.

Future-proofing an OMOP database includes designing storage and documentation strategies that accommodate vocabulary evolution, improved mapping methodologies, and changing clinical terminology standards. A corollary best practice to source data preservation is completion of transformation lineage documentation, including mapping decisions, prioritization choices, and any pre-processing or manual interventions performed during the transformation process. Together, these two steps enable future data validation efforts, support quality improvement initiatives, and provides the foundation for remapping activities when vocabulary updates or improved algorithms become available.

Granularity of FHIR Data vs. OMOP Standardization

FHIR resources can contain detailed data, such as drug dosage adjustments or specific intervals for medication administration, which might not have direct counterparts in OMOP’s more generalized tables. An implication for data transformation is that this disparity means some FHIR data may be lost or generalized in the transformation process to OMOP. This loss could impact certain use cases. When developing a data transformation from FHIR to OMOP, there is a need to identify and document potential data losses resulting from a mismatch in source to target data granularity to inform data users about impacts to, and potential limitations this may cause in analyses.

Differentiating Between Patient-Reported and Clinician-Verified Data

FHIR resources such as MedicationStatement often contain patient-reported information, which may be less reliable than data verified or documented by clinicians. In contrast, OMOP’s data model does not consistently distinguish between data sources in a way that clearly conveys differences in reliability or verification status. Treating all records as equivalent can introduce interpretive challenges and potential bias, especially when patient-reported and clinician-verified records are analyzed together.

For example, a medication history reported directly by a patient may not carry the same evidentiary weight as a medication order formally documented by a prescribing clinician. To address this limitation, the Implementation Guide recommends using OMOP’s observation_type_concept_id or drug_type_concept_id fields to indicate the provenance of each record. (See Type Concepts in the OMOP Common Data Model) By explicitly tagging records with their source, implementers can support analyses that require higher confidence in data accuracy or that need to filter data based on verification status. This practice improves transparency and helps maintain analytic rigor in research contexts where the reliability of the underlying data is critical.