Skip to content

Commit

Permalink
version 0.3
Browse files Browse the repository at this point in the history
  • Loading branch information
dkapitan committed Feb 16, 2024
1 parent 2d970d6 commit c1c8a26
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 22 deletions.
Binary file modified paper-data-platforms.pdf
Binary file not shown.
10 changes: 5 additions & 5 deletions paper-data-platforms.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -130,13 +130,13 @@ Besides the two case studies, we will discuss how the design relates to other re

### Open standards: using FHIR as the common data model

The recent convergence to FHIR as the de facto standard for information exchange has fuelled the development of OpenHIE. FHIR is currently used both for routine healthcare settings40 and clinical research settings [@duda2022hl7;@vorisek2022fast] and is increasingly being used in LMICs as well. The FHIR-native OpenSRP platform [@mehl2020open] has been deployed in 14 countries targeting various patient populations, amongst which a reference implementation of the WHO antenatal and neonatal care guidelines for midwives in Lombok, Indonesia [@summitinstitutefordevelopment2023bunda;@kurniawan2019midwife]. In India, FHIR is used as the underlying technology for the open Health Claims Exchange protocol specification, which has been adopted by the Indian government as the standard for e-claims handling [@hcx]. Additionally, the guidelines and standards of the African Union explicitly state FHIR is to be used as the messaging standard [@2023african]. This range of utilizations showcase the standards’ widespread applicability. The proceedings of the OpenHIE conference 2023 attest to the fact that FHIR and open source technologies are embraced as critical enablers in implementing health information exchanges in LMICs [@ohie23].
The recent convergence to FHIR as the de facto standard for information exchange has fuelled the development of OpenHIE. FHIR is currently used both for routine healthcare settings[ @ayaz2021fast] and clinical research settings [@duda2022hl7;@vorisek2022fast] and is increasingly being used in LMICs as well. The FHIR-native OpenSRP platform [@mehl2020open] has been deployed in 14 countries targeting various patient populations, amongst which a reference implementation of the WHO antenatal and neonatal care guidelines for midwives in Lombok, Indonesia [@summitinstitutefordevelopment2023bunda;@kurniawan2019midwife]. In India, FHIR is used as the underlying technology for the open Health Claims Exchange protocol specification, which has been adopted by the Indian government as the standard for e-claims handling [@hcx]. Additionally, the guidelines and standards of the African Union explicitly state FHIR is to be used as the messaging standard [@2023african]. This range of utilizations showcase the standards’ widespread applicability. The proceedings of the OpenHIE conference 2023 attest to the fact that FHIR and open source technologies are embraced as critical enablers in implementing health information exchanges in LMICs [@ohie23].

However, despite the increased use of FHIR as a common data model, various studies have investigated its merits and performance vis-a-vis other healthcare standards. Comparisons between OpenEHR, ISO 13606, OMOP and FHIR have been made [@ayaz2023transforming;@mullie2023coda;@rinaldi2021openehr;@cremonesi2023need;@sinaci2023data]. A study involving 10 experts comparing OpenEHR, ISO 13606 and FHIR concluded that i) these three standards are functionally and technically compatible, and therefore can be used side by side; and that ii) each of these standards have their strengths and limitations that correlate with their intended use as summarized in the @tbl-comparison.

![Comparison of OpenEHR, ISO 13606 and FHIR standards](images/comparison-ehr-standards.png){width=auto #tbl-comparison}

For an infectious diseases dataset with a limited scope, OpenEHR, OMOP and FHIR have been compared and found all to be equally suitable [@rinaldi2021openehr]. Comparing OMOP and FHIR, the latter has been found to support more granular mappings required for analytics and was therefore chosen as the standard for the CODA project[@mullie2023coda].
For an infectious diseases dataset with a limited scope, OpenEHR, OMOP and FHIR have been compared and found all to be equally suitable [@rinaldi2021openehr]. Comparing OMOP and FHIR, the latter has been found to support more granular mappings required for analytics and was therefore chosen as the standard for the CODA project [@mullie2023coda].

Although FHIR was originally designed only for exchange between systems, we propose to use it as the common data model for the design presented here for the following reasons:

Expand All @@ -149,9 +149,9 @@ Although FHIR was originally designed only for exchange between systems, we prop

Data management and analytics platforms have undergone significant changes since the first generation of data warehouses were introduced. Recent studies have shown that the current practice has converged towards the lakehouse as one of the most commonly used solution designs [@armbrust2021lakehouse;@hai2023data;@harby2022data]. Lakehouses typically have a zonal architecture [@hai2023data] where data is ingested from the source systems in bulk (E), delivered to storage with aligned schemas (L) and transformed into a format ready for analysis (T). The discerning characteristic of the lakehouse architecture is its foundation on low-cost and directly-accessible storage that also provides traditional analytical DBMS management and performance features such as ACID transactions, data versioning, auditing, indexing, caching, and query optimization [@armbrust2021lakehouse]. Lakehouses thus combine the key benefits of data lakes and data warehouses: low-cost storage in an open format accessible by a variety of systems from the former, and powerful management and optimization features from the latter.

With respect to current implementations of lakehouse data platform, we observe a proliferation of tools with as yet limited standards to improve technical interoperability. In the analysis of Pedreira et al. [@pedreira2023composable] the requirement for specialization in data management systems has evolved faster than our software development practices. This situation has created a siloed landscape composed of hundreds of products developed and maintained as monoliths, with limited reuse between systems. It has also affected the end users, who are often required to learn the idiosyncrasies of dozens of incompatible SQL and non-SQL API dialects, and settle for systems with incomplete functionality and inconsistent semantics. To remedy this, Pedreira et al. call to (re-)design and implement modern data platforms in terms of a 'composable data stack’ as a means to decrease development and maintenance cost and pick-up the speed of innovation.
With respect to current implementations of lakehouse data platforms, we observe a proliferation of tools with as yet limited standards to improve technical interoperability. In the analysis of Pedreira et al. [@pedreira2023composable] the requirement for specialization in data management systems has evolved faster than our software development practices. This situation has created a siloed landscape composed of hundreds of products developed and maintained as monoliths, with limited reuse between systems. It has also affected the end users, who are often required to learn the idiosyncrasies of dozens of incompatible SQL and non-SQL API dialects, and settle for systems with incomplete functionality and inconsistent semantics. To remedy this, Pedreira et al. call to (re-)design and implement modern data platforms in terms of a 'composable data stack’ as a means to decrease development and maintenance cost and pick-up the speed of innovation.

While the lakehouse architecture separating the concerns of compute and storage, the composable data stack takes the separation of concerns is taken one step further. A composable data system (@fig-composable-data-stack), not only separates the storage (layer 3) and execution (layer 2), but also separates the user interface (layer 1) from the execution engine by introducing standards for Intermediate Representation (standard A) and Connectivity (standard B). The composable data stack can be implemented with current open source technologies (@fig-cds-examples). As an example, the Ibis user interface is currently sufficiently mature to offer a standardized dataframe interface to 19 different execution engines.
While the lakehouse architecture separates the concerns of compute and storage, the composable data stack takes the separation of concerns is taken one step further. A composable data system (@fig-composable-data-stack), not only separates the storage (layer 3) and execution (layer 2), but also separates the user interface (layer 1) from the execution engine by introducing standards for Intermediate Representation (standard A) and Connectivity (standard B). The composable data stack can be implemented with current open source technologies (@fig-cds-examples). As an example, the Ibis user interface is currently sufficiently mature to offer a standardized dataframe interface to 19 different execution engines.

![Composable data stack](images/composable-data-stack.png){#fig-composable-data-stack}

Expand All @@ -170,7 +170,7 @@ Following Hai we take a subset of the core functionalities of a data lake. Becau

### Open technologies: available digital public goods

Many components of the OpenHIE specification are now available as a digital public good. @tbl-digital-public-goods lists components that are currently available for implementing the OpenHIE framework using open source, digital public goods that are compliant with the FHIR standard, illustrating the maturity of this ecosystem and development community. With the launch of the Instant OpenHIE configuration toolkit54, it has become easier to set up, explore and develop HIEs thereby reducing costs and skills required for software developers to deploy an OpenHIE architecture for quicker solution testing and as a starting point for faster production implementation and customisation. Several frameworks are available that offer a set of preconfigured components out of the box, such as for example:
Many components of the OpenHIE specification are now available as a digital public goods. @tbl-digital-public-goods lists components that are currently available for implementing the OpenHIE framework using open source, digital public goods that are compliant with the FHIR standard, illustrating the maturity of this ecosystem and development community. With the launch of the Instant OpenHIE configuration toolkit[@InstantOpenHIEv2], it has become easier to set up, explore and develop HIEs thereby reducing costs and skills required for software developers to deploy an OpenHIE architecture for quicker solution testing and as a starting point for faster production implementation and customisation. Several frameworks are available that offer a set of preconfigured components out of the box, such as for example:

- the “Open Smart Register Platform” (OpenSRP), that focuses on providing a mobile-first platform, including a FHIR native app designed to support the WHO Smart Guidelines
- the OpenHIM Platform, a reference implementation of the Instant OpenHIE framework, providing an easy way to set up, manage and operate various HIE configurations
Expand Down
36 changes: 19 additions & 17 deletions paper-data-platforms.tex
Original file line number Diff line number Diff line change
Expand Up @@ -474,12 +474,13 @@ \subsection{Open standards: using FHIR as the common data

The recent convergence to FHIR as the de facto standard for information
exchange has fuelled the development of OpenHIE. FHIR is currently used
both for routine healthcare settings40 and clinical research settings
\citep{duda2022hl7, vorisek2022fast} and is increasingly being used in
LMICs as well. The FHIR-native OpenSRP platform \citep{mehl2020open} has
been deployed in 14 countries targeting various patient populations,
amongst which a reference implementation of the WHO antenatal and
neonatal care guidelines for midwives in Lombok, Indonesia
both for routine healthcare settings\citep{ayaz2021fast} and clinical
research settings \citep{duda2022hl7, vorisek2022fast} and is
increasingly being used in LMICs as well. The FHIR-native OpenSRP
platform \citep{mehl2020open} has been deployed in 14 countries
targeting various patient populations, amongst which a reference
implementation of the WHO antenatal and neonatal care guidelines for
midwives in Lombok, Indonesia
\citep{summitinstitutefordevelopment2023bunda, kurniawan2019midwife}. In
India, FHIR is used as the underlying technology for the open Health
Claims Exchange protocol specification, which has been adopted by the
Expand Down Expand Up @@ -520,8 +521,8 @@ \subsection{Open standards: using FHIR as the common data
and FHIR have been compared and found all to be equally suitable
\citep{rinaldi2021openehr}. Comparing OMOP and FHIR, the latter has been
found to support more granular mappings required for analytics and was
therefore chosen as the standard for the CODA
project\citep{mullie2023coda}.
therefore chosen as the standard for the CODA project
\citep{mullie2023coda}.

Although FHIR was originally designed only for exchange between systems,
we propose to use it as the common data model for the design presented
Expand Down Expand Up @@ -577,7 +578,7 @@ \subsection{Open architecture: extending OpenHIE framework with a
former, and powerful management and optimization features from the
latter.

With respect to current implementations of lakehouse data platform, we
With respect to current implementations of lakehouse data platforms, we
observe a proliferation of tools with as yet limited standards to
improve technical interoperability. In the analysis of Pedreira et al.
\citep{pedreira2023composable} the requirement for specialization in
Expand All @@ -592,7 +593,7 @@ \subsection{Open architecture: extending OpenHIE framework with a
a `composable data stack' as a means to decrease development and
maintenance cost and pick-up the speed of innovation.

While the lakehouse architecture separating the concerns of compute and
While the lakehouse architecture separates the concerns of compute and
storage, the composable data stack takes the separation of concerns is
taken one step further. A composable data system
(Figure~\ref{fig-composable-data-stack}), not only separates the storage
Expand Down Expand Up @@ -658,17 +659,18 @@ \subsection{Open technologies: available digital public
goods}\label{open-technologies-available-digital-public-goods}

Many components of the OpenHIE specification are now available as a
digital public good. Table~\ref{tbl-digital-public-goods} lists
digital public goods. Table~\ref{tbl-digital-public-goods} lists
components that are currently available for implementing the OpenHIE
framework using open source, digital public goods that are compliant
with the FHIR standard, illustrating the maturity of this ecosystem and
development community. With the launch of the Instant OpenHIE
configuration toolkit54, it has become easier to set up, explore and
develop HIEs thereby reducing costs and skills required for software
developers to deploy an OpenHIE architecture for quicker solution
testing and as a starting point for faster production implementation and
customisation. Several frameworks are available that offer a set of
preconfigured components out of the box, such as for example:
configuration toolkit\citep{InstantOpenHIEv2}, it has become easier to
set up, explore and develop HIEs thereby reducing costs and skills
required for software developers to deploy an OpenHIE architecture for
quicker solution testing and as a starting point for faster production
implementation and customisation. Several frameworks are available that
offer a set of preconfigured components out of the box, such as for
example:

\begin{itemize}
\tightlist
Expand Down

0 comments on commit c1c8a26

Please sign in to comment.