OMOP/README.md

16 KiB

title output
Readme for COST and PRICE tables in the common data model
pdf_document html_document
toc
yes
toc toc_float
yes yes

COST and PRICE tables in the OMOP common data model

The observational medical outcomes partnership common data model (OMOP CDM) is used to represent electronic health records for over 10% of the world's population.

This branch of the common data model, called payless_health, is used to develop two tables in the common data model:

  • COST, representing a hospital's internal representations of costs (from billing, from HL7 FHIR messages, etc)
  • PRICE, representing publicly available price information from a care site such as a hospital, pharmacy, clinic, or other care site.
image

Example prototype dashboard using public data used to inform the development of the COST and PRICE tables (Open source Jupyter notebook using python for this dashboard is here).

Use cases for the COST and PRICE tables

There are several use cases for the COST and PRICE tables we have identified with the collaborators listed below:

  • [Operations] Populating the Payer Plan Period for patients with payors mapped to the Source of Payment Typology.
  • [Cost Effectiveness] Connecting care to cost. Understanding value-based care contracts with the OMOP data model (e.g. using FHIR messages to populate the COST & PRICE tables in OMOP) that supports repatable analyses to demonstrate return on investment, and enable understanding the impact of value-based care on health outcomes and impact on costs.
  • [Health Economics & Outcomes] Linking to the N3C data enclave (https://ncats.nih.gov/n3c)
  • [Health Economics & Outcomes; Policy Research] Understanding how insurer market concentration affects for-profit and non-profit hospitals (e.g. replicating this type of study: https://www.healthaffairs.org/doi/10.1377/hlthaff.2022.01184). Similarly, in health deserts (areas with a lack of hospitals) some hospitals have been found to charge many multiples times Medicare cost to insurance companies. The COST and PRICE tables can understand these marketplace dynamics and how they affect health system access, usage, costs, and outcomes
  • [Health Equity; Health Economics & Outcomes] Understanding patterns in 4000+ hospital price sheets that have been aggregated, and linking these to social and environmental determinants of health such as the area deprivation index.
  • [Observational Health Studies] Using the phenotype workflow developed in collaboration with the Phenotype Development & Evaluation Workgroup to assess unobserved confounding due to COST and PRICE information being unavailable or unreliable in electronic health records databases, while claims databases have unreliable.
  • [Care Navigation] Public price information collected from 4000+ hospitals can be used to help patients and employers make decisions about health care benefits. Examples like the demo above (code) show how prices vary by insurance product. The common data model can help people and organizations link information contained in explanations of benefits or claim feeds to public price information; this can inform how benefits are structured to also optimize health care costs and health outcomes.
  • [Health Economics & Outcomes Research; Policy Research] Countries like the United States and Estonia both have public information about the price of health care, and both have several academic medical centers that use the OMOP common data model. Cross-country comparisons can help highlight the cost effectiveness of care delivery decisions in single-payer systems or decisions like global budgeting (such as in Maryland), or their impacts on access to care across states and countries.

Who is involved

The need for this branch was determined in collaboration with the Observational Health Data Sciences and Informatics (OHDSI) collaborative, specifically through a series of presentations at several workgroups (https://www.ohdsi.org/workgroups/):

Folks who are involved or have expressed interest in use cases for the COST and PRICE tables of the common data model across these workgroups:

Schema and vocabulary information for PRICE and COST tables

We base the design of the PRICE and COST tables on several sources:

How to Use this Repository

If you are looking for the SQL DDLs and don't wish to generate them through R, they can be accessed here.

If you are looking for information on how to submit a bugfix, skip to the next section

Generating the DDLs

This module will demonstrate two different ways the CDM R package can be used to create the CDM tables in your environment. First, it uses the buildRelease function to create the DDL files on your machine, intended for end users that wish to generate these scripts from R without the need to clone or download the source code from github. The SQL scripts that are created through this process are available as zip files as part of the latest release. They are also available on the master branch here.

Second, the script shows the executeDdl function that will connect up to your SQL client directly (assuming your dbms is one of the supported dialects) and instantiate the tables through R.

Dependencies and prerequisites

This process required R-Studio to be installed as well as DatabaseConnector and SqlRender.

Create DDL, Foreign Keys, Primary Keys, and Indexes from R

First, install the package from GitHub

install.packages("devtools")
devtools::install_github("OHDSI/CommonDataModel")

List the currently supported SQL dialects

CommonDataModel::listSupportedDialects()

List the currently supported CDM versions

CommonDataModel::listSupportedVersions()

1. Use the buildRelease function

This function will generate the text files in the dialect you choose, putting the output files in the folder you specify.

CommonDataModel::buildRelease(cdmVersions = "5.4",
                              targetDialects = "postgresql",
                              outputfolder = "/pathToOutput")

2. Use the executeDdl function

If you have an empty schema ready to go, the package will connect and instantiate the tables for you. To start, you need to download DatabaseConnector in order to connect to your database.

devtools::install_github("DatabaseConnector")

cd <- DatabaseConnector::createConnectionDetails(dbms = "postgresql",
                                                 server = "localhost/ohdsi",
                                                 user = "postgres",
                                                 password = "postgres",
                                                 pathToDriver = "/pathToDriver"
                                                 )

CommonDataModel::executeDdl(connectionDetails = cd,
                            cdmVersion = "5.4",
                            cdmDatabaseSchema = "ohdsi_demo"
                            )

Bug Fixes/Model Updates

NOTE This information is for the maintainers of the CDM as well as anyone looking to submit a pull request. If you want to suggest an update or addition to the OMOP Common Data Model itself please open an issue using the proposal template. The instructions contained herein are meant to describe the process by which bugs in the DDL code should be addressed and/or new versions of the CDM are produced.

Just looking for the latest version of the CDM and you don't care about the R package? Please visit the releases tab and download the latest. It will include the DDLs for all currently supported versions of the CDM for all supported SQL dialects.

Typically, new CDM versions and updates are decided by the CDM working group (details to join meetings on homepage). These changes are tracked as issues in the github repo. Once the working group decides which changes make up a version, all the corresponding issues should be tagged with a version number, e.g. v5.4, and added to a project board.

Step 0

Changes to the model structure should be made in the representative csv files by adding, subtracting, or renaming fields or tables. ETL conventions are not currently tracked by CDM version unless they are conventions specific to new fields (for example CONDITION_STATUS was added in v5.3 which specifies the way in which primary condition designations should be captured).

Bug fixes are made much the same way using the csv files, but they should be limited to typos, primary/foreign key relationships, and formatting (like datetime vs datetime2).

Step 1

If you are making changes to the model structure request a new branch in the CommonDataModel repository for the new version of the CDM you are creating. Then, fork the repository and clone the newly made branch. If you are squashing bugs fork the repository and clone the master branch.

Step 1.1

For changes to the model structure, rename the table level and field level inst/csv files from the current released version to the new version. For example, if the new version you are creating is v5.4 and the most recent released version is v5.3, rename the csv files named "OMOP_CDMv5.3_Field_Level.csv" and "OMOP_CDMv5.3_Table_Level.csv" to "OMOP_CDMv5.4_Field_Level.csv" and "OMOP_CDMv5.4_Table_Level.csv". These files serve multiple functions; they serve as the basis for the CDM DDL, CDM documentation, and Data Quality Dashboard (DQD). You can read more about the DQD here.

For squashing bugs make the necessary changes in the csv file corresponding to the major.minor version you are fixing. For example, if you are working on fixes to v5.3.3 you would make changes in the v5.3 files. (skip to step 2)

Step 1.2

The csv files can now be updated with the changes and additions for the new CDM version. If a new table should be added, add a line to the Table_Level.csv with the table name and description and list it as part of the CDM schema. The remaining columns are quality checks that can be run. Details here on what those are. After adding any tables, make any changes or additions to CDM fields in the Field_Level.csv. The columns are meant to mimic how a DDL is structured, which is how it will eventually be generated. A yes in the field isRequired indicates a NOT NULL constraint and the datatype field should be filled in exactly how it would look in the DDL. Any additions or changes should also be reflected in the userGuidance and etlConventions fields, which are the basis for the documentation. DO NOT MAKE ANY CHANGES IN THE DDL ITSELF. The structure is set up in such a way that the csv files are the ground truth. If changes are made in the DDL instead of the csv files then the DDL will be out of sync with the documentation and the DQD.

Step 2

Once all changes are made the csvs, rebuild the package and then open extras/codeToRun.R. To make sure that your new version is recognized by the package run the function listSupportedVersions(). If you do not see it, make sure your new csv files are in inst/csv and that you have rebuilt the package. Once you have confirmed that the package recognizes your new version, run the function buildRelease(). You should now see a file in inst/ddl for your new version.

NOTE ABOUT CDM v6.0

Please be aware that v6.0 of the OMOP CDM is not fully supported by the OHDSI suite of tools and methods. The major difference in CDM v5.3 and CDM v6.0 involves switching the *_datetime fields to mandatory rather than optional. This switch radically changes the assumptions related to exposure and outcome timing. Rather than move forward with v6.0, please transform your data to CDM v5.4 until such time that we as a community have fully defined the role of dates vs datetimes both when it comes to the model and the evidence we generate.