OMOP Common Data Model Documentation

CDM v5.3.1

PERSON

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
person_id It is assumed that every person with a different unique identifier is in fact a different person and should be treated independently. Any person linkage that needs to occur to identify unique persons should be done prior to ETL. integer Yes Yes No
gender _concept_id This field is meant to capture the biological sex at birth of the Person. This field should not be used to study gender identity issues. Use the gender or sex value present in the data under the assumption that it is the biological sex at birth. If the source data captures gender identity it should be stored in the OBSERVATION table. integer Yes No Yes CONCEPT Gender
year _of_birth For data sources with date of birth, the year is extracted. For data sources where the year of birth is not available, the approximate year of birth is derived based on any age group categorization available. integer Yes No No
month _of_birth For data sources that provide the precise date of birth, the month is extracted and stored in this field. integer No No No
day _of_birth For data sources that provide the precise date of birth, the day is extracted and stored in this field. integer No No No
birth _datetime Compute age using birth_datetime. For data sources that provide the precise datetime of birth, store that value in this field. If birth_datetime is not provided in the source, use the following logic to infer the date: If day_of_birth is null and month_of_birth is not null then use month/1/year. If month_of_birth is null then use 1/day/year, if day_of_birth is null and month_of_birth is null then 1/1/year. If time of birth is not given use midnight (00:00:0000). datetime No No No
race_ concept_id integer Yes No Yes CONCEPT Race
ethnicity _concept_id Ethnic backgrounds as subsets of race. The OMOP CDM adheres to the OMB standards so only Concepts that represent “Hispanic” and “Not Hispanic” are stored here. If a source has more granular ethnicity information it can be found in the field ethnicity_source_value. Ethnicity in the OMOP CDM follows the OMB Standards for Data on Race and Ethnicity: Only distinctions between Hispanics and Non-Hispanics are made. If a source provides more granular ethnicity information it should be stored in the field ethnicity_source_value. integer Yes No Yes CONCEPT Ethnicity
location_id The location refers to the physical address of the person. Put the location_id from the LOCATION table here that represents the most granular location information for the person. This could be zip code, state, or county for example. integer No No Yes LOCATION
provider_id The Provider refers to the last known primary care provider (General Practitioner). Put the provider_id from the PROVIDER table of the last known general practitioner of the person. integer No No Yes PROVIDER
care_site_id The Care Site refers to where the Provider typically provides the primary care. integer No No Yes CARE_SITE
person _source_value Use this field to link back to persons in the source data. This is typically used for error checking of ETL logic. Some use cases require the ability to link back to persons in the source data. This field allows for the storing of the person value as it appears in the source. varchar(50) No No No
gender _source_value This field is used to store the biological sex of the person from the source data. It is not intended for use in standard analytics but for reference only. Put the biological sex of the person as it appears in the source data. varchar(50) No No No
gender_source _concept_id If the source data codes biological sex in a non-standard vocabulary, store the concept_id here. Integer No No Yes CONCEPT
race_ source_value This field is used to store the race of the person from the source data. It is not intended for use in standard analytics but for reference only. Put the race of the person as it appears in the source data. varchar(50) No No No
race_ source_concept_id If the source data codes race in an OMOP supported vocabulary store the concept_id here. Integer No No Yes CONCEPT
ethnicity_ source_value This field is used to store the ethnicity of the person from the source data. It is not intended for use in standard analytics but for reference only. If the person has an ethnicity other than the OMB standard of “Hispanic” or “Not Hispanic” store that value from the source data here. varchar(50) No No No
ethnicity_source _concept_id If the source data codes ethnicity in an OMOP supported vocabulary, store the concept_id here. Integer No No Yes CONCEPT

OBSERVATION_PERIOD

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
observation _period_id A Person can have multiple discrete observations periods which are identified by the Observation_Period_Id. It is assumed that the observation period covers the period of time for which we know events occurred for the Person. In the context of the Common Data Model the absence of events during an observation period implies that the event did not occur. Assign a unique observation_period_id to each discrete observation period for a Person. An observation period should the length of time for which we know events occurred for the Person. It may take some logic to define an observation period, especially when working with EHR or registry data. Often if no enrollment or coverage information is given an observation period is defined as the time between the earliest record and the latest record available for a person. integer Yes Yes No
person_id integer Yes No Yes PERSON
observation_period _start_date Use this date to determine the start date of the period for which we can assume that all events for a Person are recorded and any absense of records indicates an absence of events. It is often the case that the idea of observation periods does not exist in source data. In those cases the observation_period_start_date can be inferred as the earliest event date available for the Person. In US claims, the observation period can be considered as the time period the person is enrolled with an insurer. If a Person switches plans but stays with the same insurer, that change would be captured in payer_plan_period. date Yes No No
observation_period _end_date Use this date to determine the end date of the period for which we can assume that all events for a Person are recorded and any absense of records indicates an absence of events. It is often the case that the idea of observation periods does not exist in source data. In those cases the observation_period_start_end_date can be inferred as the latest event date available for the Person. The event dates include insurance enrollment dates. date Yes No No
period_type _concept_id This field can be used to determine the provenance of the observation period as in whether the period was determined from an insurance enrollment file or if it was determined from EHR healthcare encounters. Choose the observation_period_type_concept_id that best represents how the period was determined. Integer Yes No Yes CONCEPT Type Concept

VISIT_OCCURRENCE

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
visit _occurrence_id Use this to identify unique interactions between a person and the health care system. This identifier links across the other CDM event tables to associate events with a visit. This should be populated by creating a unique identifier for each unique interaction between a person and the healthcare system where the person receives a medical good or service over a span of time. integer Yes Yes No
person_id integer Yes No Yes PERSON
visit _concept_id This field contains a concept id representing the kind of visit, like inpatient or outpatient. Populate this field based on the kind of visit that took place for the person. For example this could be “Inpatient Visit”, “Outpatient Visit”, “Ambulatory Visit”, etc. It is often the case that some logic should be written for how to define visits and how to assign Visit_Concept_Id. In US claims outpatient visits that appear to occur within the time period of an inpatient visit can be rolled into one with the same Visit_Occurrence_Id. In EHR data inpatient visits that are within one day of each other may be strung together to create one visit. It will all depend on the source data and how encounter records should be translated to visit occurrences. integer Yes No Yes CONCEPT Visit
visit _start_date For inpatient visits, the start date is typically the admission date. For outpatient visits the start date and end date will be the same. When populating visit_start_date, you will first have to make decisions on how to define visits. In some cases visits in the source data can be strung together if there are one or fewer days between them. date Yes No No
visit_start _datetime If no time is given for the start date of a visit, set it to midnight (00:00:0000). datetime No No No
visit _end_date For inpatient visits the end date is typically the discharge date. Visit end dates are mandatory. If end dates are not provided in the source there are three ways in which to derive them: Outpatient Visit: visit_end_datetime = visit_start_datetime Emergency Room Visit: visit_end_datetime = visit_start_datetime Inpatient Visit: Usually there is information about discharge. If not, you should be able to derive the end date from the sudden decline of activity or from the absence of inpatient procedures/drugs. Non-hospital institution Visits: Particularly for claims data, if end dates are not provided assume the visit is for the duration of month that it occurs. For Inpatient Visits ongoing at the date of ETL, put date of processing the data into visit_end_datetime and visit_type_concept_id with 32220 “Still patient” to identify the visit as incomplete. All other Visits: visit_end_datetime = visit_start_datetime. If this is a one-day visit the end date should match the start date. date Yes No No
visit_end _datetime If no time is given for the end date of a visit, set it to midnight (00:00:0000). datetime No No No
visit_type _concept_id Use this field to understand the provenance of the visit record, or where the record comes from. Populate this field based on the provenance of the visit record, as in whether it came from an EHR record or billing claim. Integer Yes No Yes CONCEPT Type Concept
provider_id There will only be one provider per visit. If multiple providers are associated with a visit that information can be found in the VISIT_DETAIL table. If there are multiple providers associated with a visit, you will need to choose which one to put here. The additional providers can be stored in the visit_detail table. integer No No No PROVIDER
care_site_id This field provides information about the care site where the visit took place. There should only be one care site associated with a visit. integer No No No CARE_SITE
visit _source_value This field houses the verbatim value from the source data representing the kind of visit that took place (inpatient, outpatient, emergency, etc.) If there is information about the kind of visit in the source data that value should be stored here. If a visit is an amalgamation of visits from the source then use a hierarchy to choose the visit source value, such as IP -> ER-> OP. This should line up with the logic chosen to determine how visits are created. varchar(50) No No No
visit_source _concept_id If the visit source value is coded in the source data using an OMOP supported vocabulary put the concept id representing the source value here. integer No No Yes CONCEPT
admitting_source _concept_id Use this field to determine where the patient was admitted from. This concept is part of the visit domain and can indicate if a patient was admitted to the hospital from a long-term care facility, for example. If available, map the admitted_from_source_value to a standard concept in the visit domain. integer No No Yes CONCEPT Visit
admitting _source_value This information may be called something different in the source data but the field is meant to contain a value indicating where a person was admitted from. Typically this applies only to visits that have a length of stay, like inpatient visits or long-term care visits. varchar(50) No No No
discharge_to _concept_id Use this field to determine where the patient was discharged to after a visit. This concept is part of the visit domain and can indicate if a patient was discharged to home or sent to a long-term care facility, for example. If available, map the discharge_to_source_value to a standard concept in the visit domain. integer No No Yes CONCEPT Visit
discharge_to _source_value This information may be called something different in the source data but the field is meant to contain a value indicating where a person was discharged to after a visit, as in they went home or were moved to long-term care. Typically this applies only to visits that have a length of stay of a day or more. varchar(50) No No No
preceding_visit _occurrence_id Use this field to find the visit that occured for the person prior to the given visit. There could be a few days or a few years in between. The preceding_visit_id can be used to link a visit immediately preceding the current visit. Note this is not symmetrical, and there is no such thing as a “following_visit_id”. integer No No Yes VISIT_OCCURRENCE