OMOP Common Data Model Documentation

CDM v5.3.1

Documentation still in development

PERSON

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
person_id It is assumed that every person with a different unique identifier is in fact a different person and should be treated independently. Any person linkage that needs to occur to identify unique persons should be done prior to ETL. integer Yes Yes No
gender_concept_id This field is meant to capture the biological sex at birth of the Person. This field should not be used to study gender identity issues. Use the gender or sex value present in the data under the assumption that it is the biological sex at birth. If the source data captures gender identity it should be stored in the OBSERVATION table. integer Yes No Yes CONCEPT Gender
year_of_birth For data sources with date of birth, the year is extracted. For data sources where the year of birth is not available, the approximate year of birth is derived based on any age group categorization available. integer Yes No No
month_of_birth For data sources that provide the precise date of birth, the month is extracted and stored in this field. integer No No No
day_of_birth For data sources that provide the precise date of birth, the day is extracted and stored in this field. integer No No No
birth_datetime Compute age using birth_datetime. For data sources that provide the precise datetime of birth, store that value in this field. If birth_datetime is not provided in the source, use the following logic to infer the date: If day_of_birth is null and month_of_birth is not null then use month/1/year. If month_of_birth is null then use 1/day/year, if day_of_birth is null and month_of_birth is null then 1/1/year. If time of birth is not given use midnight (00:00:0000). datetime No No No
race_concept_id integer Yes No Yes CONCEPT Race
ethnicity_concept_id Ethnic backgrounds as subsets of race. The OMOP CDM adheres to the OMB standards so only Concepts that represent “Hispanic” and “Not Hispanic” are stored here. If a source has more granular ethnicity information it can be found in the field ethnicity_source_value. Ethnicity in the OMOP CDM follows the OMB Standards for Data on Race and Ethnicity: Only distinctions between Hispanics and Non-Hispanics are made. If a source provides more granular ethnicity information it should be stored in the field ethnicity_source_value. integer Yes No Yes CONCEPT Ethnicity
location_id The location refers to the physical address of the person. Put the location_id from the LOCATION table here that represents the most granular location information for the person. This could be zip code, state, or county for example. integer No No Yes LOCATION
provider_id The Provider refers to the last known primary care provider (General Practitioner). Put the provider_id from the PROVIDER table of the last known general practitioner of the person. integer No No Yes PROVIDER
care_site_id The Care Site refers to where the Provider typically provides the primary care. integer No No Yes CARE_SITE
person_source_value Use this field to link back to persons in the source data. This is typically used for error checking of ETL logic. Some use cases require the ability to link back to persons in the source data. This field allows for the storing of the person value as it appears in the source. varchar(50) No No No
gender_source_value This field is used to store the biological sex of the person from the source data. It is not intended for use in standard analytics but for reference only. Put the biological sex of the person as it appears in the source data. varchar(50) No No No
gender_source_concept_id If the source data codes biological sex in a non-standard vocabulary, store the concept_id here. Integer No No Yes CONCEPT
race_source_value This field is used to store the race of the person from the source data. It is not intended for use in standard analytics but for reference only. Put the race of the person as it appears in the source data. varchar(50) No No No
race_source_concept_id If the source data codes race in an OMOP supported vocabulary store the concept_id here. Integer No No Yes CONCEPT
ethnicity_source_value This field is used to store the ethnicity of the person from the source data. It is not intended for use in standard analytics but for reference only. If the person has an ethnicity other than the OMB standard of “Hispanic” or “Not Hispanic” store that value from the source data here. varchar(50) No No No
ethnicity_source_concept_id If the source data codes ethnicity in an OMOP supported vocabulary, store the concept_id here. Integer No No Yes CONCEPT

OBSERVATION_PERIOD

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
observation_period_id A Person can have multiple discrete observations periods which are identified by the Observation_Period_Id. It is assumed that the observation period covers the period of time for which we know events occurred for the Person. In the context of the Common Data Model the absence of events during an observation period implies that the event did not occur. Assign a unique observation_period_id to each discrete observation period for a Person. An observation period should the length of time for which we know events occurred for the Person. It may take some logic to define an observation period, especially when working with EHR or registry data. Often if no enrollment or coverage information is given an observation period is defined as the time between the earliest record and the latest record available for a person. integer Yes Yes No
person_id integer Yes No Yes PERSON
observation_period_start_date Use this date to determine the start date of the period for which we can assume that all events for a Person are recorded and any absense of records indicates an absence of events. It is often the case that the idea of observation periods does not exist in source data. In those cases the observation_period_start_date can be inferred as the earliest event date available for the Person. In US claims, the observation period can be considered as the time period the person is enrolled with an insurer. If a Person switches plans but stays with the same insurer, that change would be captured in payer_plan_period. date Yes No No
observation_period_end_date Use this date to determine the end date of the period for which we can assume that all events for a Person are recorded and any absense of records indicates an absence of events. It is often the case that the idea of observation periods does not exist in source data. In those cases the observation_period_start_end_date can be inferred as the latest event date available for the Person. The event dates include insurance enrollment dates. date Yes No No
period_type_concept_id This field can be used to determine the provenance of the observation period as in whether the period was determined from an insurance enrollment file or if it was determined from EHR healthcare encounters. Choose the observation_period_type_concept_id that best represents how the period was determined. Integer Yes No Yes CONCEPT Type Concept

VISIT_OCCURRENCE

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
visit_occurrence_id Use this to identify unique interactions between a person and the health care system. This identifier links across the other CDM event tables to associate events with a visit. This should be populated by creating a unique identifier for each unique interaction between a person and the healthcare system where the person receives a medical good or service over a span of time. integer Yes Yes No
person_id integer Yes No Yes PERSON
visit_concept_id This field contains a concept id representing the kind of visit, like inpatient or outpatient. Populate this field based on the kind of visit that took place for the person. For example this could be “Inpatient Visit”, “Outpatient Visit”, “Ambulatory Visit”, etc. It is often the case that some logic should be written for how to define visits and how to assign Visit_Concept_Id. In US claims outpatient visits that appear to occur within the time period of an inpatient visit can be rolled into one with the same Visit_Occurrence_Id. In EHR data inpatient visits that are within one day of each other may be strung together to create one visit. It will all depend on the source data and how encounter records should be translated to visit occurrences. integer Yes No Yes CONCEPT Visit
visit_start_date For inpatient visits, the start date is typically the admission date. For outpatient visits the start date and end date will be the same. When populating visit_start_date, you will first have to make decisions on how to define visits. In some cases visits in the source data can be strung together if there are one or fewer days between them. date Yes No No
visit_start_datetime If no time is given for the start date of a visit, set it to midnight (00:00:0000). datetime No No No
visit_end_date For inpatient visits the end date is typically the discharge date. Visit end dates are mandatory. If end dates are not provided in the source there are three ways in which to derive them: Outpatient Visit: visit_end_datetime = visit_start_datetime Emergency Room Visit: visit_end_datetime = visit_start_datetime Inpatient Visit: Usually there is information about discharge. If not, you should be able to derive the end date from the sudden decline of activity or from the absence of inpatient procedures/drugs. Non-hospital institution Visits: Particularly for claims data, if end dates are not provided assume the visit is for the duration of month that it occurs. For Inpatient Visits ongoing at the date of ETL, put date of processing the data into visit_end_datetime and visit_type_concept_id with 32220 “Still patient” to identify the visit as incomplete. All other Visits: visit_end_datetime = visit_start_datetime. If this is a one-day visit the end date should match the start date. date Yes No No
visit_end_datetime If no time is given for the end date of a visit, set it to midnight (00:00:0000). datetime No No No
visit_type_concept_id Use this field to understand the provenance of the visit record, or where the record comes from. Populate this field based on the provenance of the visit record, as in whether it came from an EHR record or billing claim. Integer Yes No Yes CONCEPT Type Concept
provider_id There will only be one provider per visit. If multiple providers are associated with a visit that information can be found in the VISIT_DETAIL table. If there are multiple providers associated with a visit, you will need to choose which one to put here. The additional providers can be stored in the visit_detail table. integer No No No PROVIDER
care_site_id This field provides information about the care site where the visit took place. There should only be one care site associated with a visit. integer No No No CARE_SITE
visit_source_value This field houses the verbatim value from the source data representing the kind of visit that took place (inpatient, outpatient, emergency, etc.) If there is information about the kind of visit in the source data that value should be stored here. If a visit is an amalgamation of visits from the source then use a hierarchy to choose the visit source value, such as IP -> ER-> OP. This should line up with the logic chosen to determine how visits are created. varchar(50) No No No
visit_source_concept_id If the visit source value is coded in the source data using an OMOP supported vocabulary put the concept id representing the source value here. integer No No Yes CONCEPT
admitting_source_concept_id Use this field to determine where the patient was admitted from. This concept is part of the visit domain and can indicate if a patient was admitted to the hospital from a long-term care facility, for example. If available, map the admitted_from_source_value to a standard concept in the visit domain. integer No No Yes CONCEPT Visit
admitting_source_value This information may be called something different in the source data but the field is meant to contain a value indicating where a person was admitted from. Typically this applies only to visits that have a length of stay, like inpatient visits or long-term care visits. varchar(50) No No No
discharge_to_concept_id Use this field to determine where the patient was discharged to after a visit. This concept is part of the visit domain and can indicate if a patient was discharged to home or sent to a long-term care facility, for example. If available, map the discharge_to_source_value to a standard concept in the visit domain. integer No No Yes CONCEPT Visit
discharge_to_source_value This information may be called something different in the source data but the field is meant to contain a value indicating where a person was discharged to after a visit, as in they went home or were moved to long-term care. Typically this applies only to visits that have a length of stay of a day or more. varchar(50) No No No
preceding_visit_occurrence_id Use this field to find the visit that occured for the person prior to the given visit. There could be a few days or a few years in between. The preceding_visit_id can be used to link a visit immediately preceding the current visit. Note this is not symmetrical, and there is no such thing as a “following_visit_id”. integer No No Yes VISIT_OCCURRENCE

CONDITION_OCCURRENCE

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
condition_occurrence_id bigint Yes Yes No
person_id bigint Yes No Yes PERSON
condition_concept_id The CONDITION_CONCEPT_ID field is recommended for primary use in analyses, and must be used for network studies integer Yes No Yes CONCEPT Condition
condition_start_date date Yes No No
condition_start_datetime datetime No No No
condition_end_date date No No No
condition_end_datetime should not be inferred datetime No No No
condition_type_concept_id integer Yes No Yes CONCEPT Type Concept
condition_status_concept_id Presently, there is no designated vocabulary, domain, or class that represents condition status. The following concepts from SNOMED are recommended: Admitting diagnosis: 4203942 Final diagnosis: 4230359 (should also be used for discharge diagnosis Preliminary diagnosis: 4033240 integer No No Yes CONCEPT
stop_reason The Stop Reason indicates why a Condition is no longer valid with respect to the purpose within the source data. Note that a Stop Reason does not necessarily imply that the condition is no longer occurring. varchar(20) No No No
provider_id integer No No Yes PROVIDER
visit_occurrence_id integer No No Yes VISIT_OCCURRENCE
visit_detail_id integer No No Yes VISIT_DETAIL
condition_source_value This field is discouraged from use in analysis because it is not required to contain Standard Concepts that are used across the OHDSI community, and should only be used when Standard Concepts do not adequately represent the source detail for the Condition necessary for a given analytic use case. Consider using CONDITION_CONCEPT_ID instead to enable standardized analytics that can be consistent across the network. This code is mapped to a Standard Condition Concept in the Standardized Vocabularies and the original code is stored here for reference. varchar(50) No No No
condition_source_concept_id integer No No Yes CONCEPT
condition_status_source_value This code is mapped to a Standard Concept in the Standardized Vocabularies and the original code is stored here for reference. varchar(50) No No No

DRUG_EXPOSURE

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
drug_exposure_id bigint Yes Yes No
person_id bigint Yes No Yes PERSON
drug_concept_id integer Yes No Yes CONCEPT Drug
drug_exposure_start_date Valid entries include a start date of a prescription, the date a prescription was filled, or the date on which a Drug administration procedure was recorded. date Yes No No
drug_exposure_start_datetime datetime No No No
drug_exposure_end_date The DRUG_EXPOSURE_END_DATE denotes the day the drug exposure ended for the patient. This could be that the duration of DRUG_SUPPLY was reached (in which case DRUG_EXPOSURE_END_DATETIME = DRUG_EXPOSURE_START_DATETIME + DAYS_SUPPLY -1 day), or because the exposure was stopped (medication changed, medication discontinued, etc.) When the native data suggests a drug exposure has a days supply less than 0, drop the record as unknown if a person has received the drug or not (THEMIS issue #24). If a patient has multiple records on the same day for the same drug or procedures the ETL should not de-dupe them unless there is probable reason to believe the item is a true data duplicate (THEMIS issue #14). Depending on different sources, it could be a known or an inferred date and denotes the last day at which the patient was still exposed to Drug. date Yes No No
drug_exposure_end_datetime datetime No No No
verbatim_end_date You can use the TYPE_CONCEPT_ID to delineate between prescriptions written vs. prescriptions dispensed vs. medication history vs. patient-reported exposure date No No No
drug_type_concept_id integer Yes No Yes CONCEPT Type Concept
stop_reason Reasons include regimen completed, changed, removed, etc. varchar(20) No No No
refills The content of the refills field determines the current number of refills, not the number of remaining refills. For example, for a drug prescription with 2 refills, the content of this field for the 3 Drug Exposure events are null, 1 and 2. integer No No No
quantity float No No No
days_supply integer No No No
sig (and printed on the container) varchar(MAX) No No No
route_concept_id Route information can also be inferred from the Drug product itself by determining the Drug Form of the Concept, creating some partial overlap of the same type of information. Therefore, route information should be stored in DRUG_CONCEPT_ID (as a drug with corresponding Dose Form). The ROUTE_CONCEPT_ID could be used for storing more granular forms e.g. ‘Intraventricular cardiac’. integer No No Yes CONCEPT Route
lot_number varchar(50) No No No
provider_id integer No No Yes PROVIDER
visit_occurrence_id integer No No Yes VISIT_OCCURRENCE
visit_detail_id integer No No Yes VISIT_DETAIL
drug_source_value This code is mapped to a Standard Drug concept in the Standardized Vocabularies and the original code is, stored here for reference. varchar(50) No No No
drug_source_concept_id integer No No Yes CONCEPT
route_source_value varchar(50) No No No
dose_unit_source_value varchar(50) No No No

PROCEDURE_OCCURRENCE

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
procedure_occurrence_id integer Yes Yes No
person_id integer Yes No Yes PERSON
procedure_concept_id integer Yes No Yes CONCEPT Procedure
procedure_date date Yes No No
procedure_datetime datetime No No No
procedure_type_concept_id integer Yes No Yes CONCEPT Type Concept
modifier_concept_id These concepts are typically distinguished by ‘Modifier’ concept classes (e.g., ‘CPT4 Modifier’ as part of the ‘CPT4’ vocabulary). integer No No Yes CONCEPT
quantity If the quantity value is omitted, a single procedure is assumed. If a Procedure has a quantity of ‘0’ in the source, this should default to ‘1’ in the ETL. If there is a record in the source it can be assumed the exposure occurred at least once (THEMIS issue #26). integer No No No
provider_id integer No No No PROVIDER
visit_occurrence_id integer No No No VISIT_OCCURRENCE
visit_detail_id integer No No No VISIT_DETAIL
procedure_source_value This code is mapped to a standard procedure Concept in the Standardized Vocabularies and the original code is, stored here for reference. Procedure source codes are typically ICD-9-Proc, CPT-4, HCPCS or OPCS-4 codes. varchar(50) No No No
procedure_source_concept_id integer No No No CONCEPT
modifier_source_value varchar(50) No No No

DEVICE_EXPOSURE

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
device_exposure_id bigint Yes Yes No
person_id bigint Yes No Yes PERSON
device_concept_id integer Yes No Yes CONCEPT Device
device_exposure_start_date date Yes No No
device_exposure_start_datetime datetime No No No
device_exposure_end_date date No No No
device_exposure_end_datetime datetime No No No
device_type_concept_id integer Yes No Yes CONCEPT Type Concept
unique_device_id For medical devices that are regulated by the FDA, a Unique Device Identification (UDI) is provided if available in the data source and is recorded in the UNIQUE_DEVICE_ID field. varchar(50) No No No
quantity integer No No No
provider_id integer No No Yes PROVIDER
visit_occurrence_id integer No No Yes VISIT_OCCURRENCE
visit_detail_id integer No No Yes VISIT_DETAIL
device_source_value varchar(50) No No No
device_source_concept_id integer No No Yes CONCEPT

MEASUREMENT

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
measurement_id integer Yes Yes No
person_id integer Yes No Yes PERSON
measurement_concept_id integer Yes No Yes CONCEPT Measurement
measurement_date date Yes No No
measurement_datetime datetime No No No
measurement_time This is present for backwards compatibility and will be deprecated in an upcoming version varchar(10) No No No
measurement_type_concept_id integer Yes No Yes CONCEPT Type Concept
operator_concept_id The meaning of Concept 4172703 for ‘=’ is identical to omission of a OPERATOR_CONCEPT_ID value. Since the use of this field is rare, it’s important when devising analyses to not to forget testing for the content of this field for values different from =. If there is a negative value coming from the source, set the VALUE_AS_NUMBER to NULL, with the exception of the following Measurements (listed as LOINC codes): 1925-7 Base excess in Arterial blood by calculation 1927-3 Base excess in Venous blood by calculation Operators are <, <=, =, >=, > and these concepts belong to the ‘Meas Value Operator’ domain. 8632-2 QRS-Axis 11555-0 Base excess in Blood by calculation 1926-5 Base excess in Capillary blood by calculation 28638-5 Base excess in Arterial cord blood by calculation 28639-3 Base excess in Venous cord blood by calculation THEMIS issue #16 integer No No Yes CONCEPT
value_as_number float No No No
value_as_concept_id integer No No Yes CONCEPT
unit_concept_id integer No No Yes CONCEPT Unit
range_low Ranges have the same unit as the VALUE_AS_NUMBER. If reference ranges for upper and lower limit of normal as provided (typically by a laboratory) these are stored in the RANGE_HIGH and RANGE_LOW fields. Ranges have the same unit as the VALUE_AS_NUMBER. float No No No
range_high Ranges have the same unit as the VALUE_AS_NUMBER. float No No No
provider_id integer No No Yes PROVIDER
visit_occurrence_id integer No No Yes VISIT_OCCURRENCE
visit_detail_id integer No No Yes VISIT_DETAIL
measurement_source_value varchar(50) No No No
measurement_source_concept_id integer No No Yes CONCEPT
unit_source_value varchar(50) No No No
value_source_value varchar(50) No No No

VISIT_DETAIL

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
visit_detail_id integer Yes Yes No
person_id integer Yes No Yes PERSON
visit_detail_concept_id integer Yes No Yes CONCEPT Visit
visit_detail_start_date date Yes No No
visit_detail_start_datetime datetime No No No
visit_detail_end_date date Yes No No
visit_detail_end_datetime datetime No No No
visit_detail_type_concept_id Integer Yes No Yes CONCEPT Type Concept
provider_id integer No No Yes PROVIDER
care_site_id integer No No Yes CARE_SITE
visit_detail_source_value string(50) No No No
visit_detail_source_concept_id Integer No No Yes CONCEPT
admitting_source_value Varchar(50) No No No
admitting_source_concept_id Integer No No Yes CONCEPT
discharge_to_source_value Varchar(50) No No No
discharge_to_concept_id Integer No No Yes CONCEPT
preceding_visit_detail_id Integer No No Yes VISIT_DETAIL
visit_detail_parent_id Integer No No Yes VISIT_DETAIL
visit_occurrence_id Integer Yes No Yes VISIT_OCCURRENCE

NOTE

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
note_id integer Yes Yes No
person_id integer Yes No Yes PERSON
note_date date Yes No No
note_datetime datetime No No No
note_type_concept_id integer Yes No Yes CONCEPT Type Concept
note_class_concept_id integer Yes No Yes CONCEPT
note_title varchar(250) No No No
note_text varchar(MAX) Yes No No
encoding_concept_id integer Yes No Yes CONCEPT
language_concept_id integer Yes No Yes CONCEPT
provider_id integer No No Yes PROVIDER
visit_occurrence_id integer No No Yes VISIT_OCCURRENCE
visit_detail_id integer No No Yes VISIT_DETAIL
note_source_value varchar(50) No No No

NOTE_NLP

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
note_nlp_id integer Yes Yes No
note_id integer Yes No No
section_concept_id integer No No Yes CONCEPT
snippet varchar(250) No No No
offset varchar(50) No No No
lexical_variant varchar(250) Yes No No
note_nlp_concept_id integer No No Yes CONCEPT
note_nlp_source_concept_id integer No No Yes CONCEPT
nlp_system varchar(250) No No No
nlp_date date Yes No No
nlp_datetime datetime No No No
term_exists Term_exists is defined as a flag that indicates if the patient actually has or had the condition. Any of the following modifiers would make Term_exists false: Negation = true Subject = [anything other than the patient] Conditional = true/li> Rule_out = true Uncertain = very low certainty or any lower certainties A complete lack of modifiers would make Term_exists true. varchar(1) No No No
term_temporal Term_temporal is to indicate if a condition is �present� or just in the �past�. The following would be past: History = true Concept_date = anything before the time of the report varchar(50) No No No
term_modifiers For the modifiers that are there, they would have to have these values: Negation = false Subject = patient Conditional = false Rule_out = false Uncertain = true or high or moderate or even low (could argue about low). Term_modifiers will concatenate all modifiers for different types of entities (conditions, drugs, labs etc) into one string. Lab values will be saved as one of the modifiers. A list of allowable modifiers (e.g., signature for medications) and their possible values will be standardized later. varchar(2000) No No No

OBSERVATION

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
observation_id integer Yes Yes No
person_id integer Yes No Yes PERSON
observation_concept_id integer Yes No Yes CONCEPT
observation_date date Yes No No
observation_datetime datetime No No No
observation_type_concept_id integer Yes No Yes CONCEPT Type Concept
value_as_number float No No No
value_as_string varchar(60) No No No
value_as_concept_id Note that the value of VALUE_AS_CONCEPT_ID may be provided through mapping from a source Concept which contains the content of the Observation. In those situations, the CONCEPT_RELATIONSHIP table in addition to the ‘Maps to’ record contains a second record with the relationship_id set to ‘Maps to value’. For example, ICD9CM V17.5 concept_id 44828510 ‘Family history of asthma’ has a ‘Maps to’ relationship to 4167217 ‘Family history of clinical finding’ as well as a ‘Maps to value’ record to 317009 ‘Asthma’. Integer No No Yes CONCEPT
qualifier_concept_id integer No No Yes CONCEPT
unit_concept_id integer No No Yes CONCEPT Unit
provider_id integer No No Yes PROVIDER
visit_occurrence_id integer No No Yes VISIT_OCCURRENCE
visit_detail_id integer No No Yes VISIT_DETAIL
observation_source_value varchar(50) No No No
observation_source_concept_id integer No No Yes CONCEPT
unit_source_value varchar(50) No No No
qualifier_source_value varchar(50) No No No

SPECIMEN

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
specimen_id integer Yes Yes No
person_id integer Yes No Yes PERSON
specimen_concept_id integer Yes No Yes CONCEPT
specimen_type_concept_id integer Yes No Yes CONCEPT Type Concept
specimen_date date Yes No No
specimen_datetime datetime No No No
quantity float No No No
unit_concept_id integer No No Yes CONCEPT
anatomic_site_concept_id integer No No Yes CONCEPT
disease_status_concept_id integer No No Yes CONCEPT
specimen_source_id varchar(50) No No No
specimen_source_value varchar(50) No No No
unit_source_value varchar(50) No No No
anatomic_site_source_value varchar(50) No No No
disease_status_source_value varchar(50) No No No

FACT_RELATIONSHIP

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
domain_concept_id_1 integer Yes No Yes CONCEPT
fact_id_1 integer Yes No No
domain_concept_id_2 integer Yes No Yes CONCEPT
fact_id_2 integer Yes No No
relationship_concept_id integer Yes No Yes CONCEPT

LOCATION

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
location_id integer Yes Yes No
address_1 varchar(50) No No No
address_2 varchar(50) No No No
city varchar(50) No No No
state varchar(2) No No No
zip Zip codes are handled as strings of up to 9 characters length. For US addresses, these represent either a 3-digit abbreviated Zip code as provided by many sources for patient protection reasons, the full 5-digit Zip or the 9-digit (ZIP + 4) codes. Unless for specific reasons analytical methods should expect and utilize only the first 3 digits. For international addresses, different rules apply. varchar(9) No No No
county varchar(20) No No No
location_source_value varchar(50) No No No

CARE_SITE

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
care_site_id integer Yes Yes No
care_site_name varchar(255) No No No
place_of_service_concept_id integer No No Yes CONCEPT
location_id integer No No No
care_site_source_value varchar(50) No No No
place_of_service_source_value varchar(50) No No No

PROVIDER

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
provider_id integer Yes Yes No
provider_name varchar(255) No No No
npi varchar(20) No No No
dea varchar(20) No No No
specialty_concept_id If a Provider has more than one Specialty, the main or most often exerted specialty should be recorded. integer No No Yes CONCEPT
care_site_id integer No No Yes CARE_SITE
year_of_birth integer No No No
gender_concept_id integer No No Yes CONCEPT Gender
provider_source_value varchar(50) No No No
specialty_source_value varchar(50) No No No
specialty_source_concept_id integer No No Yes CONCEPT
gender_source_value varchar(50) No No No
gender_source_concept_id integer No No Yes CONCEPT

PAYER_PLAN_PERIOD

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
payer_plan_period_id integer Yes Yes Yes PERSON
person_id integer Yes No Yes PERSON
payer_plan_period_start_date date Yes No No
payer_plan_period_end_date date Yes No No
payer_concept_id integer No No Yes CONCEPT
payer_source_value varchar(50) No No No
payer_source_concept_id integer No No Yes CONCEPT
plan_concept_id integer No No Yes CONCEPT
plan_source_value varchar(50) No No No
plan_source_concept_id integer No No Yes CONCEPT
sponsor_concept_id integer No No Yes CONCEPT
sponsor_source_value varchar(50) No No No
sponsor_source_concept_id integer No No Yes CONCEPT
family_source_value varchar(50) No No No
stop_reason_concept_id integer No No Yes CONCEPT
stop_reason_source_value varchar(50) No No No
stop_reason_source_concept_id integer No No Yes CONCEPT

COST

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
cost_id INTEGER Yes Yes No
cost_event_id INTEGER Yes No No
cost_domain_id VARCHAR(20) Yes No Yes DOMAIN
cost_type_concept_id integer Yes No Yes CONCEPT
currency_concept_id integer No No Yes CONCEPT
total_charge FLOAT No No No
total_cost FLOAT No No No
total_paid FLOAT No No No
paid_by_payer FLOAT No No No
paid_by_patient FLOAT No No No
paid_patient_copay FLOAT No No Yes CONCEPT
paid_patient_coinsurance FLOAT No No No
paid_patient_deductible FLOAT No No No
paid_by_primary FLOAT No No No
paid_ingredient_cost FLOAT No No No
paid_dispensing_fee FLOAT No No No
payer_plan_period_id INTEGER No No No
amount_allowed FLOAT No No No
revenue_code_concept_id integer No No Yes CONCEPT
revenue_code_source_value Revenue codes are a method to charge for a class of procedures and conditions in the U.S. hospital system. VARCHAR(50) No No No
drg_concept_id integer No No Yes CONCEPT
drg_source_value Diagnosis Related Groups are US codes used to classify hospital cases into one of approximately 500 groups. VARCHAR(3) No No No

DRUG_ERA

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
drug_era_id integer Yes Yes No
person_id integer Yes No Yes PERSON
drug_concept_id integer Yes No Yes CONCEPT Drug Ingredient
drug_era_start_date The Drug Era Start Date is the start date of the first Drug Exposure for a given ingredient. (NOT RIGHT) datetime Yes No No
drug_era_end_date The Drug Era End Date is the end date of the last Drug Exposure. The End Date of each Drug Exposure is either taken from the field drug_exposure_end_date or, as it is typically not available, inferred using the following rules: For pharmacy prescription data, the date when the drug was dispensed plus the number of days of supply are used to extrapolate the End Date for the Drug Exposure. Depending on the country-specific healthcare system, this supply information is either explicitly provided in the day_supply field or inferred from package size or similar information. For Procedure Drugs, usually the drug is administered on a single date (i.e., the administration date). A standard Persistence Window of 30 days (gap, slack) is permitted between two subsequent such extrapolated DRUG_EXPOSURE records to be considered to be merged into a single Drug Era. (ARENT WE REQUIRING TO USE DRUG_EXPOSURE_END_DATE NOW????) datetime Yes No No
drug_exposure_count integer No No No
gap_days The Gap Days determine how many total drug-free days are observed between all Drug Exposure events that contribute to a DRUG_ERA record. It is assumed that the drugs are “not stockpiled” by the patient, i.e. that if a new drug prescription or refill is observed (a new DRUG_EXPOSURE record is written), the remaining supply from the previous events is abandoned. The difference between Persistence Window and Gap Days is that the former is the maximum drug-free time allowed between two subsequent DRUG_EXPOSURE records, while the latter is the sum of actual drug-free days for the given Drug Era under the above assumption of non-stockpiling. integer No No No

DOSE_ERA

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
dose_era_id integer Yes Yes No
person_id integer Yes No Yes PERSON
drug_concept_id integer Yes No Yes CONCEPT Drug Ingredient
unit_concept_id integer Yes No Yes CONCEPT Unit
dose_value float Yes No No
dose_era_start_date datetime Yes No No
dose_era_end_date datetime Yes No No

CONDITION_ERA

CDM Field User Guide ETL Conventions Datatype Required Primary Key Foreign Key FK Table FK Domain FK Class
condition_era_id integer Yes Yes No
person_id integer Yes No No PERSON
condition_concept_id integer Yes No Yes CONCEPT Condition
condition_era_start_date datetime Yes No No
condition_era_end_date datetime Yes No No
condition_occurrence_count integer No No No