It is difficult to recommend what technical capabilities a site needs to set up an ETL because it is heavily dependent on the amount of data they have and how they plan to use it. Here are some examples of options that have worked well for small to medium organizations and large organizations:
+
Small-to-Medium Organization
+
+
CDM size is 100MB to several GBs
+
Vocab ~20GB
+
Results < 500 MB
+
Recommend
+
+
Server class machine disk >= 250GB (SSD preferred), >= 4 cores, >= 32GB RAM
+
+
+
Large Organization
+
+
CDM size is 12GB to several TBs
+
Vocab ~20GB
+
Results < 500 MB
+
Recommend
+
+
Cloud-based infrastructure like multiple AWS Redshift clusters, for example:
Note If you do you have access to the OHDSI Teams Tenet, either contact Clair Blacketer at mblacke@its.jnj.com or fill out this form and check “Common Data Model”
CDM WG Important Links
diff --git a/rmd/faq.Rmd b/rmd/faq.Rmd
index fcddd8d..49f96c7 100644
--- a/rmd/faq.Rmd
+++ b/rmd/faq.Rmd
@@ -1,11 +1,12 @@
---
title: "OMOP CDM Frequently Asked Questions"
output:
- html_document
-
+ html_document:
+ toc: yes
+ toc_depth: 5
+ toc_float: yes
---
-
**1. I understand that the common data model (CDM) is a way of organizing disparate data sources into the same relational database design, but how can it be effective since many databases use different coding schemes?**
During the extract, transform, load (ETL) process of converting a data source into the OMOP common data model, we standardize the structure (e.g. tables, fields, data types), conventions (e.g. rules that govern how source data should be represented), and content (e.g. what common vocabularies are used to speak the same language across clinical domains). The common data model preserves all source data, including the original source vocabulary codes, but adds the standardized vocabularies to allow for network research across the entire OHDSI research community.
@@ -171,3 +172,24 @@ Queries are written in R and SQL. The [SqlRender](https://github.com/OHDSI/sqlre
OHDSI runs as a distributed data network. All analyses are publicly available and can be downloaded to run at each site. The packages can be run locally and, at the data partner’s discretion, aggregate results can be shared with the study coordinator.
Data partners can also make use of one of OHDSI's open-source tools called [ARACHNE](https://github.com/OHDSI/arachne), a tool to facilitate distributed network analytics against the OMOP CDM.
+
+## Recommended System Requirements
+
+It is difficult to recommend what technical capabilities a site needs to set up an ETL because it is heavily dependent on the amount of data they have and how they plan to use it. Here are some examples of options that have worked well for small to medium organizations and large organizations:
+
+**Small-to-Medium Organization**
+
+- CDM size is 100MB to several GBs
+- Vocab ~20GB
+- Results < 500 MB
+- Recommend
+ - Server class machine disk >= 250GB (SSD preferred), >= 4 cores, >= 32GB RAM
+
+**Large Organization**
+
+- CDM size is 12GB to several TBs
+- Vocab ~20GB
+- Results < 500 MB
+- Recommend
+ - Cloud-based infrastructure like multiple AWS Redshift clusters, for example:
+ - ![](images/AWS_clusters.png)
diff --git a/rmd/images/AWS_clusters.png b/rmd/images/AWS_clusters.png
new file mode 100644
index 0000000..17add03
Binary files /dev/null and b/rmd/images/AWS_clusters.png differ
diff --git a/rmd/index.Rmd b/rmd/index.Rmd
index c2d1059..e093c9e 100644
--- a/rmd/index.Rmd
+++ b/rmd/index.Rmd
@@ -51,8 +51,7 @@ The CDM working group meets the first and third Tuesday of the month. See below
**Every third Tuesday of the month at 1pm est** [Teams Meeting](https://teams.microsoft.com/l/meetup-join/19%3a133f2b94b86a41a884d4a4d160610148%40thread.tacv2/1611000164347?context=%7b%22Tid%22%3a%22a30f0094-9120-4aab-ba4c-e5509023b2d5%22%2c%22Oid%22%3a%223c193b7f-c2ab-4bcf-b88c-f89a6b1fba38%22%7d)
-**Note** If you do you have access to the OHDSI Teams Tenet, either contact Clair Blacketer at mblacke@its.jnj.com or fill out [this form]
-(https://forms.office.com/Pages/ResponsePage.aspx?id=lAAPoyCRq0q6TOVQkCOy1ZyG6Ud_r2tKuS0HcGnqiQZUOVJFUzBFWE1aSVlLN0ozR01MUVQ4T0RGNyQlQCN0PWcu) and check "Common Data Model"
+**Note** If you do you have access to the OHDSI Teams Tenet, either contact Clair Blacketer at mblacke@its.jnj.com or fill out [this form](https://forms.office.com/Pages/ResponsePage.aspx?id=lAAPoyCRq0q6TOVQkCOy1ZyG6Ud_r2tKuS0HcGnqiQZUOVJFUzBFWE1aSVlLN0ozR01MUVQ4T0RGNyQlQCN0PWcu) and check "Common Data Model"
### CDM WG Important Links