... | ... | @@ -6,7 +6,9 @@ |
|
|
- Install Java
|
|
|
- You will need a database to upload the data to. For testing purposes, you can use docker
|
|
|
|
|
|
`docker run --name demo_omop -e POSTGRES_PASSWORD=lollypop -e POSTGRES_USER=postgres -p 5432:5432 -v ${PWD}/postgres:/var/lib/postgresql/data -v ${PWD}/backup:/backup -d postgres `
|
|
|
```
|
|
|
docker run --name demo_omop -e POSTGRES_PASSWORD=lollypop -e POSTGRES_USER=postgres -p 5432:5432 -v ${PWD}/postgres:/var/lib/postgresql/data -v ${PWD}/backup:/backup -d postgres
|
|
|
```
|
|
|
|
|
|
## Generating Synthetic data
|
|
|
|
... | ... | @@ -16,8 +18,8 @@ Synthea<sup>TM</sup> generates synthetic data from the medical history of patien |
|
|
|
|
|
One of the greatest qualities of Synthea<sup>TM</sup> is having more than 90 different modules, each one containing models for different diseases or medical observations. However, most of these modules have dependencies between them, and it is not recommended to restrict the search for a subset of them.
|
|
|
|
|
|
Download [synthea-with-dependencies.jar](https://github.com/synthetichealth/synthea/releases/download/master-branch-latest/synthea-with-dependencies.jar) or download provided [data.zip](https://github.com/alabarga/pybcn22-modern-data-stack/blob/main/synthea/data.zip)
|
|
|
-
|
|
|
Follow the instructions to compile Synthea or download [synthea-with-dependencies.jar](https://github.com/synthetichealth/synthea/releases/download/master-branch-latest/synthea-with-dependencies.jar)
|
|
|
|
|
|
The basic command line to generate data, in Synthea<sup>TM</sup> v3.2.0, is the following:
|
|
|
|
|
|
```
|
... | ... | @@ -29,6 +31,8 @@ To export the data in CSV format, you need to set the parameter `exporter.csv.ex |
|
|
|
|
|
To generate different types of data with modules, one must use the `-m` option with the name of your modules. Check the page with an example [here](https://github.com/synthetichealth/synthea/wiki/The--M-Feature).
|
|
|
|
|
|
You can also download this [data.zip](https://github.com/alabarga/pybcn22-modern-data-stack/blob/main/synthea/data.zip)
|
|
|
|
|
|
## Import synthetic data to a relational database
|
|
|
|
|
|
To import the data to a database, we use the [ETL Synthea repo](https://github.com/OHDSI/ETL-Synthea).
|
... | ... | @@ -42,23 +46,28 @@ First, install the library from github |
|
|
And run the following code
|
|
|
|
|
|
```r
|
|
|
library(ETLSyntheaBuilder)
|
|
|
|
|
|
# Load library
|
|
|
library(ETLSyntheaBuilder)
|
|
|
|
|
|
# Download drivers
|
|
|
DatabaseConnector::downloadJdbcDrivers('postgresql', '.')
|
|
|
|
|
|
cd <- DatabaseConnector::createConnectionDetails(
|
|
|
dbms = "postgresql",
|
|
|
server = "localhost/demo_omop",
|
|
|
user = "postgres",
|
|
|
password = "lollipop",
|
|
|
port = 5432,
|
|
|
pathToDriver = "..../drivers"
|
|
|
pathToDriver = "./"
|
|
|
)
|
|
|
|
|
|
cdmSchema <- "cdm"
|
|
|
cdmVersion <- "5.4"
|
|
|
syntheaVersion <- "2.7.0"
|
|
|
syntheaSchema <- "native"
|
|
|
syntheaFileLoc <- "/tmp/synthea/output/csv"
|
|
|
vocabFileLoc <- "/tmp/Vocabulary_20181119"
|
|
|
syntheaFileLoc <- "./csv"
|
|
|
vocabFileLoc <- "./vocabulary_download_v5_minimal"
|
|
|
|
|
|
ETLSyntheaBuilder::CreateCDMTables(connectionDetails = cd, cdmSchema = cdmSchema, cdmVersion = cdmVersion)
|
|
|
|
... | ... | |