-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
148 lines (105 loc) · 5.45 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(dplyr)
library(testthat)
```
# TestGenerator
<!-- badges: start -->
[](https://github.com/darwin-eu/TestGenerator/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/github/darwin-eu/TestGenerator?branch=main)
[](https://CRAN.R-project.org/package=TestGenerator)
<!-- badges: end -->
Does my cohort picked the correct number patients? Am I calculating an intersection in the right way? Is that the expected value for treatment duration? It just takes one incorrect parameter to get incoherent results in a pharmacoepidemiological study, and it is very challenging to test calculations on huge and complex databases.
That is why TestGenerator is useful to push a small sample of patients to unit test a study on the OMOP-CDM. It includes tools to create a blank CDM with a complete vocabulary and check if the code is doing what we expect in very specific cases.
This package is based on the unit testing written for the [Eramus MC Ranitidine Study](https://github.com/mi-erasmusmc/RanitidineStudy/blob/master/unitTesting_README.md).
## Installation
To install TestGenerator:
```{r, eval=FALSE}
# CRAN version
install.packages("TestGenerator")
```
## Example
The user can provide an Excel file [(link to sample)](https://github.com/darwin-eu/TestGenerator/raw/main/inst/extdata/icu_sample_population.xlsx) or a set of CSV files that represent tables of the OMOP-CDM, with a micro population of just 8 patients for testing purposes.
`readPatients()` will read either Excel or CSVs, and then saves the data in a JSON file. This is useful if the user wants to create more than one Unit Test Definitions. If the parameter `outputPath` is `NULL` The files are saved in the `testthat/testCases` folder of the package. Alterna
```{r, eval=FALSE}
TestGenerator::readPatients(filePath = "~/pathto/testPatients.xlsx",
testName = "test",
outputPath = NULL,
cdmVersion = "5.3")
```
Alternatively, the user can use the functions `readPatients.xl` or `readPatients.csv` directly.
```{r, eval=FALSE}
TestGenerator::readPatients.xl(filePath = "~/pathto/testPatients.xlsx",
testName = "test",
outputPath = NULL,
cdmVersion = "5.3")
TestGenerator::readPatients.csv(filePath = "~/pathto/csv/files",
testName = "test",
outputPath = NULL,
cdmVersion = "5.3",
reduceLargeIds = FALSE)
```
`patientCDM()` pushes one of those Unit Test Definitions into a blank CDM reference with a complete version of the vocabulary. If the `pathJSON` parameter is `NULL`, `TestGenerator` will look for the JSON test files in the `testthat/testCases` folder.
```{r, eval=FALSE}
cdm <- TestGenerator::patientsCDM(pathJson = NULL,
testName = "test",
cdmVersion = "5.3")
```
Now the user has a CDM reference with a complete vocabulary and just 8 patients.
```{r}
filePath <- system.file("extdata/icu_sample_population.xlsx",
package = "TestGenerator")
outputPath <- file.path(tempdir(), "test")
dir.create(outputPath)
TestGenerator::readPatients(filePath = filePath,
testName = "test",
outputPath = outputPath,
cdmVersion = "5.3")
cdm <- TestGenerator::patientsCDM(pathJson = outputPath,
testName = "test",
cdmVersion = "5.3")
cdm[["person"]] %>% glimpse()
```
The reference can be used to create a cohort and create unit tests.
```{r}
test_cohorts <- system.file("extdata",
"test_cohorts",
package = "TestGenerator")
cohort_set <- CDMConnector::readCohortSet(test_cohorts)
cdm <- CDMConnector::generate_cohort_set(cdm,
cohort_set,
name = "test_cohorts")
cohortAttrition <- CDMConnector::attrition(cdm[["test_cohorts"]])
excluded_records <- cohortAttrition %>%
pull(excluded_records) %>%
sum()
expect_equal(excluded_records, 0)
```
With `graphCohort()` it is possible to visualise the timeline for particular patient.
```{r}
diazepam <- cdm[["test_cohorts"]] %>%
filter(cohort_definition_id == 1) %>%
collect()
hospitalisation <- cdm[["test_cohorts"]] %>%
filter(cohort_definition_id == 2) %>%
collect()
icu_visit <- cdm[["test_cohorts"]] %>%
filter(cohort_definition_id == 3) %>%
collect()
TestGenerator::graphCohort(subject_id = 4, list("diazepam" = diazepam,
"hospitalisation" = hospitalisation,
"icu_visit" = icu_visit))
```
```{r, echo=FALSE}
unlink(outputPath, recursive = TRUE)
duckdb::duckdb_shutdown(duckdb::duckdb())
```