-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test using the sample data from the Leicester OMOP extract #42
Comments
Download the files to the DSD This works fine. I needed to login and find the data on OneDrive. |
Download the concept files from the UCLH SAFEHR repo to the DSD This is a little bit more complicated. I had to download the individual files as mentioned in #41 |
Open in them a version of R Parquet files can be parsed using arrow apache package. You will need to install it and load it and then run |
I read the
and then merge the two tables on the drug id to find which exposures from the
|
I wanted to count the number of matches per drug, so first I count the hits per concept name in the new merged table
and then sort and display the top three
I got
so we have Now I want to filter our the
|
Now let's take a look at the match between the First we read in the
Then we merge the two tables together
|
Checking the resulting table I can see that the drug strengths are represented in two ways
Some of the rows have none the fields and will be dropped for now |
Remember that our main target; compute_DDDs requires:
We will need to convert the strength on each exposure to the dose in units accepted by as_units(). According to the validate article |
Dose and Unit seem to be possible to extract from either
Now testing the value of
Next, I would like to try an example of compute_DDDs with a random row from the dataset. However extracting the ATC code is a bit complicated |
First step of finding the ATC codes is to examine the drug standard used in the dataset.
Matching the RxNorm to ATC is the next task, then |
I am back on getting the ATC code out of the OMOP data that we have. The standards that we have are |
Let's take a random drug concept
If we filter the
|
Now let's get the info of each of these concepts ancestors from the concept table. We can see that
We could get the
|
Now this means we could do:
each solution has advantages and disadvantages. For now I will try to build the mapping. |
First I will collect all the unique
This result in a dataframe of one column with a length of 1,742. Next I will focus on the ATC 5th concepts and filter these out of the concept table.
This gives us 5,452 concepts. Now, I will use these ATC 5th concepts to filter the
This result in 2,101,373 relationships. Clearly there are a lot of branching. Now let's match the
This gives us 2,033 rows. This is coming from 1,742. Strange! Further if we check the most common merges, we find that NA is the most common with 90 hits
let's keep going forward to the last step and then check these results after. I will now replace the concept id of the ATC 5th with the concept code from the ATC standard which could be used in the DDD function in Ramses
We finally arrive to table we wanted
|
one note for the future is that the DDD of a drug could change based on the admission route. For example, morphine
|
The relationship between RxNorm and ATC goes through the following steps. These steps are reflected in the graph BT;
A("RxNorm: Clinical Drug")-->B("RxNorm: Ingredient");
B-->C(ATC 5th);
C-->D(ATC 4th);
D-->E(ATC 3th);
E-->F(ATC 2th);
F-->G(ATC 1th);
However the connection between 1- Missing links:![]() We ca see this clearly in the pervious step where
2- Multiple ancestors Multiple descendants:The relationship between graph BT;
A("OMOP809928")-->B("L01BA01")
A-->C("L04AX03")
D("OMOP2603855")-->B
D-->C
This is partially a side effect of the third point.
3- Repeated Concepts:ATC 5th concepts are not unique. Some medications are mentioned multiple times under different classification with different codes. For example: methotrexate is mentioned under |
Here is a more serious example of the problem > filter(filtered_drug_lookup, drug_concept_id == "46276153")
# A tibble: 3 × 7
drug_concept_name drug_concept_id drug_concept_class_id ATC_level ATC_concept_name ATC_code ATC_concept_id
<chr> <int> <chr> <chr> <chr> <chr> <int>
1 sodium chloride 9 MG/ML 46276153 Clinical Drug Comp 5 cefoperazone and beta-lactamase inhibitor; parenteral J01DD62 21602913
2 sodium chloride 9 MG/ML 46276153 Clinical Drug Comp 5 ceftriaxone, combinations; systemic J01DD54 21602912
3 sodium chloride 9 MG/ML 46276153 Clinical Drug Comp 5 imipenem and cilastatin; parenteral J01DH51 21602925 the code
|
one more interesting problem is mapping the route of administration form the OMOP used standard to the ATC one. I found a route called > filter(collected_concept, collected_concept$concept_id %in% filter(omde, omde$route_concept_id == 4222254)$drug_concept_id)
# A tibble: 4 × 10
concept_id concept_name domain_id vocabulary_id concept_class_id standard_concept concept_code valid_start_date valid_end_date invalid_reason
<int> <chr> <chr> <chr> <chr> <chr> <chr> <date> <date> <chr>
1 0 No matching concept Metadata None Undefined NA No matching… 1970-01-01 2099-12-31 NA
2 21078067 5 ML Tranexamic Acid 100 MG/… Drug RxNorm Exten… Quant Clinical … S OMOP307613 2017-08-24 2099-12-31 NA
3 42479436 1000 ML Sodium Chloride 9 MG… Drug RxNorm Exten… Quant Clinical … S OMOP416795 2017-08-24 2099-12-31 NA
4 42939099 Sodium Chloride 9 MG/ML Irri… Drug RxNorm Exten… Clinical Drug S OMOP4665764 2018-08-21 2099-12-31 NA
> filter(collected_concept, collected_concept$concept_id %in% filter(omde, omde$route_concept_id == 4222254)$drug_concept_id) |> select(concept_class_id, concept_name)
# A tibble: 4 × 2
concept_class_id concept_name
<chr> <chr>
1 Undefined No matching concept
2 Quant Clinical Drug 5 ML Tranexamic Acid 100 MG/ML Prefilled Syringe
3 Quant Clinical Drug 1000 ML Sodium Chloride 9 MG/ML Injectable Solution
4 Clinical Drug Sodium Chloride 9 MG/ML Irrigation Solution |
I want to:
drug_exposure
table to theconcepts
tableThe text was updated successfully, but these errors were encountered: