Cohort-middleware provides a set of web-services (endpoints) for:
- providing information about cohorts to which a user has authorized access (Atlas DB cohorts as defined in Fence/Arborist?)
- getting clinical attribute values for a given cohort (aka CONCEPT values in Atlas/OMOP jargon)
- providing patient-level clinical attribute values matrix for use in backend workflows, like GWAS workflows (e.g. https://github.com/uc-cdis/vadc-genesis-cwl)
The cohorts and their clinical attribute values are retrieved from connected OHDSI/CMD/Atlas databases via SQL queries.
OpenAPI documentation available here.
YAML file for the OpenAPI documentation is found in the openapis
folder.
Overview of cohort-middleware and its connected systems:
Execute the following command to get help:
go run main.go -h
To just start with the default "development" settings:
go run main.go
See example config file in ./config/
folder.
The data which our code queries is currently assuming 2 separate databases. The "atlas" schema on one database, and the "results" and "cdm" schemas together on another DB. In practice, the databases could even be a mix from different vendors/engines (e.g. one a "sql server" and one a "postgres"). Therefore, the code does not have queries that do a direct join between tables in "atlas" and "results" or "atlas" and "cdm".
Below is an overview of the schemas and respective tables.
DB Instance1:
===============================
SCHEMA atlas
===============================
TABLE atlas.source
TABLE atlas.source_daimon
TABLE atlas.cohort_definition
DB Instance2:
===============================
SCHEMA results
===============================
TABLE results.COHORT
===============================
SCHEMA omop
===============================
TABLE omop.person
TABLE omop.observation
TABLE omop.concept
VIEW omop.observation_continuous
Setup the local Atlas DB by running the init_db.sh
script in the ./tests
folder:
cd tests/setup_local_db/
./init_db.sh
Test this setup by trying the following curl commands: JSON summary data endpoints:
curl http://localhost:8080/sources | python -m json.tool
curl http://localhost:8080/cohortdefinition-stats/by-source-id/1 | python -m json.tool
curl http://localhost:8080/concept/by-source-id/1 | python -m json.tool
curl -d '{"ConceptIds":[2000000324,2000006885]}' -H "Content-Type: application/json" -X POST http://localhost:8080/concept/by-source-id/1 | python -m json.tool
curl -d '{"ConceptTypes":["Measurement","Person"]}' -H "Content-Type: application/json" -X POST http://localhost:8080/concept/by-source-id/1/by-type | python -m json.tool
curl http://localhost:8080/concept-stats/by-source-id/1/by-cohort-definition-id/3/breakdown-by-concept-id/2000007027 | python3 -m json.tool
curl -d '{"variables": [{"variable_type": "concept", "concept_id": 2000006885}]}' -H "Content-Type: application/json" -X POST http://localhost:8080/concept-stats/by-source-id/1/by-cohort-definition-id/3/breakdown-by-concept-id/2000007027 | python3 -m json.tool
CSV data endpoints:
curl -d '{"variables":[{"variable_type": "concept", "concept_id": 2000000324},{"variable_type": "concept", "concept_id": 2000006885},{"variable_type": "concept", "concept_id": 2000007027},{"variable_type": "custom_dichotomous", "cohort_ids": [1, 2]}]}' -H "Content-Type: application/json" -X POST http://localhost:8080/cohort-data/by-source-id/1/by-cohort-definition-id/3
curl -d '{"variables":[{"variable_type": "concept", "concept_id": 2000000324},{"variable_type": "concept", "concept_id": 2000006885},{"variable_type": "concept", "concept_id": 2000007027},{"variable_type": "custom_dichotomous", "provided_name": "test123", "cohort_ids": [1, 99]}]}' -H "Content-Type: application/json" -X POST http://localhost:8080/concept-stats/by-source-id/1/by-cohort-definition-id/3/breakdown-by-concept-id/2000007027/csv
Histogram endpoint:
curl -d '{"variables":[{"variable_type": "custom_dichotomous", "cohort_ids": [1, 4]}]}' -H "Content-Type: application/json" -X POST http://localhost:8080/histogram/by-source-id/1/by-cohort-definition-id/4/by-histogram-concept-id/2000006885
-
Add config .yaml as a secret:
- If the config secret does not yet exist, create it with name expected in this deployment .yaml file:
kubectl create secret generic <secret_name_here> \ --from-file=./test.yaml \
where
./test.yaml
follows the general structure of./config/development.yaml
.- Check if it worked with:
kubectl get secrets/<secret_name_here> -o yaml
-
PRs to
master
get the docker image built on quay (via github action). See https://quay.io/repository/cdis/cohort-middleware?tab=tags- The following config file determines which branch or tag is used on QA: https://github.com/uc-cdis/gitops-qa/blob/master/qa-mickey.planx-pla.net/manifest.json
-
If testing on QA:
- ssh to QA machine
- run the steps below:
echo "====== Pull manifest without going into directory ====== " git -C ~/cdis-manifest pull echo "====== Update the manifest configmaps ======" gen3 kube-setup-secrets echo "====== Deploy ======" gen3 roll cohort-middleware
Example:
curl -H "Content-Type: application/json" -H "$(cat auth.txt)" https://<qa-url-here>/sources | python -m json.tool
Note that the <qa-url-here>
in these examples above needs to be replaced, and the ids used (by-source-id/2
, by-cohort-definition-id/3
) need
to be replaced with real values from the QA environment. The main addition in these curl
commands is the presence of https
and the
extra -H "$(cat auth.txt)"
. More explained in the subsections below.
- check
/home/<qa-machine-name>/cdis-manifest/<qa-machine-name>/manifest.json
to make sure the desired image name and tag for cohort-middleware are present. Do not edit this file directly on the server, but make a PR with changes if needed. - regarding
gen3 roll
, see also https://github.com/uc-cdis/cloud-automation/blob/master/kube/services/cohort-middleware/cohort-middleware-deploy.yaml, which is used directly by thegen3 roll
command (see https://github.com/uc-cdis/cloud-automation/blob/master/gen3/bin/roll.sh).
Go to https:// and then to "Login"->"Profile"->"Create API key". Download the JSON to your local computer.
Run (e.g. if the downloaded JSON file is called credentials.json
):
export SERVER_NAME=<your-server-name-here>
curl -d "$(cat credentials.json)" -X POST -H "Content-Type: application/json" https://${SERVER_NAME}/user/credentials/api/access_token
Save the contents of token in a file, e.g. auth.txt
. Then try for example:
curl -H "Content-Type: application/json" -H "Authorization: bearer $(cat auth.txt)" https://${SERVER_NAME}/cohort-middleware/sources | python -m json.tool
Find the pod(s):
kubectl get pods --all-namespaces | grep cohort-middleware
or:
kubectl get pods -l app=cohort-middleware
Then run:
kubectl logs <pod-name-here>
or
kubectl logs -f -l app=cohort-middleware
See also https://kubernetes.io/docs/reference/kubectl/cheatsheet/#interacting-with-running-pods
Get help from "PE team":
- PE team = Platform Engineering team = GPE project Jira ticket = #gen3-devops-oncall (slack channel)
If networking changes are necessary:
If proxy changes are necessary:
Other config related to network policies:
To push a new generic dockerhub image to Quay (like a specific version of Golang), use something like in slack:
@qa-bot run-jenkins-job gen3-self-service-push-dockerhub-img-to-quay jenkins {"SOURCE":"python:3.10-alpine","TARGET":"quay.io/cdis/python:3.10-alpine-master"}
Or use the self-service page:
The result will be a new image pushed to quay.io that we can start using in our Dockerfile, like:
FROM quay.io/cdis/golang:1.18-bullseye