Releases: cumc-dbmi/cehrbert
Releases · cumc-dbmi/cehrbert
v1.4.1
v1.4.0: Update runner util (#98)
* upgraded cehrbert_data dependency * removed max_position_embeddings from computing generate_prepared_ds_path
v1.3.9: Enhance meds to OMOP logic for handling the problem list records (#95)
* updated the logic for constructing the visit type for meds input * changed a few attributes to properties in patient_block * added logic for disassociating the problem list records from the visit because they could occur years before * updated the logic to infer whether the visit should have a discharge facility code associated with it * provide a list of features to the data generator to convert meds to cehrbert patients * use the smaller integers to represent the meds generated data * removed unused import * updated hf_dataset_mapping.py to work with event generator * updated hf_dataset_mapping for backward compatibility * updated the type of visit_end_datetime * set all timestamps to timestamp[us] when converting meds to cehrbert patient * fixed a bug where the inferred_visit_type() was called at the wrong place * added a function in patient_block.py to merge the patient block that is subsumed by another * fixed the merging visit logging information * disassociate records from the visits if they occur outside the visit * added newline to the debug_log in merge_patient_blocks * added a default none value to visit_complete_event_time * added E-I visit type update in patient block merging * removed a newline char * swap the patient blocks if the the longer visit is an outpatient visit and the shorter visit (being subsumed) is an inpatient or emergency visit * added one more check on the end attribute in the event * only extract the end datetime of the record corresponding to the visit table * remove the visit_type and discharge_facility events in patient block merging * updated patient_block_merge test * manually clean up the cached files generated from Dataset.from_generator when meds format is used * collect additional cached files from the downstream datasets derived from the generator based Dataset * use CacheFileCollector to remove all the cached files generated from the dataset mapping functions * moved CacheFileCollector to a separate python module * updated the logic for identifiying the visit start and visit end when using the meds data * updated cache_util to handle Dataset and DatasetDict accordingly * removed the logic for bounding the event time stamps * updated cahce file collection logic for fine-tuning * fixed type checking in add_cache_files
v1.3.8
only generate patients who have visits in _meds_to_cehrbert_generator…
v1.3.7: Cehrbert streaming for meds (#90)
* fixed the streaming option for cehrbert pretraining when streaming is enabled for the meds data * fixed a potential bug where the CehrbertConfig was misconfigured * lossened up dependencies * disabled parallism when streaming is enabled * added the missing meds_reader sample data * removed pydant
v1.3.6
lossened up the huggingface dependencies (#89)
v1.3.5
changed to the partial match to the meds_birth code instead of the ex…
v1.3.4: New cehrbert data backward compatible (#86)
* added patient_splits_folder to evaluation * support the new cehrbert_data * added backward compatibility support for the old spark cehrbert_data application * fixed any none values in concept_values * removed the renaming logic * added try catch in case the compute_metrics in finetuning fails
v1.3.3
upgraded cehrbert_data to 0.0.5 (#84)
v1.3.2
added patient_splits_folder to evaluation (#83)