Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request includes several changes aimed at improving the data normalization process and code organization. The most important changes include updating import statements to use utility functions from a new module, modifying date parsing to handle day-first format, and refactoring the main function in
gdacs
data normalization script.Improvements to data normalization:
src/cerf/data_normalisation_cerf.py
: Updated import statements to useread_blob_to_dataframe
and utility functions fromsrc.utils.util
.src/disaster_charter/data_normalisation_dc.py
: Updated import statements to useread_blob_to_dataframe
and utility functions fromsrc.utils.util
.src/emdat/data_normalisation_emdat.py
: Updated import statements to useread_blob_to_dataframe
and utility functions fromsrc.utils.util
.src/gdacs/data_normalisation_gdacs.py
: Updated import statements to usecombine_csvs_from_blob_dir
and utility functions fromsrc.utils.util
.src/glide/data_normalisation_glide.py
: Updated import statements to useread_blob_to_dataframe
and utility functions fromsrc.utils.util
. Removed redundant function definitions that are now part of the utility module. [1] [2]src/idmc/data_normalisation_idmc.py
: Updated import statements to useread_blob_to_json
and utility functions fromsrc.utils.util
.src/ifrc_eme/data_normalisation_ifrc_eme.py
: Updated import statements to useread_blob_to_dataframe
and utility functions fromsrc.utils.util
.Code refactoring:
src/gdacs/data_normalisation_gdacs.py
: Refactored the main function to improve readability and added a docstring. [1] [2]src/utils/util.py
: Added a new utility module that includes functions for mapping and dropping columns, changing data types, and normalizing event types.Date parsing improvement:
src/cerf/data_normalisation_cerf.py
: Modified date parsing to handle day-first format.Minor text correction:
Makefile
: Corrected the echo statement from "Running all cleaner scripts.." to "Running all cleaning scripts..".