Skip to content

SamHollings/nhs_data_cleansing

Repository files navigation

nhs_data_cleansing

CI Static Badge Code style: black License: MIT

Description

This repo builds the nhs_data_cleansing python package, which contains generic Python functions (specifically using the PySpark library and data structures) for data cleansing.

The functions can be seen in src.

ToDo: Add sphinx documentation (or something similar, automatically built)

Instalation

pip install nhs_data_cleansing

Usage

Generally, simply add nhs_data_cleansing to your list of dependencies/requirements, then install the package.

Note

It's best practice to specify a version of the library in your list of dependencies - then when the package is updated, your existing work will not be affected. The verion numbers may need to be updated in the future, particularly if you want to use newer functionality.

pip

Add nhs_data_cleansing to a requirements.txt file within the project, and then do pip install -r requirements.txt

Foundry

Add nhs_data_cleansing to the conda_recipe/meta.yml file following the Foundry "python libraries" guidance

Contact

Licence

Unless stated otherwise (and in keeping with the NHS Open Source Policy), the codebase is released under the MIT License. This covers both the codebase and any sample code in the documentation. The documentation is © Crown copyright and available under the terms of the Open Government 3.0 licence.

Contribution

If you want to help build and improve this package, see the contributing guidelines


This readme has neem built in line with guidance from the NHS Open Source Policy and govtcookiecutter