The write up includes an outline of the steps taken in the project. The purpose of the final data model is made explicit.
The write up describes a logical approach to this project under the following scenarios:
- The data was increased by 100x.
- The pipelines would be run on a daily basis by 7 am every day.
- The database needed to be accessed by 100+ people.
The choice of tools, technologies, and data model are justified well.
All coding scripts have an intuitive, easy-to-follow structure with code separated into logical functions. Naming for variables and functions follows the PEP8 style guidelines. The code should run without errors.
The project includes at least two data quality checks.
- The ETL processes result in the data model outlined in the write-up.
- A data dictionary for the final data model is included.
- The data model is appropriate for the identified purpose.
The project includes:
- At least 2 data sources
- More than 1 million lines of data.
- At least two data sources/formats (csv, api, json)