Skip to content

Latest commit

 

History

History
46 lines (26 loc) · 1.24 KB

project-rubric.md

File metadata and controls

46 lines (26 loc) · 1.24 KB

Data Engineering Capstone

Write Up

Scoping the Project

The write up includes an outline of the steps taken in the project. The purpose of the final data model is made explicit.

Addressing Other Scenarios

The write up describes a logical approach to this project under the following scenarios:

  • The data was increased by 100x.
  • The pipelines would be run on a daily basis by 7 am every day.
  • The database needed to be accessed by 100+ people.

Defending Decisions

The choice of tools, technologies, and data model are justified well.

Execution

Project code is clean and modular.

All coding scripts have an intuitive, easy-to-follow structure with code separated into logical functions. Naming for variables and functions follows the PEP8 style guidelines. The code should run without errors.

Quality Checks

The project includes at least two data quality checks.

Data Model

  • The ETL processes result in the data model outlined in the write-up.
  • A data dictionary for the final data model is included.
  • The data model is appropriate for the identified purpose.

Datasets

The project includes:

  • At least 2 data sources
  • More than 1 million lines of data.
  • At least two data sources/formats (csv, api, json)