Skip to content

Latest commit

 

History

History
49 lines (31 loc) · 2.48 KB

README.md

File metadata and controls

49 lines (31 loc) · 2.48 KB

Nasdaq-100 Index ELT

About

This project is designed to extract, load, transform (ELT), and visualize data from the Nasdaq-100 index. It serves as an end-to-end data pipeline demonstration, leveraging Python, Dagster, dbt, and Quarto. The primary goal is to gain practical experience with Modern Data Stack technologies while analyzing financial data.

The project does the following things:

  1. Data Scraping: Scrapes wikipedia for a list of comapnies in the Nasdaq100.
  2. Data Extraction: Uses yfinace python package to get information on companies in the Nasdaq-100 along with the daily open, high, low and close prices (OHLC) for the Nasdaq-100 E-mini futures (NQ). This data is then loaded to a DuckDB database for further processing.
  3. Data Transformation: Uses dbt to to transform the daily OHLC data, calculating weekly, monthly, and yearly returns.
  4. Data Visualization: Uses Quarto to create this dashboard.

Architecture

Pipeline DAG

Prerequisites

This project utlizes uv as the Python package and dependency manager. Before starting, ensure that uvis installed on your system. Installation instructions can be found here.

Setup

Install Dependencies

Run uv sync to install the necessary dependencies into the project's virtual environment.

Note: VS code users should also install the VS Code extension for Quarto to render and preview the Quarto dashboard.

Using Dagster

To launch the Dagster UI web server, run uv run dagster dev from the root directory and then navigate to the port shown in your console to view and interact with the pipeline.

Running in Docker

Ensure that Docker is installed on your system. To run the entire pipeline and create the dashboard with Docker run these command from the root directory:

docker build -t nasdaq100_elt .

docker run -it -p 8080:8080 -v nasdaq100_elt_vol:/app/dashboard nasdaq100_elt

The Dagster interface will then be available at http://localhost:8080. Trigger the ingest_and_transform_job from the Dagster jobs pane. Once the job completes dashboard.html will be available in the nasdaq100_elt_vol volume, accessible via Docker Desktop.