This project was developed as part of the mentorship I receive. The goal is to load the data into the cloud, after being properly processed, so that they are accessible to BI tools.
- Data processing from a dataset in
.csv
format - Data transformation with cleaning, standardization, and enrichment in SQL, initially in a local database (PostgreSQL).
- Creation of a DataLake in the cloud, with layers: raw, silver, gold, and diamond.
- Data consumption via BI tool.
- Clone this repository:
git clone https://github.com/DaviMacielCavalcante/desafio2-prof-artemisia cd desafio2-prof-artemisia
- Download the
indexData.csv
file from this link:https://www.kaggle.com/datasets/mattiuzc/stock-exchange-data
- In the root of the project, create a directory called "datasets" and place the
indexData.csv
file inside it.- It is recommended to clean the
.csv
files present in the DataLake layers to experience everything happening or modify the scripts as you prefer.
- It is recommended to clean the
- Run the script responsible for creating the silver layer:
python preparando_camada_silver.py
- Next, run the gold layer script:
python preparando_camada_gold.py
- Finally, run the diamond layer script:
python preparando_camada_diamond.py
- Uploading to the cloud:
- Create an AWS account;
- Follow this AWS LATAM tutorial to upload the DataLake:
https://youtube.com/playlist?list=PLQHh55hXC4yrBZ4yookmQPlX2zM9dZ-MH&si=lpGE6Hz2F6t37THw
- If you want to connect to Power BI, follow this tutorial:
https://youtu.be/WS3LUbK0ung?si=YXc_Wy5j53Ct34z3
- Stay on the right side of the Force:
Contributions are welcome! Please follow these guidelines:
- Fork the project.
- Create a branch for the feature you want to implement (
git checkout -b my-new-feature
). - Commit your changes with meaningful descriptions (
git commit -m 'Add new feature'
). - Push to the created branch (
git push origin my-new-feature
). - Open a pull request for review.
This project is licensed under the MIT License - see the LICENSE.md
file for more details.
If you have any questions or issues, feel free to contact:
📧 Email: davicc@outlook.com.br
- Darth Davi ⚔️😡
👩💻 Mentor’s GitHub: https://github.com/arteweyl
Through victory, my chains are broken.
The Force shall free me.