This project succeeds a CLI app, built for a small and independent cafe to track their stock, couriers and customers. Due to said cafe's unprecedented growth, they have expanded to hundreds of outlets across the country. With this demand, comes the need to utilise their sales data to best target new and returning customers, and also to understand which products are selling well. The cafes are experiecing issues with collating and analysing the data produced at each branch, as their technical setup is limited.
This project solves the problem of providing consultation in what is needed to grow technical offerings, so that the company can continue to accelerate growth.
After a thorough anaylisys of the company's needs, it was decided that the best course of action was to create an ETL pipeline to handle the large volumes of transactional data from the business. The data needs to be centrally stored in a cloud environment so all the stakeholders could quickly access it.
By being able to easily query the company's data as a whole, the client will drastically increase their ability to identify company-wide trends and insights.
- Any IDE tool for Python code development,
- AWS Account,
- GitHub Account with Repository.
boto3==1.24.13
pandas==1.4.2
psycopg2==2.9.3
- Clone the repo
git clone https://github.com/YuliaTom/Cafe-network-ETL-pipeline.git
- Create a virtual environment called
env
Unit testing to be implemented with Pytest
The schematic representation of the pipeline can be found by the link below