This project analyzes the Superstore dataset to uncover key business insights using data cleaning, preprocessing, and interactive visualizations. The workflow includes data cleaning in Python, exploratory data analysis, and dashboard creation in Power BI.
- Overview
- Data Cleaning
- Visualization & Insights
- Key Findings
- How to Use
- Repository Structure
- Credits
Performed in Python (see data_cleaning.py
):
- Checked for and handled missing values (none found).
- Removed duplicate rows.
- Renamed columns to lowercase with underscores.
- Converted date columns (
order_date
,ship_date
) to datetime format. - Ensured correct data types for numeric columns (
sales
,profit
,quantity
,discount
). - Saved the cleaned data as
cleaned_superstore.csv
.
Summary Table:
Step | Action Taken |
---|---|
Missing Values | None found |
Duplicates | Removed |
Text Standardization | All object columns standardized |
Column Renaming | Lowercase, underscores |
Date Conversion | order_date, ship_date to datetime |
Numeric Types | Ensured for sales, profit, etc. |
Created in Power BI (Superstore_Dashboard.pbix
):
- Sales Trends Over Time: Line chart (by month & category)
- Profit by Segment: Pie chart
- Top Customers by Sales: Horizontal bar chart
- Sales by Category & Segment: Stacked bar chart
- Sales vs. Profit: Scatter plot
- Discount Impact on Profit: Waterfall chart (steps: sales, discount, profit)
- Regional Performance: Interactive slicer/filter
- Sales are rising, led by the Technology category.
- Profitability is highest in Technology; Furniture lags behind.
- Top 10 customers drive a significant portion of sales.
- Discounts reduce profit margins, especially in Furniture.
- West region outperforms others in both sales and profit.
- Clone this repository
git clone https://github.com/yourusername/superstore-analysis.git
- Data Cleaning
- Run
data_cleaning.py
to clean the raw Superstore data.
- Run
- Visualization
- Open
Superstore_Dashboard.pbix
in Power BI Desktop to explore the interactive dashboard.
- Open
├── Superstore Analysis Dashboard.pdf
├── Superstore Sales & Profit Analysis.pptx
├── TASK 2 DATA ANALYST.pdf
├── cleaned_superstore.csv
├── data_cleaning.py
├── README.md
- Dataset: Kaggle Superstore CSV
- Data Cleaning & Analysis: Suraj Rajeshkumar Rajvanshi
- Dashboard: Suraj Rajeshkumar Rajvanshi