Skip to content

Analyzing customer behavior using Unsupervised Learning to segment shoppers based on Annual Income, Spending Score & Age. Insights help businesses optimize marketing & retention strategies.

License

Notifications You must be signed in to change notification settings

ParthDS02/Mall-Customer-Segmentation-using-K-Means-Clustering

Repository files navigation

Mall Customer Segmentation using K-Means Clustering

πŸ“Œ Project Overview

Understanding customer behavior is essential for targeted marketing and business growth. This project applies K-Means Clustering to segment customers based on annual income, spending score, and age. By identifying distinct customer groups, businesses can design personalized marketing strategies, optimize customer engagement, and improve revenue generation.


🎯 Why This Project?

  • Customer segmentation is a critical component of data-driven decision-making for businesses.
  • Helps businesses identify high-value customers and tailor marketing strategies accordingly.
  • Enables companies to increase customer retention, optimize promotions, and drive sales growth.

πŸ”₯ Industry Applications

  1. Targeted Marketing & Personalized Promotions

    • Identifies high-value customers for exclusive discounts, rewards, and offers.
    • Engages price-sensitive customers with budget-friendly promotions.
  2. Optimized Store Layout & Inventory Management

    • Improves product placement based on customer segmentation.
    • Helps predict product demand to prevent overstock and shortages.
  3. Enhancing Customer Experience & Retention

    • Retains premium customers through VIP programs & loyalty incentives.
    • Re-engages low-spending customers with personalized marketing strategies.

πŸš€ Technologies Used

  • Google Colab – For coding, data analysis, and visualization.
  • Overleaf – For generating professional PDF reports using LaTeX.

πŸ“š Libraries Used

  • pandas – For data manipulation.
  • numpy – For numerical computations.
  • matplotlib & seaborn – For data visualization.
  • scikit-learn – For implementing the K-Means clustering algorithm.

πŸ“Š Methodology

  1. Data Preprocessing

    • Load and clean customer data.
    • Handle missing values and perform exploratory data analysis (EDA).
  2. Feature Selection

    • Consider Annual Income, Spending Score, and Age for clustering.
  3. Clustering Using K-Means

    • Determine the optimal number of clusters using the Elbow Method.
    • Apply K-Means clustering to segment customers.
  4. Result Visualization

    • 2D Scatter Plot: Customer segmentation based on Annual Income & Spending Score.
    • 3D Visualization: Customer segmentation based on Age, Income & Spending Score.

πŸ“ˆ Data Visualizations & Observations

1. Distribution of Annual Income

πŸ“Œ Observation:

  • The distribution of annual income is right-skewed, meaning fewer customers have extremely high incomes.
  • Most customers earn between 40k$ and 80k$, with a peak around 60k$, indicating a middle-income customer base.

πŸ“· Screenshot Placeholder:
Annual Income Distribution


2. Distribution of Age

πŸ“Œ Observation:

  • The majority of customers fall between 25 to 40 years old, with the peak around 30 years, suggesting a younger customer base.
  • Very few customers are aged above 60, indicating that businesses targeting senior citizens might need a different approach.

πŸ“· Screenshot Placeholder:
Age Distribution


3. Distribution of Spending Score

πŸ“Œ Observation:

  • The spending scores range from 0 to 100, with most customers scoring between 40 and 60, indicating a mix of moderate spenders.
  • The distribution is slightly bimodal, meaning there are two dominant spending groupsβ€”moderate spenders and high spenders.

πŸ“· Screenshot Placeholder:
Spending Score Distribution


4. Elbow Method (WCSS vs K for 2D Clustering)

πŸ“Œ Observation:

  • The elbow point appears at K=5, meaning five clusters provide an optimal segmentation of customers.

πŸ“· Screenshot Placeholder:
Elbow Method 2D


5. 2D Customer Segmentation (Annual Income vs Spending Score)

πŸ“Œ Observation:

  • The scatter plot reveals five well-defined clusters, separating customers based on income and spending patterns.
  • High-income, high-spending customers (black cluster) are premium buyers who may respond well to luxury and exclusive offers.

πŸ“· Screenshot Placeholder:
2D Segmentation


6. Elbow Method (WCSS vs K for 3D Clustering)

πŸ“Œ Observation:

  • The optimal number of clusters remains at K=5, even when age is introduced as an additional dimension.

πŸ“· Screenshot Placeholder:
Elbow Method 3D


7. 3D Customer Segmentation (Age, Income, Spending Score)

πŸ“Œ Observation:

  • The yellow and red clusters (high spenders) tend to be younger, suggesting that youth-oriented marketing strategies may be effective.

πŸ“· Screenshot Placeholder:
3D Segmentation


βœ… Final Results & Insights

  • Five customer segments identified:

    • Low-income, low-spenders – Price-sensitive, require discounts.
    • Low-income, high-spenders – Impulsive buyers, engaged customers.
    • High-income, low-spenders – Untapped potential, need personalized promotions.
    • High-income, high-spenders – Premium buyers, ideal for VIP programs.
    • Moderate-income, balanced-spenders – Frequent shoppers, best for loyalty programs.
  • Elbow Method confirms that 5 clusters are optimal for segmentation.

  • Businesses can use these insights to tailor marketing strategies and optimize customer experience.


🎯 Future Scope

  • Implement Hierarchical Clustering to compare segmentation effectiveness.
  • Integrate RFM (Recency, Frequency, Monetary) Analysis for better customer profiling.
  • Develop a real-time recommendation system based on customer segments.

πŸ’‘ How to Use This Repository

  1. Clone this repository:
    git clone https://github.com/yourusername/CustomerSegmentation-KMeans.git

About

Analyzing customer behavior using Unsupervised Learning to segment shoppers based on Annual Income, Spending Score & Age. Insights help businesses optimize marketing & retention strategies.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published