Understanding customer behavior is essential for targeted marketing and business growth. This project applies K-Means Clustering to segment customers based on annual income, spending score, and age. By identifying distinct customer groups, businesses can design personalized marketing strategies, optimize customer engagement, and improve revenue generation.
- Customer segmentation is a critical component of data-driven decision-making for businesses.
- Helps businesses identify high-value customers and tailor marketing strategies accordingly.
- Enables companies to increase customer retention, optimize promotions, and drive sales growth.
-
Targeted Marketing & Personalized Promotions
- Identifies high-value customers for exclusive discounts, rewards, and offers.
- Engages price-sensitive customers with budget-friendly promotions.
-
Optimized Store Layout & Inventory Management
- Improves product placement based on customer segmentation.
- Helps predict product demand to prevent overstock and shortages.
-
Enhancing Customer Experience & Retention
- Retains premium customers through VIP programs & loyalty incentives.
- Re-engages low-spending customers with personalized marketing strategies.
- Google Colab β For coding, data analysis, and visualization.
- Overleaf β For generating professional PDF reports using LaTeX.
- pandas β For data manipulation.
- numpy β For numerical computations.
- matplotlib & seaborn β For data visualization.
- scikit-learn β For implementing the K-Means clustering algorithm.
-
Data Preprocessing
- Load and clean customer data.
- Handle missing values and perform exploratory data analysis (EDA).
-
Feature Selection
- Consider Annual Income, Spending Score, and Age for clustering.
-
Clustering Using K-Means
- Determine the optimal number of clusters using the Elbow Method.
- Apply K-Means clustering to segment customers.
-
Result Visualization
- 2D Scatter Plot: Customer segmentation based on Annual Income & Spending Score.
- 3D Visualization: Customer segmentation based on Age, Income & Spending Score.
π Observation:
- The distribution of annual income is right-skewed, meaning fewer customers have extremely high incomes.
- Most customers earn between 40k$ and 80k$, with a peak around 60k$, indicating a middle-income customer base.
π Observation:
- The majority of customers fall between 25 to 40 years old, with the peak around 30 years, suggesting a younger customer base.
- Very few customers are aged above 60, indicating that businesses targeting senior citizens might need a different approach.
π Observation:
- The spending scores range from 0 to 100, with most customers scoring between 40 and 60, indicating a mix of moderate spenders.
- The distribution is slightly bimodal, meaning there are two dominant spending groupsβmoderate spenders and high spenders.
π Observation:
- The elbow point appears at K=5, meaning five clusters provide an optimal segmentation of customers.
π Observation:
- The scatter plot reveals five well-defined clusters, separating customers based on income and spending patterns.
- High-income, high-spending customers (black cluster) are premium buyers who may respond well to luxury and exclusive offers.
π Observation:
- The optimal number of clusters remains at K=5, even when age is introduced as an additional dimension.
π Observation:
- The yellow and red clusters (high spenders) tend to be younger, suggesting that youth-oriented marketing strategies may be effective.
-
Five customer segments identified:
- Low-income, low-spenders β Price-sensitive, require discounts.
- Low-income, high-spenders β Impulsive buyers, engaged customers.
- High-income, low-spenders β Untapped potential, need personalized promotions.
- High-income, high-spenders β Premium buyers, ideal for VIP programs.
- Moderate-income, balanced-spenders β Frequent shoppers, best for loyalty programs.
-
Elbow Method confirms that 5 clusters are optimal for segmentation.
-
Businesses can use these insights to tailor marketing strategies and optimize customer experience.
- Implement Hierarchical Clustering to compare segmentation effectiveness.
- Integrate RFM (Recency, Frequency, Monetary) Analysis for better customer profiling.
- Develop a real-time recommendation system based on customer segments.
- Clone this repository:
git clone https://github.com/yourusername/CustomerSegmentation-KMeans.git