Mall Customers K-Means Clustering Model

This project implements K-Means clustering on the "Mall Customers" dataset from Kaggle. The goal is to segment customers based on their annual income and spending score.

Project Structure

model.ipynb: Jupyter Notebook containing the implementation of the K-Means clustering model.
mall.csv: Dataset used for clustering.
README.md: Project documentation.
requirements.txt: List of dependencies required to run the project.

Dataset

The dataset used in this project is the "Mall Customers" dataset from Kaggle. It contains information about customers, including their annual income and spending score.

Steps

Load the Dataset: Load the dataset using pandas and display the first few rows.
Data Preprocessing: Check for missing values and select relevant features for clustering.
Standardize the Data: Standardize the features to ensure equal contribution to distance calculations.
Implement K-Means Clustering: Initialize and fit the K-Means model, then predict the cluster for each data point.
Visualize the Clusters: Create a scatter plot to visualize the clusters.
Evaluate Clustering: Use the Elbow Method and Silhouette Method to determine the optimal number of clusters.
Save the Model and Clustered Data: Save the K-Means model and the clustered data to files.

Evaluation

The performance of the K-Means clustering model is evaluated using the silhouette score. A higher silhouette score indicates better-defined clusters.

Improving the Model

To improve the performance of the K-Means clustering model, consider the following strategies:

Feature scaling and normalization
Dimensionality reduction (e.g., PCA)
Optimal number of clusters (Elbow Method, Silhouette Method)
Initialization (k-means++)
Multiple runs (n_init parameter)
Alternative clustering algorithms (e.g., GMM, DBSCAN)
Incorporate domain knowledge

Installation

To run this project, you need to have Python installed. You can install the required dependencies using the following command:

pip install -r requirements.txt

Usage

Open the model.ipynb file in Jupyter Notebook or JupyterLab to see the implementation and results of the K-Means clustering model.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
clustered_data.csv		clustered_data.csv
kmeans_model.pkl		kmeans_model.pkl
mall.csv		mall.csv
model.ipynb		model.ipynb
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mall Customers K-Means Clustering Model

Project Structure

Dataset

Steps

Evaluation

Improving the Model

Installation

Usage

License

About

Releases

Packages

Languages

philiptitus/Mall-Customers

Folders and files

Latest commit

History

Repository files navigation

Mall Customers K-Means Clustering Model

Project Structure

Dataset

Steps

Evaluation

Improving the Model

Installation

Usage

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages