Skip to content

Music genre classification comparing RandomForest, KNN, and SVM models with optimised feature selection, achieving 91% accuracy using KNN on the GTZAN dataset

License

Notifications You must be signed in to change notification settings

lukasz-iskierka/ml-music-classification

Repository files navigation

Music Genre Classification

Overview

A machine learning project comparing the effectiveness of RandomForest, K-Nearest Neighbours (KNN), and Support Vector Machine (SVM) models for music genre classification. Through systematic feature selection and model optimisation, the KNN classifier achieved 91% accuracy on the GTZAN dataset.

Model Development

Feature Selection and Optimisation

  1. Initial model training with the full feature set
  2. Feature importance analysis using RandomForestClassifier
  3. Iterative feature selection based on importance rankings:
    • Top 3 features: chroma_stft_mean, spec_bandwidth_mean, rolloff_mean
    • Additional significant features: mfcc1_mean, mfcc2_mean, mfcc3_mean
  4. Model retraining with optimised feature subset

Models Evaluated

  • K-Nearest Neighbours (KNN)

    • Best performing model: 91% accuracy
    • Optimised parameters through RandomSearchCV
    • Robust performance across genres
  • Random Forest

    • Used for initial feature importance analysis
    • Secondary classification model
  • Support Vector Machine (SVM)

    • Comparative baseline model
    • Performance evaluation with different kernels
    • Computational efficiency considerations

Results Summary

  • KNN achieved highest accuracy (91%) with optimised feature set
  • Feature reduction from original set to top performers maintained accuracy
  • Cross-validation scores demonstrate model stability
  • Detailed confusion matrix highlighting per-genre performance

Project Structure

.
├── features_3_sec.csv     # Feature set with 3-second windows
├── features_30_sec.csv    # Feature set with 30-second windows
├── music_genre_classification.ipynb    # Implementation and analysis
├── LICENCE
└── README.md

Technical Requirements

  • Python 3.x
  • scikit-learn
  • pandas
  • numpy
  • seaborn (visualisation)
  • matplotlib (visualisation)

Usage

  1. Clone the repository:
    git clone https://github.com/lukasz-iskierka/ml-music-classification.git
  2. Install dependencies:
    pip install scikit-learn pandas numpy seaborn matplotlib
  3. Run the Jupyter notebook for detailed analysis and results:
    jupyter notebook music_genre_classification.ipynb

Future Development

  • Ensemble method exploration
  • Additional feature engineering
  • Model optimisation for specific genre pairs
  • Performance optimisation for larger datasets

Licence

See LICENCE file for details.

Contact

For questions or suggestions, please open an issue in the GitHub repository.

About

Music genre classification comparing RandomForest, KNN, and SVM models with optimised feature selection, achieving 91% accuracy using KNN on the GTZAN dataset

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published