Explorytics is a Python library designed to simplify the process of exploratory data analysis (EDA). With an intuitive interface and powerful visualization tools, it provides quick insights into datasets, helping you understand distributions, correlations, and outliers with ease.
- Comprehensive Data Analysis: Perform statistical and visual analysis in a few lines of code.
- Interactive Visualizations: Generate dynamic plots for distributions, correlations, and relationships.
- Outlier Detection: Identify and explore outliers across various features.
- User-Friendly API: Designed for simplicity and ease of use, even for beginners.
Install Explorytics using pip:
pip install explorytics
Here's a quick example of how to use Explorytics with the Wine dataset from scikit-learn:
# Import required libraries
import pandas as pd
from sklearn.datasets import load_wine
from explorytics import DataAnalyzer
# Load the wine dataset
wine = load_wine()
df = pd.DataFrame(wine.data, columns=wine.feature_names)
df['wine_class'] = wine.target
# Initialize the analyzer
analyzer = DataAnalyzer(df)
# Perform analysis
results = analyzer.analyze()
# Generate a distribution plot
analyzer.visualizer.plot_distribution('alcohol', kde=True).show()
# Generate a correlation heatmap
analyzer.visualizer.plot_correlation_matrix().show()
The complete documentation is available here. It includes details on:
- Installation and setup
- Usage examples
- API references for key classes and methods
- Advanced configuration options
See it in action on Google-Colab <
Explore the examples
folder for Jupyter notebooks showcasing various use cases, including:
- Basic data exploration
- Advanced feature relationships
- Outlier detection and analysis
We welcome contributions! If you'd like to contribute:
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
. - Make your changes and commit:
git commit -m 'Add feature name'
. - Push to the branch:
git push origin feature-name
. - Open a pull request.
Please ensure your code adheres to the existing style and includes tests for any new functionality.
If you're using version 0.1.0, 0.1.1, or 0.1.2, please upgrade to version 0.1.4 or later. These earlier versions had critical issues that resulted in incomplete installations and missing files. The package structure was not correctly bundled, leading to missing dependencies and modules that prevented users from fully utilizing the library.
In versions 0.1.0, 0.1.1, and 0.1.2, we faced multiple issues with the packaging process, resulting in missing files such as visualizations and helper modules, and incorrect package structure. Following were the issues and their resolutions:
-
Missing Visualizations and Helper Files (Version 0.1.0): The
visualizations
andhelpers
modules were not included in the package, causing errors like:ModuleNotFoundError: No module named 'explorytics.visualizations'
-
Incorrect Package Structure (Version 0.1.1): Despite efforts to fix the structure, important directories such as core, visualizations, and utils were still missing in the final distribution.
-
Dependency Issues (Version 0.1.2): There were problems with dependencies not being correctly listed in setup.py, causing compatibility issues with different Python environments.
If you are facing this issue, we suggest you "upgrade to version 0.1.4 or later" to resolve this issues.
Explorytics is licensed under the MIT License. See the LICENSE file for more details.
This library was inspired by a course I was pursuing on Coursera: Exploratory Data Analysis for Machine Learning. Special thanks to the open-source community for providing inspiration and support.
Start exploring your data today with Explorytics!