This repository is a software framework designed to leverage both Semi-Supervised and Self-Supervised Learning techniques to utilize unlabeled data effectively during the training process. These methods help improve model performance by generating pseudo-labels or creating artificial labels from the data itself, enabling the model to learn useful representations. The framework is built to support a wide range of applications and tasks, providing detailed documentation for projects that aim to incorporate these advanced learning techniques.
- License
- Installation
- Getting Started
- Usage
- Project Structure
- Documentation
- Contributing
- Acknowledgments
This project is licensed under the MIT License - see the LICENSE file for details.
Please ACKNOWLEDGE THE AUTHOR if you use this repository in your project by including a link to this repository.
The prerequisites for this project are:
- Python 3.11+
- pip
To install this project, first clone the repository:
git clone https://github.com/xico2001pt/exploring-label-efficiency
Then, install the dependencies:
pip install -r requirements.txt
To adapt this project to your needs, it is recommended to read the documentation file. This file contains a brief overview of the available documentation and links to the different sections of the documentation.
There are 4 tool scripts in this repository, which correspond to three different learning paradigms and a test script:
sl_train
semisl_train
selfsl_train
test
To train or test a model, run the following command, where {TOOL_SCRIPT}
is one of the tools above and {CONFIG_PATH}
is the path for the configuration file:
python -m src.tools.{TOOL_SCRIPT} --config experiments/{CONFIG_PATH}
For example, to use Supervised Learning to train the CIFAR-10 dataset, the following command can be used:
python -m src.tools.sl_train --config experiments/sl/cifar10/wideresnet/sl_cifar10_wideresnet.yaml
exploring-label-efficiency/
├── configs/ # holds the configuration files
│ ├── configs/ # configuration files for the experiments
│ ├── datasets.yaml
│ ├── losses.yaml
│ ├── metrics.yaml
│ ├── models.yaml
│ ├── optimizers.yaml
│ ├── schedulers.yaml
│ ├── selfsl_methods.yaml
│ ├── semisl_methods.yaml
│ └── stop_conditions.yaml
├── data/ # default directory for storing input data
├── docs/ # documentation files
├── logs/ # default directory for storing logs
├── src/ # source code
│ ├── core/ # contains the core functionalities
│ ├── datasets/ # contains the datasets
│ ├── methods/ # contains the SemiSL and SelfSL methods
│ ├── models/ # contains the models
│ ├── tools/ # scripts for training, testing, etc.
│ │ ├── selfsl_train.py
│ │ ├── semisl_train.py
│ │ ├── sl_train.py
│ │ └── test.py
│ ├── trainers/ # contains the trainer classes
│ └── utils/ # utility functions
├── weights/ # default directory for storing model weights
└── requirements.txt # project dependencies
Read the documentation for more details about the project and all the sections mentioned above.
If you want to contribute to this project, please contact the author or create a pull request with a description of the feature or bug fix.
This repository was developed by Francisco Cerqueira and used this template.