SpuGraph is a Python project using PyTorch, PyTorch Geometric, and PyTorch Lightning to implement various graph neural network models for benchmarking graph spurious correlation learning and other related tasks. This project is structured to provide a modular and flexible way to experiment with different datasets, models, and backbones.
SpuGraph/
│
├── data/ # Dataset directory
├── logs/ # Logs generated by the training process
├── notebooks/ # Jupyter notebooks for experimentation and analysis
├── scripts/ # Additional scripts for utilities and setup
├── src/ # Source code for the project
│ ├── backbones/ # Backbone networks implementations
│ ├── configs/ # Configuration files for models and experiments
│ ├── datasets/ # Dataset loading and processing modules
│ ├── models/ # Model implementations (e.g., DIR, CIGA, ERM, GSAT)
│ │ ├── ciga.py
│ │ ├── dir.py
│ │ ├── erm.py
│ │ ├── gsat.py
│ │ └── template.py
│ ├── main.py # Main script to run experiments
│ ├── pretrain.py # Pretraining routines
│ └── utils.py # Utility functions and classes
├── .gitignore # Git ignore file
└── README.md # This README file
To set up the SpuGraph project, follow these steps:
-
Clone the repository:
git clone git@github.com:TimeLovercc/SpuGraph.git cd SpuGraph
-
Install the required dependencies (you may want to read
setup.sh
to do some customizations):bash scripts/setup.sh
To run experiments with SpuGraph, use the main.py
script in the src
directory. You can specify various parameters and configurations through command-line arguments or configuration files.
Example usage:
python src/main.py --dataset_name <dataset> --backbone_name <backbone> --model_name <model> --seed <seed>
- For datasets, we provide
spmotif
and then you need to specify the bias level in the config file insrc/configs/
. - For backbones, we provide
gcn
,gin
,spnet
andpna
now. - For models, we provide
dir
,ciga
,erm
andgsat
now. - For the specific setting of the model, you can check the config file in
src/configs/
.
For example, to run the dir
model on the spmotif
dataset with gin
backbone, you can use the following command:
CUDA_VISIBLE_DEVICES=1 python src/main.py --dataset_name spmotif --backbone_name gin --model_name dir --seed 0
The configs
directory contains YAML files for different experimental setups. These files define parameters for datasets, models, backbones, and training procedures. You can create custom configuration files or modify existing ones to suit your experimental needs.
The models
directory contains implementations of different graph neural network models. You can extend these models or use them as templates to implement your own models.
- CIGA has many different backbones, where should we put them? Maybe backbones.
- CIGA and DIR have some same name methods, should we put them in a same file?
- Should we add a function to select representation or raw? Maybe not.
- What if we have too many redundant codes? Use base class to reduce the redundant codes.
- For node encoder, three cases, 1, -1, >1.
- For a new backbone, first define new layer type (add edge_att in conv), then use the base class to implement the new backbone.
- How to solve the explain message problem?
Contributions to the SpuGraph project are welcome. Please follow the standard procedures for contributing to a GitHub project:
- Fork the repository.
- Create a new branch for your feature or fix.
- Submit a pull request with a clear description of your changes.
[Specify the license under which your project is released]