Skip to content

SpuGraph is a framework for benchmarking graph spurious correlation learning and other related tasks.

Notifications You must be signed in to change notification settings

TimeLovercc/SpuGraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spurious Graph Benchmark

Overview

SpuGraph is a Python project using PyTorch, PyTorch Geometric, and PyTorch Lightning to implement various graph neural network models for benchmarking graph spurious correlation learning and other related tasks. This project is structured to provide a modular and flexible way to experiment with different datasets, models, and backbones.

Project Structure

SpuGraph/
│
├── data/              # Dataset directory
├── logs/              # Logs generated by the training process
├── notebooks/         # Jupyter notebooks for experimentation and analysis
├── scripts/           # Additional scripts for utilities and setup
├── src/               # Source code for the project
│   ├── backbones/     # Backbone networks implementations
│   ├── configs/       # Configuration files for models and experiments
│   ├── datasets/      # Dataset loading and processing modules
│   ├── models/        # Model implementations (e.g., DIR, CIGA, ERM, GSAT)
│   │   ├── ciga.py
│   │   ├── dir.py
│   │   ├── erm.py
│   │   ├── gsat.py
│   │   └── template.py
│   ├── main.py        # Main script to run experiments
│   ├── pretrain.py    # Pretraining routines
│   └── utils.py       # Utility functions and classes
├── .gitignore         # Git ignore file
└── README.md          # This README file

Installation

To set up the SpuGraph project, follow these steps:

  1. Clone the repository:

    git clone git@github.com:TimeLovercc/SpuGraph.git
    cd SpuGraph
    
  2. Install the required dependencies (you may want to read setup.sh to do some customizations):

    bash scripts/setup.sh
    

Usage

To run experiments with SpuGraph, use the main.py script in the src directory. You can specify various parameters and configurations through command-line arguments or configuration files.

Example usage:

python src/main.py --dataset_name <dataset> --backbone_name <backbone> --model_name <model> --seed <seed>
  • For datasets, we provide spmotif and then you need to specify the bias level in the config file in src/configs/.
  • For backbones, we provide gcn, gin, spnet and pna now.
  • For models, we provide dir, ciga, erm and gsat now.
  • For the specific setting of the model, you can check the config file in src/configs/.

For example, to run the dir model on the spmotif dataset with gin backbone, you can use the following command:

CUDA_VISIBLE_DEVICES=1 python src/main.py --dataset_name spmotif --backbone_name gin --model_name dir --seed 0

Configurations

The configs directory contains YAML files for different experimental setups. These files define parameters for datasets, models, backbones, and training procedures. You can create custom configuration files or modify existing ones to suit your experimental needs.

Models

The models directory contains implementations of different graph neural network models. You can extend these models or use them as templates to implement your own models.

Problems

  1. CIGA has many different backbones, where should we put them? Maybe backbones.
  2. CIGA and DIR have some same name methods, should we put them in a same file?
  3. Should we add a function to select representation or raw? Maybe not.
  4. What if we have too many redundant codes? Use base class to reduce the redundant codes.
  5. For node encoder, three cases, 1, -1, >1.
  6. For a new backbone, first define new layer type (add edge_att in conv), then use the base class to implement the new backbone.
  7. How to solve the explain message problem?

Contributing

Contributions to the SpuGraph project are welcome. Please follow the standard procedures for contributing to a GitHub project:

  1. Fork the repository.
  2. Create a new branch for your feature or fix.
  3. Submit a pull request with a clear description of your changes.

License

[Specify the license under which your project is released]

About

SpuGraph is a framework for benchmarking graph spurious correlation learning and other related tasks.

Resources

Stars

Watchers

Forks