Name	Name	Last commit message	Last commit date
parent directory ..
callbacks	callbacks
config	config
data	data
loggers	loggers
models	models
modules	modules
scripts/genx	scripts/genx
utils	utils
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
environment.yaml	environment.yaml
finetune.py	finetune.py
gen4_class_labels.py	gen4_class_labels.py
train.py	train.py
validation.py	validation.py

RVT: Recurrent Vision Transformers for Object Detection with Event Cameras

Image from official RVT implementation

This is the modified RVT version for the CVPR 2024 paper eTraM: Event-based Traffic Monitoring Dataset.

Conda Installation

We highly recommend to use Mambaforge to reduce the installation time.

conda env create -f environment.yaml

In case of error you can follow the original installation steps.

conda create -y -n rvt python=3.9 pip
conda activate rvt
conda config --set channel_priority flexible

CUDA_VERSION=11.8

conda install -y h5py=3.8.0 blosc-hdf5-plugin=1.0.0 \
hydra-core=1.3.2 einops=0.6.0 torchdata=0.6.0 tqdm numba \
pytorch=2.0.0 torchvision=0.15.0 pytorch-cuda=$CUDA_VERSION \
-c pytorch -c nvidia -c conda-forge

python -m pip install pytorch-lightning==1.8.6 wandb==0.14.0 \
pandas==1.5.3 plotly==5.13.1 opencv-python==4.6.0.66 tabulate==0.9.0 \
pycocotools==2.0.6 bbox-visualizer==0.1.0 StrEnum==0.4.10
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

Detectron2 is not strictly required but speeds up the evaluation.

Evaluation

Required Data

To evaluate or train RVT on eTraM you will need to download the eTraM dataset from Link to Dataset.

Run the following command to preprocess the dataset to required format

The preprocessed format is similar to gen4 (1Mpx) format with changes in the class labels.

python preprocess_dataset.py <DATA_IN_PATH> \
<DATA_OUT_PATH> \
conf_preprocess/representation/stacked_hist.yaml \
conf_preprocess/extraction/const_duration.yaml \
conf_preprocess/filter_gen4.yaml -ds gen4 -np <N_PROCESSES> # we use the same preprocessing as gen4 with modified class

Pre-trained Checkpoint

The pre-trained checkpoint of RVT-base on eTraM is available here for your reference.

Set DATA_DIR as the path to eTraM dataset directory
Set CKPT_PATH to the path of the correct checkpoint matching the choice of the model and dataset.

to load either the base, small, or tiny model configuration
Set
- USE_TEST=1 to evaluate on the test set, or
- USE_TEST=0 to evaluate on the validation set
Set GPU_ID to the PCI BUS ID of the GPU that you want to use. e.g. GPU_ID=0. Only a single GPU is supported for evaluation

python validation.py dataset=gen4 dataset.path=${DATA_DIR} checkpoint=${CKPT_PATH} \
use_test_set=${USE_TEST} hardware.gpus=${GPU_ID} +experiment/gen4="${MDL_CFG}.yaml" \
batch_size.eval=8 model.postprocess.confidence_threshold=0.001

Training

Set DATA_DIR as the path to either the 1 Mpx or Gen1 dataset directory
Set
- MDL_CFG=base, or
- MDL_CFG=small, or
- MDL_CFG=tiny
to load either the base, small, or tiny model configuration
Set GPU_IDS to the PCI BUS IDs of the GPUs that you want to use. e.g. GPU_IDS=[0,1] for using GPU 0 and 1. Using a list of IDS will enable single-node multi-GPU training. Pay attention to the batch size which is defined per GPU:
Set BATCH_SIZE_PER_GPU such that the effective batch size is matching the parameters below. The effective batch size is (batch size per gpu)*(number of GPUs).
If you would like to change the effective batch size, we found the following learning rate scaling to work well for all models on both datasets:

lr = 2e-4 * sqrt(effective_batch_size/8).
The training code uses W&B for logging during the training. Hence, we assume that you have a W&B account.
- The training script below will create a new project called RVT. Adapt the project name and group name if necessary.

python train.py model=rnndet dataset=gen4 dataset.path=<DATA_DIR>\
	wandb.project_name=<WANDB_NAME> wandb.group_name=<WAND_GRP> \
	+experiment/gen4="default.yaml" hardware.gpus=0 batch_size.train=6 \
	batch_size.eval=2 hardware.num_workers.train=4 hardware.num_workers.eval=3 \
	training.max_epochs=20 dataset.train.sampling=stream +model.head.num_classes=3

Code Acknowledgments

This project has used code from the following projects:

RVT for the official RVT implementation in Pytorch
timm for the MaxViT layer implementation in Pytorch
YOLOX for the detection PAFPN/head

References

@InProceedings{Gehrig_2023_CVPR,
  author  = {Mathias Gehrig and Davide Scaramuzza},
  title   = {Recurrent Vision Transformers for Object Detection with Event Cameras},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year    = {2023},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rvt_eTram

rvt_eTram

README.md

RVT: Recurrent Vision Transformers for Object Detection with Event Cameras

Conda Installation

Evaluation

Required Data

Pre-trained Checkpoint

Training

Code Acknowledgments

References

Files

rvt_eTram

Directory actions

More options

Directory actions

More options

Latest commit

History

rvt_eTram

Folders and files

parent directory

README.md

RVT: Recurrent Vision Transformers for Object Detection with Event Cameras

Conda Installation

Evaluation

Required Data

Pre-trained Checkpoint

Training

Code Acknowledgments

References