Allows processing of images with MMSegmentation.
Uses PyTorch 1.9.0 and CUDA 11.1.
MMSegmentation github repo tag/hash:
v1.1.0
00790766aff22bd6470dbbd9e89ea40685008395
and timestamp:
July 4th, 2023
-
Log into registry using public credentials:
docker login -u public -p public public.aml-repo.cms.waikato.ac.nz:443
-
Pull and run image (adjust volume mappings
-v
):docker run --gpus=all --shm-size 8G \ -v /local/dir:/container/dir \ -it public.aml-repo.cms.waikato.ac.nz:443/open-mmlab/mmsegmentation:1.1.0_cuda11.1
-
Pull and run image (adjust volume mappings
-v
):docker run --gpus=all --shm-size 8G \ -v /local/dir:/container/dir \ -it waikatodatamining/mmsegmentation:1.1.0_cuda11.1
-
Build the image from Docker file (from within /path_to/mmsegmentation/1.1.0_cuda11.1)
docker build -t mmseg .
-
Run the container
docker run --gpus=all --shm-size 8G -v /local/dir:/container/dir -it mmseg
/local/dir:/container/dir
maps a local disk directory into a directory inside the container
docker build -t mmsegmentation:1.1.0_cuda11.1 .
-
Tag
docker tag \ mmsegmentation:1.1.0_cuda11.1 \ public-push.aml-repo.cms.waikato.ac.nz:443/open-mmlab/mmsegmentation:1.1.0_cuda11.1
-
Push
docker push public-push.aml-repo.cms.waikato.ac.nz:443/open-mmlab/mmsegmentation:1.1.0_cuda11.1
If error "no basic auth credentials" occurs, then run (enter username/password when prompted):
docker login public-push.aml-repo.cms.waikato.ac.nz:443
-
Tag
docker tag \ mmsegmentation:1.1.0_cuda11.1 \ waikatodatamining/mmsegmentation:1.1.0_cuda11.1
-
Push
docker push waikatodatamining/mmsegmentation:1.1.0_cuda11.1
If error "no basic auth credentials" occurs, then run (enter username/password when prompted):
docker login
The following scripts are available:
mmseg_config
- for expanding/exporting default configurations (calls print_config2.py)mmseg_train
- for training a model (calls/mmsegmentation/tools/train.py
)mmseg_predict_poll
- for applying a model to images (uses file-polling, calls/mmsegmentation/tools/predict_poll.py
)mmseg_predict_redis
- for applying a model to images (via Redis backend), add--net=host
to the Docker options (calls/mmsegmentation/tools/predict_redis.py
)mmseg_onnx
- for exporting pytorch models to ONNX (calls/mmsegmentation/tools/pytorch2onnx.py
)indexed-png-stats
- can output statistics for datasets, i.e., listing the pixel counts per PNG index (for quality checks)
-
The annotations must be in indexed PNG format. You can use wai.annotations to convert your data from other formats.
-
Store class names or label strings in an environment variable called
MMSEG_CLASSES
(inside the container):export MMSEG_CLASSES=\'class1\',\'class2\',...
-
Alternatively, have the labels stored in a text file with the labels separated by commas and the
MMSEG_CLASSES
environment variable point at the file.-
The labels are stored in
/data/labels.txt
either as comma-separated list (class1,class2,...
) or one per line. -
Export
MMSEG_CLASSES
as follows:export MMSEG_CLASSES=/data/labels.txt
-
-
Use
mmseg_config
to export the config file (of the model you want to train) from/mmsegmentation/configs
(inside the container), then follow these instructions. -
Train
mmseg_train /path_to/your_data_config.py \ --work-dir /where/to/save/everything
-
Predict and produce PNG files
mmseg_predict_poll \ --model /path_to/epoch_n.pth \ --config /path_to/your_data_config.py \ --prediction_in /path_to/test_imgs \ --prediction_out /path_to/test_results
Run with
-h
for all available options. -
Predict via Redis backend
You need to start the docker container with the
--net=host
option if you are using the host's Redis server.The following command listens for images coming through on channel
images
and broadcasts predicted images on channelpredictions
:mmseg_predict_redis \ --model /path_to/epoch_n.pth \ --config /path_to/your_data_config.py \ --redis_in images \ --redis_out predictions
Run with
-h
for all available options.
You can output example config files using (stored under /mmsegmentation/configs
for the various network types):
mmseg_config \
--config /mmsegmentation/configs/some/config.py \
--output_config /output/dir/config.py
You can browse the config files here.
- If necessary, change
num_classes
to number of labels (background not counted). - Change
dataset_type
toExternalDataset
and any occurrences oftype
in thetrain
,test
,val
sections of thedata
dictionary. - Change
data_root
to the root path of your dataset (the directory containingtrain
andval
directories). - In
train_pipeline
,val_pipeline
andtest_pipeline
: changeimg_scale
to preferred values. Image will be scaled to the smaller value between (larger_scale/larger_image_side) and (smaller_scale/smaller_image_side). - Adapt
img_path
andseg_map_path
(as part ofdata_prefix
) to suit your dataset, remove redundant, nesteddata_root
properties. - Interval in the
checkpoint
default hook will determine the frequency of saving models while training (4000 for example will save a model after every 4000 iterations). - In the
train_cfg
property, changemax_iters
to how many iterations you want to train the model for. - Change
load_from
to the file name of the pre-trained network that you downloaded from the model zoo instead of downloading it automatically.
You don't have to copy the config file back, just point at it when training.
NB: A fully expanded config file will get placed in the output directory with the same name as the config plus the extension .full.
When running the docker container as regular use, you will want to set the correct user and group on the files generated by the container (aka the user:group launching the container):
docker run -u $(id -u):$(id -g) -e USER=$USER ...
PyTorch downloads base models, if necessary. However, by using Docker, this means that models will get downloaded with each Docker image, using up unnecessary bandwidth and slowing down the startup. To avoid this, you can map a directory on the host machine to cache the base models for all processes (usually, there would be only one concurrent model being trained):
-v /somewhere/local/cache:/.cache
Or specifically for PyTorch:
-v /somewhere/local/cache/torch:/.cache/torch
NB: When running the container as root rather than a specific user, the internal directory will have to be
prefixed with /root
.
You can use simple-redis-helper to broadcast images and listen for image segmentation results when testing.
You can test the inference of your container with the image_demo2.py script as follows:
-
create a test directory and change into it
mkdir test_inference cd test_inference
-
create cache directory
mkdir -p cache/torch
-
start the container in interactive mode
docker run --gpus=all --shm-size 8G -u $(id -u):$(id -g) -e USER=$USER \ -v `pwd`:/workspace \ -v `pwd`/cache:/.cache \ -v `pwd`/cache/torch:/.cache/torch \ -it public.aml-repo.cms.waikato.ac.nz:443/open-mmlab/mmsegmentation:1.1.0_cuda11.1
-
download a pretrained model
cd /workspace mim download mmsegmentation --config pspnet_r50-d8_512x1024_40k_cityscapes --dest .
-
perform inference
python /mmsegmentation/demo/image_demo2.py \ --img /mmsegmentation/demo/demo.png \ --config /mmsegmentation/configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py \ --checkpoint pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \ --device cuda:0 \ --output_file /workspace/demo_out.png
-
the model saved the result of the segmentation in
test_inference/demo_out.png
(in grayscale)
-
Training results in a core dump with the following error message:
File "/mmsegmentation/mmseg/models/losses/accuracy.py", line 49, in accuracy correct = correct[:, target != ignore_index]
Check that your PNG files with the annotations all have the correct indices in their palette.
-
Training with only a single class:
set
num_classes=2
and add parameteruse_sigmoid=False
to the loss function