Skip to content

Commit e8b90e7

Browse files
committed
Adjust variable names and readme
1 parent 9db7653 commit e8b90e7

File tree

2 files changed

+36
-23
lines changed

2 files changed

+36
-23
lines changed

WSIGraph.py

+6-6
Original file line numberDiff line numberDiff line change
@@ -547,8 +547,8 @@ def constructGraphFromDict(
547547

548548
# print(f"{'Graph features cost':#^40s}, {time.time() - t3:*^10.2f}")
549549

550-
# Stroma barrier
551-
# For each inflam node, add it to neoplaAddConnecGraph, compute barrier and delete
550+
# Stroma blocker
551+
# For each inflam node, add it to neoplaAddConnecGraph, compute blocker and delete
552552
# ! Why select the maximum subgraph, if distanceThreshold wasn't set appropriately, shortestPathsLymCancer would be empty
553553
t4 = time.time()
554554
centroid_T = globalGraph.induced_subgraph(neoplaIDs).vs['Centroid']
@@ -558,13 +558,13 @@ def constructGraphFromDict(
558558
STree = cKDTree(centroid_S)
559559
dis, pairindex_T = Ttree.query(centroid_I, k=1)
560560
paircentroid_T = np.array(centroid_T)[pairindex_T]
561-
barrier = []
561+
blocker = []
562562
for Tcoor, Icoor, r in tqdm(zip(centroid_I, paircentroid_T, dis), total=len(centroid_I)):
563563
set1 = set(STree.query_ball_point(Tcoor, r))
564564
set2 = set(STree.query_ball_point(Icoor, r))
565-
barrier.append(len(set1 & set2))
566-
globalGraph.vs[inflamIDs]['stromaBarrier'] = barrier
567-
print(f"{'stroma barrier cost':#^40s}, {time.time() - t4:*^10.2f}")
565+
blocker.append(len(set1 & set2))
566+
globalGraph.vs[inflamIDs]['stromaBlocker'] = blocker
567+
print(f"{'stroma blocker cost':#^40s}, {time.time() - t4:*^10.2f}")
568568

569569
return globalGraph, edge_info
570570

readme.md

+30-17
Original file line numberDiff line numberDiff line change
@@ -2,41 +2,54 @@
22

33
## Description
44

5-
sc-MTOP is an analysis framework based on deep learning and computational pathology. It consists of three steps: 1) Hover-Net based nuclear segmentation and classification; 2) Nuclear morphological and texture feature extraction; 3) Multi-level pairwise nuclear graph construction and spatial topological feature extraction. This framework aims to characterize the tumor ecosystem diversity at the single-cell level. We have a [demo](http://101.132.124.14/#/dashboard) website to show this work.
5+
sc-MTOP is an analysis framework based on deep learning and computational pathology. It consists of two steps: 1) Nuclear segmentation and classification; 2) Feature extraction. This framework aims to characterize the tumor ecosystem diversity at the single-cell level. We have established a [demo](http://101.132.124.14/#/dashboard) website to show the functions.
66

7-
This is the offical pytorch implementation of sc_MTOP. According to the above description, we use three functions to finish three steps: segment, feature and visual. Note that only segment step support batch processing.
7+
This is the official pytorch implementation of sc-MTOP. Note that only the Nuclear segmentation and classification step supports batch processing.
88

99
<a id="hovernet"></a>
10-
In the segmentation steps, it uses the [HoVer-Net](https://github.com/vqdang/hover_net) model. We doesn't provide the model parameter in the source code because it is large. We use the pretrain model based on PanNuke dataset, you can download it from the this [url](https://drive.google.com/file/d/1SbSArI3KOOWHxRlxnjchO7_MbWzB4lNR/view) which provided from the repositories of HoVer-Net. It needs a folder path including WSI files of all samples as input. We use `.ndpi` file in our work, but we have not tried other formats of wsi files. In theory it supports all file formats allowed by HoVer-Net. The step gives a folder with json file which including all information of cell segmentation and classification.
10+
1. `F1_CellSegment.py` for nuclear segmentation and classification:
1111

12-
In the feature step, you have to provide the path of `.json` file, WSI file, and output folder. You can also provide an `.xml` annotation file to only compute features for cells within the annotation range. The annotation is with square shape and white color. The code supports multiple annotation to calculate several area in the same sample. The `.xml` annotation file follow the format of [ImageScope](https://www.leicabiosystems.com/zh/digital-pathology/manage/aperio-imagescope/) software's `.xml` annotation. The annotation file from other software may not be support. In the output folder there will be a secondary folder named after the input WSI file. In this folder, it will include three `.csv` files of cell features and one `.csv` file of graph edge information.
12+
This step employs [HoVer-Net](https://github.com/vqdang/hover_net) for simultaneous nuclear segmentation and classification. The model is pre-trained based on PanNuke dataset and can be downloaded from [url](https://drive.google.com/file/d/1SbSArI3KOOWHxRlxnjchO7_MbWzB4lNR/view).
1313

14-
In the visual step, we make a visualization of the graph and segmentation. You have to provide the path of feature, WSI file and `.xml` file. The path of feature is the output of feature step, and this step needs the cell ID and the edge information in it. `.xml` file is the annotation file same as the feature step. We only plot the range in the annotation. The visual result is written in the annotation file and can be viewed in the ImageScope software. Note that if the annotation is too large, the ImageScope will failed to open the annotation file.
14+
Provide your WSI files as input. We use `.ndpi` WSI files in our work, and theoretically it supports all WSI file formats allowed by HoVer-Net. The step outputs a `.json` file including all information on nuclear segmentation and classification for each sample.
15+
16+
17+
2. `F3_FeatureExtract.py` for feature extraction:
18+
19+
This step extracts morphological, texture and topological features for individual tumor, inflammatory and stroma cells, which are the main cellular components of breast cancer ecosystem.
20+
21+
Provide your WSI files and the corresponding `.json` files output by the segmentation step as input. It is allowed to define region of interest (ROI) using an `.xml` annotation file generated by the [ImageScope](https://www.leicabiosystems.com/zh/digital-pathology/manage/aperio-imagescope/) software. For each sample, the feature extraction step outputs a folder containing four `.csv` data files. For each type of tumor, inflammatory and stroma cells, one `.csv` files stores the features for all cells belonging to this type and each cell was identified by a unique cell ID together with the centroid’s spatial coordinates. The other `.csv` file stored the edge information for this sample and characterized each edge by the connected cell IDs.
22+
23+
3. `F4_Visualization.py` for visualization:
24+
25+
We provide an additional function for the visualization of the nuclear segmentation results and nuclear graph.
26+
27+
Provide the WSI files, the corresponding feature files output by the feature extraction step and an `.xml` annotation file defining the ROI. The output visualization results will be written in the annotation file and can be viewed using the ImageScope software. Note that ImageScope may fail to open the annotation file once your ROI is too large.
1528

1629
## Requirements
1730
### Packages and version
18-
The packages required have provided in the file `requirements.txt`
31+
The packages required have been provided in the file `requirements.txt`
1932
### Operating systems
20-
The code have test in the Windows and Ubuntu 16.04.7 LTS. But the installation in the different operation systems can be different because of some packages.
33+
The code have been tested in the Windows and Ubuntu 16.04.7 LTS.The installation in the different operation systems may be different because of some packages.
2134
### Hardware
22-
The code include the deep learning network calculation, so it needs GPU with more than 8GB video memory. And the HoVer-Net needs SSD at least 100GB for cache. The requirement of RAM depends on the data size, but we suggest that it should be more than 128GB. The code have test on the GPU NVIDIA 2080TI, RAM 128GB.
35+
The code involves deep learning-based neural network inference, so it needs GPU with more than 8GB video memory. HoVer-Net needs SSD at least 100GB for cache. The requirement of RAM depends on the data size and we suggest that it should be more than 128GB. The code has been tested on GeForce GTX 2080Ti NVIDIA GPU, RAM 128GB.
2336

2437
## Installation
2538
To install the environment, you can run the command in the terminal:
2639
```
2740
pip install -r requirements.txt
2841
```
2942
The code require package `openslide python`, but its installation is different between Linux and Windows. Please follow the [offical documentation](https://openslide.org/api/python/) to install and import it in python to make sure it can work correctly.
30-
The HoVer-Net pre-train parameter is not provided in the source code. The file size is 144 MB. You can download it follow the [Description](#hovernet). Instead, you can download it in our [release](https://github.com/fuscc-deep-path/sc_MTOP/releases/download/Demo/hovernet_fast_pannuke_type_tf2pytorch.tar).
43+
The pre-trained HoVer-Net model is not provided in the source code due to the file size. You can download it following the [Description](#hovernet) or you can download it in our [release](https://github.com/fuscc-deep-path/sc_MTOP/releases/download/Demo/hovernet_fast_pannuke_type_tf2pytorch.tar).
3144

3245
## Repository Structure
33-
`Hover`: the implementation of HoVer-Net, which clone from the offical [implementation](https://github.com/vqdang/hover_net)
46+
`Hover`: the implementation of HoVer-Net, which is cloned from the official [implementation](https://github.com/vqdang/hover_net)
3447
`main.py`: main function
35-
`F1_CellSegment.py`: segment step by calling `Hover`.
36-
`F3_FeatureExtract.py`: feature step by calling `WSIGraph.py`.
37-
`F4_Visualization.py`: visual step by calling `utils_xml.py`.
48+
`F1_CellSegment.py`: nuclear segmentation and classification by calling `Hover`.
49+
`F3_FeatureExtract.py`: feature extraction by calling `WSIGraph.py`.
50+
`F4_Visualization.py`: visualization by calling `utils_xml.py`.
3851
`utils_xml.py`: define some tools to finish visualization.
39-
`WSIGraph.py`: define the process of feature extract.
52+
`WSIGraph.py`: define the process of feature extraction.
4053

4154
## Usage Demo
4255
Here is a demo to use it in the bash terminal of Ubuntu. Some commands may not work in different terminal.
@@ -50,15 +63,15 @@ Download the demo data
5063
mkdir ./wsi
5164
wget --no-check-certificate --content-disposition -P ./wsi https://github.com/fuscc-deep-path/sc_MTOP/releases/download/Demo/Example001.ndpi
5265
```
53-
Segment step -- This step take almost 2 hours with 2080Ti GPU and SSD.
66+
Nuclear segmentation and classification -- This step takes almost 2 hours with 2080Ti GPU and SSD.
5467
```
5568
python main.py segment --input_dir='./wsi' --output_dir='./output'
5669
```
57-
Feature step -- This step take almost 40 minutes with 128GB RAM and 8 process.
70+
Feature extraction -- This step takes almost 40 minutes with 128GB RAM and 8 process.
5871
```
5972
python main.py feature --json_path='./output/json/Example001.json' --wsi_path='./wsi/Example001.ndpi' --output_path='./feature'
6073
```
61-
Visual step
74+
visualization
6275
```
6376
python main.py visual --feature_path='./feature/sample' --wsi_path='./wsi/sample.ndpi' --xml_path='./xml/sample.xml'
6477
```

0 commit comments

Comments
 (0)