Adjust variable names and readme

CDPDisk · CDPDisk · commit e8b90e7dd276 · 2023-01-31T19:16:44.000+08:00
diff --git a/WSIGraph.py b/WSIGraph.py
@@ -547,8 +547,8 @@ def constructGraphFromDict(
 
     # print(f"{'Graph features cost':#^40s}, {time.time() - t3:*^10.2f}")
 
-    # Stroma barrier
-    # For each inflam node, add it to neoplaAddConnecGraph, compute barrier and delete
+    # Stroma blocker
+    # For each inflam node, add it to neoplaAddConnecGraph, compute blocker and delete
     # ! Why select the maximum subgraph, if distanceThreshold wasn't set appropriately, shortestPathsLymCancer would be empty
     t4 = time.time()
     centroid_T = globalGraph.induced_subgraph(neoplaIDs).vs['Centroid']
@@ -558,13 +558,13 @@ def constructGraphFromDict(
     STree = cKDTree(centroid_S)
     dis, pairindex_T = Ttree.query(centroid_I, k=1) 
     paircentroid_T = np.array(centroid_T)[pairindex_T] 
-    barrier = []
+    blocker = []
     for Tcoor, Icoor, r in tqdm(zip(centroid_I, paircentroid_T, dis), total=len(centroid_I)):
         set1 = set(STree.query_ball_point(Tcoor, r))
         set2 = set(STree.query_ball_point(Icoor, r))
-        barrier.append(len(set1 & set2))
-    globalGraph.vs[inflamIDs]['stromaBarrier'] = barrier
-    print(f"{'stroma barrier cost':#^40s}, {time.time() - t4:*^10.2f}")
+        blocker.append(len(set1 & set2))
+    globalGraph.vs[inflamIDs]['stromaBlocker'] = blocker
+    print(f"{'stroma blocker cost':#^40s}, {time.time() - t4:*^10.2f}")
 
     return globalGraph, edge_info
 
diff --git a/readme.md b/readme.md
@@ -2,41 +2,54 @@
 
 ## Description
 
-sc-MTOP is an analysis framework based on deep learning and computational pathology. It consists of three steps: 1) Hover-Net based nuclear segmentation and classification; 2) Nuclear morphological and texture feature extraction; 3) Multi-level pairwise nuclear graph construction and spatial topological feature extraction. This framework aims to characterize the tumor ecosystem diversity at the single-cell level. We have a [demo](http://101.132.124.14/#/dashboard) website to show this work.
+sc-MTOP is an analysis framework based on deep learning and computational pathology. It consists of two steps: 1) Nuclear segmentation and classification; 2) Feature extraction. This framework aims to characterize the tumor ecosystem diversity at the single-cell level. We have established a [demo](http://101.132.124.14/#/dashboard) website to show the functions.
 
-This is the offical pytorch implementation of sc_MTOP. According to the above description, we use three functions to finish three steps: segment, feature and visual. Note that only segment step support batch processing.
+This is the official pytorch implementation of sc-MTOP. Note that only the Nuclear segmentation and classification step supports batch processing.
 
 <a id="hovernet"></a>
-In the segmentation steps, it uses the [HoVer-Net](https://github.com/vqdang/hover_net) model. We doesn't provide the model parameter in the source code because it is large. We use the pretrain model based on PanNuke dataset, you can download it from the this [url](https://drive.google.com/file/d/1SbSArI3KOOWHxRlxnjchO7_MbWzB4lNR/view) which provided from the repositories of HoVer-Net.  It needs a folder path including WSI files of all samples as input. We use `.ndpi` file in our work, but we have not tried other formats of wsi files. In theory it supports all file formats allowed by HoVer-Net. The step gives a folder with json file which including all information of cell segmentation and classification.
+1. `F1_CellSegment.py` for nuclear segmentation and classification:
 
-In the feature step, you have to provide the path of `.json` file, WSI file, and output folder. You can also provide an `.xml` annotation file to only compute features for cells within the annotation range. The annotation is with square shape and white color. The code supports multiple annotation to calculate several area in the same sample. The `.xml` annotation file follow the format of [ImageScope](https://www.leicabiosystems.com/zh/digital-pathology/manage/aperio-imagescope/) software's `.xml` annotation. The annotation file from other software may not be support. In the output folder there will be a secondary folder named after the input WSI file. In this folder, it will include three `.csv` files of cell features and one `.csv` file of graph edge information.
+    This step employs  [HoVer-Net](https://github.com/vqdang/hover_net) for simultaneous nuclear segmentation and classification. The model is pre-trained based on PanNuke dataset and can be downloaded from [url](https://drive.google.com/file/d/1SbSArI3KOOWHxRlxnjchO7_MbWzB4lNR/view). 
 
-In the visual step, we make a visualization of the graph and segmentation. You have to provide the path of feature, WSI file and `.xml` file. The path of feature is the output of feature step, and this step needs the cell ID and the edge information in it. `.xml` file is the annotation file same as the feature step. We only plot the range in the annotation. The visual result is written in the annotation file and can be viewed in the ImageScope software. Note that if the annotation is too large, the ImageScope will failed to open the annotation file.
+    Provide your WSI files as input. We use `.ndpi` WSI files in our work, and theoretically it supports all WSI file formats allowed by HoVer-Net. The step outputs a `.json` file including all information on nuclear segmentation and classification for each sample.
+
+
+2. `F3_FeatureExtract.py` for feature extraction:
+
+    This step extracts morphological, texture and topological features for individual tumor, inflammatory and stroma cells, which are the main cellular components of breast cancer ecosystem.
+
+    Provide your WSI files and the corresponding `.json` files output by the segmentation step as input. It is allowed to define region of interest (ROI) using an `.xml` annotation file generated by the [ImageScope](https://www.leicabiosystems.com/zh/digital-pathology/manage/aperio-imagescope/) software. For each sample, the feature extraction step outputs a folder containing four `.csv` data files. For each type of tumor, inflammatory and stroma cells, one `.csv` files stores the features for all cells belonging to this type and each cell was identified by a unique cell ID together with the centroid’s spatial coordinates. The other `.csv` file stored the edge information for this sample and characterized each edge by the connected cell IDs.
+
+3. `F4_Visualization.py` for visualization:
+
+    We provide an additional function for the visualization of the nuclear segmentation results and nuclear graph. 
+
+    Provide the WSI files, the corresponding feature files output by the feature extraction step and an `.xml` annotation file defining the ROI. The output visualization results will be written in the annotation file and can be viewed using the ImageScope software. Note that ImageScope may fail to open the annotation file once your ROI is too large.
 
 ## Requirements
 ### Packages and version
-The packages required have provided in the file `requirements.txt`
+The packages required have been provided in the file `requirements.txt`
 ### Operating systems
-The code have test in the Windows and Ubuntu 16.04.7 LTS. But the installation in the different operation systems can be different because of some packages.
+The code have been tested in the Windows and Ubuntu 16.04.7 LTS.The installation in the different operation systems may be different because of some packages.
 ### Hardware
-The code include the deep learning network calculation, so it needs GPU with more than 8GB video memory. And the HoVer-Net needs SSD at least 100GB for cache. The requirement of RAM depends on the data size, but we suggest that it should be more than 128GB. The code have test on the GPU NVIDIA 2080TI, RAM 128GB.
+The code involves deep learning-based neural network inference, so it needs GPU with more than 8GB video memory. HoVer-Net needs SSD at least 100GB for cache. The requirement of RAM depends on the data size and we suggest that it should be more than 128GB. The code has been tested on GeForce GTX 2080Ti NVIDIA GPU, RAM 128GB.
 
 ## Installation
 To install the environment, you can run the command in the terminal:
 ```
 pip install -r requirements.txt
 ```
 The code require package `openslide python`, but its installation is different between Linux and Windows. Please follow the [offical documentation](https://openslide.org/api/python/) to install and import it in python to make sure it can work correctly.
-The HoVer-Net pre-train parameter is not provided in the source code. The file size is 144 MB. You can download it follow the [Description](#hovernet). Instead, you can download it in our [release](https://github.com/fuscc-deep-path/sc_MTOP/releases/download/Demo/hovernet_fast_pannuke_type_tf2pytorch.tar).
+The pre-trained HoVer-Net model is not provided in the source code due to the file size. You can download it following the [Description](#hovernet) or you can download it in our [release](https://github.com/fuscc-deep-path/sc_MTOP/releases/download/Demo/hovernet_fast_pannuke_type_tf2pytorch.tar).
 
 ## Repository Structure
-`Hover`: the implementation of HoVer-Net, which clone from the offical [implementation](https://github.com/vqdang/hover_net)  
+`Hover`: the implementation of HoVer-Net, which is cloned from the official [implementation](https://github.com/vqdang/hover_net)  
 `main.py`: main function  
-`F1_CellSegment.py`: segment step by calling `Hover`.  
-`F3_FeatureExtract.py`: feature step by calling `WSIGraph.py`.  
-`F4_Visualization.py`: visual step by calling `utils_xml.py`.  
+`F1_CellSegment.py`: nuclear segmentation and classification by calling `Hover`.  
+`F3_FeatureExtract.py`: feature extraction by calling `WSIGraph.py`.  
+`F4_Visualization.py`: visualization by calling `utils_xml.py`.  
 `utils_xml.py`: define some tools to finish visualization.  
-`WSIGraph.py`: define the process of feature extract.
+`WSIGraph.py`: define the process of feature extraction.
 
 ## Usage Demo
 Here is a demo to use it in the bash terminal of Ubuntu. Some commands may not work in different terminal.
@@ -50,15 +63,15 @@ Download the demo data
 mkdir ./wsi
 wget --no-check-certificate --content-disposition -P ./wsi https://github.com/fuscc-deep-path/sc_MTOP/releases/download/Demo/Example001.ndpi
 ```
-Segment step -- This step take almost 2 hours with 2080Ti GPU and SSD.
+Nuclear segmentation and classification -- This step takes almost 2 hours with 2080Ti GPU and SSD.
 ```
 python main.py segment --input_dir='./wsi' --output_dir='./output'
 ```
-Feature step -- This step take almost 40 minutes with 128GB RAM and 8 process.
+Feature extraction -- This step takes almost 40 minutes with 128GB RAM and 8 process.
 ```
 python main.py feature --json_path='./output/json/Example001.json' --wsi_path='./wsi/Example001.ndpi' --output_path='./feature'
 ```
-Visual step  
+visualization
 ```
 python main.py visual --feature_path='./feature/sample' --wsi_path='./wsi/sample.ndpi' --xml_path='./xml/sample.xml'
 ```