[Fix] Update MMScan's README #99

rbler1234 · 2025-01-24T08:02:55Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the back-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
If the modification has potential influence on downstream projects, this PR should be tested with downstream projects.
The documentation has been modified accordingly, like docstring or example tutorials.

README.md

models/README.md

README.md

Tai-Wang · 2025-01-26T06:55:28Z

README.md

- **--------------For Visual Grounding Task**
- **"target_id"** (list\[int\]): IDs of target objects.
- **"text"** (str): Grounding text.
+- **"sub_class"**: The sample category of the sample.


sample's category
or
category of the sample

Tai-Wang · 2025-01-26T06:56:00Z

README.md

+- **"pcds"** (np.ndarray): Point cloud data with dimensions [n_points, 6(xyz+rgb)], representing the coordinates and color of each point.
+- **"instance_labels"** (np.ndarray): Instance ID assigned to each point in the point cloud.
+- **"class_labels"** (np.ndarray): Class IDs assigned to each point in the point cloud.
+- **"bboxes"** (dict): Information about bounding boxes within the scan.


Specify what information

Tai-Wang · 2025-01-26T06:57:31Z

README.md

- **"text"** (str): Grounding text.
+- **"sub_class"**: The sample category of the sample.
+- **"ID"**: A unique identifier for the sample.
+- **"scan_id"**:Identifier corresponding to the related scan.


The scan's ID.

Tai-Wang · 2025-01-26T06:57:40Z

README.md

- **"target_id"** (list\[int\]): IDs of target objects.
- **"text"** (str): Grounding text.
+- **"sub_class"**: The sample category of the sample.
+- **"ID"**: A unique identifier for the sample.


The sample's ID.

Recommend directly using id in the next version of release

Tai-Wang · 2025-01-26T06:58:30Z

README.md

+- **"scan_id"**:Identifier corresponding to the related scan.
+-  *For Visual Grounding task*
+- **"target_id"** (list\[int\]): IDs of target objects. 
+- **"text"** (str): Text used for grounding.


Text prompt to specify the target grounding object.

Tai-Wang · 2025-01-26T06:59:40Z

README.md

 - **"answers"** (list\[str\]): List of possible answers.
 - **"object_ids"** (list\[int\]): Object IDs referenced in the question.
 - **"object_names"** (list\[str\]): Types of referenced objects.
 - **"input_bboxes_id"** (list\[int\]): IDs of input bounding boxes.
- **"input_bboxes"** (list\[np.ndarray\]): Input bounding boxes, 9 DoF.
+- **"input_bboxes"** (list\[np.ndarray\]): Input bounding box data, with 9 degrees of freedom.


Input 9-DoF bounding boxes.

Tai-Wang · 2025-01-26T07:00:00Z

README.md

+- **'img_path'** (str): File path to the RGB image.
+- **'depth_img_path'** (str): File path to the depth image.
+- **'intrinsic'** (np.ndarray):  Intrinsic parameters of the camera for RGB images.
+- **'depth_intrinsic'** (np.ndarray):  Intrinsic parameters of the camera for Depth images.


for depth images

Tai-Wang · 2025-01-26T07:00:47Z

README.md

@@ -182,7 +186,9 @@ For the visual grounding task, our evaluator computes multiple metrics including

 - **AP and AR**: These metrics calculate the precision and recall by considering each sample as an individual category.
 - **AP_C and AR_C**: These versions categorize samples belonging to the same subclass and calculate them together.
- **gtop-k**: An expanded metric that generalizes the traditional top-k metric, offering insights into broader performance aspects.
+- **gTop-k**: An expanded metric that generalizes the traditional Top-k metric, offering insights into broader performance aspects.


Specify offering what "insights" into broader performance aspects

Tai-Wang · 2025-01-26T07:02:16Z

models/README.md

@@ -47,6 +51,11 @@ These are 3D visual grounding models adapted for the mmscan-devkit. Currently, t
   # Multiple GPU testing
   python tools/test.py configs/grounding/pcd_4xb24_mmscan_vg_num256.py path/to/load_pth --launcher="pytorch"
   ```
+#### ckpts & Logs


ckpts & logs -> Results and Models
Please also fix other places

Tai-Wang · 2025-01-26T07:02:56Z

models/README.md

Scanrefer -> ScanRefer

setup the Env -> setup the environment

multiple GPU -> multiple GPUs

Consider polish the readme carefully using ChatGPT

edit readme

0569df4