-
Hi, all, i tried the demo under ”tutorial/full_gpu_inference_pipeline“ without luck. 1. errors occurredBy following the official steps, i have successfully started the Triton server and Triton client, but in the Benchmark, when i ran perf_analyzer -m spleen_seg -u localhost:18100 --input-data zero --shape "INPUT0":512,512,114 --shared-memory system The following errors occurred: # server side
I0106 14:36:26.742519 1279 grpc_server.cc:4190] Started GRPCInferenceService at 0.0.0.0:8001
I0106 14:36:26.743364 1279 http_server.cc:2857] Started HTTPService at 0.0.0.0:8000
I0106 14:36:26.785051 1279 http_server.cc:167] Started Metrics Service at 0.0.0.0:8002
2023-01-06 14:36:43,063 - the shape of the input tensor is: torch.Size([1, 512, 512, 114])
2023-01-06 14:36:44,825 - the shape of the transformed tensor is: torch.Size([1, 224, 224, 224])
2023-01-06 14:36:44,826 - the shape of the unsqueezed transformed tensor is: torch.Size([1, 1, 224, 224, 224])
E0106 14:36:46.006760 1279 python.cc:1970] Stub process is unhealthy and it will be restarted.
# client side
*** Measurement Settings ***
Batch size: 1
Using "time_windows" mode for stabilization
Measurement window: 5000 msec
Using synchronous calls for inference
Stabilizing using average latency
Request concurrency: 1
Failed to maintain requested inference load. Worker thread(s) failed to generate concurrent requests.
Thread [0] had error: Failed to process the request(s) for model instance 'spleen_seg_0', message: Stub process is not healthy. 2. tried methodsLater i also tried to changes some parameters, e.g. w/o shared-memory, changing shm-size from 1g to 16g, installing monai env inside server instead of using conda-pack, changing docker image version, etc, which however, all failed. When i tried tritonserver 22.12, the following errors occurred, # server side
I0106 14:47:40.332535 94 grpc_server.cc:4819] Started GRPCInferenceService at 0.0.0.0:8001
I0106 14:47:40.332862 94 http_server.cc:3477] Started HTTPService at 0.0.0.0:8000
I0106 14:47:40.373862 94 http_server.cc:184] Started Metrics Service at 0.0.0.0:8002
2023-01-06 14:49:07,524 - the shape of the input tensor is: torch.Size([1, 512, 512, 114])
2023-01-06 14:49:09,501 - the shape of the transformed tensor is: torch.Size([1, 224, 224, 224])
2023-01-06 14:49:09,501 - the shape of the unsqueezed transformed tensor is: torch.Size([1, 1, 224, 224, 224])
# client side
*** Measurement Settings ***
*** Measurement Settings ***
Batch size: 1
Service Kind: Triton
Using "time_windows" mode for stabilization
Measurement window: 5000 msec
Using synchronous calls for inference
Stabilizing using average latency
Request concurrency: 1
Failed to maintain requested inference load. Worker thread(s) failed to generate concurrent requests.
Thread [0] had error: Failed to process the request(s) for model instance 'spleen_seg_0', message: TritonModelException: DLPack tensor is not contiguous. Only contiguous DLPack tensors that are stored in C-Order are supported.
At:
/triton_monai/spleen_seg/1/model.py(131): execute 3. some assumptionsfrom above errors, it can be inferred, that the following messages hit the core issue:
It's related to the following code in the mentioned model repository(need to be downloaded), "/triton_monai/spleen_seg/1/model.py(131): execute" # get the input by name (as configured in config.pbtxt)
input_triton_tensor = pb_utils.get_input_tensor_by_name(request, "INPUT0")
input_torch_tensor = from_dlpack(input_triton_tensor.to_dlpack())
logger.info(f"the shape of the input tensor is: {input_torch_tensor.shape}")
transform_output = self.pre_transforms(input_torch_tensor[0])
logger.info(f"the shape of the transformed tensor is: {transform_output.shape}")
transform_output_batched = transform_output.unsqueeze(0)
logger.info(f"the shape of the unsqueezed transformed tensor is: {transform_output_batched.shape}")
# if(transform_output_batched.is_cuda):
# print("the transformed pytorch tensor is on GPU")
# print(transform_output.shape)
transform_tensor = pb_utils.Tensor.from_dlpack("INPUT__0", to_dlpack(transform_output_batched)) Apparently, the last line of code failed to run, but i don't know how to modify it. Any suggestions or update to the "model.py" script? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Pls refer to #1150 |
Beta Was this translation helpful? Give feedback.
Pls refer to #1150