Not getting all the expected outputs from Faster R-CNN with Resnet V2 #266
Unanswered
AntoneRoundy
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've downloaded the Saved Model for Faster R-CNN with Resnet V2 from https://www.kaggle.com/models/tensorflow/faster-rcnn-inception-resnet-v2/tensorFlow2/640x640/1 and executed it using CPPFlow, but the output only contains a single tensor containing 100 floats. The documentation at the above link list a completely different collection of outputs -- a number of objects detected, the bounding boxes for each object, the classes of each object, detection scores, etc. Could anyone help me understand what the outputs that I'm getting are, and, more importantly, how to access all the outputs listed at the above link?
Here's some of the relevant code from my project:
cppflow::model *pModel;
cppflow::tensor *pInputImage;
std::vector<std::tuple<std::string, cppflow::tensor>> opInputs;
std::vectorstd::string opOutputs;
std::vectorcppflow::tensor opResults;
// load model, load image, etc.
opInputs = {{"serving_default_input_tensor", *pInputImage}};
opOutputs = {"StatefulPartitionedCall"};
opResults = (*pModel)(opInputs, opOutputs);
After that last call, opResults is a {1,100} vector containing 100 float value tensors.
When I called pModel->get_operations(), the only two operations that appeared to be intended to be used as input and output operations respectively were serving_default_input_tensor and StatefulPartitionedCall, so it doesn't appear that the answer to my question is to specify different operations. When I call pModel->get_operation_shape(), the result does indicate that the expected output shape is {1,100}, which is what I'm seeing. It's just not what the model card indicates.
The Faster R-CNN with Resnet V2 model card seems to indicate that the model outputs a bunch of named fields rather than just a list of values. Is there a way to get access to the "output dictionary" the the above link mentions?
Update: I appear to have found a partial answer to my own question. Instead of StatefulPartitionedCall, I need to use the names "StatefulPartitionedCall:0" through "StatefulPartitionedCall:7" to access the lists of values for each of the 8 fields in the model's output dictionary. I'm not sure which of these corresponds to each of the fields, but could probably figure it out with a bit more work. So my next question is, is there a way to look up which of these corresponds to which field?
Looking at the data, one thing that seems a bit odd is that there are clearly no numbers that correspond to the bounding boxes specified in pixel units. It appears that the model is returning 100 detected objects. One (or more? I'd have to double check) of the results lists contains 400 floating point numbers between 0 and 1. I'm guessing that those are the coordinates for the bounding boxes scaled down from integral pixels values to numbers between 0 and 1.
Beta Was this translation helpful? Give feedback.
All reactions