Extracting text for the single image #16

yurii-piets · 2020-03-22T21:17:47Z

Sorry for duplication of the issue #1, but can you, please, explain how can I extract text for the single image given as input? It is not clear for me what steps I need to do to get text description of the single image.

Also, I was wondering if I can extract text for some external image, I mean for the image that was not included in the train and val image set?

I will really appreciate any help.

layumi · 2020-03-23T02:15:49Z

Hi @yurii-piets
The code is to map the image or text input to one shared space. Therefore, given one image, we could extract the image embedding (feature). Given one sentence, we could extract the corresponding text embedding (feature).
More precisely, we extract the shared feature from image inputs, rather than text feature from image inputs.

Yes. You could extract the feature as well. But one thing that you should keep in mind is the data distribution of external images.

If the image is collected from Flickr, you should choose the model pretrained either on Flickr30k or MSCOCO.
If the content image is pedestrian, you should choose the model pretrained on CUHK-PEDES.

The model works well when the testing distribution is close to the training distribution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting text for the single image #16

Extracting text for the single image #16

yurii-piets commented Mar 22, 2020

layumi commented Mar 23, 2020

Extracting text for the single image #16

Extracting text for the single image #16

Comments

yurii-piets commented Mar 22, 2020

layumi commented Mar 23, 2020