Skip to content

Extracting text for the single image #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
yurii-piets opened this issue Mar 22, 2020 · 1 comment
Open

Extracting text for the single image #16

yurii-piets opened this issue Mar 22, 2020 · 1 comment

Comments

@yurii-piets
Copy link

Sorry for duplication of the issue #1, but can you, please, explain how can I extract text for the single image given as input? It is not clear for me what steps I need to do to get text description of the single image.

Also, I was wondering if I can extract text for some external image, I mean for the image that was not included in the train and val image set?

I will really appreciate any help.

@layumi
Copy link
Owner

layumi commented Mar 23, 2020

Hi @yurii-piets
The code is to map the image or text input to one shared space. Therefore, given one image, we could extract the image embedding (feature). Given one sentence, we could extract the corresponding text embedding (feature).
More precisely, we extract the shared feature from image inputs, rather than text feature from image inputs.

Yes. You could extract the feature as well. But one thing that you should keep in mind is the data distribution of external images.

If the image is collected from Flickr, you should choose the model pretrained either on Flickr30k or MSCOCO.
If the content image is pedestrian, you should choose the model pretrained on CUHK-PEDES.

The model works well when the testing distribution is close to the training distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants