attention warning and incomplete text description #28

Holmes-GU · 2024-12-01T16:05:20Z

Hi,
Thanks for your interesting work and sharing codes. When running the quick inference code, several issues appear.

The terminal returns the warning that "The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results." So, ignoring it is OK?
The generated video description for video "./examples/video1.mp4" is
"'The video begins with a scene featuring two characters in a fantastical setting, with a large, green, leaf-like structure and a mountainous backdrop. The first character, dressed in a yellow and red outfit with a mask, is seen in a dynamic pose, suggesting movement or action, while the second character, wearing a blue robe with a white beard, stands with a contemplative expression. The scene then transitions to a more serene setting, with the same two characters now standing still, facing each other against a misty, mountainous background. The character in the yellow and red outfit appears to be speaking or reacting, while the character in'"
which seems not complete, especially the last sentence. Is there is solution for it?

Thanks.

The text was updated successfully, but these errors were encountered:

xiaoqian-shen · 2024-12-05T10:52:49Z

The warning can be ignored.
You can increase the number of max_new_tokens for generating complete sentence, or prompting it to generate concisely.

Provide feedback