Multimodal ToolCallSummaryMessage
& FunctionExecutionResult
Support
#6381
Closed
fang-d
started this conversation in
Feature suggestions
Replies: 2 comments 1 reply
-
You may want to take a look at the new |
Beta Was this translation helpful? Give feedback.
1 reply
-
It seems that function calling results with images are not supported by the current OpenAI models. I will close this discussion. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
It seems that the return value of function calling only supports the
str
type currently. But for VLMs, supporting multimodal function calling is necessary and important. A simple example is that it allows the model to automatically select images from the file system and analyze them. Thecontent
of theToolCallSummaryMessage
andFunctionExecutionResult
should bestr | agent_core.Image | list[str | agent_core.Image]
.Beta Was this translation helpful? Give feedback.
All reactions