Error in Stream in Runner.run_streamed() with LitellmModel(Model) class #601

sumit-lightringai · 2025-04-25T06:58:19Z

Please read this first

Have you read the docs? Agents SDK docs Yes
Have you searched for related issues? Yes, others may have faced similar issues.

Describe the bug

Runner.run_streamed() is not able to produce a proper stream with LitellmModel, tested with "openai/gpt-4o". When using:

result = Runner.run_streamed(triage_agent, message)
async for event in result.stream_events():
    print(event)

Only a single AgentUpdatedStreamEvent object is printed, rather than a continuous stream of events.

Reproduction Code

Agent definitions:

history_tutor_agent = Agent(
    name="History Tutor",
    handoff_description="Specialist agent for historical questions",
    instructions="You provide assistance with historical queries. Explain important events and context clearly.",
    model=LitellmModel(model=model, api_key=api_key),
)

math_tutor_agent = Agent(
    name="Math Tutor",
    handoff_description="Specialist agent for math questions",
    instructions="You provide help with math problems. Explain your reasoning at each step and include examples",
    model=LitellmModel(model=model, api_key=api_key),
)   

triage_agent = Agent(
    name="Triage Agent",
    instructions="You determine which agent to use based on the user's homework question",
    model=LitellmModel(model=model, api_key=api_key),
    handoffs=[history_tutor_agent, math_tutor_agent]
)

Expected Behavior

A continuous stream of events should be produced as the agent processes the request, similar to how it works with default OpenAI models.

Current Behavior

Only a single event is output in the stream:

AgentUpdatedStreamEvent(new_agent=Agent(name='Triage Agent', instructions="You determine which agent to use based on the user's homework question", handoff_description=None, handoffs=[Agent(name='History Tutor', instructions='You provide assistance with historical queries. Explain important events and context clearly.', handoff_description='Specialist agent for historical questions', handoffs=[], model=<agents.extensions.models.litellm_model.LitellmModel object at 0x0000026AB77C67B0>, model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=None, truncation=None, max_tokens=None, reasoning=None, metadata=None, store=None, include_usage=None, extra_query=None, extra_body=None), tools=[], mcp_servers=[], mcp_config={}, input_guardrails=[], output_guardrails=[], output_type=None, hooks=None, tool_use_behavior='run_llm_again', reset_tool_choice=True), Agent(name='Math Tutor', instructions='You provide help with math problems. Explain your reasoning at each step and include examples', handoff_description='Specialist agent for math questions', handoffs=[], model=<agents.extensions.models.litellm_model.LitellmModel object at 0x0000026AB7828A50>, model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=None, truncation=None, max_tokens=None, reasoning=None, metadata=None, store=None, include_usage=None, extra_query=None, extra_body=None), tools=[], mcp_servers=[], mcp_config={}, input_guardrails=[], output_guardrails=[], output_type=None, hooks=None, tool_use_behavior='run_llm_again', reset_tool_choice=True)], model=<agents.extensions.models.litellm_model.LitellmModel object at 0x0000026AB7828E10>, model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=None, truncation=None, max_tokens=None, reasoning=None, metadata=None, store=None, include_usage=None, extra_query=None, extra_body=None), tools=[], mcp_servers=[], mcp_config={}, input_guardrails=[], output_guardrails=[], output_type=None, hooks=None, tool_use_behavior='run_llm_again', reset_tool_choice=True), type='agent_updated_stream_event')

Additional Information

Runner.run() (non-streaming version) works correctly with the same LitellmModel agents.
When using default OpenAI models (without specifying model=LitellmModel), streaming works correctly:

history_tutor_agent = Agent(
    name="History Tutor",
    handoff_description="Specialist agent for historical questions",
    instructions="You provide assistance with historical queries. Explain important events and context clearly.",
)

math_tutor_agent = Agent(
    name="Math Tutor",
    handoff_description="Specialist agent for math questions",
    instructions="You provide help with math problems. Explain your reasoning at each step and include examples",
)   

triage_agent = Agent(
    name="Triage Agent",
    instructions="You determine which agent to use based on the user's homework question",
    handoffs=[history_tutor_agent, math_tutor_agent]
)

This indicates the issue is specifically with the LitellmModel integration in the streaming functionality of the Agents SDK.

Environment Information

Tested with model: "openai/gpt-4o" via LitellmModel
Issue appears to be related to the interaction between LitellmModel and the streaming functionality

The text was updated successfully, but these errors were encountered:

DanieleMorotti · 2025-04-25T09:21:55Z

Hi, I'm not able to reproduce your error, try the following script:

import asyncio

from agents import Agent, Runner
from agents.extensions.models.litellm_model import LitellmModel
from openai.types.responses import ResponseTextDeltaEvent


API_KEY = open("../../OPENAI_API_KEY.txt", "r").read()


model = "openai/gpt-4o"

history_tutor_agent = Agent(
    name="History Tutor",
    handoff_description="Specialist agent for historical questions",
    instructions="You provide assistance with historical queries. Explain important events and context clearly.",
    model=LitellmModel(model=model, api_key=API_KEY),
)

math_tutor_agent = Agent(
    name="Math Tutor",
    handoff_description="Specialist agent for math questions",
    instructions="You provide help with math problems. Explain your reasoning at each step and include examples",
    model=LitellmModel(model=model, api_key=API_KEY),
)   

triage_agent = Agent(
    name="Triage Agent",
    instructions="You determine which agent to use based on the user's homework question",
    model=LitellmModel(model=model, api_key=API_KEY),
    handoffs=[history_tutor_agent, math_tutor_agent]
)


async def main():
    result = Runner.run_streamed(triage_agent, "I want to solve the following equation: '2x^2 -32 = 16'")
    async for event in result.stream_events():
        if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
            print(event.data.delta, end="", flush=True)



if __name__ == "__main__":
    asyncio.run(main())

It seems to work properly

sumit-lightringai · 2025-04-26T05:57:06Z

Thanks, I just pip freeze and found out that was using openai-agents v0.12 and when try running with new version openai-agents v0.13 its working, still v.12 had that bug🙂.

sumit-lightringai added the bug label Apr 25, 2025

rm-openai added the needs-more-info label Apr 25, 2025

cnm13ryan mentioned this issue Apr 25, 2025

[Bug]: SDK crashes when choices is None (provider-error payload) #604

Open

sumit-lightringai closed this as completed Apr 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in Stream in Runner.run_streamed() with LitellmModel(Model) class #601

Error in Stream in Runner.run_streamed() with LitellmModel(Model) class #601

sumit-lightringai commented Apr 25, 2025

DanieleMorotti commented Apr 25, 2025

sumit-lightringai commented Apr 26, 2025

Error in Stream in Runner.run_streamed() with LitellmModel(Model) class #601

Error in Stream in Runner.run_streamed() with LitellmModel(Model) class #601

Comments

sumit-lightringai commented Apr 25, 2025

Please read this first

Describe the bug

Reproduction Code

Expected Behavior

Current Behavior

Additional Information

Environment Information

DanieleMorotti commented Apr 25, 2025

sumit-lightringai commented Apr 26, 2025