Skip to content

MCP SDK hangs #813

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
NielsRogge opened this issue May 27, 2025 · 3 comments
Open

MCP SDK hangs #813

NielsRogge opened this issue May 27, 2025 · 3 comments

Comments

@NielsRogge
Copy link

NielsRogge commented May 27, 2025

Describe the bug

I have developed a local MCP server which performs retrieval to an ElasticSearch database using hybrid search. The MCP server works fine when adding it to Claude Desktop, or when running the MCP Inspection tool using mcp dev mcp_server.py. Here's how I've added it to Claude Desktop (I have replaced my UV directory, GCP credentials and project name):

{
    "mcpServers": {
      "filesystem": {
        "command": "uv",
        "args": [
          "--directory",
          "my-directory",
          "run",
          "mcp_server.py"
        ],
        "env": {
          "GOOGLE_APPLICATION_CREDENTIALS": "(...)/application_default_credentials.json",
          "GOOGLE_CLOUD_PROJECT": "my-gcp-project"
        }
      }
    }
}

However, when running the MCP server locally using the "stdio" protocol, the connection hangs. The MCP server performs its task, but then for some reason the results are not passed to the LLM API.

To Reproduce

I run the following script based on the docs:

import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from google import genai

client = genai.Client(api_key=GEMINI_API_KEY)

# Create server parameters for stdio connection
server_params = StdioServerParameters(
    command="/opt/homebrew/bin/uv",  # Executable
    args=["run", "mcp_server.py"],  # MCP Server
    env={
          "GOOGLE_APPLICATION_CREDENTIALS": "(...)/application_default_credentials.json",
          "GOOGLE_CLOUD_PROJECT": "my-gcp-project",
    },
)

async def run():
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            # Prompt
            prompt = "This is a test"
            # Initialize the connection between client and server
            await session.initialize()
            # Send request to the model with MCP function declarations
            response = await client.aio.models.generate_content(
                model="gemini-2.0-flash",
                contents=prompt,
                config=genai.types.GenerateContentConfig(
                    temperature=0,
                    tools=[session],  # uses the session, will automatically call the tool
                    # Uncomment if you **don't** want the sdk to automatically call the tool
                    # automatic_function_calling=genai.types.AutomaticFunctionCallingConfig(
                    #     disable=True
                    # ),
                ),
            )
            print(response.text)

# Start the asyncio event loop and run the main function
asyncio.run(run())

This is the output:

(env) nielsrogge@Nielss-MacBook-Air env % uv run test.py
INFO:mcp.server.lowlevel.server:Processing request of type ListToolsRequest
INFO:mcp.server.lowlevel.server:Processing request of type CallToolRequest
INFO:app.implementations.retrievers.python_retriever:Search method: hybrid
INFO:app.implementations.retrievers.python_retriever:Initial query given to the retriever: test
INFO:elastic_transport.transport:POST <my-elastic-search-endpoint> [status:200 duration:0.316s]
INFO:app.implementations.vector_databases.elastic_search_vector_database:ES response time via 'took' (ms): 125
INFO:app.implementations.vector_databases.elastic_search_vector_database:ES response time (s): 0.4781339168548584
INFO:app.implementations.retrievers.python_retriever:Number of results: 30

=> after that, it hangs, the LLM does not return a result.

Expected behavior

I would expect a smooth response from the LLM, based on the tool call to the MCP server.

The sample code snippet from the docs works fine, but it does not for my MCP server (even though the MCP server works fine in Claude Desktop as well as within the MCP inspection tool).

Device:

  • OS: Macbook Air, Apple M3 Chip, MacOS 14.7.2
  • IDE: Cursor
  • Python 3.12.9
@NielsRogge
Copy link
Author

NielsRogge commented May 27, 2025

Update: another colleague of mine had the same issue on their laptop. So this seems an issue related to the MCP Python SDK (or the Gemini integration).

@philschmid
Copy link

Does using session.call_tool works, without any LLM inteaction?

@NielsRogge
Copy link
Author

No, I debugged this with Claude-4-Sonnet:

2. Testing direct tool call...
INFO:__main__:Testing direct tool call...
INFO:__main__:Starting MCP server using stdio transport
INFO:mcp.server.lowlevel.server:Processing request of type ListToolsRequest
INFO:__main__:Available tools: ['retrieve_articles']
INFO:mcp.server.lowlevel.server:Processing request of type CallToolRequest
INFO:__main__:Starting retrieval for query: test
INFO:__main__:Calling retriever.retrieve_articles...
INFO:app.implementations.retrievers.python_retriever:Search method: hybrid
INFO:app.implementations.retrievers.python_retriever:Initial query given to the retriever: test
INFO:elastic_transport.transport:POST search [status:200 duration:1.430s]
INFO:app.implementations.vector_databases.elastic_search_vector_database:ES response time via 'took' (ms): 1302
INFO:app.implementations.vector_databases.elastic_search_vector_database:ES response time (s): 1.5721080303192139
INFO:app.implementations.retrievers.python_retriever:Number of results: 30
INFO:__main__:Retrieved 5 articles
INFO:__main__:Returning result of length: 718 characters

It retrieves 5 articles as there's also reranking involved after hybrid search, but then hangs at this line:

result = await asyncio.wait_for(
      session.call_tool("retrieve_articles", {"query": "test"}),
      timeout=60.0
  )

The MCP server itself is proprietary, it just returns articles from an ElasticSearch database as a long string. It works in Claude Desktop and with the MCP inspection tool, so I would assume there is something wrong with the "stdio" implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants