Skip to content

feat: Allow OpenAI client config in OpenAIChatGenerator and AzureOpenAIChatGenerator #9215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

Amnah199
Copy link
Contributor

@Amnah199 Amnah199 commented Apr 10, 2025

Related Issues

Proposed Changes:

Currently, users cannot configure the HTTP client used in these chat generators. To solve this:

  • add a http_client_kwargs param to store HTTP client configuration options. These options will be used in the init method to construct the client and can be safely serialized and deserialized.
  • Add a util function init_http_client which will be reused in configuring openai client in different components.
  • Unrelated but I took this chance to update the Azure embedders to use the new util function.

How did you test it?

Added new tests
Updated the existing tests

Notes for the reviewer

Similar to this PR

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes
  • I added unit tests and updated the docstrings
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I documented my code
  • I ran pre-commit hooks and fixed any issue

@github-actions github-actions bot added topic:tests type:documentation Improvements on the docs labels Apr 10, 2025
@anakin87
Copy link
Member

@coveralls
Copy link
Collaborator

coveralls commented Apr 11, 2025

Pull Request Test Coverage Report for Build 14402617714

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 46 unchanged lines in 7 files lost coverage.
  • Overall coverage decreased (-0.05%) to 90.327%

Files with Coverage Reduction New Missed Lines %
components/generators/chat/types/protocol.py 1 83.33%
components/generators/chat/azure.py 4 92.73%
components/agents/agent.py 5 95.28%
components/evaluators/llm_evaluator.py 5 95.35%
components/extractors/named_entity_extractor.py 7 64.52%
components/generators/chat/openai.py 7 96.07%
components/extractors/llm_metadata_extractor.py 17 71.43%
Totals Coverage Status
Change from base Build 14378403767: -0.05%
Covered Lines: 10674
Relevant Lines: 11817

💛 - Coveralls

@Amnah199 Amnah199 marked this pull request as ready for review April 11, 2025 11:59
@Amnah199 Amnah199 requested review from a team as code owners April 11, 2025 11:59
@Amnah199 Amnah199 requested review from dfokina and vblagoje and removed request for a team April 11, 2025 11:59
Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach...

I suggest creating a test module in test/test_utils/test_http_client.py with a few tests: test_init_http_client, test_http_client_kwargs_type_validation, and test_http_client_kwargs_with_invalid_params.

After doing that, we might also remove the specific tests from the components - I'll leave that up to you.

@@ -128,6 +131,8 @@ def __init__( # pylint: disable=too-many-positional-arguments
the schema provided in the `parameters` field of the tool definition, but this may increase latency.
:param azure_ad_token_provider: A function that returns an Azure Active Directory token, will be invoked on
every request.
:param http_client_kwargs:
A dictionary of keyword arguments to configure a custom httpx.Client.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Amnah199!
I want to add similar support to other components, and I think they all should have the same description. I suggest adding documentation similar to the following for :param http_client::


A dictionary of keyword arguments to configure a custom httpx.Client for your use case: proxies, authentication and other advanced functionalities of HTTPX.
You can set a proxy with basic authorization using the environment variables: HTTP_PROXY and HTTPS_PROXY, ALL_PROXY and NO_PROXY, for example HTTP_PROXY=http://user:password@your-proxy.net:8080.


A dictionary of keyword arguments to configure a custom `httpx.Client` for your use case: [proxies](https://www.python-httpx.org/advanced/proxies), [authentication](https://www.python-httpx.org/advanced/authentication) and other [advanced functionalities](https://www.python-httpx.org/advanced/clients) of HTTPX.
You can set a proxy with basic authorization using [the environment variables](https://www.python-httpx.org/environment_variables): `HTTP_PROXY` and `HTTPS_PROXY`, `ALL_PROXY` and `NO_PROXY`, for example `HTTP_PROXY=http://user:password@your-proxy.net:8080`.

if not isinstance(http_client_kwargs, dict):
raise TypeError("The parameter 'http_client_kwargs' must be a dictionary.")
if async_client:
return httpx.AsyncClient(**http_client_kwargs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using openai.DefaultAsyncHttpxClient because it provides specific configurations from OpenAI (see here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! Thanks @alexengrig for the suggestion.

raise TypeError("The parameter 'http_client_kwargs' must be a dictionary.")
if async_client:
return httpx.AsyncClient(**http_client_kwargs)
return httpx.Client(**http_client_kwargs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using openai.DefaultHttpxClient because it provides specific configurations from OpenAI (see here).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents.

In case the user wants to provide their own client, I would leave them free to choose the configurations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anakin87 As per my understanding, DefaultAsyncHttpxClient does not limit the configuration options, just the default parameters are pre-configured to be optimal. The user you retains full configurability as with a regular httpx.AsyncClient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand this point, as a user I would find it unexpected to pass some client kwargs and then get a client that automatically sets others as well.

If we decide to go this path, I would clearly indicate this in all docstrings.

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left 2 minor comments.

(There also are merge conflicts)

assert http_client is not None
assert http_client.base_url == "https://example.com"


Copy link
Member

@anakin87 anakin87 Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please also test other cases?

  • init_http_client() (no parameter)
  • init_http_client(async_client=True)

You can find similar tests here

embedder = AzureOpenAITextEmbedder()
client = embedder._init_http_client()
assert client is None
embedder.http_client_kwargs = {"proxy": "http://example.com:3128"}
client = embedder._init_http_client(async_client=False)
assert isinstance(client, httpx.Client)
client = embedder._init_http_client(async_client=True)
assert isinstance(client, httpx.AsyncClient)

raise TypeError("The parameter 'http_client_kwargs' must be a dictionary.")
if async_client:
return httpx.AsyncClient(**http_client_kwargs)
return httpx.Client(**http_client_kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents.

In case the user wants to provide their own client, I would leave them free to choose the configurations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add passing http_client to OpenAI clients
4 participants