-
Notifications
You must be signed in to change notification settings - Fork 171
feat: Add "think" parameter for Ollama #1948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@anakin87 Hi! What's your opinion about storing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this...
-
I agree with your approach. For the moment, putting the thinking output in
ChatMessage._meta.thinking
is reasonable. -
Please rebase your branch, fix conflicts and run tests.
-
I left some other comments .
@@ -156,6 +161,7 @@ def __init__( | |||
url: str = "http://localhost:11434", | |||
generation_kwargs: Optional[Dict[str, Any]] = None, | |||
timeout: int = 120, | |||
think=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this new parameter at the end, to make this change non-breaking.
@@ -172,6 +178,8 @@ def __init__( | |||
[Ollama docs](https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values). | |||
:param timeout: | |||
The number of seconds before throwing a timeout error from the Ollama API. | |||
:param think | |||
Enables the model's "thinking" process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expand this explanation to something like
Enables the model's "thinking" process. | |
If True, the modell will "think" before producing a response. | |
Only [thinking models](https://ollama.com/search?c=thinking) support this feature. | |
The intermediate "thinking" output can be found in the `meta` property of the returned `ChatMessage`. |
@@ -36,6 +36,7 @@ def __init__( | |||
template: Optional[str] = None, | |||
raw: bool = False, | |||
timeout: int = 120, | |||
think: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are trying to introduce new features in the Chat Generators only. In the long run, we may deprecate Generators and keep only Chat Generators.
For this reason, I won't introduce support for thinking in Generators.
@@ -508,6 +508,17 @@ def test_run_with_chat_history(self): | |||
city.lower() in response["replies"][-1].text.lower() for city in ["Manchester", "Birmingham", "Glasgow"] | |||
) | |||
|
|||
@pytest.mark.integration | |||
def test_live_run_with_thinking(self): | |||
chat_generator = OllamaChatGenerator(model="qwen3:1.7b", think=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to use this model in an integration test, you should also change the following line
LLM_FOR_TESTS: "llama3.2:3b" |
However, I would recommend using qwen3:0.6b
if possible: based on my experiment, this would work quite well with our tests and being very small, it can speed up download and inference times.
Related Issues
Proposed Changes:
Added an additional
think
parameter in accordance with Ollama’s current interface, and included a test to cover this feature.How did you test it?
I added unit tests
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.