Skip to content

Commit 3d132af

Browse files
authored
Merge pull request #10 from mutablelogic/dev
Dev
2 parents 4911586 + ff3de03 commit 3d132af

File tree

4 files changed

+313
-167
lines changed

4 files changed

+313
-167
lines changed

README.md

+1-167
Original file line numberDiff line numberDiff line change
@@ -323,7 +323,7 @@ import (
323323
func add_two_numbers(ctx context.Context, agent llm.Agent) (string, error) {
324324
context := agent.Model(ctx, "claude-3-5-haiku-20241022").Context()
325325
toolkit := tool.NewToolKit()
326-
toolkit.Register(Adder{})
326+
toolkit.Register(&Adder{})
327327

328328
// Get the tool call
329329
if err := context.FromUser(ctx, "What is five plus seven?", llm.WithToolKit(toolkit)); err != nil {
@@ -373,172 +373,6 @@ The transation of field types is as follows:
373373
* `uint`, `int` - Translates to JSON `integer`
374374
* `float32`, `float64` - Translates to JSON `number`
375375

376-
## Complete and Chat Options
377-
378-
These are the options you can use with the `Completion` and `Chat` methods.
379-
380-
<table>
381-
<tr>
382-
<th>Ollama</th>
383-
<th>Anthropic</th>
384-
<th>Mistral</th>
385-
<th>OpenAI</th>
386-
<th>Gemini</th>
387-
</tr>
388-
389-
<tr><td colspan="6">
390-
<code>llm.WithTemperature(float64)</code>
391-
What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.7 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
392-
</td></tr>
393-
<tr style="border-bottom: 2px solid black;">
394-
<td>Yes</td>
395-
<td>Yes</td>
396-
<td>Yes</td>
397-
<td>Yes</td>
398-
<td>Yes</td>
399-
</tr>
400-
401-
</table>
402-
403-
## Embedding Options
404-
405-
These are the options you can include for the `Embedding`method.
406-
407-
<table>
408-
<tr>
409-
<th>Ollama</th>
410-
<th>Anthropic</th>
411-
<th>Mistral</th>
412-
<th>OpenAI</th>
413-
<th>Gemini</th>
414-
</tr>
415-
416-
<tr><td colspan="6">
417-
<code>ollama.WithKeepAlive(time.Duration)</code>
418-
Controls how long the model will stay loaded into memory following the request
419-
</td></tr>
420-
<tr style="border-bottom: 2px solid black;">
421-
<td>Yes</td>
422-
<td>No</td>
423-
<td>No</td>
424-
<td>No</td>
425-
<td>No</td>
426-
</tr>
427-
428-
<tr><td colspan="6">
429-
<code>ollama.WithTruncate()</code>
430-
Does not truncate the end of each input to fit within context length. Returns error if context length is exceeded.
431-
</td></tr>
432-
<tr style="border-bottom: 2px solid black;">
433-
<td>Yes</td>
434-
<td>No</td>
435-
<td>No</td>
436-
<td>No</td>
437-
<td>No</td>
438-
</tr>
439-
440-
<tr><td colspan="6">
441-
<code>ollama.WithOption(string, any)</code>
442-
Set model-specific option value.
443-
</td></tr>
444-
<tr style="border-bottom: 2px solid black;">
445-
<td>Yes</td>
446-
<td>No</td>
447-
<td>No</td>
448-
<td>No</td>
449-
<td>No</td>
450-
</tr>
451-
452-
<tr><td colspan="6">
453-
<code>openai.WithDimensions(uint64)</code>
454-
The number of dimensions the resulting output embeddings
455-
should have. Only supported in text-embedding-3 and later models.
456-
</td></tr>
457-
<tr style="border-bottom: 2px solid black;">
458-
<td>No</td>
459-
<td>No</td>
460-
<td>No</td>
461-
<td>Yes</td>
462-
<td>No</td>
463-
</tr>
464-
465-
<tr><td colspan="6">
466-
<code>llm.WithFormat(string)</code>
467-
The format to return the embeddings in. Can be either .
468-
</td></tr>
469-
<tr style="border-bottom: 2px solid black;">
470-
<td>No</td>
471-
<td>No</td>
472-
<td>'float'</td>
473-
<td>'float' or 'base64'</td>
474-
<td>No</td>
475-
</tr>
476-
477-
</table>
478-
479-
## Older Content
480-
481-
You can add options to sessions, or to prompts. Different providers and models support
482-
different options.
483-
484-
```go
485-
package llm
486-
487-
type Model interface {
488-
// Set session-wide options
489-
Context(...Opt) Context
490-
491-
// Create a completion from a text prompt
492-
Completion(context.Context, string, ...Opt) (Completion, error)
493-
494-
// Embedding vector generation
495-
Embedding(context.Context, string, ...Opt) ([]float64, error)
496-
}
497-
498-
type Context interface {
499-
// Generate a response from a user prompt (with attachments and
500-
// other options)
501-
FromUser(context.Context, string, ...Opt) error
502-
}
503-
```
504-
505-
The options are as follows:
506-
507-
| Option | Ollama | Anthropic | Mistral | OpenAI | Description |
508-
|--------|--------|-----------|---------|--------|-------------|
509-
| `llm.WithTemperature(float64)` | Yes | Yes | Yes | Yes | What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.7 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
510-
| `llm.WithTopP(float64)` | Yes | Yes | Yes | Yes | Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. |
511-
| `llm.WithTopK(uint64)` | Yes | Yes | No | No | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. |
512-
| `llm.WithMaxTokens(uint64)` | No | Yes | Yes | Yes | The maximum number of tokens to generate in the response. |
513-
| `llm.WithStream(func(llm.Completion))` | Can be enabled when tools are not used | Yes | Yes | Yes | Stream the response to a function. |
514-
| `llm.WithToolChoice(string, string, ...)` | No | Use `auto`, `any` or a function name. Only the first argument is used. | Use `auto`, `any`, `none`, `required` or a function name. Only the first argument is used. | Use `auto`, `none`, `required` or a function name. Only the first argument is used. | The tool to use for the model. |
515-
| `llm.WithToolKit(llm.ToolKit)` | Cannot be combined with streaming | Yes | Yes | Yes | The set of tools to use. |
516-
| `llm.WithStopSequence(string, string, ...)` | Yes | Yes | Yes | Yes | Stop generation if one of these tokens is detected. |
517-
| `llm.WithSystemPrompt(string)` | No | Yes | Yes | Yes | Set the system prompt for the model. |
518-
| `llm.WithSeed(uint64)` | Yes | No | Yes | Yes | The seed to use for random sampling. If set, different calls will generate deterministic results. |
519-
| `llm.WithFormat(string)` | Use `json` | No | Use `json_format` or `text` | Use `json_format` or `text` | The format of the response. For Mistral, you must also instruct the model to produce JSON yourself with a system or a user message. |
520-
| `llm.WithPresencePenalty(float64)` | Yes | No | Yes | Yes | Determines how much the model penalizes the repetition of words or phrases. A higher presence penalty encourages the model to use a wider variety of words and phrases, making the output more diverse and creative. |
521-
| `llm.WithFequencyPenalty(float64)` | Yes | No | Yes | Yes | Penalizes the repetition of words based on their frequency in the generated text. A higher frequency penalty discourages the model from repeating words that have already appeared frequently in the output, promoting diversity and reducing repetition. |
522-
| `llm.WithPrediction(string)` | No | No | Yes | Yes | Enable users to specify expected results, optimizing response times by leveraging known or predictable content. This approach is especially effective for updating text documents or code files with minimal changes, reducing latency while maintaining high-quality results. |
523-
| `llm.WithSafePrompt()` | No | No | Yes | No | Whether to inject a safety prompt before all conversations. |
524-
| `llm.WithNumCompletions(uint64)` | No | No | Yes | Yes | Number of completions to return for each request. |
525-
| `llm.WithAttachment(io.Reader)` | Yes | Yes | Yes | - | Attach a file to a user prompt. It is the responsibility of the caller to close the reader. |
526-
| `llm.WithUser(string)` | No | Yes | No | Yes | A unique identifier representing your end-user |
527-
| `antropic.WithEphemeral()` | No | Yes | No | - | Attachments should be cached server-side |
528-
| `antropic.WithCitations()` | No | Yes | No | - | Attachments should be used in citations |
529-
| `openai.WithStore(bool)` | No | No | No | Yes | Whether or not to store the output of this chat completion request |
530-
| `openai.WithDimensions(uint64)` | No | No | No | Yes | The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models |
531-
| `openai.WithReasoningEffort(string)` | No | No | No | Yes | The level of effort model should put into reasoning. |
532-
| `openai.WithMetadata(string, string)` | No | No | No | Yes | Metadata to be logged with the completion. |
533-
| `openai.WithLogitBias(uint64, int64)` | No | No | No | Yes | A token and their logit bias value. Call multiple times to add additional tokens |
534-
| `openai.WithLogProbs()` | No | No | No | Yes | Include the log probabilities on the completion. |
535-
| `openai.WithLogProbs()` | No | No | No | Yes | Include the log probabilities on the completion. |
536-
| `openai.WithTopLogProbs(uint64)` | No | No | No | Yes | An integer between 0 and 20 specifying the number of most likely tokens to return at each token position. |
537-
| `openai.WithAudio(string, string)` | No | No | No | Yes | Output audio (voice, format) for the completion. Can be used with certain models. |
538-
| `openai.WithServiceTier(string)` | No | No | No | Yes | Specifies the latency tier to use for processing the request. |
539-
| `openai.WithStreamOptions(func(llm.Completion), bool)` | No | No | No | Yes | Include usage information in the stream response |
540-
| `openai.WithDisableParallelToolCalls()` | No | No | No | Yes | Call tools in serial, rather than in parallel |
541-
542376
## The Command Line Tool
543377

544378
You can use the command-line tool to interact with the API. To build the tool, you can use the following command:

doc/options.md

+169
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# Options
2+
3+
This content needs reviewed and updated.
4+
5+
## Complete and Chat Options
6+
7+
These are the options you can use with the `Completion` and `Chat` methods.
8+
9+
<table>
10+
<tr>
11+
<th>Ollama</th>
12+
<th>Anthropic</th>
13+
<th>Mistral</th>
14+
<th>OpenAI</th>
15+
<th>Gemini</th>
16+
</tr>
17+
18+
<tr><td colspan="6">
19+
<code>llm.WithTemperature(float64)</code>
20+
What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.7 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
21+
</td></tr>
22+
<tr style="border-bottom: 2px solid black;">
23+
<td>Yes</td>
24+
<td>Yes</td>
25+
<td>Yes</td>
26+
<td>Yes</td>
27+
<td>Yes</td>
28+
</tr>
29+
30+
</table>
31+
32+
## Embedding Options
33+
34+
These are the options you can include for the `Embedding`method.
35+
36+
<table>
37+
<tr>
38+
<th>Ollama</th>
39+
<th>Anthropic</th>
40+
<th>Mistral</th>
41+
<th>OpenAI</th>
42+
<th>Gemini</th>
43+
</tr>
44+
45+
<tr><td colspan="6">
46+
<code>ollama.WithKeepAlive(time.Duration)</code>
47+
Controls how long the model will stay loaded into memory following the request
48+
</td></tr>
49+
<tr style="border-bottom: 2px solid black;">
50+
<td>Yes</td>
51+
<td>No</td>
52+
<td>No</td>
53+
<td>No</td>
54+
<td>No</td>
55+
</tr>
56+
57+
<tr><td colspan="6">
58+
<code>ollama.WithTruncate()</code>
59+
Does not truncate the end of each input to fit within context length. Returns error if context length is exceeded.
60+
</td></tr>
61+
<tr style="border-bottom: 2px solid black;">
62+
<td>Yes</td>
63+
<td>No</td>
64+
<td>No</td>
65+
<td>No</td>
66+
<td>No</td>
67+
</tr>
68+
69+
<tr><td colspan="6">
70+
<code>ollama.WithOption(string, any)</code>
71+
Set model-specific option value.
72+
</td></tr>
73+
<tr style="border-bottom: 2px solid black;">
74+
<td>Yes</td>
75+
<td>No</td>
76+
<td>No</td>
77+
<td>No</td>
78+
<td>No</td>
79+
</tr>
80+
81+
<tr><td colspan="6">
82+
<code>openai.WithDimensions(uint64)</code>
83+
The number of dimensions the resulting output embeddings
84+
should have. Only supported in text-embedding-3 and later models.
85+
</td></tr>
86+
<tr style="border-bottom: 2px solid black;">
87+
<td>No</td>
88+
<td>No</td>
89+
<td>No</td>
90+
<td>Yes</td>
91+
<td>No</td>
92+
</tr>
93+
94+
<tr><td colspan="6">
95+
<code>llm.WithFormat(string)</code>
96+
The format to return the embeddings in. Can be either .
97+
</td></tr>
98+
<tr style="border-bottom: 2px solid black;">
99+
<td>No</td>
100+
<td>No</td>
101+
<td>'float'</td>
102+
<td>'float' or 'base64'</td>
103+
<td>No</td>
104+
</tr>
105+
106+
</table>
107+
108+
## Older Content
109+
110+
You can add options to sessions, or to prompts. Different providers and models support
111+
different options.
112+
113+
```go
114+
package llm
115+
116+
type Model interface {
117+
// Set session-wide options
118+
Context(...Opt) Context
119+
120+
// Create a completion from a text prompt
121+
Completion(context.Context, string, ...Opt) (Completion, error)
122+
123+
// Embedding vector generation
124+
Embedding(context.Context, string, ...Opt) ([]float64, error)
125+
}
126+
127+
type Context interface {
128+
// Generate a response from a user prompt (with attachments and
129+
// other options)
130+
FromUser(context.Context, string, ...Opt) error
131+
}
132+
```
133+
134+
The options are as follows:
135+
136+
| Option | Ollama | Anthropic | Mistral | OpenAI | Description |
137+
|--------|--------|-----------|---------|--------|-------------|
138+
| `llm.WithTemperature(float64)` | Yes | Yes | Yes | Yes | What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.7 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
139+
| `llm.WithTopP(float64)` | Yes | Yes | Yes | Yes | Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. |
140+
| `llm.WithTopK(uint64)` | Yes | Yes | No | No | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. |
141+
| `llm.WithMaxTokens(uint64)` | No | Yes | Yes | Yes | The maximum number of tokens to generate in the response. |
142+
| `llm.WithStream(func(llm.Completion))` | Can be enabled when tools are not used | Yes | Yes | Yes | Stream the response to a function. |
143+
| `llm.WithToolChoice(string, string, ...)` | No | Use `auto`, `any` or a function name. Only the first argument is used. | Use `auto`, `any`, `none`, `required` or a function name. Only the first argument is used. | Use `auto`, `none`, `required` or a function name. Only the first argument is used. | The tool to use for the model. |
144+
| `llm.WithToolKit(llm.ToolKit)` | Cannot be combined with streaming | Yes | Yes | Yes | The set of tools to use. |
145+
| `llm.WithStopSequence(string, string, ...)` | Yes | Yes | Yes | Yes | Stop generation if one of these tokens is detected. |
146+
| `llm.WithSystemPrompt(string)` | No | Yes | Yes | Yes | Set the system prompt for the model. |
147+
| `llm.WithSeed(uint64)` | Yes | No | Yes | Yes | The seed to use for random sampling. If set, different calls will generate deterministic results. |
148+
| `llm.WithFormat(string)` | Use `json` | No | Use `json_format` or `text` | Use `json_format` or `text` | The format of the response. For Mistral, you must also instruct the model to produce JSON yourself with a system or a user message. |
149+
| `llm.WithPresencePenalty(float64)` | Yes | No | Yes | Yes | Determines how much the model penalizes the repetition of words or phrases. A higher presence penalty encourages the model to use a wider variety of words and phrases, making the output more diverse and creative. |
150+
| `llm.WithFequencyPenalty(float64)` | Yes | No | Yes | Yes | Penalizes the repetition of words based on their frequency in the generated text. A higher frequency penalty discourages the model from repeating words that have already appeared frequently in the output, promoting diversity and reducing repetition. |
151+
| `llm.WithPrediction(string)` | No | No | Yes | Yes | Enable users to specify expected results, optimizing response times by leveraging known or predictable content. This approach is especially effective for updating text documents or code files with minimal changes, reducing latency while maintaining high-quality results. |
152+
| `llm.WithSafePrompt()` | No | No | Yes | No | Whether to inject a safety prompt before all conversations. |
153+
| `llm.WithNumCompletions(uint64)` | No | No | Yes | Yes | Number of completions to return for each request. |
154+
| `llm.WithAttachment(io.Reader)` | Yes | Yes | Yes | - | Attach a file to a user prompt. It is the responsibility of the caller to close the reader. |
155+
| `llm.WithUser(string)` | No | Yes | No | Yes | A unique identifier representing your end-user |
156+
| `antropic.WithEphemeral()` | No | Yes | No | - | Attachments should be cached server-side |
157+
| `antropic.WithCitations()` | No | Yes | No | - | Attachments should be used in citations |
158+
| `openai.WithStore(bool)` | No | No | No | Yes | Whether or not to store the output of this chat completion request |
159+
| `openai.WithDimensions(uint64)` | No | No | No | Yes | The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models |
160+
| `openai.WithReasoningEffort(string)` | No | No | No | Yes | The level of effort model should put into reasoning. |
161+
| `openai.WithMetadata(string, string)` | No | No | No | Yes | Metadata to be logged with the completion. |
162+
| `openai.WithLogitBias(uint64, int64)` | No | No | No | Yes | A token and their logit bias value. Call multiple times to add additional tokens |
163+
| `openai.WithLogProbs()` | No | No | No | Yes | Include the log probabilities on the completion. |
164+
| `openai.WithLogProbs()` | No | No | No | Yes | Include the log probabilities on the completion. |
165+
| `openai.WithTopLogProbs(uint64)` | No | No | No | Yes | An integer between 0 and 20 specifying the number of most likely tokens to return at each token position. |
166+
| `openai.WithAudio(string, string)` | No | No | No | Yes | Output audio (voice, format) for the completion. Can be used with certain models. |
167+
| `openai.WithServiceTier(string)` | No | No | No | Yes | Specifies the latency tier to use for processing the request. |
168+
| `openai.WithStreamOptions(func(llm.Completion), bool)` | No | No | No | Yes | Include usage information in the stream response |
169+
| `openai.WithDisableParallelToolCalls()` | No | No | No | Yes | Call tools in serial, rather than in parallel |

0 commit comments

Comments
 (0)