Add metadata to tracing #375

devjpt23 · 2024-09-18T16:32:28Z

Capture Number of Tokens in Request and Response
When this PR is merged, you will be able to capture metadata from the responses into the trace directory when you run run_demo.py . The structure will be as follows:

── trace
      └── gpt-3.5-turbo << MODEL ID>>
          └── coolstore << APP Name >>
              ├── pom.xml << Source File Path >>
              │   └── single_group << Incident Batch Mode >>
              │       └── 1719673609.8266618 << Start of Request Time Stamp >>
              │           ├── 1 << Incident Batch Number >>
              │           │   ├── 0 << Retry Attempt  >>
              │           │   │   ├── llm_result << Contains the response from the LLM prior to us parsing >>
              │           │   │   ├── token_usage.json<< New metadata file added here >>
              │           │   │   ├── prompt << The formatted prompt prior to sending to LLM >>
              │           │   │   └── prompt_vars.json << The prompt variables which are injected into the prompt template >>
              │           │   ├── params.json << Request parameters >>
              │           │   └── timing << Duration of a Successful Request >>
              └── src
                  └── main
                      ├── java
                      │   └── com
                      │       └── redhat
                      │           └── coolstore
                      │               ├── model
                      │               │   ├── InventoryEntity.java
                      │               │   │   └── single_group
                      │               │   │       └── 1719673609.827135
                      │               │   │           ├── 1
                      │               │   │           │   ├── 0
                      │               │   │           │   │   ├── llm_result
                      │               │   │           │   │   └── token_usage.json << New metadata file added here >>
                      │               │   │           │   ├── prompt
                      │               │   │           │   └── prompt_vars.json
                      │               │   │           ├── params.json
                      │               │   │           └── timing
                      │               │   ├── Order.java
                      │               │   │   └── single_group
                      │               │   │       └── 1719673609.826999
                      │               │   │           ├── 1
                      │               │   │           │   ├── 0
                      │               │   │           │   │   ├── llm_result
                      │               │   │           │   │   └── token_usage.json << New metadata file added here >>
                      │               │   │           │   ├── prompt
                      │               │   │           │   └── prompt_vars.json
                      │               │   │           ├── params.json
                      │               │   │           └── timing

This will aid you in debugging by providing detailed metadata and trace information.

Note: The metadata captured varies across different models. This variation reflects the unique characteristics and capabilities of each model, such as token usage, latency, or other performance metrics. For specific details on the metadata differences across various models, please refer to the following link: LLM Metadata Variations.

jwmatthews · 2024-09-18T18:34:20Z

@devjpt23 we need to address 2 process related items with contributing to Kai.

We need to 'sign' our commits through a DCO process:
- Read more here: https://github.com/konveyor/kai/blob/main/CONTRIBUTING.md#dco
We need to check that the linter trunk succeeds running through the code
- Read more here: https://github.com/konveyor/kai/blob/main/CONTRIBUTING.md#linting

Also when you are ready for this PR to be tested and reviewed, please convert this out of 'Draft'.

Signed-off-by: devjpt23 <devpatel232408@gmail.com>

jwmatthews

Confirmed with IBM BAM I'm seeing token usage data

$ cat token_usage.json
{
"prompt_tokens": 6427,
"completion_tokens": 1244,
"total_tokens": 7671,
"input_token_count": 6427,
"generated_token_count": 1244
}

For BedRock using claude 3.5, there is no token data, we catch the exception and logs show a warning.

WARNING - 2024-09-20 15:05:13,937 - kai.service.kai_application.kai_application - [ kai_application.py:189 - get_incident_solutions_for_file()
] - Key does not exist in the dictionary: 'token_usage'

jwmatthews · 2024-09-20T19:11:15Z

I think we may need to experiment with a few other providers and see if/how they report back token usage.
Planning to merge this for now as we continue to experiment with other ways of grabbing this info.

devjpt23 · 2024-09-26T07:38:23Z

@jwmatthews During testing, I found that capturing the entire response metadata generated, via demo mode, a log file of over 4,000 lines. To optimize logging, I considered targeting specific keys within the response.

devjpt23@5A-E1-06-17-C3-51:~/grok-integrate/kai/logs/trace/meta-llama/llama-3-70b-instruct/coolstore/src/main/java/com/redhat/coolstore/model/InventoryEntity.java/single_group/1727334757.1589327/1/0$ wc -l token_usage.json 
4025 token_usage.json

If we are comfortable with logging long responses can easily fix that.

Please let me know your preference for the most efficient logging approach.

devjpt23 marked this pull request as draft September 18, 2024 16:36

devjpt23 force-pushed the add-metadata-to-tracing branch from ec1a7ca to 3996321 Compare September 18, 2024 17:45

jwmatthews self-requested a review September 18, 2024 18:29

devjpt23 changed the title ~~Add metadata to tracing #373~~ Add metadata to tracing Sep 19, 2024

devjpt23 force-pushed the add-metadata-to-tracing branch from 3996321 to 7305a83 Compare September 19, 2024 17:07

Added response metadata capture to trace directory

a200846

Signed-off-by: devjpt23 <devpatel232408@gmail.com>

devjpt23 force-pushed the add-metadata-to-tracing branch from 7305a83 to a200846 Compare September 19, 2024 17:17

devjpt23 marked this pull request as ready for review September 19, 2024 17:22

jwmatthews approved these changes Sep 20, 2024

View reviewed changes

jwmatthews merged commit ed14dbe into konveyor:main Sep 20, 2024
5 checks passed

This was referenced Sep 20, 2024

Capture number of tokens in a request and response when possible #373

Closed

Change from token_usage to all response metadata #385

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add metadata to tracing #375

Add metadata to tracing #375

devjpt23 commented Sep 18, 2024 •

edited

Loading

jwmatthews commented Sep 18, 2024

jwmatthews left a comment

jwmatthews commented Sep 20, 2024

devjpt23 commented Sep 26, 2024 •

edited

Loading

Add metadata to tracing #375

Add metadata to tracing #375

Conversation

devjpt23 commented Sep 18, 2024 • edited Loading

jwmatthews commented Sep 18, 2024

jwmatthews left a comment

Choose a reason for hiding this comment

jwmatthews commented Sep 20, 2024

devjpt23 commented Sep 26, 2024 • edited Loading

devjpt23 commented Sep 18, 2024 •

edited

Loading

devjpt23 commented Sep 26, 2024 •

edited

Loading