Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move command requests to card message matrix events and add support for multiple commands per message #2145

Merged
merged 25 commits into from
Mar 10, 2025

Conversation

lukemelia
Copy link
Contributor

@lukemelia lukemelia commented Feb 13, 2025

This PR is breaking in a sense -- previous commands added to matrix rooms will no longer be visible since this PR changes the way we are encoding command requests (tool calls) into matrix events.

The motivation for this change is that tool-call-enabled LLMs have the ability make multiple tool calls in a single response, and our modeling made that quite awkward. This PR make that case normal with full support.

This PR also changes the way that debouncing/throttling works in the aibot. It handles response streaming in better, ensuring an event is sent to matrix at most every 250ms and the response is encoded and delivered in full.

Finally, type-checking and linting in the aibot package is improved in this PR too. It was a little loose before.

UPDATE: This PR has some issues consuming the stream from OpenRouter. I'm investigating. Resolved!

  • add host tests for multiple tool calls
  • add aibot tests for multiple tool calls with results
  • test coverage / implementation for command failures
  • Ensure code TODOs are addressed

Copy link

github-actions bot commented Feb 13, 2025

Host Test Results

    1 files  ±0      1 suites  ±0   25m 25s ⏱️ + 1m 21s
795 tests +2  793 ✔️ +2  2 💤 ±0  0 ±0 
800 runs  +2  798 ✔️ +2  2 💤 ±0  0 ±0 

Results for commit 8b4e305. ± Comparison against base commit 0e42fb1.

♻️ This comment has been updated with latest results.

@lukemelia lukemelia force-pushed the cs-7993-support-multiple-tool-calls-per-ai-response branch 2 times, most recently from b762a09 to 8d52ce0 Compare February 14, 2025 04:07
@lukemelia lukemelia force-pushed the cs-7993-support-multiple-tool-calls-per-ai-response branch from 8d52ce0 to c7a2bd6 Compare February 14, 2025 04:31
@lukemelia lukemelia force-pushed the cs-7993-support-multiple-tool-calls-per-ai-response branch from ddeea6e to 1f5b5c6 Compare March 3, 2025 16:21
@lukemelia lukemelia force-pushed the cs-7993-support-multiple-tool-calls-per-ai-response branch from 1f5b5c6 to 9979d51 Compare March 3, 2025 16:56
@lukemelia lukemelia changed the title WIP move tool calling to boxel message events and add support for multiple tool calls/commands Move command requests to card message matrix events and add support for multiple commands per message Mar 4, 2025
@lukemelia lukemelia requested review from IanCal and a team March 4, 2025 17:46
@lukemelia lukemelia marked this pull request as ready for review March 4, 2025 17:46
Copy link
Contributor

@backspace backspace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m approving after having read this over with the caveat that I couldn’t get it running locally, the bot was crashing like this:

/Users/b/Documents/Cardstack/Code/boxel-motion/packages/ai-bot/helpers.ts:196
      let { attachedCardsEventIds } = event.content.data;
            ^
TypeError: Cannot destructure property 'attachedCardsEventIds' of 'event.content.data' as it is undefined.
    at constructHistory (/Users/b/Documents/Cardstack/Code/boxel-motion/packages/ai-bot/helpers.ts:196:13)

Is this the breaking change of the PR description, do I need to clear my Matrix history?

@lukemelia
Copy link
Contributor Author

Is this the breaking change of the PR description, do I need to clear my Matrix history?

It's not supposed to crash. Thanks for the error message. I'll see if I can fix it.

@lukemelia lukemelia requested a review from backspace March 5, 2025 18:37
@lukemelia lukemelia force-pushed the cs-7993-support-multiple-tool-calls-per-ai-response branch from 2fda108 to c5ec57a Compare March 5, 2025 18:42
@IanCal
Copy link
Contributor

IanCal commented Mar 5, 2025

I ran this and managed to get 3.5 sonnet to return two calls for a command on a skill - I noticed there wasn't a description with the options, it replied when the first one was applied (unsure if those two were intentional here, I know we spoke about different options) and it didn't like sending the reaction event for the second command

image

Looks like we are using m.annotation, and while that doesn't seem to be against the spec it's flagged by synapse as two reactions -
image

@IanCal
Copy link
Contributor

IanCal commented Mar 6, 2025

replacing m.annotation with a custom thing seems to work for that issue.

@jurgenwerk
Copy link
Contributor

image image image

@lukemelia lukemelia force-pushed the cs-7993-support-multiple-tool-calls-per-ai-response branch 2 times, most recently from 8795782 to 03c51ea Compare March 6, 2025 17:00
@lukemelia lukemelia force-pushed the cs-7993-support-multiple-tool-calls-per-ai-response branch from 03c51ea to 53a898d Compare March 6, 2025 17:36
@backspace
Copy link
Contributor

I copied Matic’s example and it seems to have worked:

California Beach Volleyball Sunset, Beach Decor, Beach Wall Art 2025-03-06 13-14-56

For that screenshot I pressed “Apply” on the first. The blankness in the response seems confusing but I did inspect the Matrix message and saw the command was properly in an array.

Is there a more thorough way to exercise this, to get multiple commands in a response?

@lukemelia
Copy link
Contributor Author

Is there a more thorough way to exercise this, to get multiple commands in a response?

Interesting -- that prompt was enough to get multiple commands in one message for me

@jurgenwerk
Copy link
Contributor

jurgenwerk commented Mar 7, 2025

I tested this again and I'm a little confused about how multiple commands per message should work. I would expect there would be multiple commands within the same message, but when I test it, the commands will be in multiple messages. Is this expected? Or is there an issue with the test prompt?

Here is my prompt:
image

@jurgenwerk
Copy link
Contributor

Should there be a description of the action to be applied?

image

@lukemelia lukemelia merged commit 160dab0 into main Mar 10, 2025
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants