Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Obs AI Assistant] Improve error handling in the evaluation framework #212991

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,35 @@ import {
MessageRole as InferenceMessageRole,
} from '@kbn/inference-common';
import { generateFakeToolCallId } from '@kbn/inference-plugin/common';
import type { Logger } from '@kbn/logging';
import { Message, MessageRole } from '.';

export function convertMessagesForInference(messages: Message[]): InferenceMessage[] {
export function convertMessagesForInference(
messages: Message[],
logger: Pick<Logger, 'error'>
): InferenceMessage[] {
const inferenceMessages: InferenceMessage[] = [];

messages.forEach((message) => {
if (message.message.role === MessageRole.Assistant) {
let parsedArguments;
if (message.message.function_call?.name) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of this change is to not validate the arguments JSON object here because any errors in the function call will be caught in the function validation step.

For example:
When the LLLM hallucinates the function name/arguments such as:

{
  "content": "",
  "function_call": {
    "name": "elasticsearchelasticsearch",
    "arguments": "{\"method\": \"POST\", \"path\": \"/testing_ai_assistant/_doc\", \"body\": {\"type\": \"alert\", \"message\": \"This test is for alerts\"}}{\"method\": \"POST\", \"path\": \"/testing_ai_assistant/_doc\", \"body\": {\"type\": \"esql\", \"message\": \"This test is for esql\"}}",
    "trigger": "assistant"
  },
  "role": "assistant"
}

At present, since the JSON is invalid, this is the error thrown:

ERROR ChatCompletionError: SyntaxError: Unexpected non-whitespace character after JSON at position 56
          at Object.next (throw_serialized_chat_completion_errors.ts:29:17)
          at /Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/tap.ts:189:31
          at OperatorSubscriber._this._next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/OperatorSubscriber.ts:70:13)
          at OperatorSubscriber.Subscriber.next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:75:12)
          at /Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/mergeInternals.ts:85:24
          at OperatorSubscriber._this._next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/OperatorSubscriber.ts:70:13)
          at OperatorSubscriber.Subscriber.next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:75:12)
          at Observable._subscribe (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/observable/innerFrom.ts:78:18)
          at Observable._trySubscribe (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Observable.ts:244:19)
          at /Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Observable.ts:234:18
          at Object.errorContext (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/util/errorContext.ts:29:5)
          at Observable.subscribe (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Observable.ts:220:5)
          at doInnerSub (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/mergeInternals.ts:71:40)
          at outerNext (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/mergeInternals.ts:53:58)
          at OperatorSubscriber._this._next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/OperatorSubscriber.ts:70:13)
          at OperatorSubscriber.Subscriber.next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:75:12)
          at /Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/switchMap.ts:115:42
          at OperatorSubscriber._this._next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/OperatorSubscriber.ts:70:13)
          at OperatorSubscriber.next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:75:12)
          at processStream (stream_into_observable.ts:17:20)
          at processTicksAndRejections (node:internal/process/task_queues:95:5) {
        code: 'internalError',
        meta: {}
      }

With this change (By not propagating the invalid JSON error), we can see the actual error in the terminal:

ERROR ChatCompletionError: Could not find tool call request for elasticsearchelasticsearch
          at Object.next (throw_serialized_chat_completion_errors.ts:29:17)
          at /Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/tap.ts:189:31
          at OperatorSubscriber._this._next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/OperatorSubscriber.ts:70:13)
          at OperatorSubscriber.Subscriber.next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:75:12)
          at /Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/mergeInternals.ts:85:24
          at OperatorSubscriber._this._next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/OperatorSubscriber.ts:70:13)
          at OperatorSubscriber.Subscriber.next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:75:12)
          at Observable._subscribe (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/observable/innerFrom.ts:78:18)
          at Observable._trySubscribe (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Observable.ts:244:19)
          at /Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Observable.ts:234:18
          at Object.errorContext (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/util/errorContext.ts:29:5)
          at Observable.subscribe (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Observable.ts:220:5)
          at doInnerSub (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/mergeInternals.ts:71:40)
          at outerNext (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/mergeInternals.ts:53:58)
          at OperatorSubscriber._this._next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/OperatorSubscriber.ts:70:13)
          at OperatorSubscriber.Subscriber.next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:75:12)
          at /Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/switchMap.ts:115:42
          at OperatorSubscriber._this._next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/operators/OperatorSubscriber.ts:70:13)
          at OperatorSubscriber.next (/Users/viduni/Workspace/Elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:75:12)
          at processStream (stream_into_observable.ts:17:20)
          at processTicksAndRejections (node:internal/process/task_queues:95:5) {
        code: 'internalError',
        meta: {}
      }

try {
parsedArguments = message.message.function_call?.arguments
? JSON.parse(message.message.function_call.arguments)
: {};
} catch (error) {
logger.error(
`Failed to parse function call arguments when converting messages for inference: ${error}`
);
// if the LLM returns invalid JSON, it is likley because it is hallucinating
// the function. We don't want to propogate the error about invalid JSON here.
// Any errors related to the function call will be caught when the function and
// it's arguments are validated
return {};
}
}

inferenceMessages.push({
role: InferenceMessageRole.Assistant,
content: message.message.content ?? null,
Expand All @@ -27,7 +49,7 @@ export function convertMessagesForInference(messages: Message[]): InferenceMessa
{
function: {
name: message.message.function_call.name,
arguments: JSON.parse(message.message.function_call.arguments || '{}'),
arguments: parsedArguments || {},
},
toolCallId: generateFakeToolCallId(),
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -486,15 +486,14 @@ export class ObservabilityAIAssistantClient {
const options = {
connectorId,
system: systemMessage,
messages: convertMessagesForInference(messages),
messages: convertMessagesForInference(messages, this.dependencies.logger),
toolChoice,
tools,
functionCalling: (simulateFunctionCalling ? 'simulated' : 'auto') as FunctionCallingMode,
};

this.dependencies.logger.debug(
() =>
`Calling inference client with for name: "${name}" with options: ${JSON.stringify(options)}`
() => `Calling inference client for name: "${name}" with options: ${JSON.stringify(options)}`
);

if (stream) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -328,10 +328,10 @@ export class KibanaClient {
}

if (error.message.includes('Status code: 429')) {
that.log.info(`429, backing off 20s`);

return timer(20000);
that.log.info(`429, backing off 30s`);
return timer(30000);
}

that.log.info(`Retrying in 5s`);
return timer(5000);
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ describe('Knowledge base', () => {
const conversation = await chatClient.complete({ messages: prompt });

const result = await chatClient.evaluate(conversation, [
'Uses KB retrieval function to find information about the Quantum Revectorization Engine',
'Uses context function response to find information about the Quantum Revectorization Engine',
'Correctly identifies Dr. Eliana Stone at Acme Labs in 2023 as the inventor',
'Accurately describes that it reorders the subatomic structure of materials and can transform silicon wafers into superconductive materials',
'Does not invent unrelated or hallucinated details not present in the KB',
Expand All @@ -111,7 +111,7 @@ describe('Knowledge base', () => {
const conversation = await chatClient.complete({ messages: prompt });

const result = await chatClient.evaluate(conversation, [
'Uses KB retrieval function to find the correct document about QRE constraints',
'Uses context function response to find the correct document about QRE constraints',
'Mentions the 2 nanometer limit on the revectorization radius',
'Mentions that specialized fusion reactors are needed',
'Does not mention information unrelated to constraints or energy (i.e., does not mention the inventor or silicon wafer transformation from doc-invention-1)',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ export function registerQueryFunction({
};
}
);

functions.registerFunction(
{
name: QUERY_FUNCTION_NAME,
Expand All @@ -129,7 +130,8 @@ export function registerQueryFunction({
connectorId,
messages: convertMessagesForInference(
// remove system message and query function request
messages.filter((message) => message.message.role !== MessageRole.System).slice(0, -1)
messages.filter((message) => message.message.role !== MessageRole.System).slice(0, -1),
resources.logger
),
logger: resources.logger,
tools: Object.fromEntries(
Expand Down