Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Obs AI Assistant] Add test for get_dataset_info #213231

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

sorenlouv
Copy link
Member

@sorenlouv sorenlouv commented Mar 5, 2025

  • Add API test for get_dataset_info
  • Add apache synthtrace scenario

@sorenlouv sorenlouv changed the title Add test for get_dataset_info and synthtrace scenario for generating apache logs [Obs AI Assistant] Add test for get_dataset_info and synthtrace scenario for generating apache logs Mar 5, 2025
@sorenlouv sorenlouv marked this pull request as ready for review March 5, 2025 13:34
@sorenlouv sorenlouv requested review from a team as code owners March 5, 2025 13:34
@sorenlouv sorenlouv changed the title [Obs AI Assistant] Add test for get_dataset_info and synthtrace scenario for generating apache logs [Obs AI Assistant] Add test for get_dataset_info Mar 5, 2025
@botelastic botelastic bot added ci:project-deploy-observability Create an Observability project Team:Obs AI Assistant Observability AI Assistant Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team labels Mar 5, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ai-assistant (Team:Obs AI Assistant)

@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

Copy link
Contributor

github-actions bot commented Mar 5, 2025

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@sorenlouv sorenlouv added release_note:skip Skip the PR/issue when compiling release notes v9.0.0 backport:version Backport to applied version labels v9.1.0 v8.19.0 labels Mar 5, 2025
};
}
try {
const name = indexPattern === '' ? ['*', '*:*'] : `${indexPattern.split(',')}*`;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgieselaar As discussed I've added a trailing wildcard. I think this is better although this still doesn't find zookeeper logs if the user asks "please show me my zookeeper logs" and they use datastreams with default naming, eg. indices will be something like .ds-logs-zookeeper.1-default-2025.03.04-000001 and the data stream will be logs-zookeeper.1-default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To handle that we'd need both a leading and trailing wildcard (*${indexPattern.split(',')}*). I wouldn't mind that but you had some concerns?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can:

  • first execute with the raw value for index (let's say logs)
  • if no indices are resolved, execute it with logs*
  • if still no indices are resolved, execute it with *logs*

we can also always add a *:... additionally.

_resolve/index should be relatively fast so I'm not worried about performance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_resolve/index should be relatively fast so I'm not worried about performance.

That's also my thinking so in that case, why not just do *logs* up front to avoid the complexity?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from code complexity it could also be confusing to users how it sometimes includes some indices depending on what other indices they have in their system, eg. if they have a datastream called zookeeper-legacy-metrics and another logs-zookeeper.1-default it will only include their logs datastream if they delete the legacy datastream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also my thinking so in that case, why not just do logs up front to avoid the complexity?

Because you get stuff like backing indices. But if we can exclude those (I think we can) it's probably fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also my thinking so in that case, why not just do logs up front to avoid the complexity?

starting with a more-specific index pattern is not about performance, it's about making sure we don't send a lot of noise to the LLM. We could also filter progressively client side if that makes things easier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you get stuff like backing indices. But if we can exclude those (I think we can) it's probably fine.

When would that be a problem? Say, in the example with "zookeeper" can you imagine an index that includes "zookeeper" but should not be included?

Copy link
Member Author

@sorenlouv sorenlouv Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT of this?

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That way we do not match backing indices.

@elasticmachine
Copy link
Contributor

elasticmachine commented Mar 5, 2025

💚 Build Succeeded

  • Buildkite Build
  • Commit: 61e5197
  • Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-213231-61e5197da7a2

Metrics [docs]

✅ unchanged

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:version Backport to applied version labels ci:project-deploy-observability Create an Observability project release_note:skip Skip the PR/issue when compiling release notes Team:Obs AI Assistant Observability AI Assistant Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team v8.19.0 v9.0.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants