Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use MatchPhraseQuery for bleve code search #33628

Merged
merged 7 commits into from
Feb 25, 2025

Conversation

jpraet
Copy link
Member

@jpraet jpraet commented Feb 17, 2025

Fix regression from #32210 which unintentionally changed the search mode for bleve from MatchPhraseQuery to MatchQuery.

On the main branch, meanwhile with #33590 a "literal code search" mode (by using quotes) was implemented as workaround for this unexpected code search behavior. Maybe that feature needs some redesign as it turns out to have been caused by a regression.

But this PR at least already fixes the regression for 1.23.x

@GiteaBot GiteaBot added this to the 1.23.4 milestone Feb 17, 2025
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Feb 17, 2025
@pull-request-size pull-request-size bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 17, 2025
@github-actions github-actions bot added the modifies/go Pull requests that update Go code label Feb 17, 2025
@jpraet jpraet added issue/regression Indicates a previously functioning feature or behavior that has broken or regressed after a change and removed modifies/go Pull requests that update Go code labels Feb 17, 2025
@github-actions github-actions bot added the modifies/go Pull requests that update Go code label Feb 17, 2025
@jpraet
Copy link
Member Author

jpraet commented Feb 17, 2025

The test fails because the default behavior for bleve now differs from elasticsearch.

@darren
Copy link
Contributor

darren commented Feb 17, 2025

use MatchPhraseQuery would be too strict for cases when search for some words that I don't remember exactly, for the case I raised in #33588, if switch to MatchPhraseQuery searching for swagger template returns no result while the second hit is valid in my opinion:

413142282-528b3659-ea21-470f-bd9e-2b8bf6e9b869

@jpraet
Copy link
Member Author

jpraet commented Feb 17, 2025

My instance was recently upgraded from 1.22 and 1.23 and many users complain about the code search returning way too many irrelevant results in the new version. In 1.22 MatchPhraseQuery was used for bleve. In 1.23 it now uses MatchQuery without any option for the user to switch it back to MatchPhraseQuery.

I don't know what the best course of action is. Maybe your PR #33590 can be backported to 1.23.

@lunny
Copy link
Member

lunny commented Feb 17, 2025

#33590

Maybe we can backport it after

My instance was recently upgraded from 1.22 and 1.23 and many users complain about the code search returning way too many irrelevant results in the new version. In 1.22 MatchPhraseQuery was used for bleve. In 1.23 it now uses MatchQuery without any option for the user to switch it back to MatchPhraseQuery.

I don't know what the best course of action is. Maybe your PR #33590 can be backported to 1.23.

Maybe it's better to backport #33590 ?

@jpraet
Copy link
Member Author

jpraet commented Feb 18, 2025

I feel like returning ~200 matches when searching for swaggerParameterBodies is probably worse than not returning a match when searching for swagger template. But I agree both should work ideally.

GitHub search returns a single match for swaggerParameterBodies, without requiring quotes around it.
If you search for swagger parameter bodies you get 2 matches and not 200 😄.

So as I said: I don't know what the right approach is, but bleve code search feels broken in 1.23.

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Feb 18, 2025
@lunny lunny modified the milestones: 1.23.4, 1.23.5 Feb 19, 2025
@pull-request-size pull-request-size bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 22, 2025
@pull-request-size pull-request-size bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 22, 2025
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Feb 25, 2025
@lunny lunny added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Feb 25, 2025
@lunny lunny enabled auto-merge (squash) February 25, 2025 19:55
@lunny lunny merged commit e3021fa into go-gitea:release/v1.23 Feb 25, 2025
26 checks passed
@GiteaBot GiteaBot removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
issue/regression Indicates a previously functioning feature or behavior that has broken or regressed after a change lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. modifies/go Pull requests that update Go code size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants