Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rsdev 148 api performance #55

Merged

Conversation

rs-fraser
Copy link
Contributor

This PR improves the performance of:

  • api search
  • workspace 'View all' (equivalent of an empty - i.e. returning all results - search)

Previously in both of the above features, we would first limit the list of records which could be searched in various ways, including by permissions and filtering.

This caused slow performance in cases such as on pangolin85 as sysadmin or for large customer installs, where there are lots of documents.

On the UI search which performs well, we first search all of the indexed documents, then apply any filters or permission-based view rights to the smaller list of search results.

In addition, in the case of an empty search or 'View All' on the workspace where we list all documents, we would first of all loop through each of the users which the subject user can view (e.g. members of a PI subjects groups, or everyone for a sysadmin) and perform a db call for each to get their documents. Instead, the db call now takes a list of user ids and performs an IN query so that there only needs to be 1 db call.

The above changes have improved performance on pangolin85, operating as sysadmin, as follows:

  • search via api for specific term (chemical):
    • from ~30s to under 2s (i.e. not logged as slow call in the logs, where the limit is set to 2s)
  • 'View All' in the workspace or empty search:
    • from ~30s to ~6s

There is still probably further we could go with this as in the case of an empty (i.e. all records) on an instance with 100k records in scope this will still be slow, but hopefully won't time out. I think another avenue to address would be the way we handle permissions checking.

Additionally, there are a bunch of code cleanup changes included in the files which I touched.

…ce hit, only filter in the case where the viewable user of the user performing a search is a PI. In this case, the searching user should only see docs shared with the group, as oppose to all of the group member PIs work, which is the case for regular users. Also some code cleanup.
… user and performing a sql query for each when getting all of a users records. apply search terms to records list before determining which records can be viewed by the subject user to increase performance. code cleanup.
…is a pi with single check after getting all records using new query which takes a list of user ids. only checks permissions if the owning user is a pi which is the only case where the subject user may not be allowed to view.
@rs-fraser
Copy link
Contributor Author

@mKowalski256 @rs-nicof I am off this coming week. I've tested out the changes I made in response to the inefficiency @rs-nicof pointed out yesterday on pangolin and all seems to be working fine. Please feel free to make any changes and merge this PR when you guys are happy with it.

Copy link
Contributor

@rs-nicof rs-nicof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

il looks ok to me!

Comment on lines +194 to +206
if (srchConfig.getFilters().isSharedFilter()) {
List<BaseRecord> sharedRecords =
recordGroupSharingDao.getSharedRecordsWithUser(srchConfig.getAuthenticatedUser());
resultList.retainAll(sharedRecords);
}

if (srchConfig.getFilters().isFavoritesFilter()) {
List<BaseRecord> userFavourites =
recordUserFavoritesDao.getFavoriteRecordsByUser(
srchConfig.getAuthenticatedUser().getId());
resultList.retainAll(userFavourites);
}

Copy link
Contributor

@mKowalski256 mKowalski256 Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to re-apply filtering that was previously handled by SearchManager.searchWorkspaceRecords(), right? Looking through the filters that used to be handled there, one missing here is ontology filter. You can actually see that filtering by ontology files no longer works after UI search on the branch (i.e. after applying 'ontology files' filter the result list shows all documents, not just ontologies).

Copy link
Contributor

@mKowalski256 mKowalski256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mKowalski256 mKowalski256 merged commit aa4b282 into rspace-os:main Jul 15, 2024
2 checks passed
@mKowalski256 mKowalski256 deleted the rsdev-148-api-performance branch July 15, 2024 20:29
@github-actions github-actions bot locked and limited conversation to collaborators Jul 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants