Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: avoid slow reindex of studio content during init #1174

Merged
merged 1 commit into from
Dec 9, 2024

Conversation

bradenmacdonald
Copy link
Contributor

Now that openedx/edx-platform#35981 has merged to Sumac, solving openedx/modular-learning#235 , we can update the command that Tutor uses to populate the initial Studio search index.

What this means:

  • Brand new instances will get a working index, and be able to start using the features right away. The init during instance setup will be nearly-instant.
  • Upgraded instances with a small amount of content will see a notice during init that they need to run reindex_studio manually. The index will be created automatically during init, but it will be incomplete (existing content absent, but new content will show up in it). The init during instance setup will be nearly-instant, and the reindex command will take a few minutes when run manually.
  • Upgraded instances with a huge amount of content will see a notice during init that they need to run reindex_studio manually. The index will be created automatically during init, but it will be incomplete (existing content absent, but new content will show up in it). The init during instance setup will be nearly-instant, and the reindex command could take up to several days when run manually. It can be interrupted and resumed as needed.

Compare this to the situation before this change:

  • Brand new instances will get a working index, and be able to start using the features right away. The init during instance setup will be nearly-instant.
  • Upgraded instances with a small amount of content will be able to start using the features right away. The init during instance setup will take a few minutes.
  • Upgraded instances with a huge amount of content will hit a wall when the init process tries to run reindex_studio, and it takes several days and likely never completes before the task is terminated for one reason or another.

FYI @regisb @DanielVZ96 @ChrisChV @ormsbee

# index (resulting in broken features until it is complete). If either of those
# are necessary, it will print instructions on what command to run to do so.
./manage.py cms reindex_studio --experimental --init
# Create the courseware content index
./manage.py cms reindex_course --active
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about course content? will it also not replace existing indices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the reindex_course command - I haven't been making any changes to the learner-facing courseware search. So it should work the same way it always has, but presumably using Meilisearch instead of Elasticsearch. I think this is something we can improve in future releases.

Copy link
Contributor

@regisb regisb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@regisb regisb merged commit 308e453 into overhangio:sumac Dec 9, 2024
@bradenmacdonald bradenmacdonald deleted the sumac-reindex-init branch December 9, 2024 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants