Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock patch 001: Addressing accumulating locks on contributors table on very large instances #3003

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

sgoggins
Copy link
Member

@sgoggins sgoggins commented Feb 17, 2025

Description

  • Deadlock management in Augur is generally addressed using a progressive transaction shrinking and waiting policy in order to:
    • Maximize data gathering throughput
    • Minimize contention and deadlocking
  • Experiments show that this strategy is working well with the occasional exception of deadlocks around the contributors table. In this case, and because contributors are so logically coupled with every other data gathering process, deadlocks can quickly cascade and accumulate.
    • This PR implements the general strategy found in more general terms throughout Augur, but centered on the unique properties of the contributors problem (The snowball of deadlocks has been observed to accelerate quickly on instances of Augur with more than 10k repositories and more than 10 simultaneous data collection processes.
    • The "storm" does not last forever and so far there is not any indication that in place retries are not ultimately working.
    • Rather, this is about lowering the load on the database engine so that we avoid generating deadlocks as much as is sometimes happening when the "large instance" thresholds are met).
    • Of course this problem like all programming and data problems is idiosyncratic to the particular repositories and contributors being selected at a point in time. Early indications are that when a repository is both new to Augur and contains more than 1,000 pull requests deadlocks occur more frequently

Signed-off-by: Sean P. Goggins <s@goggins.com>
Signed-off-by: Sean P. Goggins <s@goggins.com>
Signed-off-by: Sean P. Goggins <s@goggins.com>
Signed-off-by: Sean P. Goggins <s@goggins.com>
Signed-off-by: Sean P. Goggins <s@goggins.com>
@sgoggins sgoggins added database Related to Augur's unifed data model bug-fix Fixes a bug python Pull requests that update Python code labels Feb 17, 2025
@sgoggins sgoggins requested a review from ABrain7710 February 17, 2025 23:19
retries = 0
while retries < max_retries:
try:
bulk_insert_dicts(contributors_batch, Contributor, ["cntrb_id"])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
E1120: No value for argument 'natural_keys' in function call (no-value-for-parameter)

@sgoggins
Copy link
Member Author

This does not seem to fix all of the deadlocks. Lets review together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix Fixes a bug database Related to Augur's unifed data model python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant