allow reuse of cached provisional memos within the same cycle iteration #786

carljm · 2025-04-02T20:11:38Z

Currently, we don't allow provisional memos from cycle iteration to be reused as a memoized result. This is important in case another thread "peeks" in while we are still iterating a cycle (it shouldn't see a provisional result), and also so that iteration N+1 of a cycle doesn't reuse results from iteration N; it needs to recompute instead.

But this also means that if the same query is called many times within a single iteration of a cycle, we will re-execute it many times. For example, if query A calls query B, and query B calls query C 1000 times, and query C calls query A, each time we iterate this cycle we will execute query C 1000 times. This can lead to massive blow-up in execution times.

To fix this, we need to record in cycle_heads of a provisional memo the iteration-count for each cycle in which this memo was recorded. When validating whether a provisional memo can be used by the caller, we must allow it to be used only if we are still within the same iteration of those cycles, and not otherwise.

This PR fixes an execution-time blow-up on red-knot that was bad enough that it looked like a hang.

netlify · 2025-04-02T20:12:01Z

✅ Deploy Preview for salsa-rs canceled.

Name	Link
🔨 Latest commit	`1d0f7e7`
🔍 Latest deploy log	https://app.netlify.com/sites/salsa-rs/deploys/67eea2e96a44e80008e73193

Copilot

Pull Request Overview

This PR enhances memoization by allowing the reuse of cached provisional memos within the same cycle iteration, preventing unnecessary re-execution of queries during cycle iterations. The key changes include:

Updating the cycle head data structure to include an iteration count.
Modifying functions such as push_query and reset_for to accept and propagate the iteration count.
Adding tests to verify the reuse of provisional memos and adjusting cycle log expectations.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/cycle_output.rs	Removed a duplicate log entry to align test outputs.
tests/cycle_accumulate.rs	Removed a duplicate test check to avoid redundant assertions.
tests/cycle.rs	Adjusted log length expectations and added a new test for provisional memo reuse.
src/zalsa_local.rs	Updated push_query signature to include iteration count.
src/function/memo.rs	Revised cycle head filtering to compare the inner database_key_index.
src/function/maybe_changed_after.rs	Updated calls to CycleHeads and added validate_same_iteration.
src/function/fetch.rs	Integrated the new memo validation and updated push_query calls.
src/function/execute.rs	Updated cycle head iteration updates and push_query calls.
src/function.rs	Changed cycle head membership test to iterate over CycleHead.
src/cycle.rs	Redesigned CycleHeads to hold CycleHead (with iteration_count) and added update/extend logic.
src/active_query.rs	Added iteration_count to ActiveQuery and updated related functions.

Comments suppressed due to low confidence (1)

src/cycle.rs:174

Review the assert comparing iteration counts in insert_into_impl; if differing iteration counts are ever expected when merging cycle heads, consider implementing a merging strategy or a more informative error instead of a hard assertion.

assert!(existing.iteration_count == head.iteration_count);

src/function/maybe_changed_after.rs

codspeed-hq · 2025-04-02T20:12:53Z

CodSpeed Performance Report

Merging #786 will not alter performance

_{Comparing carljm:cacheprovisional (1d0f7e7) with master (395b29d)}

Summary

✅ 12 untouched benchmarks

carljm · 2025-04-02T22:57:26Z

The last 3 commits on this PR represent 3 different viable working fixes, which should be evaluated against each other on maintainability and performance.

The most recent commit (072ada7) walks the active query stack once per cycle head in a provisional memo, to check if each cycle head is present in the active query stack.
The second-to-last commit (484d6dd) only walks the active query stack once, but iterates the provisional memo's cycle heads once per entry in the active query stack.
The 3rd-from-last commit (4ea3d85) instead tracks the currently-iterating cycles in a new active_cycles field on ZalsaLocal, and just checks whether that field contains all cycle heads in the memo.

All 3 different versions are showing a small regression on a couple Salsa benchmarks. Oddly, the benchmarks showing regression are not the one with cycles, but this fix should have little or no impact when there are not cycles. This suggests the regression might be from compilation, e.g. an inlining change? 072ada7 reported the smallest regression.

I've tried all 3 of these commits in release builds of red-knot, on a module with a cycle that previously appeared to hang (due to runaway re-execution) before this fix. The performance differences observed between the commits are small (3-4%) and not consistent: they seem to tend to report whichever version hyperfine runs first or second as slightly faster.

Regarding maintainability, I think the options that walk the active query stack are better; correctly maintaining ZalsaLocal::active_cycles is a bit finicky and took me a while to get right (and I think it's still not correct in the face of cancellation). Walking the active query stack is a more reliable and easy-to-maintain way to know which cycle heads are currently executing.

Based on this, I'm currently proposing 072ada7, which is the simplest version and shows the least regression on Salsa benchmarks, and no worse than any other version on the red-knot test.

I welcome reviewer feedback here (and/or other suggestions for how to reclaim some perf.)

Veykril · 2025-04-03T05:06:27Z

Oddly, the benchmarks showing regression are not the one with cycles, but this fix should have little or no impact when there are not cycles. This suggests the regression might be from compilation, e.g. an inlining change?

I believe we are actually seeing allocator noise. You are changing the size of the ActiveQuery (I think) which is heap allocated so this hits the allocator differently (and regressions in those benches are also mostly in allocating paths). I've noticed this in a couple PRs that affect allocation size and/or order.
Either way we can ignore those regressions I think, they look very heap allocation dependent here.

Veykril

I agree with your choice of which strategy to prefer here 👍

src/function/maybe_changed_after.rs

Veykril · 2025-04-03T06:18:13Z

src/cycle.rs

 #[derive(Clone, Debug, Default)]
 #[allow(clippy::box_collection)]
-pub struct CycleHeads(Option<Box<Vec<DatabaseKeyIndex>>>);
+pub struct CycleHeads(Option<Box<Vec<CycleHead>>>);


Unrelated to this PR (but something I just noticed we could investigate in a separate PR), we could check if using ThinVec here might be better than a Box<Vec<_>> (I think we have that dep nowadays)

src/function/fetch.rs

src/function/maybe_changed_after.rs

src/active_query.rs

Update to latest Salsa main branch, so as to get a baseline for measuring the perf effect of salsa-rs/salsa#786 on red-knot in isolation from other recent changes in Salsa main branch.

carljm · 2025-04-03T14:32:27Z

The Salsa benchmarks seem very noisy here and not useful. I just pushed two small changes (iterating active query stack in reverse, and making ActiveQuery::iteration_count private with accessor) which should not impact performance of any benchmark other than converge_diverge (the one with cycles, which already showed no impact), and now the perf regression on those other two input benchmarks is gone. I think this supports @Veykril theory that it's allocator noise and not really meaningful.

Veykril · 2025-04-03T14:35:38Z

Also note that 1.86 released earlier today. So until you rebase onto latest main ( #781 ) you will likely bench between different rust versions.

This reverts commit 4ea3d85.

This reverts commit 2d79486.

carljm · 2025-04-03T14:41:40Z

Rebased on latest main.

This reverts commit 49ceb84.

carljm · 2025-04-03T15:09:48Z

Currently both Salsa and red-knot benchmarks are suggesting around a 1% regression here. I'm going to go ahead and merge; we can explore optimizations separately.

Update to latest Salsa main branch, so as to get a baseline for measuring the perf effect of salsa-rs/salsa#786 on red-knot in isolation from other recent changes in Salsa main branch.

carljm requested review from Copilot, MichaReiser and Veykril April 2, 2025 20:11

Copilot AI reviewed Apr 2, 2025

View reviewed changes

src/function/maybe_changed_after.rs Outdated Show resolved Hide resolved

src/function/maybe_changed_after.rs Outdated Show resolved Hide resolved

carljm requested a review from nikomatsakis April 2, 2025 23:56

Veykril approved these changes Apr 3, 2025

View reviewed changes

src/function/maybe_changed_after.rs Outdated Show resolved Hide resolved

Veykril reviewed Apr 3, 2025

View reviewed changes

MichaReiser approved these changes Apr 3, 2025

View reviewed changes

src/function/fetch.rs Show resolved Hide resolved

src/function/maybe_changed_after.rs Outdated Show resolved Hide resolved

src/active_query.rs Outdated Show resolved Hide resolved

carljm mentioned this pull request Apr 3, 2025

[red-knot] bump Salsa version astral-sh/ruff#17176

Merged

carljm added 13 commits April 3, 2025 07:41

test for caching provisional values

2a2872b

add iteration-count to cycle heads

9376fcf

CycleHeads insert/extend checks iteration count match

1cce019

update iteration count in cycle heads

fa4d0c6

all tests passing

a667a7d

remove debug prints

651794e

just walk active query stack once

759d97d

switch to tracking active cycle iterations on ZalsaLocal

cb59c6e

Revert "switch to tracking active cycle iterations on ZalsaLocal"

a1341e6

This reverts commit 4ea3d85.

Revert "just walk active query stack once"

f039784

This reverts commit 2d79486.

make ActiveQuery::iteration_count private with accessor

3cb19ee

iterate active query stack in reverse

36b74a6

use tracing::trace! in hot path

05bbbd3

carljm force-pushed the cacheprovisional branch from 13d8d89 to 05bbbd3 Compare April 3, 2025 14:41

carljm added 2 commits April 3, 2025 07:53

try a cold annotation on validate_same_iteration

49ceb84

Revert "try a cold annotation on validate_same_iteration"

1d0f7e7

This reverts commit 49ceb84.

carljm enabled auto-merge April 3, 2025 15:10

carljm added this pull request to the merge queue Apr 3, 2025

Merged via the queue into salsa-rs:master with commit 296a8c7 Apr 3, 2025
11 checks passed

carljm deleted the cacheprovisional branch April 3, 2025 15:20

github-actions bot mentioned this pull request Apr 3, 2025

chore: release v0.20.0 #753

Open

MichaReiser mentioned this pull request Apr 5, 2025

[refactor] Use ThinVec for CycleHeads #787

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow reuse of cached provisional memos within the same cycle iteration #786

allow reuse of cached provisional memos within the same cycle iteration #786

carljm commented Apr 2, 2025 •

edited

Loading

netlify bot commented Apr 2, 2025 •

edited

Loading

Copilot AI left a comment

codspeed-hq bot commented Apr 2, 2025 •

edited

Loading

carljm commented Apr 2, 2025 •

edited

Loading

Veykril commented Apr 3, 2025 •

edited

Loading

Veykril left a comment

Veykril Apr 3, 2025

carljm commented Apr 3, 2025

Veykril commented Apr 3, 2025

carljm commented Apr 3, 2025

carljm commented Apr 3, 2025

allow reuse of cached provisional memos within the same cycle iteration #786

allow reuse of cached provisional memos within the same cycle iteration #786

Conversation

carljm commented Apr 2, 2025 • edited Loading

netlify bot commented Apr 2, 2025 • edited Loading

✅ Deploy Preview for salsa-rs canceled.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

codspeed-hq bot commented Apr 2, 2025 • edited Loading

CodSpeed Performance Report

Merging #786 will not alter performance

Summary

carljm commented Apr 2, 2025 • edited Loading

Veykril commented Apr 3, 2025 • edited Loading

Veykril left a comment

Choose a reason for hiding this comment

Veykril Apr 3, 2025

Choose a reason for hiding this comment

carljm commented Apr 3, 2025

Veykril commented Apr 3, 2025

carljm commented Apr 3, 2025

carljm commented Apr 3, 2025

carljm commented Apr 2, 2025 •

edited

Loading

netlify bot commented Apr 2, 2025 •

edited

Loading

codspeed-hq bot commented Apr 2, 2025 •

edited

Loading

carljm commented Apr 2, 2025 •

edited

Loading

Veykril commented Apr 3, 2025 •

edited

Loading