Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed error handling #40

Merged
merged 1 commit into from
Apr 4, 2025
Merged

Distributed error handling #40

merged 1 commit into from
Apr 4, 2025

Conversation

viccon
Copy link
Owner

@viccon viccon commented Mar 23, 2025

Overview

This PR should resolve #38 by ensuring that either the error from the fetch function is returned when no distributed records are found, or a combination of ErrOnlyCachedRecords and the fetch function error when part of a batch was retrievable from the distributed storage.

Note: In this PR, I've branched from #36 as they are both related to error handling, and I'll most likely include them in the same release.

// Before we call the fetchFn, we'll do an unblocking read to see if the
// context has been cancelled. If it has, we'll return a stale value if we
// have one available.
select {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could argue that it’s simpler to just perform an error check here, e.g. if ctx.Err() != nil, but I just find this approach more explicit. To me it's very clear that this code is trying to determine whether the work should proceed or not.

Copy link

@ernstwi ernstwi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks good to me! 👍

@@ -7,7 +7,7 @@ import (

func (c *Client[T]) refresh(key string, fetchFn FetchFn[T]) {
response, err := fetchFn(context.Background())
if err != nil {
if err != nil && !errors.Is(err, errOnlyDistributedRecords) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change: Don't update the memory cache if the underlying fetch failed.

So this makes refresh match the behaviour of getFetch/getFetchBatch – only update the cache if the fetchFn succeeds.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're getting an errOnlyDistributedRecords error here, it means that we're using distributed storage with early refreshes. When we attempt to retrieve the record from the underlying data source and that call fails, we write the "old" value from the distributed cache to the in-memory cache. This allows us to serve stale data if an upstream system goes down.

If we don't write it to the in-memory cache, we will keep hammering both the distributed storage and the underlying data source. Note that while I use the term stale, the data will never be older than the TTL of the distributed cache. For example, if we're using a distributed storage with a 10-minute TTL and 1-minute refresh times, we would write values that are between 1 and 10 minutes old

fetchObserver.Err(errors.New("error"))
res, err := sturdyc.GetOrFetch(ctx, c, key, fetchObserver.Fetch)
if err != nil {
t.Fatalf("expected no error, got %v", err)
if !errors.Is(err, sturdyc.ErrOnlyCachedRecords) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also add an errors.Is check for the wrapped fetchFn error here, right?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean like this?

	if !errors.Is(err, fetchObserver.err) {
		t.Fatal("expected the original error to have been joined with ErrOnlyCachedRecords")
	}

E.g assert that you're getting a combination of the two errors? I think that is a good call to make both assertions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly!

}

fetchObserver := NewFetchObserver(11)
fetchObserver.err = context.Canceled
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use fetchObserver.Err here for consistency :)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm what do you mean?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just that you have a helper to set err on FetchObserver that you use in some of the tests but not all, e.g.: fetchObserver.Err(sturdyc.ErrNotFound)

No big deal.

@viccon
Copy link
Owner Author

viccon commented Mar 28, 2025

Nice, looks good to me! 👍

Great! My plan is to merge this along with the changes in #36 and then create a new release with this improved error handling

@ernstwi
Copy link

ernstwi commented Mar 28, 2025

Nice, looks good to me! 👍

Great! My plan is to merge this along with the changes in #36 and then create a new release with this improved error handling

Sounds good. I think the team is planning to deploy this PR to prod today. Check with @perhells for details (I'm away next week)

@davidulander
Copy link

Nice, looks good to me! 👍

Great! My plan is to merge this along with the changes in #36 and then create a new release with this improved error handling

Sounds good. I think the team is planning to deploy this PR to prod today. Check with @perhells for details (I'm away next week)

I'll test this PR today :)

@viccon
Copy link
Owner Author

viccon commented Mar 31, 2025

davidulander

Great, let me know how it went!

@davidulander
Copy link

davidulander

Great, let me know how it went!

I feel I lack the context to be of much help here 😅 Per is sick and Ernst is on vacation, think the best is to wait until one of them is back :) Is that ok by you @viccon ?

@viccon viccon force-pushed the additional-error-handling branch from 43a5026 to 34b5ab5 Compare April 4, 2025 19:24
@viccon viccon changed the base branch from error-handling to main April 4, 2025 19:24
@viccon viccon merged commit 97fc006 into main Apr 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Canceled Context with go-redis results in ErrOnlyCachedRecords
3 participants