Add retry to bulk indexer #39

navarone-feekery · 2024-05-30T11:07:37Z

Related to #37

We currently have no retry logic for bulk indexing. This is a problem when running the Crawler on Mac because of an intermittent bad_record_mac error. Retrying often resolves the error (so this is a bandaid, not a fix).
Retries are healthy in general so this was a good opportunity to add them.

Add retry with max of 3 for bulk indexer
Add timeout
Add stats for docs that failed indexing
Improve logging in general

lib/utility/es_client.rb

artem-shelkovnikov · 2024-05-30T13:31:02Z

lib/utility/es_client.rb

+        retries += 1
+        if retries <= MAX_RETRIES
+          @system_logger.info("Bulk index attempt #{retries} failed: #{e.message}. Retrying...")
+          sleep(1.second)


IMO fallback should be added here too, exponential is a good candidate for 3 retries

@artem-shelkovnikov I've added a temporary fallback; if a bulk index fails, it saves the failed payload to a file and outputs that file name to the log. Users can cross reference the file content with whatever error response they received.
In the future I'd like to implement something else, but for now I think this should suffice.

I also added exponentially increasing wait times for the retries.

Now, in testing this, I realised the bulk queue can get overloaded if it encounters a lot of errors and exponentially backs off for many seconds. The crawler continues to try sending crawl results and the bulk indexer trips over itself and chaos.
I have an idea to fix this but I think it's out of scope for this PR, so I'll do it in a follow-up PR.

lib/crawler/output_sink/elasticsearch.rb

Add retry to bulk indexer

2bf0954

navarone-feekery added the v0.1.0 label May 30, 2024

navarone-feekery requested a review from a team May 30, 2024 11:07

artem-shelkovnikov reviewed May 30, 2024

View reviewed changes

navarone-feekery added 4 commits May 31, 2024 14:53

Merge branch 'main' into navarone/improve-bulk-indexer

524b0aa

Add fallback and tests

4d70896

Clean up ingestion stats

5049818

Bribe rubocop

afd66d9

navarone-feekery requested review from artem-shelkovnikov and a team May 31, 2024 14:33

artem-shelkovnikov approved these changes May 31, 2024

View reviewed changes

navarone-feekery merged commit cc7ae78 into main Jun 4, 2024
1 check passed

navarone-feekery deleted the navarone/improve-bulk-indexer branch June 4, 2024 08:03

This was referenced Jun 6, 2024

Inconsistent number of items / docs indexed in logs #40

Closed

Lock bulk queue while processing indexing request #45

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add retry to bulk indexer #39

Add retry to bulk indexer #39

navarone-feekery commented May 30, 2024

artem-shelkovnikov May 30, 2024

navarone-feekery May 31, 2024 •

edited

Loading

Add retry to bulk indexer #39

Add retry to bulk indexer #39

Conversation

navarone-feekery commented May 30, 2024

artem-shelkovnikov May 30, 2024

Choose a reason for hiding this comment

navarone-feekery May 31, 2024 • edited Loading

Choose a reason for hiding this comment

navarone-feekery May 31, 2024 •

edited

Loading