add mechanisms for rate-limiting and re-prompting in case 429 errors happen #3

franktip · 2024-07-01T14:32:14Z

This PR contains mechanisms for dealing with API endpoints that are rate-limited, causing requests to fail with 429 errors. To work around this, we implemented:

a mechanism for ensuring that a specified number of milliseconds elapses between prompts
a mechanism for retrying the same prompt in case a 429 error happens

This reverts commit e04c8cf.

snadi · 2024-07-02T09:12:23Z

.github/workflows/run-experiment.yml

@@ -29,6 +29,10 @@ on:
        description: "Skip slow benchmarks"
        type: boolean
        default: false
+      benchmarkMode:


Not sure the name benchmarkMode is very intuitive to me for its purpose. So if I'm running in benchmarkMode, it means I use custom rate limiting but if I'm not running in benchmarkMode, I don't use any rate limiting? Wouldn't making this named rateLimiting make more sense? Especially since the command-line argument below is also named benchmark, which at first glance, I thought means which benchmark (i.e., projects) are you running on?

See comment below.

snadi · 2024-07-02T09:15:14Z

benchmark/run.ts

+          default: false,
+          demandOption: false,
+          description:
+            "use custom rate-limiting for benchmarking (if specified, this supercedes the rateLimit option)",


Why is there the need for superceding here? Isn't it the case that if you specify benchmark as True, then you must have a non-zero value for rateLimit? Otherwise, it's as if you don't have rate-limiting? Do you really need two options here?

The confusion is caused by the fact that I implemented two ways of doing rate-limiting. First, it's possible to enforce that a specified number of milliseconds elapses between requests. Second, in "benchmark" rate-limiting mode, the interval is gradually reduced. This is because when we run an experiment, initially there are 25 concurrent workers that are each sending requests to the same API endpoint. As time goes on, some of these workers complete, so that there is less concurrency, and it's possible to speed up the remaining workers without exceeding the overall rate limit. These thresholds were determined experimentally based on the particular LLM service I'm using.

I'll simplify things by turning this into a single command-line parameter called "rateLimit" which can either be a numeric value or the string "benchmark".

snadi · 2024-07-02T09:18:29Z

src/chatmodel.ts

+    if (this.benchmark) {
+      this.rateLimiter = new BenchmarkRateLimiter();
+      console.log(
+        `Using ${this.model} at ${this.apiEndpoint} with ${this.nrAttempts} attempts and benchmark rate limit.`


Ok I was really confused at this point then I scrolled down to the BenchmarkRateLimiter and realized that basically you have two strategies:

specify a rate limit during the initial configuration

let the benchmark rate limiter do its thing where it increasingly tries different rate limits.
I now get it but I don't think the naming/initial descriptions in the configurations are very clear tbh.

yes. But the second option involves gradually decreasing the interval between successive requests

franktip added 5 commits June 30, 2024 18:21

retry when encountering 429 errors

cc4d48a

adopt rate limiter

eef7549

update comment

e4b5e6f

add parameters for rate limiting and retry

99106a6

debug

6004c8d

franktip requested review from max-schaefer and snadi as code owners July 1, 2024 14:32

franktip and others added 4 commits July 1, 2024 10:36

autoformat

c19dc12

update json format used for prompt and response

14d6c16

update default LLM to use

e04c8cf

Revert "update default LLM to use"

c8fc854

This reverts commit e04c8cf.

snadi reviewed Jul 2, 2024

View reviewed changes

franktip added 4 commits July 2, 2024 09:10

rationalize command line options for rate limiting

4b4bfa7

autoformat

bb56a25

update workflow to reflect new ratelimiting mechanism

401acd8

update workflow to reflect new ratelimiting mechanism

efefbb9

franktip merged commit 88eb3b0 into main Jul 2, 2024
5 checks passed

franktip deleted the retry-429 branch July 2, 2024 13:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add mechanisms for rate-limiting and re-prompting in case 429 errors happen #3

add mechanisms for rate-limiting and re-prompting in case 429 errors happen #3

franktip commented Jul 1, 2024

snadi Jul 2, 2024

franktip Jul 2, 2024

snadi Jul 2, 2024

franktip Jul 2, 2024

snadi Jul 2, 2024

franktip Jul 2, 2024

add mechanisms for rate-limiting and re-prompting in case 429 errors happen #3

add mechanisms for rate-limiting and re-prompting in case 429 errors happen #3

Conversation

franktip commented Jul 1, 2024

snadi Jul 2, 2024

Choose a reason for hiding this comment

franktip Jul 2, 2024

Choose a reason for hiding this comment

snadi Jul 2, 2024

Choose a reason for hiding this comment

franktip Jul 2, 2024

Choose a reason for hiding this comment

snadi Jul 2, 2024

Choose a reason for hiding this comment

franktip Jul 2, 2024

Choose a reason for hiding this comment