Handle rate limiting with leaky bucket instead of only backoff #320

oyarsa · 2025-01-15T12:27:59Z

If I understand the code correctly, the current mechanism used for dealing with rate limits relies on exponential backoff to try again until it succeeds or it runs out of time.

This works for a small number of requests, but when running lots of them all at once (I'm talking about thousands in a batch), this doesn't work very well. What I found to be useful is using a leaky bucket–based rate limiter for both requests per minute and tokens per minute, which handles this better. The Python package openlimit implements this.

The text was updated successfully, but these errors were encountered:

64bit added the enhancement New feature or request label Jan 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle rate limiting with leaky bucket instead of only backoff #320

Handle rate limiting with leaky bucket instead of only backoff #320

oyarsa commented Jan 15, 2025

Handle rate limiting with leaky bucket instead of only backoff #320

Handle rate limiting with leaky bucket instead of only backoff #320

Comments

oyarsa commented Jan 15, 2025