You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I understand the code correctly, the current mechanism used for dealing with rate limits relies on exponential backoff to try again until it succeeds or it runs out of time.
This works for a small number of requests, but when running lots of them all at once (I'm talking about thousands in a batch), this doesn't work very well. What I found to be useful is using a leaky bucket–based rate limiter for both requests per minute and tokens per minute, which handles this better. The Python package openlimit implements this.
The text was updated successfully, but these errors were encountered:
If I understand the code correctly, the current mechanism used for dealing with rate limits relies on exponential backoff to try again until it succeeds or it runs out of time.
This works for a small number of requests, but when running lots of them all at once (I'm talking about thousands in a batch), this doesn't work very well. What I found to be useful is using a leaky bucket–based rate limiter for both requests per minute and tokens per minute, which handles this better. The Python package
openlimit
implements this.The text was updated successfully, but these errors were encountered: