Rate limits serve as essential mechanisms to control the flow of requests made to a system, ensuring its stability and fair usage. They impose restrictions on the number of requests that can be made within a specific time frame. Rate limits are commonly measured in terms of RPM (requests per minute) or RPS (requests per second), indicating the maximum allowed number of requests a user or application can make during that time.
Below are the rate limits for all of our models. If you require a higher limit for your particular use case, please don't hesitate to get in touch with us at [email protected].
In the case of our foundation models, there are limits per second (RPS) and per minute (RPM):
In the case of custom models, there are limits per second (RPS) and per minute (RPM) (depending on the base model):
|Custom Model (Based on):||RPS||RPM|
For the task-specific models, there are only limitations per minute:
|Grammatical Error Correction (GEC)||100|
Updated about 1 month ago