Rate Limits and Limitations

Our current Rate Limits and Serverless usage limitations

Rate Limits:

Every customer is now rate limited based on the following existing and future product tiers:

The current rate limits are:

Product Class
RPM
Token Limit

Serverless - Free

5

-

Serverless - User

500

-

Dedicated Serverless

1000

-

Dedicated Serverless Pro

4000

-

Enterprise

Unlimited

-

If you are a new user without a credit card you will soon be able to use our system without a credit card but be heavily rate limited.

Once you sign up your account will have rate limits of 500RPM, following the table above. The Token Limit is not yet implemented or decided on yet.

You can contact us to increase your rate limits or move into one of the newer services.

GPU Quota:

As a new customer of our platform, we have a quota of 2 GPU's per user organization. This is set for multiple reasons the biggest reason is that, initial Users sometimes don't realize that their model could be on 1 GPU but choose a larger option before testing and accumulate a large bill.

If you see the message "Insufficient quota" in your deployment page, that indicates you have reached your user quota.

In order to increase your user quota please fill out this form: Quota Increase

Last updated