Parasail
  • Welcome
  • Serverless
    • Serverless
    • Available Parameters
  • Dedicated
    • Dedicated Endpoints
    • Speeding up Dedicated Models with Speculative Decoding
    • Deploying private models through HuggingFace Repos
    • Dedicated Endpoint Management API
    • Rate Limits and Limitations
  • Batch
    • Quick start
    • Batch Processing with Private Models
    • Batch file format
    • API Reference
  • Cookbooks
    • Run and Evaluate Any Model
    • Chat Completions
    • RAG
    • Multi-Modal
    • Text-to-Speech with Orpheus TTS models
  • Billing
    • Pricing
    • Billing And Payments
    • Promotions
    • Batch SLA
  • Security and Account Management
    • Data Privacy and Retention
    • Account Management
    • Compliance
  • Resources
    • Silly Tavern Guide
    • Community Engagement
Powered by GitBook
On this page
  1. Dedicated

Rate Limits and Limitations

Our current Rate Limits and Serverless usage limitations

PreviousDedicated Endpoint Management APINextQuick start

Last updated 1 month ago

GPU Quota:

As a new customer of our platform, we have a quota of 2 GPU's per user organization. This is set for multiple reasons the biggest reason is that, initial Users sometimes don't realize that their model could be on 1 GPU but choose a larger option before testing and accumulate a large bill.

If you see the message "Insufficient quota" in your deployment page, that indicates you have reached your user quota.

In order to increase your user quota please fill out this form:

Quota Increase