Quota Management

GPU Quota Limits

By default, all users have a graphics processing unit quota limit of 4 graphics processing units across both Batch and Dedicated services. This quota applies collectively, regardless of graphics processing unit type. The Quota Increase Form is Here.

Examples:

  • If your current graphics processing unit usage is 2 H100 graphics processing units and you attempt to submit a Batch job requiring 8 H100 graphics processing units such as DeepSeek V3, the submission gets rejected due to exceeding your 4 graphics processing unit quota.

  • If you configure automatic scaling for a range of 1–6 graphics processing units, and your model currently utilizes 4 graphics processing units, the quota limit prevents the model from scaling up to a fifth graphics processing unit.

Requesting Increased Quotas

If you need additional graphics processing unit capacity, you can request an increased quota using the Quota Request Form. Quota increases up to 8 graphics processing units are available without additional justification. Requests exceeding 8 graphics processing units require a detailed explanation of your usage needs.

Last updated