Quota Management
GPU Quota Limits
By default, all users have a graphics processing unit quota limit of 4 graphics processing units across both Batch and Dedicated services. This quota applies collectively, regardless of graphics processing unit type. The Quota Increase Form is Here.
Examples:
If your current graphics processing unit usage is 2 H100 graphics processing units and you attempt to submit a Batch job requiring 8 H100 graphics processing units such as DeepSeek V3, the submission gets rejected due to exceeding your 4 graphics processing unit quota.
If you configure automatic scaling for a range of 1–6 graphics processing units, and your model currently utilizes 4 graphics processing units, the quota limit prevents the model from scaling up to a fifth graphics processing unit.
Requesting Increased Quotas
If you need additional graphics processing unit capacity, you can request an increased quota using the Quota Request Form. Quota increases up to 8 graphics processing units are available without additional justification. Requests exceeding 8 graphics processing units require a detailed explanation of your usage needs.
Last updated