Pricing

Serverless Pricing:

Pricing for the “serverless” is token-based based on tokens split between input and output, and the amount you owe changes depending on which models you choose to use. You can find the pricing listed directly on the "Serverless" page for the Input/Output pricing. The pricing works per million tokens so if the model costs $1 for input and output pricing and you spend 250,000 tokens on input and 250,000 tokens on output, you get charged $.50 $0.25 for the input and $0.25 for the output.

Dedicated Pricing:

Each dedicated instance costs according to graphics processing unit per hour. Parasail offers various configurations of the hardware fleet to hit your indicated cost, performance, and latency targets. You have the ability to have your dedicated instances automatically scale the number of graphics processing units as your workload fluctuates, but Parasail offers scale-down policy configuration to meet your needs. A scale down policy is when you want the server to automatically turn off. During run time you get the possible option and amount of replicas you want for the model you chose with the pricing displayed on the option:

Batch Pricing:

Pricing for the “batch” Use Case is token-based based on total amount of tokens, discounted to reflect the fact that your queries don't get processed in real time, and the amount you owe changes depending on which models you choose to use.

The default pricing bases itself on parameter size unless the model is a named model.

Batch gets billed on a 50% discount of the Serverless Price. Cached tokens are also an additional 30% off. If the model is an FP16 quant model it's 30% more, FP8 models incur no additional costs.

The current Price:

Parameter Count

Size

Serverless Price

Batch Price FP8

Batch Price FP16

Cache Price FP8

Cache Price FP16

0-4 B

$0.05

$0.025

$0.033

$0.013

$0.016

4.1-8 B

$0.08

$0.040

$0.052

$0.020

$0.026

LLM_Model_8.1-16 B

8.1-16 B

$0.11

$0.055

$0.072

$0.028

$0.036

LLM_Model_16.1 B-21 B

16.1 B-21 B

$0.45

$0.225

$0.293

$0.113

$0.146

LLM_Model_21.1 B-41 B

21.1 B-41 B

$0.50

$0.250

$0.325

$0.125

$0.163

LLM_Model_41.1 B-80 B

41.1 B-80 B

$0.70

$0.350

$0.455

$0.175

$0.228

LLM_Model_80.1 B-404 B

80.1 B-404 B

$0.80

$0.400

$0.520

$0.200

$0.260

LLM_Model_405 B

405 B

$1.75

$0.875

$1.138

$0.438

$0.569

PreviousImage Generation and Editing NextRate Limits and Limitations

Last updated 5 months ago

hashtagServerless Pricing:

hashtagDedicated Pricing:

hashtagBatch Pricing:

Serverless Pricing:

Dedicated Pricing:

Batch Pricing: