# Run and Evaluate Any Model

At Parasail, we aim to make it as easy as possible to find the right model for your job. Our [Serverless](https://www.saas.parasail.io/serverless) tier is the quickest way to try out a wide range of popular models covering chat, instruct, and multimodal with model sizes ranging from 7 B to the incredibly capable DeepSeek V3 at 671 B parameters. There are many times though when Serverless is not sufficient, and we built the Dedicated and Batch tiers to help in those cases:

**Trying out new models:** If a model doesn't exist in serverless—maybe its an uncommon model on HuggingFace, or maybe you trained and fine tuned it yourself—you can easily spin it up in a [Dedicated Endpoint](/parasail-docs/dedicated-instance/dedicated-endpoint.md). We support most transformers on HuggingFace, both public [and private](/parasail-docs/dedicated-instance/private-hf-models.md), and we have the lowest cost on-demand GPUs on the market from 4090 s up to H200s.

**Large-scale Evaluations:** Effective and automatic LLM evaluation is critical to building a quality product. Our [batch processing service](/parasail-docs/batch/batch-quickstart.md#batch-embedding-models-1) is ideal for evals that require a large amount of prompts, images, or text. The price is 50% off our serverless endpoints and prompt-caching provides another 50% discount. We also support most transformers on HuggingFace with a single-line change in the code, making the evaluation of many models easy.

**Embeddings:** We support a range of embedding models that rival proprietary models such as OpenAI and Voyage, including:

* [parasail-ai/GritLM-7 B-vllm](https://huggingface.co/parasail-ai/GritLM-7B-vllm)
* [Alibaba-NLP/gte-Qwen2-7 B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct)

These embeddings - and any others on HuggingFace - can be easily run using our [batch processing service](/parasail-docs/batch/batch-quickstart.md#batch-embedding-models-1).

### **Checking Model Compatibility**

The Dedicated UI can be used to verify whether a model is supported by our inference engines. Simply paste the URL into the model entry page and a message will appear indicating that it is supported.

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th><th data-hidden data-card-cover data-type="files"></th></tr></thead><tbody><tr><td><img src="/files/uZkGtiBJjEUsGAcmIPoZ" alt="" data-size="original"></td><td>Model is supported</td><td></td></tr><tr><td><img src="/files/mzHpxqOjrxD8JFIP5GHB" alt=""></td><td>Model cannot be found indicates the model is not supported</td><td></td></tr></tbody></table>

### **Private or Custom Models and LORAs**

Fine-tuned models and LORA adapters can be easily hosted on Parasail's dedicated by hosting them on in a public or private repo on HuggingFace. For public models, simply paste the URL into the dedicated page. For guidelines on hosting private models, including generating the access token, please see this section:

{% content-ref url="/pages/0tnt4xWOMBgre4UPkN0j" %}
[Deploying private models through HuggingFace Repos](/parasail-docs/dedicated-instance/private-hf-models.md)
{% endcontent-ref %}

### **Batch Processing of Private Models**

Private models can also be processed in batch mode at a 50% discount to the equivalent serverless pricing of the model. For information on how to set this up, please see this section:

{% content-ref url="/pages/g08bVP5XvUsIWEMEu62Z" %}
[Batch Processing with Private Models](/parasail-docs/batch/batch-processing-with-private-models.md)
{% endcontent-ref %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.parasail.io/parasail-docs/cookbooks/run-and-evaluate-any-model.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
