Batch Processing with Private Models
Last updated
Last updated
In addition to supporting batch API for models that are publicly available on Huggingface, Parasail also supports the batch API on models that are stored in private Huggingface repos. This page explains how to run such private models via the batch API.
Please follow these steps:
Create a dedicated deployment with the private model. See to learn how to achieve this. Make sure the dedicated deployment is up and running by sending a few requests to it via the UI or the OpenAI APIs.
Find the endpoint name of the dedicated deployment. You can find this at the top of the UI page of the dedicated deployment in the description section or from the examples provided for the deployment at the bottom of the page. Endpoint names are typically prefixed with your account name followed by a name that you choose when creating the dedicated deployment.
Now follow the standard workflow as explained in the , except that you should switch the model name with the endpoint name of the dedicated deployment. Everything else remains the same.