Batch Processing with Private Models
In addition to supporting batch API for models that are publicly available on Huggingface, Parasail also supports the batch API on models that are stored in private Huggingface repos. This page explains how to run such private models via the batch API.
Please follow these steps:
Create a dedicated deployment with the private model. See Deploying private models through HuggingFace Repos to learn how to achieve this. The dedicated deployment does not need to be running in order for the rest of the steps to work, so you can pause it immediately. This deployment is essentially used by the backend to determine what models, HF token, and certain configurations to use for processing a batch.
Find the endpoint name of the dedicated deployment. You can find this at the top of the UI page of the dedicated deployment in the description section or from the examples provided for the deployment at the bottom of the page. Endpoint names are typically prefixed with your account name followed by a name that you choose when creating the dedicated deployment.
Now follow the standard workflow as explained in the Batch API documentation, except that you should use the endpoint name of the dedicated deployment as the model name. Everything else remains the same.
Last updated