Parasail
  • Welcome
  • Serverless
    • Serverless
    • Available Parameters
  • Dedicated
    • Dedicated Endpoints
    • Speeding up Dedicated Models with Speculative Decoding
    • Deploying private models through HuggingFace Repos
    • Dedicated Endpoint Management API
    • Rate Limits and Limitations
    • FP8 Quantization
  • Batch
    • Quick start
    • Batch Processing with Private Models
    • Batch file format
    • API Reference
  • Cookbooks
    • Run and Evaluate Any Model
    • Chat Completions
    • RAG
    • Multi-Modal
    • Text-to-Speech with Orpheus TTS models
    • Multimodal GUI Task Agent
    • Tool/Function Calling
    • Structured Output
  • Billing
    • Pricing
    • Billing And Payments
    • Promotions
    • Batch SLA
  • Security and Account Management
    • Data Privacy and Retention
    • Account Management
    • Compliance
  • Resources
    • Silly Tavern Guide
    • Community Engagement
Powered by GitBook
On this page
  1. Batch

Batch Processing with Private Models

PreviousQuick startNextBatch file format

Last updated 17 days ago

In addition to supporting batch API for models that are publicly available on Huggingface, Parasail also supports the batch API on models that are stored in private Huggingface repos. This page explains how to run such private models via the batch API.

Please follow these steps:

  • Create a dedicated deployment with the private model. See to learn how to achieve this. Make sure the dedicated deployment is up and running by sending a few requests to it via the UI or the OpenAI APIs.

  • Find the endpoint name of the dedicated deployment. You can find this at the top of the UI page of the dedicated deployment in the description section or from the examples provided for the deployment at the bottom of the page. Endpoint names are typically prefixed with your account name followed by a name that you choose when creating the dedicated deployment.

  • Now follow the standard workflow as explained in the , except that you should switch the model name with the endpoint name of the dedicated deployment. Everything else remains the same.

Batch API documentation
Deploying private models through HuggingFace Repos