Parasail
  • Welcome
  • Serverless
    • Serverless
    • Available Parameters
  • Dedicated
    • Dedicated Endpoints
    • Speeding up Dedicated Models with Speculative Decoding
    • Deploying private models through HuggingFace Repos
    • Dedicated Endpoint Management API
    • Rate Limits and Limitations
    • FP8 Quantization
  • Batch
    • Quick start
    • Batch Processing with Private Models
    • Batch file format
    • API Reference
  • Cookbooks
    • Run and Evaluate Any Model
    • Chat Completions
    • RAG
    • Multi-Modal
    • Text-to-Speech with Orpheus TTS models
    • Multimodal GUI Task Agent
    • Tool/Function Calling
    • Structured Output
  • Billing
    • Pricing
    • Billing And Payments
    • Promotions
    • Batch SLA
  • Security and Account Management
    • Data Privacy and Retention
    • Account Management
    • Compliance
  • Resources
    • Silly Tavern Guide
    • Community Engagement
Powered by GitBook
On this page
  1. Dedicated

Deploying private models through HuggingFace Repos

How to create a dedicated deployment for a private HuggingFace model

PreviousSpeeding up Dedicated Models with Speculative DecodingNextDedicated Endpoint Management API

Last updated 4 months ago

In order to deploy a Parasail dedicated endpoint for a private model from HuggingFace, you need to follow these steps:

  • Make your model private in HuggingFace.

    • Go to the settings page of your model.

    • At the top of the page, click on the button Make private.

    • After the model becomes private, you should see the following:

  • Create a HuggingFace access token with finegrained permissions that only allows access to your private model.

    • On the Access Tokens page, click on the button Create new token.

    • Leave token type as Fine-grained and write a name for your access token.

    • Scroll down to the section Repositories permissions.

    • In the textbox Search for repos, type the name of your model (e.g. meta-llama/Llama-3.2-1B).

    • Leave the permission to be Read access to contents of selected repos.

    • Click on the button Create token at the bottom of the page.

    • Copy the access token that is generated.

  • Create a dedicated deployment in Parasail.

    • Click on the button Create Dedicated Model on the Parasail dashboard.

    • Copy your model name (e.g. meta-llama/Llama-3.2-1B) into the textbox HuggingFace ID / URL

    • Copy your access token to the textbox HuggingFace Token.

    • Click on the button Deploy.