How to create a dedicated deployment for a private HuggingFace model
In order to deploy a Parasail dedicated endpoint for a private model from HuggingFace, you need to follow these steps:
Make your model private in HuggingFace.
Go to the settings page of your model.
At the top of the page, click on the button Make private.
Make private
After the model becomes private, you should see the following:
Create a HuggingFace access token with finegrained permissions that only allows access to your private model.
On the Access Tokens page, click on the button Create new token.
Access Tokens
Create new token
Leave token type as Fine-grained and write a name for your access token.
Fine-grained
Scroll down to the section Repositories permissions.
Repositories permissions
In the textbox Search for repos, type the name of your model (for example meta-llama/Llama-3.2-1 B).
Search for repos
meta-llama/Llama-3.2-1 B
Leave the permission to be Read access to contents of selected repos.
Read access to contents of selected repos
Click on the button Create token at the bottom of the page.
Create token
Copy the access token that is generated.
Create a dedicated deployment in Parasail.
Click on the button Create Dedicated Model on the Parasail dashboard.
Create Dedicated Model
Copy your model name (for example meta-llama/Llama-3.2-1 B) into the textbox HuggingFace ID / URL
HuggingFace ID / URL
Copy your access token to the textbox HuggingFace Token.
HuggingFace Token
Click on the button Deploy.
Deploy
Last updated 2 months ago