Parasail
  • Welcome
  • Serverless
    • Serverless
    • Available Parameters
  • Dedicated
    • Dedicated Endpoints
    • Speeding up Dedicated Models with Speculative Decoding
    • Deploying private models through HuggingFace Repos
    • Dedicated Endpoint Management API
    • Rate Limits and Limitations
    • FP8 Quantization
  • Batch
    • Quick start
    • Batch Processing with Private Models
    • Batch file format
    • API Reference
  • Cookbooks
    • Run and Evaluate Any Model
    • Chat Completions
    • RAG
    • Multi-Modal
    • Text-to-Speech with Orpheus TTS models
    • Multimodal GUI Task Agent
    • Tool/Function Calling
    • Structured Output
  • Billing
    • Pricing
    • Billing And Payments
    • Promotions
    • Batch SLA
  • Security and Account Management
    • Data Privacy and Retention
    • Account Management
    • Compliance
  • Resources
    • Silly Tavern Guide
    • Community Engagement
Powered by GitBook
On this page
  1. Cookbooks

Tool/Function Calling

We have certain models that now support tool calling.

This also is related to Structured Output which documentation can be found here: Structured Output

Tool/function calling is a capability of language models (LLMs) allowing them to interact with external systems or APIs by explicitly invoking tools or functions. Instead of generating only text, the model identifies when a task requires external assistance and makes structured calls to external software or APIs.

Models
Tools
Tool Choice

parasail-llama-33-70b-fp8

Y

Y

parasail-llama-4-scout-instruct

Y

Y

parasail-llama-4-maverick-instruct-fp8

Y

Y

parasail-qwen3-30b-a3b

Y

Y

parasail-qwen3-235b-a22b

Y

Y

parasail-qwen3-32b

Y

Y

parasail-mistral-devstral-small

Y

Y

  • A. OpenAI-compatible REST endpoint – the gateway does the schema validation for you.

  • B. Parasail-hosted vLLM client – the model streams back raw text that you parse into JSON.


Function / Tool Calling Two Ways

0 | Shared tool signature

pythonCopyEdittool_schema = {
  "name": "get_weather",
  "description": "Retrieve weather information for a given location and date.",
  "parameters": {
    "type": "object",
    "properties": {
      "location": {"type": "string"},
      "date": {"type": "string", "pattern": "^\\d{4}-\\d{2}-\\d{2}$"}
    },
    "required": ["location", "date"]
  }
}

example_sentence = (
  "What's the weather like in Manhattan Beach on 2025-06-03?"
)

A | OpenAI-style REST call (schema enforced server-side)

pythonCopyEditimport os, json
from openai import OpenAI

client = OpenAI(
    base_url="https://api.parasail.io/v1",
    api_key=os.getenv("PARASAIL_API_KEY")
)

resp = client.chat.completions.create(
    model="parasail-llama-4-scout-instruct",
    messages=[{"role": "user", "content": example_sentence}],
    tools=[tool_schema],
    tool_choice="auto"           # let the model decide to call get_weather
)

args = resp.choices[0].message.tool_calls[0].function.arguments
print(json.dumps(args, indent=2))

Typical output

jsonCopyEdit{
  "location": "Manhattan Beach",
  "date": "2025-06-03"
}

Because the gateway validates against tool_schema, the payload is guaranteed to match.


B | vLLM client call (text → JSON)

pythonCopyEditfrom vllm import LLM, SamplingParams
import os, json

vlm = LLM(
    model="parasail-llama-4-scout-instruct",
    api_key=os.getenv("PARASAIL_API_KEY"),
    base_url="https://api.parasail.io/v1/vllm"
)

prompt = f"""
You have access to this tool:
{json.dumps(tool_schema, indent=2)}

User Query: "{example_sentence}"

Respond **only** in JSON:
{{
  "tool_name": "get_weather",
  "parameters": {{
    "location": "",
    "date": ""
  }}
}}
"""

params  = SamplingParams(temperature=0, max_tokens=100, stop=["}"])
raw_out = vlm.generate([prompt], params)[0].outputs[0].text.strip() + "}"
tool_call = json.loads(raw_out)

print(json.dumps(tool_call, indent=2))

Typical output

jsonCopyEdit{
  "tool_name": "get_weather",
  "parameters": {
    "location": "Manhattan Beach",
    "date": "2025-06-03"
  }
}

Now you can hand tool_call["parameters"] to your weather micro-service. Switch between REST and vLLM as needed—your model, schema, and business logic stay the same.

PreviousMultimodal GUI Task AgentNextStructured Output

Last updated 16 hours ago