Batch file format
Last updated
Last updated
Parasail supports the OpenAI batch API, including the format of the input and output files.
A batch input file is a .jsonl
file where each line is a batch request. Parasail processes these requests and returns the results as an output .jsonl
file.
The batch input wraps the same data structures used by interactive requests into an offline format. Each line in the input file is one request.
The request is a JSON dictionary with the following keys:
custom_id
: A unique value that the user creates so they can later match outputs to inputs.
method
: HTTP method, currently POST
.
url
: one of /v1/chat/completions
, /v1/embeddings
body
: The same as the body of an interactive request.
. Note: stream
must be omitted or false
.
. We strongly recommend using "encoding_format": "base64"
. See below.
Downloadable example file:
Both Parasail and OpenAI limit the maximum size of the batch input files:
Up to 50,000 requests (lines)
Up to 100MB total input file size
Workloads exceeding these limits must be split into multiple batches.
The batch output file wraps the interactive responses into an offline format. Each line in the output file is the response to one request.
Important: The order of responses may differ from the order of requests in the input file
The response is a JSON dictionary. The most important keys:
custom_id
: The same as the value in the request. Use this to match responses to requests.
response
: The HTTP response, as a dictionary:
status_code
: HTTP status code. 200
if the request succeeded, else the same HTTP error code as if this request was interactive.
body
: The same as the body of an interactive response.
It is easy to create a batch input file. Exactly how depends on how your workflow. The following code snippets get you started.
See .
For embeddings, we strongly recommend using "encoding_format": "base64"
(). This setting asks the server to return the result as a Base64-encoded binary array instead of a plain-text float array. This typically reduces file size by nearly 4x with no loss of precision, and is used by default by the interactive OpenAI client.
.
.
See .