# Quick start

***This page gives an introduction to Parasail's Batch Processing Library. For more detailed information, jump to:***

* [Using the Batch UI](#parasail-batch-ui)
* [Parasail's Python Batch Helper Library Reference](https://docs.parasail.io/parasail-docs/batch/python-batch-helper-library)
* [Batch file format (100% compatible with OpenAI)](https://docs.parasail.io/parasail-docs/batch/batch-file-format)
* [OpenAI Batch API reference (Parasail is 100% compatible)](https://docs.parasail.io/parasail-docs/batch/api-reference)

Parasail's batch processing engine is a straightforward and inexpensive way to process thousands or millions of LLM inferences. Batch inferencing is easy: create an input file, start a batch job, wait for it to finish, then download the output.

OpenAI compatibility, Parasail batch helper library

### Getting Started with the Parasail Batch Helper Library

{% hint style="info" %}
You can find the Parasail Batch Helper Library on Github. This library is 100% compatible with both Parasail and OpenAI\
<https://github.com/parasail-ai/openai-batch>
{% endhint %}

The first step is to create a Parasail API key: <https://www.saas.parasail.io/keys>. This key should be stored in your environment through something like a .bashrc file or a .env file, or passed through the command line invocation. *Pasting directly in code isn't recommended as keys can get leaked.*

Next, install ***openai-batch***, the batch helper library that's compatible both with Parasail and OpenAI:

```bash
pip install openai-batch
```

Now you're ready to run a batch job in as little as five lines of code. The batch endpoint supports most transformers on Hugging Face—all you have to do is put the Hugging Face ID in the request. There's no need for this model to be a dedicated or endpoint. In this example, Parasail uses **NousResearch/DeepHermes-3-Mistral-24 B-Preview** (<https://huggingface.co/NousResearch/DeepHermes-3-Mistral-24B-Preview>).

{% code overflow="wrap" %}

```python
#test_batch.py
import random
from openai_batch import Batch

# Create a batch with random prompts
with Batch() as batch:
    objects = ["cat", "robot", "coffee mug", "spaceship", "banana"]
    for i in range(100):
        batch.add_to_batch(
            model="NousResearch/DeepHermes-3-Mistral-24B-Preview",
            messages=[{"role": "user", "content": f"Tell me a joke about a {random.choice(objects)}"}]
        )
    # Submit, wait for completion, and download results
    result, output_path, error_path = batch.submit_wait_download()
    print(f"Batch completed with status {result.status} and stored in {output_path}")
```

{% endcode %}

This code looks for PARASAIL\_API\_KEY in the environment, and passing it through the command line is straightforward. Running this code produces the following output:

```bash
PARASAIL_API_KEY=<INSERT API KEY> python3 test_batch.py
validating
in_progress
...
in_progress
in_progress
completed
Batch completed with status completed and stored in batch-itvt3wmjs7-output.jsonl

```

The first two lines of `batch-itvt3wmjs7-output.jsonl` are the prompt responses:

```
{"id":"vllm-51653b28a97e4e67a0d3587f959ffe3a","custom_id":"line-1","response":{"status_code":200,"request_id":"vllm-batch-1c7186c472cd49b1aa12a81760094540","body":{"id":"chatcmpl-01be171caf3b4dc69cfa9ef360d26a4e","object":"chat.completion","created":1743050291,"model":"NousResearch/DeepHermes-3-Mistral-24B-Preview","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"Why did the coffee mug get arrested? It had too much to behave!","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":41,"total_tokens":60,"completion_tokens":19,"prompt_tokens_details":null},"prompt_logprobs":null}},"error":null}
{"id":"vllm-58a8ac3f589d4380ad94e00b191da24e","custom_id":"line-3","response":{"status_code":200,"request_id":"vllm-batch-2dd59bd942fa4a25a8792f9b8ee28da1","body":{"id":"chatcmpl-a59bfccd7d134981bf5e1004e77c6854","object":"chat.completion","created":1743050291,"model":"NousResearch/DeepHermes-3-Mistral-24B-Preview","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"Why can't you trust an astronaut? Because they're always out of this world!","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":41,"total_tokens":59,"completion_tokens":18,"prompt_tokens_details":null},"prompt_logprobs":null}},"error":null}
```

### Batch Submission Limitations

Both Parasail and OpenAI limit the maximum size of the batch input files:

* Up to 50,000 requests (lines)
* Up to **500 MB** total input file size for Parasail
* Up to **250 MB** total input file size for OpenAI
* Max completion tokens for Parasail defaults to 8,192, but can be overridden to 16,384 through the `max_completion_tokens` parameter.

Workloads that exceed these limits must be split into multiple batches. The `add_to_batch` function in the [Parasail Batch Helper Library](https://github.com/parasail-ai/openai-batch) will raise a `ValueError` exception when the file size or request count is exceeded for the provider.

### Resuming Batch Jobs

A major convenience of batch processing is the ability to submit a job and resume the monitoring in a different process or flow. A developer can upload hundreds of batch jobs and millions of prompts without worrying about program crashes, errors, or resets—those prompts are processed on Parasail's servers until successful completion.

Changing the last two lines of the previous example submits the job and prints out the batch ID, then exits.

```python
    # Submit, wait for completion, and download results
    batch_id = batch.submit()
    print(f"Batch ID: {batch_id}")
```

With a batch ID of `batch-tclfzwczcd`, you can now wait for the batch to finish and download it in a separate script:

{% code overflow="wrap" %}

```python
from openai_batch import Batch
import time

with Batch(batch_id="batch-tclfzwczcd",
           output_file="mybatch.jsonl") as batch:
    # Check status periodically
    while True:
        status = batch.status()
        print(f"Batch status: {status.status}")

        if status.status in ["completed", "failed", "expired", "cancelled"]:
            break

        time.sleep(60)  # Check every minute

    # Download results once completed
    output_path, error_path = batch.download()
    print(f"Output saved to: {output_path}")
```

{% endcode %}

Which outputs:

```
Batch status: in_progress
...
Batch status: in_progress
Output saved to: mybatch.jsonl
```

### Broad Model and Parameter Support

{% hint style="info" %}
Chat completions, embeddings, multimodal inputs, the full range of prompt parameters, and **even OpenAI models** are supported
{% endhint %}

**OpenAI Models**

{% code overflow="wrap" %}

```python
with Batch() as batch:
    for i in range(100):
        batch.add_to_batch(
            model="gpt-4o",
            messages=[{"role": "user", "content": f"Give me 10 dad jokes"}]
        )
```

{% endcode %}

#### &#x20;<a href="#batch-embedding-models" id="batch-embedding-models"></a>

#### **Embedding Models** <a href="#batch-embedding-models" id="batch-embedding-models"></a>

GritLM—developed by Contextual and hosted on Hugging Face by Parasail—and GTE-Qwen2-7 B-Instruct from Alibaba are two excellent open source embedding models that rival proprietary models. Embeddings like these can be easily run by changing `messages`to `input` .

[parasail-ai/GritLM-7 B-vllm](https://huggingface.co/parasail-ai/GritLM-7B-vllm)

[Alibaba-NLP/gte-Qwen2-7 B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct)

As seen below, Parasail strongly recommends using [base64 encoding](https://platform.openai.com/docs/api-reference/embeddings/create#embeddings-create-encoding_format) `encoding_format="base64"` to reduce the size of the output files for both Parasail and OpenAI.

```python
with Batch() as batch:
    for i in range(100):
        batch.add_to_batch(
            model="parasail-ai/GritLM-7B-vllm",
            encoding_format= "base64"
            input=f"This is input #{i}"
        )
```

**Parameters**

`add_to_batch` supports all of the parameters that `client.chat.completions.create` or `client.embedding.create` supports, though note that open source models on parasail may not always support every parameter.

```python
with Batch() as batch:
    for i in range(100):
        batch.add_to_batch(
            model="NousResearch/DeepHermes-3-Mistral-24B-Preview",
            max_completion_tokens=1000,
            temperature=0.7,
            top_p=0.1,
            messages=[{"role": "user", "content": f"Give me 10 dad jokes"}]
        )
```

**Metadata**

You can add metadata to the batch. This is useful for passing information between separate submit and download processes, as well as tracking the results on the status UI page. Any metadata you add to the submission is visible in the Batch UI progress section. For detailed information about the metadata field, see the [OpenAI Batch API Submit Specification](https://platform.openai.com/docs/api-reference/batch/create#batch-create-metadata).

```python
with Batch() as batch:
    for i in range(100):
        batch.add_to_batch(
            model="NousResearch/DeepHermes-3-Mistral-24B-Preview",
            messages=[{"role": "user", "content": f"Give me 10 dad jokes"}]
        )
    batch.submit( metadata= {"Job Name": "Dad jokes"})
```

### **Parasail Batch UI**

The Parasail Batch UI can be used to upload new batches, track the status of batches, and download results when finished.

While a batch is queued or running, you can view its status, download the input, or cancel it:

<figure><img src="https://3807676826-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLSXNQZeD4w30hUaiugTx%2Fuploads%2Fgit-blob-59b05d9e6a50a36c9781480c91f77c5f62e75574%2Fimage.png?alt=media" alt=""><figcaption></figcaption></figure>

When a batch job is finished, you can download the input and the output file as well as view the total token usage:

<figure><img src="https://3807676826-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLSXNQZeD4w30hUaiugTx%2Fuploads%2Fgit-blob-3252f56ac67f15c88753f83340e7aaf2f559de85%2Fimage.png?alt=media" alt=""><figcaption></figcaption></figure>

**Batch Submission**

Batches can be submitted directly from the Batch section of the platform by clicking the **Create Batch** <img src="https://3807676826-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLSXNQZeD4w30hUaiugTx%2Fuploads%2Fgit-blob-bf543d07e3a3382b69b3c57b3e42c35f40ade099%2Fimage.png?alt=media" alt="" data-size="line"> button. This brings up a dialog to upload a JSONL file.

<figure><img src="https://3807676826-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLSXNQZeD4w30hUaiugTx%2Fuploads%2Fgit-blob-c5de29e1c0fbd34ff7e9ebd92ee7caa12d5efc04%2FCreate%20Batch%20screenshot.png?alt=media" alt="" width="375"><figcaption></figcaption></figure>

Below is an example JSONL file, which follows the OpenAI format for batch submissions. Unlike the Parasail Batch Helper Library, this doesn't support OpenAI models and only supports open source Hugging Face transformers and embeddings.

{% file src="<https://3807676826-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLSXNQZeD4w30hUaiugTx%2Fuploads%2Fgit-blob-75d4d6ae82da3ba27b9b20ecf6562f3ed2061493%2Fbatch-input-example.jsonl?alt=media>" %}

**Further Reading**

* [Parasail's Python Batch Helper Library Reference](https://docs.parasail.io/parasail-docs/batch/python-batch-helper-library)
* [Batch file format (100% compatible with OpenAI)](https://docs.parasail.io/parasail-docs/batch/batch-file-format)
* [OpenAI Batch API reference (Parasail is 100% compatible)](https://docs.parasail.io/parasail-docs/batch/api-reference)
