OpenAI-Batch Library
Initial Release:
Last updated
Initial Release:
Last updated
Batch inferencing is an easy and inexpensive way to process thousands or millions of LLM inferences.
The process is:
Write inferencing requests to an input file
start a batch job
wait for it to finish
download the output
This library aims to make these steps easier. The OpenAI protocol is relatively easy to use, but it has a lot of boilerplate steps. This library automates those.
Supported Providers
OpenAI - ChatGPT, GPT4o, etc.
Parasail - Most transformer models on HuggingFace, such as LLama, Qwen, LLava, etc.
Use openai_batch.run
to run a batch from an input file on disk:
This will start the batch, wait for it to complete, then download the results.
Useful switches:
-c
Only create the batch, do not wait for it.
--resume
Attach to an existing batch job. Wait for it to finish then download results.
--dry-run
Confirm your configuration without making an actual request.
Full list: python -m openai_batch.run --help
Anthropic's Message Batches - Uses a different API