Metadata-Version: 2.1
Name: batch-inference
Version: 1.0rc1
Summary: Batch Inference
Author-email: Xi Chen <xichen5@microsoft.com>, Lu Ye <luye@microsoft.com>, Yong Huang <yohuan@microsoft.com>
Project-URL: Homepage, https://msasg.visualstudio.com/Bing_and_IPG/_git/batch-inference
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: filelock
Requires-Dist: grpcio
Requires-Dist: msgpack
Requires-Dist: msgpack-numpy
Requires-Dist: numpy
Provides-Extra: docs
Requires-Dist: flask (>=2.0.2) ; extra == 'docs'
Requires-Dist: furo (>=2022.12.7) ; extra == 'docs'
Requires-Dist: sphinx (>=6.1.3) ; extra == 'docs'
Requires-Dist: sphinx-autodoc-typehints (!=1.23.4,>=1.22) ; extra == 'docs'
Requires-Dist: transformers (>=4.27.4) ; extra == 'docs'
Provides-Extra: testing
Requires-Dist: onnxruntime ; extra == 'testing'
Requires-Dist: pytest (>=7.2.2) ; extra == 'testing'
Requires-Dist: torch ; extra == 'testing'

# Batch Inference Toolkit

Batch Inference Toolkit(batch-inference) is a Python package that batches model input tensors coming from multiple users dynamically, executes the model, un-batches output tensors and then returns them back to each user respectively. This will improve system throughput because of a better cache locality. The entire process is transparent to developers.

## Installation

**Install from Pip** _(Coming Soon)_

```bash
python -m pip install batch-inference --upgrade
```

**Build and Install from Source** _(for developers)_

```bash
git clone https://msasg.visualstudio.com/DefaultCollection/Bing_and_IPG/_git/batch-inference
python -m pip install -e .[docs,testing]

# if you want to format the code before commit
pip install pre-commit
pre-commit install

# run unittests
python -m unittest discover tests
```

## Example

```python
import threading
import numpy as np
from batch_inference import batching


@batching(max_batch_size=32)
class MyModel:
    def __init__(self, k, n):
        self.weights = np.random.randn((k, n)).astype("f")

    # x: [batch_size, m, k], self.weights: [k, n]
    def predict_batch(self, x):
        y = np.matmul(x, self.weights)
        return y


with MyModel.host(3, 3) as host:
    def send_requests():
        for _ in range(0, 10):
            x = np.random.randn(1, 3, 3).astype("f")
            y = host.predict(x)

    threads = [threading.Thread(target=send_requests) for i in range(0, 32)]
    [th.start() for th in threads]
    [th.join() for th in threads]

```

## Build the Docs

Run the following commands and open `docs/_build/html/index.html` in browser.

```bash
pip install sphinx myst-parser sphinx-rtd-theme sphinxemoji
cd docs/

make html         # for linux
.\make.bat html   # for windows
```
