Metadata-Version: 2.4
Name: pendra
Version: 0.2.0
Summary: Python SDK for Pendra — UK-based, privacy-first LLM inference
Author-email: Pendra <hello@pendra.ai>
License: Apache-2.0
Project-URL: Homepage, https://pendra.ai
Project-URL: Documentation, https://pendra.ai/docs/python
Keywords: llm,ai,inference,uk,openai-compatible
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: httpx>=0.27
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: pytest-asyncio>=0.24; extra == "dev"
Requires-Dist: respx>=0.22; extra == "dev"

# pendra-python

Official Python SDK for [Pendra](https://pendra.ai) — UK-based, privacy-first LLM inference.

Your data is processed in the UK, never stored, never shared with US cloud providers.

## Installation

```bash
pip install pendra
```

## Quick Start

```python
import pendra

client = pendra.Pendra(
    api_key="pdr_sk_...",  # or set PENDRA_API_KEY env var
)

response = client.chat.completions.create(
    model="llama3.2",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of the UK?"},
    ],
)

print(response.choices[0].message.content)
# → London is the capital of the United Kingdom.
```

## Streaming

```python
with client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Write me a short poem about London."}],
    stream=True,
) as stream:
    for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="", flush=True)
```

## Async

```python
import asyncio
import pendra

async def main():
    async with pendra.AsyncPendra(api_key="pdr_sk_...") as client:
        # Non-streaming
        response = await client.chat.completions.create(
            model="llama3.2",
            messages=[{"role": "user", "content": "Hello!"}],
        )
        print(response.choices[0].message.content)

        # Streaming
        stream = await client.chat.completions.create(
            model="llama3.2",
            messages=[{"role": "user", "content": "Count to 5"}],
            stream=True,
        )
        async for chunk in stream:
            print(chunk.choices[0].delta.content or "", end="", flush=True)

asyncio.run(main())
```

## List Models

```python
models = client.models.list()
for model in models:
    print(model.id)
```

## Image Generation

Generate images from a text prompt. Returns base64-encoded PNGs by default.

```python
import base64

response = client.images.generations.create(
    model="x/z-image-turbo",
    prompt="A red London double-decker bus at sunset",
    size="1024x1024",
)

with open("bus.png", "wb") as f:
    f.write(base64.b64decode(response.data[0].b64_json))
```

Async usage mirrors the sync API:

```python
async with pendra.AsyncPendra(api_key="pdr_sk_...") as client:
    response = await client.images.generations.create(
        model="x/z-image-turbo",
        prompt="A red London double-decker bus at sunset",
    )
```

Image generation is non-streaming — the response is returned as a single JSON payload once the worker finishes.

## Environment Variables

| Variable | Description |
|----------|-------------|
| `PENDRA_API_KEY` | Your Pendra API key (`pdr_sk_...`) |

## OpenAI Compatibility

The Pendra SDK is fully compatible with the OpenAI Python SDK interface. To migrate:

```python
# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...")

# After
from pendra import Pendra
client = Pendra(api_key="pdr_sk_...")
```

The `client.chat.completions.create()` interface is identical.

## Self-Hosted Workers

Run inference on your own GPUs with a single command. Your prompts and completions never leave your infrastructure.

```bash
curl -fsSL https://get.pendra.ai/worker | bash
```

See the [Workers documentation](https://pendra.ai/docs/workers) for full setup instructions.

## Licence

Apache-2.0
