Metadata-Version: 2.4
Name: brain-proxy
Version: 0.0.1
Summary: OpenAI-compatible FastAPI router with Chroma + LangMem memory.
Home-page: https://github.com/puntorigen/brain-proxy
Author: Pablo Schaffner
Author-email: Pablo Schaffner <pablo@puntorigen.com>
Project-URL: Homepage, https://github.com/puntorigen/brain-proxy
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: fastapi
Requires-Dist: openai
Requires-Dist: langchain-chroma
Requires-Dist: langmem
Requires-Dist: tiktoken
Requires-Dist: pydantic
Requires-Dist: langchain-openai
Dynamic: author
Dynamic: home-page
Dynamic: requires-python

# 🧠 brain-proxy

**Turn any FastAPI backend into a fully featured OpenAI-compatible LLM proxy — with memory, RAG, streaming, and file uploads.**

> Like the OpenAI `/chat/completions` endpoint — but with context, memory, and smart file ingestion.

---

## ✨ Features

- ✅ OpenAI-compatible `/chat/completions` (drop-in SDK support)
- ✅ Multi-tenant routing (`/v1/<tenant>/chat/completions`)
- ✅ File ingestion via `file_data` messages
- ✅ RAG with Chroma + LangChain
- ✅ LangMem-powered long & short-term memory
- ✅ Streaming via Server-Sent Events
- ✅ Custom text extractor support for PDFs, CSVs, etc.
- ✅ No frontend changes required

---

## 🚀 Installation

```bash
pip install brain-proxy
```

---

## ⚡ Quickstart

```python
from fastapi import FastAPI
from brain_proxy import BrainProxy

proxy = BrainProxy(
    openai_api_key="sk-...",  # used for both LLM + embeddings
)

app = FastAPI()
app.include_router(proxy.router, prefix="/v1")
```

Now any OpenAI SDK can point to:

```
http://localhost:8000/v1/<tenant>/chat/completions
```

---

## 🧠 Multi-tenancy explained

Every tenant (`/v1/acme`, `/v1/alpha`, etc):

- Gets its own vector store (for RAG)
- Has isolated LangMem memory (short- and long-term)
- Can upload files (auto-indexed + persisted)

This means you can serve multiple brands or users safely and scalably from a single backend.

---

## 💬 OpenAI SDK Example

```python
import openai

openai.api_key = "sk-fake"
openai.base_url = "http://localhost:8000/v1/acme"

response = openai.ChatCompletion.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What's 3 + 2?"}]
)

print(response["choices"][0]["message"]["content"])
```

### Streaming:

```python
stream = openai.ChatCompletion.create(
    model="gpt-4o",
    stream=True,
    messages=[{"role": "user", "content": "Tell me a short story about an AI fox."}]
)

for chunk in stream:
    print(chunk.choices[0].delta.get("content", ""), end="")
```

---

## 📎 File Uploads

Send `file_data` parts inside messages to upload PDFs, CSVs, images, etc:

```json
{
  "role": "user",
  "content": [
    { "type": "text", "text": "Here's a report:" },
    { "type": "file_data", "file_data": {
        "name": "report.pdf",
        "mime": "application/pdf",
        "data": "...base64..."
    }}
  ]
}
```

Files are saved, parsed, embedded, and used in RAG on the fly.

---

## 🧾 Custom PDF extractor example

```python
from pdfminer.high_level import extract_text

def parse_pdf(path: Path, mime: str) -> str:
    if mime == "application/pdf":
        return extract_text(path)
    return "(unsupported format)"
```

```python
proxy = BrainProxy(
    openai_api_key="sk-...",
    extract_text=parse_pdf
)
```

---

## 📦 Roadmap

- [x] Multi-agent manager hook
- [x] Usage hooks + token metering
- [ ] Use LiteLLM instead to support more models
- [ ] MCP support
- [ ] LangGraph integration

---

## ⚖️ License

MIT — free to use, fork, and build on.  
Made for backend devs who want to move fast ⚡

---

## ❤️ Contributing

Issues and PRs welcome!

Let’s build smarter backends — together.
