Metadata-Version: 2.4
Name: coffloader
Version: 0.1.0
Summary: External memory for AI agents — offload context to a VFS, index summaries, retrieve on demand.
Project-URL: Homepage, https://github.com/mingyk/coffloader
Project-URL: Repository, https://github.com/mingyk/coffloader
Author: coffloader contributors
License-Expression: MIT
License-File: LICENSE
Keywords: agent,context,llm,memory,rag,vfs
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Provides-Extra: dev
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: embed
Requires-Dist: numpy>=1.21; extra == 'embed'
Requires-Dist: sentence-transformers>=2.2; extra == 'embed'
Description-Content-Type: text/markdown

# coffloader

**External memory for AI agents** — offload context to a VFS, index caller-provided summaries, retrieve on demand.

[![Python](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![Status](https://img.shields.io/badge/status-pre--alpha-orange.svg)](#status)

```bash
pip install coffloader              # core (BM25 search)
pip install coffloader[embed]       # + semantic search (sentence-transformers)
```

---

## What it does

Agents accumulate context faster than any window allows. coffloader offloads content to storage, keeps a searchable index of summaries, and retrieves full content on demand.

```
write(content, summary) → store blob + index summary
search(query)           → top-k summaries + addresses
read(address)           → full content
```

**Key constraints:**
- `summary` is **required** on write — your agent/LLM provides it, not coffloader
- No LLM calls inside the library — pure storage and retrieval
- Caller handles contradiction detection, dedup, and reasoning

---

## Quick start

```python
from coffloader import Coffloader

store = Coffloader()

# 1. Offload a conversation segment (summary comes from your agent)
store.write(
    content="[Turn 1] User: I was charged twice for order #9910...",
    summary="Customer reports duplicate charge on order #9910",
    metadata={"session_id": "ticket_8842", "segment": 1},
    path="/sessions/ticket_8842/seg_001.txt",
)

# 2. Later: search when user asks about earlier context
hits = store.search("order number", namespace="/sessions/ticket_8842/")

# 3. Load full content and inject into your LLM
text = store.read_text(hits[0].address)
```

**The loop:** offload cold context → search when needed → read and inject.

---

## API

```python
store = Coffloader(
    backend=None,           # default: in-memory VFS
    max_bytes=512_000,      # default: 512 KB — reject oversized payloads
    on_oversize="reject",   # "reject" or "metadata_only"
    hybrid=True,            # default: True — use BM25 + embeddings if available
    min_similarity=0.3,     # default: 0.3 — filter out weak embedding matches
                            # lower = more results, less relevant
                            # higher = fewer results, more relevant  
                            # set to 0.0 to disable filtering
)

# Store content with a caller-provided summary
result = store.write(content, summary, metadata={}, path=None)

# Search indexed summaries (returns TocEntry list, not full content)
hits = store.search(query, k=5, filters={}, namespace=None)
#                         ^^^ number of results to return

# Load full content
data = store.read(address)          # bytes
text = store.read_text(address)     # str

# Check size before writing
check = store.inspect(content)      # .acceptable, .byte_count

# Delete
store.delete(address)
```

**Defaults are exposed as class attributes:**
```python
Coffloader.DEFAULT_MAX_BYTES       # 512_000
Coffloader.DEFAULT_MIN_SIMILARITY  # 0.3
```

---

## Composite backends

Route paths to different storage:

```python
from coffloader import Coffloader, CompositeBackend, LocalBackend, MemoryBackend

store = Coffloader(
    backend=CompositeBackend(
        default=MemoryBackend(),
        routes={"/archive/": LocalBackend(root="./data")},
    )
)
```

---

## Patterns

**Long session (segmented):** Offload every ~15 turns. Search returns precise segments, not the whole transcript.

```python
store.write(content=turns_1_15, summary="...", path="/sessions/abc/seg_001.txt")
store.write(content=turns_16_30, summary="...", path="/sessions/abc/seg_002.txt")
```

**Tool output:** Offload large grep/API results with a structural summary (no LLM needed).

```python
store.write(
    content=grep_output,
    summary=f"grep error src/ → {n} matches",
    path=f"/active/{session}/tool_001.txt",
)
```

**Multi-agent:** Use namespaces for isolation (`/agent/{id}/`) or sharing (`/shared/`).

---

## Limits

- Max payload: 512 KB by default (configurable)
- Oversized content is rejected or recorded as metadata-only
- No silent truncation

---

## Status

Pre-alpha. Core API is stable: `write`, `search`, `read`, `inspect`, `delete`.

**Working:**
- BM25 (keyword) search via SQLite FTS5
- Semantic search via `[embed]` optional extra
- Hybrid search (BM25 + embeddings) with Reciprocal Rank Fusion

**Not yet implemented:**
- Persistent index to disk
- Sharded TOC for large corpora

---

## Non-goals

- LLM calls from the library
- Automatic dedup, contradiction detection, or memory merge
- Knowledge graphs or hierarchical rollups

---

## License

MIT
