Metadata-Version: 2.4
Name: graphstore
Version: 0.6.0
Summary: Agentic brain DB - the cognitive layer for AI agents
Author-email: Kailash Mahavarkar <kailashmahavarkar5@gmail.com>
License-Expression: AGPL-3.0
Project-URL: Homepage, https://github.com/orkait/graphstore
Project-URL: Repository, https://github.com/orkait/graphstore
Project-URL: Issues, https://github.com/orkait/graphstore/issues
Keywords: graph,database,in-memory,dsl,sparse-matrix
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Requires-Dist: scipy>=1.10
Requires-Dist: usearch>=2.0
Requires-Dist: lark>=1.1
Requires-Dist: msgspec>=0.18
Requires-Dist: psutil>=5.9
Requires-Dist: threadpoolctl>=3.1
Requires-Dist: model2vec>=0.4
Requires-Dist: croniter>=6.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-benchmark>=4.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: pytest-xdist>=3.8; extra == "dev"
Requires-Dist: pytest-timeout>=2.4; extra == "dev"
Requires-Dist: httpx>=0.24; extra == "dev"
Requires-Dist: onnxruntime>=1.17; extra == "dev"
Requires-Dist: tokenizers>=0.15; extra == "dev"
Provides-Extra: embedders-extra
Requires-Dist: fastembed>=0.8; extra == "embedders-extra"
Requires-Dist: llama-cpp-python>=0.3; extra == "embedders-extra"
Provides-Extra: audio
Requires-Dist: faster-whisper>=1.0; extra == "audio"
Provides-Extra: ingest
Requires-Dist: markitdown>=0.1; extra == "ingest"
Requires-Dist: pymupdf4llm>=0.1; extra == "ingest"
Requires-Dist: pymupdf>=1.20; extra == "ingest"
Provides-Extra: ingest-pro
Requires-Dist: docling>=2.0; extra == "ingest-pro"
Provides-Extra: playground
Requires-Dist: fastapi>=0.100; extra == "playground"
Requires-Dist: uvicorn>=0.20; extra == "playground"
Requires-Dist: pydantic>=2.0; extra == "playground"
Provides-Extra: gpu
Requires-Dist: onnxruntime-gpu>=1.23; (sys_platform == "linux" and platform_machine == "x86_64") and extra == "gpu"
Provides-Extra: vision
Requires-Dist: llama-cpp-python[server]>=0.3; extra == "vision"
Requires-Dist: huggingface-hub>=0.24; extra == "vision"
Provides-Extra: pro
Requires-Dist: graphstore[ingest]; extra == "pro"
Requires-Dist: graphstore[vision]; extra == "pro"
Requires-Dist: graphstore[audio]; extra == "pro"
Requires-Dist: graphstore[embedders-extra]; extra == "pro"
Requires-Dist: graphstore[gpu]; extra == "pro"
Requires-Dist: tokenizers>=0.15; extra == "pro"
Requires-Dist: onnxruntime>=1.17; extra == "pro"
Requires-Dist: huggingface-hub>=0.24; extra == "pro"
Dynamic: license-file

<div align="center">

# graphstore

**A memory database for AI agents**

[![CI](https://github.com/orkait/graphstore/actions/workflows/ci.yml/badge.svg)](https://github.com/orkait/graphstore/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/graphstore?color=f59e0b&logo=pypi&logoColor=white)](https://pypi.org/project/graphstore/)
[![Python](https://img.shields.io/badge/python-%3E%3D3.10-3776AB?logo=python&logoColor=white)](https://python.org)
[![License: AGPL-3.0](https://img.shields.io/badge/license-AGPL--3.0-ea580c?logo=gnu&logoColor=white)](LICENSE)
[![Docs](https://img.shields.io/badge/docs-website%2Fdocs-f59e0b?logo=readthedocs&logoColor=white)](website/docs/intro.md)

</div>

---

An embedded memory database for AI agents. Facts get written with confidence scores, expire, get contradicted, decay by recency. Retrieval fuses vector similarity, BM25, graph structure, and recency in one call. Everything goes through a typed DSL. Runs in-process, persists to SQLite.

Status: v0.6.0, alpha.

## Install

```bash
pip install graphstore
```

Core ships with [model2vec](https://github.com/MinishLab/model2vec) as the default embedder. Swap for Jina v5, bge-*, EmbeddingGemma, or any ONNX / GGUF model via `graphstore install-embedder`. PDFs, images, audio, GPU, and the web UI are opt-in extras.

```bash
pip install 'graphstore[ingest]'       # PDF / DOCX / HTML
pip install 'graphstore[vision]'       # local VLM for images + scanned PDFs
pip install 'graphstore[audio]'        # faster-whisper speech-to-text
pip install 'graphstore[playground]'   # FastAPI web UI
pip install 'graphstore[gpu]'          # onnxruntime-gpu, Linux x86_64, CUDA 12
pip install 'graphstore[pro]'          # one-shot agentic memory bundle (see Pro mode below)
```

Full extras matrix: [Installation](website/docs/installation.md).

## Quickstart

```python
from graphstore import GraphStore

g = GraphStore(path="./brain")

g.execute('CREATE NODE "mem:paris" kind = "memory" '
          'DOCUMENT "Paris is the capital of France, famous for the Eiffel Tower."')
g.execute('CREATE NODE "mem:rome" kind = "memory" '
          'DOCUMENT "Rome is the capital of Italy, home to the Colosseum."')
g.execute('CREATE EDGE "mem:paris" -> "mem:rome" kind = "both_european_capitals"')

g.execute('REMEMBER "European history" LIMIT 5')          # hybrid fusion
g.execute('RECALL FROM "mem:paris" DEPTH 2 LIMIT 10')     # graph walk
g.execute('LEXICAL SEARCH "Eiffel Tower" LIMIT 5')        # BM25
g.execute('SIMILAR TO "capital city" LIMIT 5')            # vector only
```

`DOCUMENT "text"` populates the vector index, FTS5 index, and blob storage in one shot. Without it, a node is structured data only.

## Natural-language ingest (Bonsai)

For agent-conversation memory, writing DSL by hand is the wrong abstraction. graphstore ships `BonsaiIngestor`, an LLM-driven NL→DSL converter built on a 4B Ternary-Bonsai GGUF (1.1 GB, runs on CPU at ~20 tok/s, ~150 tok/s on a CUDA 12 GPU). It reads natural-language turns and emits the DSL statements that mirror them.

```python
from graphstore import GraphStore
from graphstore.bonsai_ingestor import BonsaiIngestor, _DEFAULT_LITE_PROMPT_PATH

g = GraphStore(path="./brain")
ing = BonsaiIngestor(
    model_path="./models/Ternary-Bonsai-4B-TQ1_0.gguf",
    gs=g,
    skill_path=str(_DEFAULT_LITE_PROMPT_PATH),  # or omit for the full prompt
    n_gpu_layers=-1,                             # 0 for CPU; -1 to offload all
)

ing.ingest("Kailash joined OpenAI.",     msg_id="m1")  # @UPSERT/@UPSERT/@EDGE
ing.ingest("I prefer tea to coffee.",    msg_id="m2")  # @BELIEF
ing.ingest("Maria moved to Berlin.",     msg_id="m3")
ing.ingest("Actually I drink coffee now.", msg_id="m4")  # @RETRACT + @BELIEF

# Retrieval is the same NL surface
ing.ingest("Where does Maria work?", msg_id="q1", dry_run=True)  # -> @ANSWER
```

Prompt variants:
- `bonsai_dsl_prompt_lite.txt` (~600 system tokens, 16 verbs, ingest+retrieval): production sweet spot.
- `bonsai_dsl_prompt.txt` (~1700 system tokens, ~50 verbs, all admin DSL): full control surface.

Persistent KV cache (`kv_cache_path=...`) cuts cold start from ~10 s to ~1 s across process restarts.

## Architecture

<p align="center">
  <img src="website/static/img/architecture.svg" alt="graphstore architecture: DSL + three storage engines + ingest pipeline + retrieval" width="760">
</p>

Three engines behind one DSL.

- **Graph**: columnar numpy arrays + scipy CSR edge matrices. Reserved columns `__event_at__`, `__confidence__`, `__retracted__`, `__source__` are first-class.
- **Vector**: usearch HNSW, cosine. Auto-embedding on `DOCUMENT` or `EMBED content` schemas.
- **Document**: SQLite + FTS5 for BM25 and blobs. Single-owner advisory lock on the path.

The **DSL** is Lark LALR(1). Every write, read, `INGEST`, and `SYS *` goes through it.

Deep dive: [Architecture](website/docs/concepts/architecture.md) · [Edge matrix](website/docs/concepts/edge-matrix.md).

## REMEMBER

`REMEMBER` fuses four signals at retrieval time. `SIMILAR`, `LEXICAL`, `RECALL` each expose a single leg.

<p align="center">
  <img src="website/static/img/remember.svg" alt="REMEMBER 5-stage retrieval pipeline" width="620">
</p>

| Signal | Default weight | Source |
|---|---|---|
| `vec_signal` | 0.52 | max sentence cosine over usearch ANN |
| `bm25_signal` | 0.25 | SQLite FTS5 over `doc_fts` |
| `recency` | 0.15 | `exp(-age / half_life)` from `__event_at__` |
| `graph_signal` | 0.08 | sum of entity degrees |

Weights are configurable via `graphstore.json`, `GRAPHSTORE_DSL_*` env vars, or constructor kwargs.

Every result returns per-signal scores on every node and a `meta["signals"]` block with the full pipeline state (fusion weights, per-stage candidate counts, reranker status):

```python
r = g.execute('REMEMBER "Caroline counseling" LIMIT 1 WHERE kind = "message"')
n = r.data[0]
print(n["_remember_score"], n["_vector_sim"], n["_bm25_score"],
      n["_recency_score"], n["_graph_score"], n["_co_bonus"],
      n["_recall_boost"], n["_rank_stage"])
r.meta["signals"]  # {fusion, recency, stages, reranker, nucleus, ...}
```

Dry-run the pipeline without mutating recall counts:

```python
g.execute('SYS EXPLAIN REMEMBER "Caroline counseling" LIMIT 3')
# kind="plan", candidates with per-signal scores, full meta["signals"]
```

Deep dive: [REMEMBER pipeline](website/docs/concepts/remember-pipeline.md).

## ANSWER (retrieval + reader LLM)

For a full retrieve + synthesize loop, wire a reader callable and use `ANSWER`:

```python
def my_reader(prompt: str, max_tokens: int = 1000) -> str:
    ...  # call any LLM (openai, litellm, local, ...)

g = GraphStore(path="./brain", reader=my_reader)

r = g.execute('ANSWER "What is the capital of France?" LIMIT 3')
r.data["answer"]         # "Paris"
r.data["cited_slots"]    # ["mem:paris", ...]
r.meta["signals"]        # same telemetry as REMEMBER
```

graphstore ships no LLM dependency. The reader is a plain callable; bring your own. Named readers (`GraphStore(readers={"fast": a, "careful": b})`) enable A/B via `ANSWER "q" USING "careful"`.

## Typed query builder

Every DSL verb has a typed function. Same grammar, IDE autocomplete, injection-safe.

```python
from graphstore import q, F, Time

q.create_node("mem:paris", kind="memory",
              document="Paris is the capital of France.").execute(g)

recent = F.gte("__event_at__", Time.now_minus(7, "d"))
q.nodes(where=F.eq("kind", "memory") & recent & ~F.eq("__retracted__", True))

q.batch(
    q.var("x", q.create_node("n1", kind="memory", document="a")),
    q.var("y", q.create_node("n2", kind="memory", document="b")),
    q.create_edge("$x", "$y", kind="next"),
).execute(g)
```

Full reference: [Query builder](website/docs/query-builder.md).

## Benchmarks

**LongMemEval-S**, 500 records, Jina v5 Small 1024d, Kaggle T4 GPU, 2026-04-19. Public kernel: [kaggle.com/code/superkaiii/graphstore-jina-v5-small](https://www.kaggle.com/code/superkaiii/graphstore-jina-v5-small).

| Overall | knowledge-update | single-session-assistant | single-session-user | multi-session | temporal | preference |
|---|---|---|---|---|---|---|
| **97.0%** | 100.0% | 100.0% | 98.6% | 98.5% | 94.7% | 83.3% |

Query p50 46 ms / p95 76 ms. Retrieval-only, no LLM judge.

**LoCoMo**, conv-26 199Q full, jina-v5-small 1024d, MiniMax M2.7 reader. Best adapter (Bonsai NL→DSL ingest) overall token-F1 **0.476** vs 0.464 deterministic NER + 0.392 remote-LLM ingest, all on the same retrieval stack. Scoring matched byte-for-byte against snap-research/locomo `task_eval/evaluation.py` (parity test in `tests/test_locomo_scoring_parity.py`).

Full methodology: [Benchmarks](website/docs/benchmarks/overview.md).

## GPU offload (opt-in, off by default)

graphstore never grabs a GPU implicitly. Every `*_gpu_layers` default is 0 (CPU). To opt in, install `[gpu]` (and a CUDA-built `llama-cpp-python` wheel for Bonsai/embedder/reranker) and call `gpu.setup()`:

```python
from graphstore import gpu
status = gpu.setup()
print(status.ready, status.provider, status.device_name, status.error)
```

`gpu.setup()` is idempotent and does the dirty work: discovers any `nvidia-*-cu12` pip wheels under `site-packages`, ctypes-preloads their `.so` files in dependency order so `LD_LIBRARY_PATH` doesn't have to be set externally, then probes onnxruntime + llama-cpp-python CUDA support. On success it surfaces `GRAPHSTORE_GPU=1` so the existing compute_profile gate flips automatically. Failure is structured (`status.error`) and falls back to CPU silently.

For per-component control, pass the explicit kwargs (CPU stays the default):

```python
GraphStore(path="./brain", gpu_layers=-1, reranker_gpu_layers=-1)
BonsaiIngestor(model_path=..., n_gpu_layers=-1)
```

## Pro mode

`pip install 'graphstore[pro]'` bundles ingest + vision + audio + embedders-extra + gpu plus huggingface-hub / tokenizers / onnxruntime. Pair it with a one-time calibration to get spec-driven validation and a calibrated Bonsai ingestor without writing the device-detection / sizing / fallback glue yourself.

```bash
pip install 'graphstore[pro]'
graphstore pro setup        # download every component, probe each on this host
graphstore pro status       # inspect host + spec + resolved knobs
```

```python
from graphstore import GraphStore

gs = GraphStore(path="./brain", profile="pro")

# Resolver caught every shortfall up-front (extras missing, calibration
# stale, RAM/VRAM short). If we got here, the spec runs.
print(gs.pro_resolved.n_ctx, gs.pro_resolved.bonsai_n_gpu_layers)

ing = gs.create_bonsai()                       # n_ctx / n_batch / n_gpu_layers
ing.ingest("Maria joined OpenAI.", msg_id="m1")  # all wired from calibration
```

Defaults match the measured-best LoCoMo configuration as of this release: `jina-v5-small` embedder, `jina-v3` reranker, `bonsai-tq1_0-lite` ingest, `tinybert` NER. Customize via a `ProSpec(...)` instance. Strict by default: missing extras / missing calibration / unfit host raise `ProExtraNotInstalled` / `ProCalibrationMissing` / `ProUnsupportedHostError`. Pass `pro_strict=False` to log + continue. Linux x86_64 + NVIDIA CUDA 12 only in v1.

Full guide: [Pro mode](website/docs/guides/pro-mode.md).

## Scope

- Embedded, one writer per path. For multi-tenant, wrap in your own service.
- No SQL, no Cypher, no distributed cluster. Graph ops exist because agent memory is a graph.
- Fusion weights are hand-tuned. Reranking is opt-in, off by default.
- Bonsai NL→DSL ingest is opt-in via `BonsaiIngestor(...)`. Core install never auto-loads an LLM.
- GPU offload is opt-in via `gpu.setup()` or explicit `n_gpu_layers=...`. No silent device acquisition.

## Development

```bash
git clone https://github.com/orkait/graphstore.git
cd graphstore
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,ingest,vision,embedders-extra,playground]"
pytest
```

Docs site under `website/` (Docusaurus). Run locally:

```bash
cd website && bun install && bun run start
```

## License

AGPL-3.0. See [LICENSE](LICENSE).
