Metadata-Version: 2.4
Name: nervecode
Version: 0.1.1
Summary: Intrinsic surprise scoring for PyTorch via statistical coding.
Project-URL: Homepage, https://gitlab.com/domezsolt/nervecode
Project-URL: Repository, https://gitlab.com/domezsolt/nervecode
Author: Zsolt Döme
License: MIT License
        
        Copyright (c) 2026 Nervecode Maintainers
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
License-File: LICENSE
Keywords: coding,ml,pytorch,research,surprise
Requires-Python: >=3.10
Requires-Dist: torch>=2.0
Provides-Extra: benchmark
Requires-Dist: matplotlib>=3.7; extra == 'benchmark'
Requires-Dist: numpy>=1.24; extra == 'benchmark'
Requires-Dist: torchvision>=0.15; extra == 'benchmark'
Provides-Extra: dev
Requires-Dist: mypy>=1.8; extra == 'dev'
Requires-Dist: pre-commit>=3.6; extra == 'dev'
Requires-Dist: pytest>=7; extra == 'dev'
Requires-Dist: ruff>=0.3; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.5; extra == 'docs'
Requires-Dist: mkdocs>=1.5; extra == 'docs'
Provides-Extra: logging
Requires-Dist: loguru>=0.7; extra == 'logging'
Requires-Dist: rich>=13; extra == 'logging'
Provides-Extra: viz
Requires-Dist: matplotlib>=3.7; extra == 'viz'
Requires-Dist: numpy>=1.24; extra == 'viz'
Description-Content-Type: text/markdown

# Nervecode

Nervecode is a PyTorch library that adds an intrinsic uncertainty signal to neural networks by scoring how compressible internal activations are under learned codebooks. The goal is a practical, observe-only wrapper that preserves model outputs while exposing a calibrated surprise score for OOD detection, guardrails, and monitoring.

The signal is intrinsic and layerwise: it is most useful when you need an
inspectable internal surprise trace, not just a single black-box confidence
number. On some CIFAR shifts, standalone Nervecode OOD scores may be weaker
than strong logit baselines such as Energy; the research value to validate is
complementarity with those baselines and layerwise interpretability.

## Installation
- Prerequisites: Python 3.10+, PyTorch 2.0+ (install a build matching your platform from pytorch.org).
- From a checkout for local use: `pip install -e .`
- For development with tooling: `pip install -e .[dev]` then `pre-commit install`.
- Optional extras: `.[benchmark]` for source-checkout benchmark scripts, `.[viz]` for plotting, `.[logging]` for richer logs.

Note: You can use the top-level convenience `nervecode.wrap(...)` which instruments your model in-place and returns a `WrappedModel` container.

## Quickstart
Minimal end-to-end flow using the current public surface:

```python
import torch
from torch import nn
import nervecode as nvc
from nervecode.scoring import EmpiricalPercentileCalibrator, mean_surprise

# 1) Build and instrument a tiny model
model = nn.Sequential(nn.Linear(2, 32), nn.ReLU(), nn.Linear(32, 2))
wrapped = nvc.wrap(model, layers="all_linear")  # in-place wrappers + container

# 2) Train with task loss + coding loss
x = torch.randn(64, 2)
y = torch.randint(0, 2, (64,))
logits = wrapped(x)
loss = nn.CrossEntropyLoss()(logits, y) + 0.1 * wrapped.coding_loss()
loss.backward()

# 3) Calibrate empirical percentiles on in-distribution scores
with torch.no_grad():
    _ = wrapped(torch.randn(64, 2))
    agg = wrapped.surprise() or mean_surprise(getattr(wrapped, "_last_layer_traces", {}))
scores = agg.surprise if agg is not None else torch.empty(0)
calib = EmpiricalPercentileCalibrator(threshold_quantiles=(0.95,))
state = calib.fit(scores, aggregation="mean")
thr = calib.threshold_for()  # threshold at 95th percentile
```

## Product Boundary
- Is: a lightweight PyTorch library that wraps selected layers (start with Linear), learns codebooks over reduced activations, and emits layer-wise and aggregated surprise scores.
- Is not: a hardware project, a full observability platform, or a framework-agnostic toolkit; MVP targets PyTorch only and focuses on observe-only wrappers with modest overhead.

## First Public API Shape (MVP)
The initial public surface is intentionally small and convenient:

```python
import nervecode

model = MyModel()
wrapped = nervecode.wrap(model, layers="all_linear")

for x, y in train_loader:
    logits = wrapped(x)
    loss = task_loss_fn(logits, y) + wrapped.coding_loss()
    loss.backward()
    optimizer.step(); optimizer.zero_grad()

wrapped.calibrate(calib_loader)

logits = wrapped(x_test)
surprise = wrapped.surprise()  # includes score and percentile

# Optional explicit trace path for robust integrations
logits, trace = wrapped.forward_with_trace(x_test)
```

Provisional API entries:
- `wrap(...)`
- `WrappedModel.coding_loss()`
- `WrappedModel.calibrate(...)`
- `WrappedModel.surprise()`
- `WrappedModel.forward_with_trace(...)`

## MVP Scope
The MVP is a narrow, end-to-end vertical slice:
- Gradient-updated codebooks and differentiable soft assignment.
- `SoftCode` and `CodingTrace` data structures.
- `CodingLinear` wrapper and `wrap(..., layers="all_linear")` convenience. The wrapper supports an optional `coding_dim` to project wide layer outputs down to a coding space via a learned linear reducer while preserving the layer's visible output.
- Mean and max aggregation for a per-input surprise score.
- Empirical percentile calibration on in-distribution data.
- Lightweight coding loss and basic diagnostics (CSV/JSONL).
- One small end-to-end example (MLP or simple CNN).

Distance-augmented surprise:
- The combined per-position surprise can include a distance component to lift
  OOD scores above ID across the bulk, improving percentile thresholding. Set
  `assignment.beta_distance > 0` (e.g., 0.2–1.0) to enable `S = βL·L + βH·H + βD·D`
  where `D ≈ log1p(nearest-center squared distance)`.

Quickstart: see `examples/quickstart_mlp.py` for a tiny end-to-end MLP training + calibration + inference script. For pooled Conv2d coding contributing to aggregated surprise, see `examples/quickstart_cnn.py`. For a plain‑language walkthrough of the expected user flow, read `docs/quickstart.md`.

For a fast, dataset-agnostic smoke run suitable for CI or local validation, use `scripts/train_minimal.py` which trains a tiny model on a synthetic dataset and calibrates an empirical percentile threshold.

For a minimal OOD benchmark harness, see `benchmarks/ood/simple.py` which trains an MLP, calibrates percentiles on in-distribution data, and reports AUROC versus a synthetic OOD split.

Benchmark scripts are source-checkout research tools, not part of the installed wheel API. Run them from a checkout with benchmark dependencies installed: `pip install -e '.[benchmark,viz]'`.

Reusable v0.1.1 benchmark configs live under `configs/v0_1_1/`; the CIFAR
runner accepts them with `--config`.

The CIFAR ResNet-18 runner has explicit presets:

```
python -m benchmarks.ood.cifar_resnet18 --preset demo --limit-eval 256
python -m benchmarks.ood.cifar_resnet18 --preset full_cifar10_resnet18 --device cuda
```

`demo` is a short smoke path and writes a warning in the run directory; do not
use its tables as benchmark claims. `full_cifar10_resnet18` uses from-scratch
CIFAR-10 defaults, saves per-seed checkpoints by default, records checkpoint
SHA-256 values in `meta.json`, and blocks OOD reporting unless the CIFAR-10
test accuracy gate passes. The runner does not enable ImageNet-pretrained
transfer; add and label any future transfer path separately as
`transfer_resnet18`.

For CIFAR OOD combo ablations, the headline-safe method is
`combo_id_zscore_equal`: Energy, Mahalanobis, and Nervecode scores are
normalized on ID calibration data only and averaged with fixed equal weights.
See `docs/combo.md`. Learned OOD-calibrated combo rows are oracle appendix
results, not headline detector claims.

The CIFAR OOD benchmark also writes real per-sample, per-layer Nervecode
surprise artifacts under `runs/<run_id>/layerwise/seed_<seed>/*.npz`. Plot them
without rerunning the model:

```
python -m nervecode.viz.layerwise runs/<run_id> \
  --seed 123 \
  --ood dtd,cifar100,svhn \
  --output runs/<run_id>/plots/layerwise.png
```

For quick ablations over codebook/coding hyperparameters and layer selection, use `scripts/ablate_grid.py` which sweeps small grids of K (codebook size), D (coding dimension), T (temperature), and selection strategies, then logs a minimal quality metric and overhead proxies to CSV.

For a minimal OOD comparison using synthetic scores and the empirical percentile calibrator, see `examples/ood_smoke_test.py`.

Performance notes: see `docs/overhead.md` for pooled Conv2d coding overhead estimates, timing harness, and operating guidance.

## Recommended OOD Settings (quick start)
- Selection: `layers=first_linear`
- Aggregation: `agg=max`
- Coding: `coding_dim D=8`
- Codebook: `K=16`
- Weights: `βL=1.0`, `βE=1.0`, `βD=1.0` (distance-augmented surprise)
- Calibration: `quantile q=0.90` (use `0.95` for stricter ID control)

Run the bundled OOD benchmark with these settings:

```
python -m benchmarks.ood.simple --epochs 20 --device cpu \
  --agg max --layers first_linear --K 16 --coding-dim 8 \
  --beta-length 1.0 --beta-entropy 1.0 --beta-distance 1.0 \
  --quantile 0.90 --json
```

Or sweep a narrow fast grid:

```
FAST=1 bash scripts/run_ood_matrix.sh
```

## Contributing
Contributions are welcome. Please see `CONTRIBUTING.md` for a quick start, coding guidelines, and how to run tests locally.

## Changelog
User-facing changes are tracked in `CHANGELOG.md` under the Unreleased section and versioned entries.
