Metadata-Version: 2.4
Name: calibrated_explanations
Version: 0.10.4
Summary: Extract calibrated explanations from machine learning models.
Author-email: Helena Löfström <helena.lofstrom@ju.se>, Tuwe Löfström <tuwe.lofstrom@ju.se>
License-Expression: BSD-3-Clause
Project-URL: Documentation, https://calibrated-explanations.readthedocs.io/en/latest/?badge=latest
Project-URL: Changelog, https://github.com/Moffran/calibrated_explanations/blob/main/CHANGELOG.md
Project-URL: Bug Tracker, https://github.com/Moffran/calibrated_explanations/issues
Project-URL: Repository, https://github.com/Moffran/calibrated_explanations
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: cachetools
Requires-Dist: pympler
Requires-Dist: crepes
Requires-Dist: venn-abers
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: scikit-learn
Requires-Dist: pyyaml
Provides-Extra: viz
Requires-Dist: matplotlib; extra == "viz"
Provides-Extra: notebooks
Requires-Dist: ipython; extra == "notebooks"
Requires-Dist: jupyter; extra == "notebooks"
Requires-Dist: nbconvert; extra == "notebooks"
Requires-Dist: matplotlib; extra == "notebooks"
Provides-Extra: dev
Requires-Dist: matplotlib; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: pydocstyle; extra == "dev"
Requires-Dist: nbqa; extra == "dev"
Requires-Dist: sphinx; extra == "dev"
Requires-Dist: myst-parser; extra == "dev"
Requires-Dist: nbsphinx; extra == "dev"
Requires-Dist: numpydoc; extra == "dev"
Requires-Dist: pydata-sphinx-theme; extra == "dev"
Requires-Dist: furo; extra == "dev"
Requires-Dist: sphinx-rtd-theme; extra == "dev"
Requires-Dist: sphinxcontrib-mermaid; extra == "dev"
Requires-Dist: docutils; extra == "dev"
Requires-Dist: roman; extra == "dev"
Requires-Dist: linkify-it-py; extra == "dev"
Requires-Dist: jsonschema; extra == "dev"
Requires-Dist: tomli-w; extra == "dev"
Requires-Dist: tomli; python_version < "3.11" and extra == "dev"
Requires-Dist: prometheus-client; extra == "dev"
Requires-Dist: pyarrow; extra == "dev"
Requires-Dist: joblib; extra == "dev"
Requires-Dist: ucimlrepo; extra == "dev"
Provides-Extra: eval
Requires-Dist: lime; extra == "eval"
Requires-Dist: shap; extra == "eval"
Requires-Dist: seaborn; extra == "eval"
Requires-Dist: xgboost; extra == "eval"
Requires-Dist: scipy; extra == "eval"
Requires-Dist: joblib; extra == "eval"
Requires-Dist: ucimlrepo; extra == "eval"
Requires-Dist: matplotlib; extra == "eval"
Provides-Extra: external-plugins
Requires-Dist: numpy>=1.24; extra == "external-plugins"
Requires-Dist: pandas>=2.0; extra == "external-plugins"
Requires-Dist: scikit-learn>=1.3; extra == "external-plugins"

# Calibrated Explanations ([Documentation](https://calibrated-explanations.readthedocs.io/en/latest/))

[![Calibrated Explanations PyPI version][pypi-version]][calibrated-explanations-on-pypi]
<!-- [![Conda Version](https://img.shields.io/conda/vn/conda-forge/calibrated-explanations.svg)](https://anaconda.org/conda-forge/calibrated-explanations) -->
[![GitHub (Pre-)Release Date](https://img.shields.io/github/release-date-pre/Moffran/calibrated_explanations)](https://github.com/Moffran/calibrated_explanations/blob/main/CHANGELOG.md)
[![Docstring coverage](https://img.shields.io/badge/docstring%20coverage-94%25-brightgreen)](https://github.com/Moffran/calibrated_explanations/blob/main/reports/docstring_coverage_20251025.txt)
[![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://github.com/Moffran/calibrated_explanations/blob/main/LICENSE)
[![Downloads](https://static.pepy.tech/badge/calibrated-explanations)](https://pepy.tech/project/calibrated-explanations)

## Quick Reference

**Purpose**: Uncertainty-aware feature-importance explanations for scikit-learn compatible models.

**Install**:
```bash
pip install calibrated-explanations
```

**Primary Use Cases**: binary-classification, multiclass-classification, regression, probabilistic regression

**Key Class (public API)**: `WrapCalibratedExplainer`

**Required calibration**: `true` (calibration set is mandatory).

**All examples in this repo use `WrapCalibratedExplainer`.**

**Typical Workflow (3 lines)**:

```python
from calibrated_explanations import WrapCalibratedExplainer
explainer = WrapCalibratedExplainer(model)           # wrap your sklearn-like model
explainer.fit(x_proper, y_proper); explainer.calibrate(x_cal, y_cal)
explanation = explainer.explain_factual(x_test)      # returns calibrated rules + uncertainty
```

**Core Methods**:

* `fit(x_proper, y_proper)` — train/prepare internal state (model fitting or wrapper).
* `calibrate(x_cal, y_cal, feature_names=None)` — required: align uncertainty estimates.
* `explain_factual(X)` — factual rules + feature importance with [low, high] bounds.
* `explore_alternatives(X)` — counterfactual / alternative rules.
* `predict_proba(X[, uq_interval=True])` — calibrated probability (with uncertainty interval).
* `predict(X[, uq_interval=True])` — point prediction (with uncertainty interval).

**Outputs**: calibrated prediction intervals, per-feature importance with uncertainty bounds, factual/alternative rule tables.

### Task map (critical: regression meanings differ)

**Classification (binary/multiclass)**:
Classification in this library is calibrated using Venn-Abers predictors.
- Calibrated probability: `predict_proba(x[, ...])`
- Calibrated probability with uncertainty bounds using Venn-Abers: `predict_proba(x, uq_interval=True[, ...])`
- Calibrated prediction: `predict(x[, ...])`
- Explanations: `explain_factual(x[, ...])` and `explore_alternatives(x[, ...])`

**Conformal interval regression (CPS)  ← CE "regression"**:
Regression in this library is **conformal interval regression** via **Conformal Predictive Systems (CPS)**:
- CPS calibrated point regression: `predict(x[, ...])`
- Point regression + calibrated uncertainty intervals = (conformal) interval regression: `predict(x, uq_interval=True, low_high_percentiles=(a, b)[, ...])`. Note that one-sided intervals can be obtained by setting `a=-np.Inf` or `b=np.Inf`.
- You can also request CPS-controlled intervals from explanations: `explain_factual(x, low_high_percentiles=(a, b)[, ...])` and `explore_alternatives(x, low_high_percentiles=(a, b)[, ...])`
- Default: `low_high_percentiles` = (5, 95) for 90% intervals.

**Probabilistic regression (thresholded probability queries for y)**:
Probabilistic regression requires assigning a `threshold`:
- Threshold probability for real-valued target: `predict_proba(x, threshold=t[, ...])` gives **P(y <= t)**
- Within-spec probability for real-valued target: `predict_proba(x, threshold=(low, high)[, ...])` gives **P(low < y <= high)**
- Add uncertainty bounds with `uq_interval=True`
- Exceedance explanations: `explain_factual(x, threshold=t[, ...])` and `explore_alternatives(x, threshold=t[, ...])`
- Within-spec explanations: `explain_factual(x, threshold=(low, high)[, ...])` and `explore_alternatives(x, threshold=(low, high)[, ...])`

**All tasks also support (core capability)**:
- `predict(x[, ...])` and `predict(x, uq_interval=True[, ...])`
- `explain_factual(x[, ...])` and `explore_alternatives(x[, ...])`

**Common optional parameters (`[, ...]`)**:
- `bins=...` for conditional calibration. Can also set a Mondrian Calibrator (see [crepes.extras.MondrianCategorizer](https://crepes.readthedocs.io/en/latest/crepes.extras.html#crepes.extras.MondrianCategorizer))
- `low_high_percentiles=(a, b)` for CPS conformal interval regression intervals
- `threshold=t` or `threshold=(low, high)` for probabilistic regression

**Local dev**: run `pip install -e .` before running examples/tests locally.

**When not to use**: raw deep nets without an sklearn wrapper; real-time streaming without a calibration set; extremely high-dimensional (>10k) feature vectors.

Calibrated Explanations turns any scikit-learn-compatible estimator into a
calibrated explainer that returns:

- **Factual rules** – the calibrated reasons your model backed its prediction.
- **Alternative rules** – what needs to change to flip or reinforce that
  decision, complete with uncertainty bounds.
- **Prediction intervals** – uncertainty-aware probabilities or regression
  ranges that quantify both aleatoric and epistemic risk.

Every quickstart, notebook, and benchmark follows the same recipe: fit your
estimator, calibrate on held-out data, then interpret the returned rule table
before acting.

> **Guarantees & Assumptions**
>
> * **Calibration set required**: A held-out calibration set (typically 20-25% of training data) is mandatory for all workflows.
> * **Interval invariant**: All intervals satisfy `low <= predict <= high`; violations trigger errors.
> * **Uncertainty decomposition**: Intervals capture both aleatoric (data) and epistemic (model) uncertainty.
> * **Calibration validity**: Guarantees hold when calibration and test distributions match (exchangeability assumption).
>
> See [ADR-021](docs/improvement/adrs/ADR-021-calibrated-interval-semantics.md) for formal semantics.

---

## Your first calibrated explanation (≈5 minutes)

1. **Install the essentials**
   ```bash
   python -m pip install calibrated-explanations
   ```

   **Optional extras:**
   | Extra | Purpose | Key Packages |
   | :--- | :--- | :--- |
   | `[viz]` | Plotting and visualizations | `matplotlib` |
   | `[notebooks]` | Jupyter notebook support | `ipython`, `jupyter`, `nbconvert` |
   | `[eval]` | Reproducing benchmarks | `lime`, `shap`, `xgboost`, `scipy` |
   | `[external-plugins]` | High-performance plugins | `numpy>=1.24`, `pandas>=2.0`, `scikit-learn>=1.3` |

   Install with: `pip install "calibrated-explanations[viz,notebooks]"`

2. **Run the quickstart** – this mirrors the smoke-tested docs example.
   ```python
   from sklearn.datasets import load_breast_cancer
   from sklearn.model_selection import train_test_split
   from sklearn.ensemble import RandomForestClassifier
   from calibrated_explanations import WrapCalibratedExplainer

   dataset = load_breast_cancer()
   x_train, x_test, y_train, y_test = train_test_split(
       dataset.data,
       dataset.target,
       test_size=0.2,
       stratify=dataset.target,
       random_state=0,
   )
   x_proper, x_cal, y_proper, y_cal = train_test_split(
       x_train,
       y_train,
       test_size=0.25,
       stratify=y_train,
       random_state=0,
   )

   explainer = WrapCalibratedExplainer(RandomForestClassifier(random_state=0))
   explainer.fit(x_proper, y_proper)
   explainer.calibrate(x_cal, y_cal, feature_names=dataset.feature_names)

   factual = explainer.explain_factual(x_test[:1])
   alternatives = explainer.explore_alternatives(x_test[:1])
   probabilities, probability_interval = explainer.predict_proba(x_test[:1], uq_interval=True)
   low, high = probability_interval
   print(f"Calibrated probability: {probabilities[0, 1]:.3f}")
   print(factual[0])
   ```
3. **Check the output** – the first factual explanation prints a calibrated rule
   table. A real run looks like:
   ```text
   Prediction [ Low ,  High]
   0.077 [0.000, 0.083]
   Value : Feature                                  Weight [ Low  ,  High ]
   0.07  : mean concave points > 0.05               -0.418 [-0.576, -0.256]
   0.15  : worst concave points > 0.12              -0.308 [-0.548,  0.077]
   0.34  : worst concavity > 0.22                   -0.090 [-0.123,  0.077]
   ```
   - The header row shows the calibrated prediction and its low/high uncertainty
     interval.
   - Each subsequent line is a factual rule: the observed value, the matching
     feature, and its signed contribution with uncertainty bounds.
4. **Interpret what you see** – follow the
   [Interpret Calibrated Explanations guide](https://calibrated-explanations.readthedocs.io/en/latest/foundations/how-to/interpret_explanations.html)
   to learn how calibrated intervals, rule weights, and the triangular plot work
   together. The [triangular alternatives tutorial](https://calibrated-explanations.readthedocs.io/en/latest/foundations/concepts/alternatives.html)
   then shows how to narrate trade-offs across alternative rules.

---

## Mental model: fit → calibrate → explain → interpret

1. **Fit** your preferred estimator.
2. **Calibrate** with held-out data to align predicted and observed outcomes.
3. **Explain** with `explain_factual` for calibrated rules and
   `explore_alternatives` for semi-, super-, and counterfactuals.
4. **Interpret** using the how-to guides so decisions account for both aleatoric
   and epistemic uncertainty.

This workflow is identical across binary, multiclass classification, as well as probabilistic, and
interval regression tasks, the difference lies in how you configure the underlying estimator and read the returned intervals.

---

## Choose your path

### New practitioners (first run)
- Stay on this README quickstart, then open the
  [classification quickstart](https://calibrated-explanations.readthedocs.io/en/latest/get-started/quickstart_classification.html)
  for a notebook-friendly walk-through with the breast cancer dataset.
- Compare factual vs. alternative explanations using the
  [triangular plot tutorial](https://calibrated-explanations.readthedocs.io/en/latest/foundations/concepts/alternatives.html).

### Practitioners (day-to-day usage)
- Follow the
  [practitioner hub](https://calibrated-explanations.readthedocs.io/en/latest/practitioner/index.html)
  for production checklists, integration how-tos, and interpretation playbooks.
- Explore the
  [probabilistic regression quickstart](https://calibrated-explanations.readthedocs.io/en/latest/get-started/quickstart_regression.html)
  when you need calibrated thresholds.
- Opt into plugins only when needed via
  `pip install "calibrated-explanations[external-plugins]"`—they remain
  optional extensions. Note: `fast` explanations are experimental and
  provided via opt-in plugins; they are allowed in the schema for
  interoperability but are not promoted for primary practitioner
  workflows. Treat `fast` as an experimental, opt-in feature and prefer
  `factual`/`alternative` workflows for production use.

### Agents (CE-first by default)
- Read `AGENTS.md` for the minimal entrypoint.
- Follow the CE-first guide in `docs/get-started/ce_first_agent_guide.md`.
- Use the helper module in `src/calibrated_explanations/ce_agent_utils.py`.

### Researchers
- Reproduce published studies through the
  [researcher hub](https://calibrated-explanations.readthedocs.io/en/latest/researcher/index.html),
  which links directly to benchmark manifests, dataset splits, and evaluation
  notebooks.
- Fetch replication artefacts from the
  [evaluation README](https://github.com/Moffran/calibrated_explanations/blob/main/evaluation/README.md)
  and align with the release plan checkpoints.
- Cite the work using the ready-made entries in
  [docs/citing.md](https://calibrated-explanations.readthedocs.io/en/latest/citing.html).

### Contributors
- Start with the
  [contributor hub](https://calibrated-explanations.readthedocs.io/en/latest/contributor/index.html)
  for development environment setup, plugin guardrails, and quality gates.
- Review the
  [contributor hub](https://calibrated-explanations.readthedocs.io/en/latest/contributor/index.html)
  before submitting pull requests.

### Maintainers
- Track release readiness through the root-level
  [`ROADMAP.md`](https://github.com/Moffran/calibrated_explanations/blob/main/ROADMAP.md),
  [docs/foundations/governance/release_checklist.md](https://calibrated-explanations.readthedocs.io/en/latest/governance/release_checklist.html),
  and the implementation plan in
  [`docs/improvement/RELEASE_PLAN_v1.md`](https://github.com/Moffran/calibrated_explanations/blob/main/docs/improvement/RELEASE_PLAN_v1.md).
- Confirm Standards and ADR alignment via
  [docs/improvement/standards/](https://github.com/Moffran/calibrated_explanations/tree/main/docs/improvement/standards) and
  [docs/improvement/adrs/](https://github.com/Moffran/calibrated_explanations/tree/main/docs/improvement/adrs)
  and keep docs navigation synced with the
  [IA crosswalk](https://calibrated-explanations.readthedocs.io/en/latest/foundations/governance/nav_crosswalk.html).

---

## Documentation map

- **API reference** – start with the
  [API index](https://calibrated-explanations.readthedocs.io/en/latest/api/index.html),
  then browse CLI, plugin, serialization, and visualization references.
- **Architecture overview** – the
  [architecture notes](https://calibrated-explanations.readthedocs.io/en/latest/foundations/concepts/architecture.html)
  connect runtime components, telemetry, and plugin boundaries.
- **Contributor guidance** – see the
  [contributor hub](https://calibrated-explanations.readthedocs.io/en/latest/contributor/index.html)
  for setup, quality gates, and process notes.
- **Release notes & changelog** – check
  [release notes](https://calibrated-explanations.readthedocs.io/en/latest/foundations/governance/release_notes.html)
  and the project
  [CHANGELOG](https://github.com/Moffran/calibrated_explanations/blob/main/CHANGELOG.md).
- **Plugin CLI** – inspect registered plugins and trust state with
  `ce.plugins list all` (see the
  [CLI reference](https://calibrated-explanations.readthedocs.io/en/latest/api/cli.html)).
- **Project governance** – review
  [GOVERNANCE.md](https://github.com/Moffran/calibrated_explanations/blob/main/GOVERNANCE.md),
  [SECURITY.md](https://github.com/Moffran/calibrated_explanations/blob/main/SECURITY.md),
  and the
  [Code of Conduct](https://github.com/Moffran/calibrated_explanations/blob/main/CODE_OF_CONDUCT.md).
- **Support** – see
  [SUPPORT.md](https://github.com/Moffran/calibrated_explanations/blob/main/SUPPORT.md)
  for the fastest way to get help.

---

## Licensing & Contributions

Contributions to this project are licensed under the same terms as the project
itself (BSD 3-Clause). By contributing, you agree to the
[Developer Certificate of Origin (DCO)](https://developercertificate.org/)
and that your contributions will be available under the project's license.
See [.github/CONTRIBUTING.md](.github/CONTRIBUTING.md) for details on how to
sign off your commits.

---

## Feature highlights

- **Calibrated prediction confidence** for binary and multiclass classification.
- **Uncertainty-aware feature importance** with aleatoric and epistemic bounds.
- **Probabilistic and interval regression** that mirrors the classification API.
- **Alternative explanations with triangular plots** for visualising trade-offs.
- **Conjunctional and conditional rules** for interaction and fairness analysis.
- **Experimental plugin lane** for `fast` explanations (opt-in only, not
  promoted for production—see practitioner notes above).

---

## Installation options

```bash
python -m pip install calibrated-explanations           # PyPI
conda install -c conda-forge calibrated-explanations    # conda-forge, currently only v0.9.0
python -m pip install "calibrated-explanations[dev]"    # local development tooling
python -m pip install "calibrated-explanations[viz]"    # plotting extras
```

Python ≥3.8 is supported. Optional extras remain additive so the core package
stays lightweight.

---

## Research and reproducibility

1. **Set up the evaluation environment**
   ```bash
   python -m venv .venv
   source .venv/bin/activate
   python -m pip install --upgrade pip
   python -m pip install -e .[dev,eval]
   ```
   The optional `[eval]` extras pull in `xgboost`, `venn-abers`, and plotting
   dependencies used across the published studies.
2. **Load the benchmark assets** – datasets live in the
   [`data/`](https://github.com/Moffran/calibrated_explanations/tree/main/data)
   directory (CSV files and zipped archives) and are referenced directly by the
   evaluation scripts.
3. **Re-run the flagship experiments** – each paper has a matching notebook or
   script under [`evaluation/`](https://github.com/Moffran/calibrated_explanations/tree/main/evaluation):
   - `Classification_Experiment_sota.py` and the accompanying notebooks cover
     the 25-dataset binary classification suite.
   - `multiclass/` and `regression/` host the multiclass and interval
     regression pipelines, respectively.
   - `ensure/` and `fastCE/` contain the ensured-explanations and accelerated
     plugin studies.
   Result archives (`*.pkl`, `.zip`) sit beside each run for quick comparison.
4. **Keep results traceable** – preserve the random seeds baked into the scripts
   (typically `42` or `0`) and record any deviations alongside the active ADRs
   noted in [`docs/improvement/adrs/`](https://github.com/Moffran/calibrated_explanations/tree/main/docs/improvement/adrs).
5. **Cite the sources** – the
   [theory & literature overview](https://calibrated-explanations.readthedocs.io/en/latest/researcher/advanced/theory_and_literature.html)
   lists DOIs, arXiv IDs, and funding acknowledgements to include in your work.

---

## Contributing and maintenance workflow

1. **Create a virtual environment**
   ```bash
   python -m venv .venv
   source .venv/bin/activate
   python -m pip install --upgrade pip
   python -m pip install -e .[dev] -c constraints.txt
   python -m pip install -r docs/requirements-doc.txt -c constraints.txt
   ```
2. **Run the quality gates locally**
   ```bash
   pytest
   ruff check .
   mypy src tests
   ```
3. **Build the documentation (optional but encouraged)**
   ```bash
   make -C docs html
   ```
4. **Open a pull request** referencing the active milestone and relevant ADRs.
   The [PR guide](https://calibrated-explanations.readthedocs.io/en/latest/foundations/governance/pr_guide.html)
   lists the checklist used during reviews.
5. **Review community health docs** – contributions are expected to follow the
   [Code of Conduct](https://github.com/Moffran/calibrated_explanations/blob/main/CODE_OF_CONDUCT.md),
   the contribution licensing guidance in
   [CONTRIBUTING](https://github.com/Moffran/calibrated_explanations/blob/main/.github/CONTRIBUTING.md),
   and the support/security policies in
   [SUPPORT.md](https://github.com/Moffran/calibrated_explanations/blob/main/SUPPORT.md)
   and [SECURITY.md](https://github.com/Moffran/calibrated_explanations/blob/main/SECURITY.md).

---

## License and citation

- Licensed under the [BSD 3-Clause License](https://github.com/Moffran/calibrated_explanations/blob/main/LICENSE).
- Cite Calibrated Explanations using the entries in
  [`CITATION.cff`](https://github.com/Moffran/calibrated_explanations/blob/main/CITATION.cff)
  or [docs/citing.md](https://calibrated-explanations.readthedocs.io/en/latest/citing.html).

---

## Acknowledgements & support

Funded by the [Swedish Knowledge Foundation](https://www.kks.se/) through the
Knowledge Intensive Product Realization SPARK environment at Jönköping
University. For questions or support, open an issue on
[GitHub](https://github.com/Moffran/calibrated_explanations/issues) or review
the guidance in
[SUPPORT.md](https://github.com/Moffran/calibrated_explanations/blob/main/SUPPORT.md).

[pypi-version]: https://img.shields.io/pypi/v/calibrated-explanations.svg
[calibrated-explanations-on-pypi]: https://pypi.org/project/calibrated-explanations/
