Metadata-Version: 2.3
Name: HiDimStat
Version: 0.2.0b0
Summary: High-dimensional statistical inference tools for Python
Project-URL: Development, https://github.com/nilearn/nilearn
Project-URL: Homepage, https://mind-inria.github.io/hidimstat
Project-URL: Repository, https://github.com/mind-inria/hidimstat
Author: HiDimStat developers
Maintainer-email: Bertrand Thirion <bertrand.thirion@inria.fr>
License: BSD 2-Clause License
        
        Copyright (c) 2024, Mind-Inria
        All rights reserved.
        
        Redistribution and use in source and binary forms, with or without
        modification, are permitted provided that the following conditions are met:
        
        1. Redistributions of source code must retain the above copyright notice, this
           list of conditions and the following disclaimer.
        
        2. Redistributions in binary form must reproduce the above copyright notice,
           this list of conditions and the following disclaimer in the documentation
           and/or other materials provided with the distribution.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
        AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
        IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
        DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
        FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
        DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
        SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
        CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
        OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
        OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
License-File: LICENSE
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Software Development
Requires-Python: >=3.12
Requires-Dist: joblib>=1.4.2
Requires-Dist: numpy>=2.0.0
Requires-Dist: pandas>=2.2.2
Requires-Dist: scikit-learn>=1.5.1
Requires-Dist: scipy>=1.14.0
Requires-Dist: torch>=2.3.1
Requires-Dist: torchmetrics>=1.4.0.post0
Provides-Extra: doc
Requires-Dist: memory-profiler>=0.61.0; extra == 'doc'
Requires-Dist: mne>=1.7.1; extra == 'doc'
Requires-Dist: nilearn>=0.10.4; extra == 'doc'
Requires-Dist: numpydoc>=1.7.0; extra == 'doc'
Requires-Dist: pillow>=10.4.0; extra == 'doc'
Requires-Dist: pyqt5>=5.15.10; extra == 'doc'
Requires-Dist: pyvista>=0.44.0; extra == 'doc'
Requires-Dist: pyvistaqt>=0.11.1; extra == 'doc'
Requires-Dist: sphinx-bootstrap-theme>=0.8.1; extra == 'doc'
Requires-Dist: sphinx-gallery>=0.16.0; extra == 'doc'
Requires-Dist: sphinxcontrib-bibtex>=2.6.2; extra == 'doc'
Provides-Extra: plotting
Requires-Dist: matplotlib>=3.9.0; extra == 'plotting'
Provides-Extra: style
Requires-Dist: black>=24.4.2; extra == 'style'
Requires-Dist: isort>=5.13.2; extra == 'style'
Provides-Extra: test
Requires-Dist: coverage>=7.6.0; extra == 'test'
Requires-Dist: pytest-cov>=5.0.0; extra == 'test'
Requires-Dist: pytest>=8.2.2; extra == 'test'
Description-Content-Type: text/markdown

# HiDimStat: High-dimensional statistical inference tool for Python
[![Build](https://github.com/mind-inria/hidimstat/actions/workflows/build_package.yml/badge.svg?branch=main)](https://github.com/mind-inria/hidimstat/actions/workflows/build_package.yml)  [![codecov](https://codecov.io/github/mind-inria/hidimstat/branch/main/graph/badge.svg?token=O1YZDTFTNS)](https://codecov.io/github/mind-inria/hidimstat) [![CodeStyle](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

The HiDimStat package provides statistical inference methods to solve the
problem of support recovery in the context of high-dimensional and
spatially structured data.

## Installation

HiDimStat working only with Python 3, ideally Python 3.6+. For installation,
run the following from terminal

```bash
pip install hidimstat
```

Or if you want the latest version available (for example to contribute to
the development of this project):

```
pip install -U git+https://github.com/mind-inria/hidimstat.git
```

or

```bash
git clone https://github.com/mind-inria/hidimstat.git
cd hidimstat
pip install -e .
```

## Dependencies

```
joblib
numpy
scipy
scikit-learn
```

To run examples it is neccessary to install `matplotlib`, and to run tests it
is also needed to install `pytest`.

## Documentation & Examples

All the documentation of HiDimStat is available at https://mind-inria.github.io/hidimstat/.

As of now in the `examples` folder there are three Python scripts that
illustrate how to use the main HiDimStat functions.
In each script we handle a different kind of dataset:
``plot_2D_simulation_example.py`` handles a simulated dataset with a 2D
spatial structure,
``plot_fmri_data_example.py`` solves the decoding problem on Haxby fMRI dataset,
``plot_meg_data_example.py`` tackles the source localization problem on several
MEG/EEG datasets.


```bash
# For example run the following command in terminal
python plot_2D_simulation_example.py
```

## References

The algorithms developed in this package have been detailed in several
conference/journal articles that can be downloaded at
https://mind-inria.github.io/research.html.

#### Main references:

Ensemble of Clustered desparsified Lasso (ECDL):

* Chevalier, J. A., Salmon, J., & Thirion, B. (2018). __Statistical inference
  with ensemble of clustered desparsified lasso__. In _International Conference
  on Medical Image Computing and Computer-Assisted Intervention_
  (pp. 638-646). Springer, Cham.

* Chevalier, J. A., Nguyen, T. B., Thirion, B., & Salmon, J. (2021). __Spatially relaxed inference on high-dimensional linear models__. arXiv preprint arXiv:2106.02590.

Aggregation of multiple Knockoffs (AKO):

* Nguyen T.-B., Chevalier J.-A., Thirion B., & Arlot S. (2020). __Aggregation
  of Multiple Knockoffs__. In _Proceedings of the 37th International Conference on
  Machine Learning_, Vienna, Austria, PMLR 119.

Application to decoding (fMRI data):

* Chevalier, J. A., Nguyen T.-B., Salmon, J., Varoquaux, G. & Thirion, B. (2021). __Decoding with confidence: Statistical control on decoder maps__. In _NeuroImage_, 234, 117921.

Application to source localization (MEG/EEG data):

* Chevalier, J. A., Gramfort, A., Salmon, J., & Thirion, B. (2020). __Statistical control for spatio-temporal MEG/EEG source imaging with desparsified multi-task Lasso__. In _Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020)_, Vancouver, Canada.

Single/Group statistically validated importance using conditional permutations:

* Chamma, A., Thirion, B., & Engemann, D. (2024). __Variable importance in
  high-dimensional settings requires grouping__. In _Proceedings of
  the 38th Conference of the Association for the Advancement of Artificial
  Intelligence(AAAI 2024)_, Vancouver, Canada.

* Chamma, A., Engemann, D., & Thirion, B. (2023). __Statistically Valid Variable
  Importance Assessment through Conditional Permutations__. In _Proceedings of
  the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)_,
  New Orleans, USA.

If you use our packages, we would appreciate citations to the relevant aforementioned papers.

#### Other useful references:

For de-sparsified(or de-biased) Lasso:

* Javanmard, A., & Montanari, A. (2014). __Confidence intervals and hypothesis
  testing for high-dimensional regression__. _The Journal of Machine Learning
  Research_, 15(1), 2869-2909.

* Zhang, C. H., & Zhang, S. S. (2014). __Confidence intervals for low dimensional
  parameters in high dimensional linear models__. _Journal of the Royal
  Statistical Society: Series B: Statistical Methodology_, 217-242.

* Van de Geer, S., Bühlmann, P., Ritov, Y. A., & Dezeure, R. (2014). __On
  asymptotically optimal confidence regions and tests for high-dimensional
  models__. _The Annals of Statistics_, 42(3), 1166-1202.

For Knockoffs Inference:

* Barber, R. F; Candès, E. J. (2015). __Controlling the false discovery rate
  via knockoffs__. _Annals of Statistics_. 43 , no. 5,
  2055--2085. doi:10.1214/15-AOS1337. https://projecteuclid.org/euclid.aos/1438606853

* Candès, E., Fan, Y., Janson, L., & Lv, J. (2018). __Panning for gold: Model-X
  knockoffs for high dimensional controlled variable selection__. _Journal of the
  Royal Statistical Society Series B_, 80(3), 551-577.

## License

This project is licensed under the BSD 2-Clause License.

## Acknowledgments

This project has been funded by Labex DigiCosme (ANR-11-LABEX-0045-DIGICOSME)
as part of the program "Investissement d’Avenir" (ANR-11-IDEX-0003-02), by the
Fast Big project (ANR-17-CE23-0011) and the KARAIB AI Chair
(ANR-20-CHIA-0025-01). This study has also been supported by the European
Union’s Horizon 2020 research and innovation program
(Grant Agreement No. 945539, Human Brain Project SGA3).
