Metadata-Version: 2.1
Name: auroris
Version: 0.1.3
Summary: Data Curation in Polaris
Author-email: Lu Zhu <lu@valencediscovery.com>, Julien St-Laurent <julien.stl@valencediscovery.com>, Cas Wognum <cas@valencediscovery.com>
Project-URL: Website, https://polarishub.io/
Project-URL: Source Code, https://github.com/polaris-hub/auroris
Project-URL: Bug Tracker, https://github.com/polaris-hub/auroris/issues
Project-URL: Documentation, https://polaris-hub.github.io/auroris/
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Healthcare Industry
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: tqdm
Requires-Dist: loguru
Requires-Dist: typer
Requires-Dist: pydantic >=2
Requires-Dist: numpy
Requires-Dist: pandas <2.2.0
Requires-Dist: scipy
Requires-Dist: scikit-learn
Requires-Dist: seaborn
Requires-Dist: datamol >=0.12.1
Requires-Dist: pillow
Requires-Dist: fsspec
Requires-Dist: pyarrow
Provides-Extra: dev
Requires-Dist: pytest ; extra == 'dev'
Requires-Dist: pytest-xdist ; extra == 'dev'
Requires-Dist: pytest-cov ; extra == 'dev'
Requires-Dist: ruff ; extra == 'dev'
Requires-Dist: jupyterlab ; extra == 'dev'
Requires-Dist: ipywidgets ; extra == 'dev'
Provides-Extra: doc
Requires-Dist: mkdocs ; extra == 'doc'
Requires-Dist: mkdocs-material >=9.4.7 ; extra == 'doc'
Requires-Dist: mkdocstrings ; extra == 'doc'
Requires-Dist: mkdocstrings-python ; extra == 'doc'
Requires-Dist: mkdocs-jupyter ; extra == 'doc'
Requires-Dist: markdown-include ; extra == 'doc'
Requires-Dist: mdx-truly-sane-lists ; extra == 'doc'
Requires-Dist: nbconvert ; extra == 'doc'
Requires-Dist: mike >=1.0.0 ; extra == 'doc'

# Auroris

[![PyPI](https://img.shields.io/pypi/v/auroris)](https://pypi.org/project/auroris/)
[![Conda](https://img.shields.io/conda/v/conda-forge/auroris?label=conda&color=success)](https://anaconda.org/conda-forge/auroris)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/auroris)](https://pypi.org/project/auroris/)
[![Conda](https://img.shields.io/conda/dn/conda-forge/auroris)](https://anaconda.org/conda-forge/auroris)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/auroris)](https://pypi.org/project/auroris/)

[![test](https://github.com/polaris-hub/auroris/actions/workflows/test.yml/badge.svg)](https://github.com/polaris-hub/auroris/actions/workflows/test.yml)
[![release](https://github.com/polaris-hub/auroris/actions/workflows/release.yml/badge.svg)](https://github.com/polaris-hub/auroris/actions/workflows/release.yml)
[![code-check](https://github.com/polaris-hub/auroris/actions/workflows/code-check.yml/badge.svg)](https://github.com/polaris-hub/auroris/actions/workflows/code-check.yml)
[![doc](https://github.com/polaris-hub/auroris/actions/workflows/doc.yml/badge.svg)](https://github.com/polaris-hub/auroris/actions/workflows/doc.yml)

Tools for data curation in the Polaris ecosystem. 


### Getting started

```python
from auroris.curation import Curator
from auroris.curation.actions import MoleculeCuration, OutlierDetection, Discretization

# Define the curation workflow
curator = Curator(
    steps=[
        MoleculeCuration(input_column="smiles"),
        OutlierDetection(method="zscore", columns=["SOL"]),
        Discretization(input_column="SOL", thresholds=[-3]),
    ],
    parallelized_kwargs = { "n_jobs": -1 }
)

# Run the curation
dataset, report = curator(dataset)
```
### Run curation with command line
A `Curator` object is serializable, so you can save it to and load it from a JSON file to reproduce the curation.

```
auroris [config_file] [destination] --dataset-path [data_path]
```

## Documentation

Please refer to the [documentation](https://polaris-hub.github.io/auroris/), which contains tutorials for getting started with `auroris` and detailed descriptions of the functions provided.

## Installation

You can install `auroris` using conda/mamba/micromamba:

```bash
conda install -c conda-forge auroris
```

You can also use pip:

```bash
pip install auroris
```

## Development lifecycle

### Setup dev environment

```shell
conda env create -n auroris -f env.yml
conda activate auroris

pip install --no-deps -e .
```

<details>
  <summary>Other installation options</summary>
  
    Alternatively, using [uv](https://github.com/astral-sh/uv):
    ```shell
    uv venv -p 3.12 auroris
    source .venv/auroris/bin/activate
    uv pip compile pyproject.toml -o requirements.txt --all-extras
    uv pip install -r requirements.txt 
    ```   
</details>


### Tests

You can run tests locally with:

```shell
pytest
```

## License

Under the Apache-2.0 license. See [LICENSE](LICENSE).
