Metadata-Version: 2.1
Name: autoPyTorch
Version: 0.1.0
Summary: Auto-PyTorch searches neural architectures using smac
Home-page: https://github.com/automl/Auto-PyTorch
Author: AutoML Freiburg
Author-email: eddiebergmanhs@gmail.com
License: 3-clause BSD
Keywords: machine learning algorithm configuration hyperparameteroptimization tuning neural architecture deep learning
Platform: Linux
Classifier: Development Status :: 3 - Alpha
Classifier: Topic :: Utilities
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: License :: OSI Approved :: BSD License
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas
Requires-Dist: torch
Requires-Dist: torchvision
Requires-Dist: tensorboard
Requires-Dist: scikit-learn (<0.25.0,>=0.24.0)
Requires-Dist: numpy
Requires-Dist: scipy (>=1.7)
Requires-Dist: lockfile
Requires-Dist: imgaug (>=0.4.0)
Requires-Dist: ConfigSpace (<0.5,>=0.4.14)
Requires-Dist: pynisher (>=0.6.3)
Requires-Dist: pyrfr (<0.9,>=0.7)
Requires-Dist: smac (==0.14.0)
Requires-Dist: dask
Requires-Dist: distributed (>=2.2.0)
Requires-Dist: catboost
Requires-Dist: lightgbm
Requires-Dist: flaky
Requires-Dist: tabulate
Provides-Extra: docs
Requires-Dist: sphinx ; extra == 'docs'
Requires-Dist: sphinx-gallery ; extra == 'docs'
Requires-Dist: sphinx-bootstrap-theme ; extra == 'docs'
Requires-Dist: numpydoc ; extra == 'docs'
Provides-Extra: examples
Requires-Dist: matplotlib ; extra == 'examples'
Requires-Dist: jupyter ; extra == 'examples'
Requires-Dist: notebook ; extra == 'examples'
Requires-Dist: seaborn ; extra == 'examples'
Provides-Extra: test
Requires-Dist: matplotlib ; extra == 'test'
Requires-Dist: pytest ; extra == 'test'
Requires-Dist: pytest-xdist ; extra == 'test'
Requires-Dist: pytest-timeout ; extra == 'test'
Requires-Dist: flaky ; extra == 'test'
Requires-Dist: pyarrow ; extra == 'test'
Requires-Dist: pre-commit ; extra == 'test'
Requires-Dist: pytest-cov ; extra == 'test'
Requires-Dist: pytest-forked ; extra == 'test'
Requires-Dist: codecov ; extra == 'test'
Requires-Dist: pep8 ; extra == 'test'
Requires-Dist: mypy ; extra == 'test'
Requires-Dist: openml ; extra == 'test'
Requires-Dist: emcee ; extra == 'test'
Requires-Dist: scikit-optimize ; extra == 'test'
Requires-Dist: pyDOE ; extra == 'test'

# Auto-PyTorch

Copyright (C) 2021  [AutoML Groups Freiburg and Hannover](http://www.automl.org/)

While early AutoML frameworks focused on optimizing traditional ML pipelines and their hyperparameters, another trend in AutoML is to focus on neural architecture search. To bring the best of these two worlds together, we developed **Auto-PyTorch**, which jointly and robustly optimizes the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL).

Auto-PyTorch is mainly developed to support tabular data (classification, regression).
The newest features in Auto-PyTorch for tabular data are described in the paper ["Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL"](https://arxiv.org/abs/2006.13799) (see below for bibtex ref).
Also, find the documentation [here](https://automl.github.io/Auto-PyTorch/master).

***From v0.1.0, AutoPyTorch has been updated to further improve usability, robustness and efficiency by using SMAC as the underlying optimization package as well as changing the code structure. Therefore, moving from v0.0.2 to v0.1.0 will break compatibility. 
In case you would like to use the old API, you can find it at [`master_old`](https://github.com/automl/Auto-PyTorch/tree/master-old).***

## Workflow

The rough description of the workflow of Auto-Pytorch is drawn in the following figure.

<img src="figs/apt_workflow.png" width="500">

In the figure, **Data** is provided by user and
**Portfolio** is a set of configurations of neural networks that work well on diverse datasets.
The current version only supports the *greedy portfolio* as described in the paper *Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL*
This portfolio is used to warm-start the optimization of SMAC.
In other words, we evaluate the portfolio on a provided data as initial configurations.
Then API starts the following procedures:
1. **Validate input data**: Process each data type, e.g. encoding categorical data, so that Auto-Pytorch can handled.
2. **Create dataset**: Create a dataset that can be handled in this API with a choice of cross validation or holdout splits.
3. **Evaluate baselines** *1: Train each algorithm in the predefined pool with a fixed hyperparameter configuration and dummy model from `sklearn.dummy` that represents the worst possible performance.
4. **Search by [SMAC](https://github.com/automl/SMAC3)**:\
    a. Determine budget and cut-off rules by [Hyperband](https://jmlr.org/papers/volume18/16-558/16-558.pdf)\
    b. Sample a pipeline hyperparameter configuration *2 by SMAC\
    c. Update the observations by obtained results\
    d. Repeat a. -- c. until the budget runs out
5. Build the best ensemble for the provided dataset from the observations and [model selection of the ensemble](https://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml04.icdm06long.pdf).

*1: Baselines are a predefined pool of machine learning algorithms, e.g. LightGBM and support vector machine, to solve either regression or classification task on the provided dataset

*2: A pipeline hyperparameter configuration specifies the choice of components, e.g. target algorithm, the shape of neural networks, in each step and 
(which specifies the choice of components in each step and their corresponding hyperparameters.

## Installation

### Manual Installation

We recommend using Anaconda for developing as follows:

```sh
# Following commands assume the user is in a cloned directory of Auto-Pytorch

# We also need to initialize the automl_common repository as follows
# You can find more information about this here:
# https://github.com/automl/automl_common/
git submodule update --init --recursive

# Create the environment
conda create -n auto-pytorch python=3.8
conda activate auto-pytorch
conda install swig
python setup.py install

```

## Examples

In a nutshell:

```py
from autoPyTorch.api.tabular_classification import TabularClassificationTask

# data and metric imports
import sklearn.model_selection
import sklearn.datasets
import sklearn.metrics
X, y = sklearn.datasets.load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = \
        sklearn.model_selection.train_test_split(X, y, random_state=1)

# initialise Auto-PyTorch api
api = TabularClassificationTask()

# Search for an ensemble of machine learning algorithms
api.search(
    X_train=X_train,
    y_train=y_train,
    X_test=X_test,
    y_test=y_test,
    optimize_metric='accuracy',
    total_walltime_limit=300,
    func_eval_time_limit_secs=50
)

# Calculate test accuracy
y_pred = api.predict(X_test)
score = api.score(y_pred, y_test)
print("Accuracy score", score)
```

For more examples including customising the search space, parellising the code, etc, checkout the `examples` folder

```sh
$ cd examples/
```


Code for the [paper](https://arxiv.org/abs/2006.13799) is available under `examples/ensemble` in the [TPAMI.2021.3067763](https://github.com/automl/Auto-PyTorch/tree/TPAMI.2021.3067763`) branch.

## Contributing

If you want to contribute to Auto-PyTorch, clone the repository and checkout our current development branch

```sh
$ git checkout development
```

## License

This program is free software: you can redistribute it and/or modify
it under the terms of the Apache license 2.0 (please see the LICENSE file).

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

You should have received a copy of the Apache license 2.0
along with this program (see LICENSE file).

## Reference

Please refer to the branch `TPAMI.2021.3067763` to reproduce the paper *Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL*.

```bibtex
  @article{zimmer-tpami21a,
  author = {Lucas Zimmer and Marius Lindauer and Frank Hutter},
  title = {Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year = {2021},
  note = {also available under https://arxiv.org/abs/2006.13799},
  pages = {3079 - 3090}
}
```

```bibtex
@incollection{mendoza-automlbook18a,
  author    = {Hector Mendoza and Aaron Klein and Matthias Feurer and Jost Tobias Springenberg and Matthias Urban and Michael Burkart and Max Dippel and Marius Lindauer and Frank Hutter},
  title     = {Towards Automatically-Tuned Deep Neural Networks},
  year      = {2018},
  month     = dec,
  editor    = {Hutter, Frank and Kotthoff, Lars and Vanschoren, Joaquin},
  booktitle = {AutoML: Methods, Sytems, Challenges},
  publisher = {Springer},
  chapter   = {7},
  pages     = {141--156}
}
```

## Contact

Auto-PyTorch is developed by the [AutoML Group of the University of Freiburg](http://www.automl.org/).


