Metadata-Version: 2.1
Name: baytune
Version: 0.3.6
Summary: Bayesian Tuning and Bandits
Home-page: https://github.com/HDI-Project/BTB
Author: MIT Data To AI Lab
Author-email: dailabmit@gmail.com
License: MIT license
Keywords: machine learning hyperparameters tuning classification
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Requires-Python: >=3.5
Description-Content-Type: text/markdown
Requires-Dist: numpy (<1.18.0,>=1.14.0)
Requires-Dist: scikit-learn (<0.22.0,>=0.20.0)
Requires-Dist: scipy (<1.4.0,>=1.0.1)
Requires-Dist: pandas (<0.26.0,>=0.21.0)
Requires-Dist: tqdm (<4.50.0,>=4.36.1)
Provides-Extra: dev
Requires-Dist: bumpversion (>=0.5.3) ; extra == 'dev'
Requires-Dist: pip (>=9.0.1) ; extra == 'dev'
Requires-Dist: watchdog (>=0.8.3) ; extra == 'dev'
Requires-Dist: autodocsumm (>=0.1.10) ; extra == 'dev'
Requires-Dist: ipython (>=6.5.0) ; extra == 'dev'
Requires-Dist: m2r (>=0.2.0) ; extra == 'dev'
Requires-Dist: Sphinx (<2.4,>=1.7.1) ; extra == 'dev'
Requires-Dist: sphinx-rtd-theme (>=0.2.4) ; extra == 'dev'
Requires-Dist: flake8 (>=3.7.7) ; extra == 'dev'
Requires-Dist: isort (>=4.3.4) ; extra == 'dev'
Requires-Dist: autoflake (>=1.2) ; extra == 'dev'
Requires-Dist: autopep8 (>=1.4.3) ; extra == 'dev'
Requires-Dist: twine (>=1.10.0) ; extra == 'dev'
Requires-Dist: wheel (>=0.30.0) ; extra == 'dev'
Requires-Dist: tox (>=2.9.1) ; extra == 'dev'
Requires-Dist: coverage (>=4.5.1) ; extra == 'dev'
Requires-Dist: pytest (>=3.4.2) ; extra == 'dev'
Requires-Dist: pytest-cov (>=2.6.0) ; extra == 'dev'
Provides-Extra: examples
Requires-Dist: jupyter (>=1.0.0) ; extra == 'examples'
Requires-Dist: matplotlib (>=3.1.1) ; extra == 'examples'
Provides-Extra: test
Requires-Dist: pytest (>=3.4.2) ; extra == 'test'
Requires-Dist: pytest-cov (>=2.6.0) ; extra == 'test'

<p align="left">
<img width="15%" src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png" alt="BTB" />
<i>An open source project from Data to AI Lab at MIT.</i>
</p>

![](https://raw.githubusercontent.com/HDI-Project/BTB/master/docs/images/BTB-Icon-small.png)

A simple, extensible backend for developing auto-tuning systems.

[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
[![PyPi Shield](https://img.shields.io/pypi/v/baytune.svg)](https://pypi.python.org/pypi/baytune)
[![Travis CI Shield](https://travis-ci.org/HDI-Project/BTB.svg?branch=master)](https://travis-ci.org/HDI-Project/BTB)
[![Coverage Status](https://codecov.io/gh/HDI-Project/BTB/branch/master/graph/badge.svg)](https://codecov.io/gh/HDI-Project/BTB)
[![Downloads](https://pepy.tech/badge/baytune)](https://pepy.tech/project/baytune)

* Free software: MIT license
* Development Status: [Pre-Alpha](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
* Documentation: https://HDI-Project.github.io/BTB
* Homepage: https://github.com/HDI-Project/BTB

# Overview

BTB ("Bayesian Tuning and Bandits") is a simple, extensible backend for developing auto-tuning
systems such as AutoML systems. It provides an easy-to-use interface for *tuning* and *selection*.

It is currently being used in several AutoML systems:
- [ATM](https://github.com/HDI-Project/ATM), distributed, multi-tenant AutoML system for
classifier tuning
- MIT TA2, MIT's system for the DARPA [Data-driven discovery of models](
https://www.darpa.mil/program/data-driven-discovery-of-models) (D3M) program
- [AutoBazaar](https://github.com/HDI-Project/AutoBazaar), a flexible, general-purpose
AutoML system

# Install

## Requirements

**BTB** has been developed and tested on [Python 3.5 and 3.6](https://www.python.org/downloads/)

Also, although it is not strictly required, the usage of a
[virtualenv](https://virtualenv.pypa.io/en/latest/) is highly recommended in order to avoid
interfering with other software installed in the system where **BTB** is run.

## Install with pip

The easiest and recommended way to install **BTB** is using [pip](
https://pip.pypa.io/en/stable/):

```bash
pip install baytune
```

This will pull and install the latest stable release from [PyPi](https://pypi.org/).

If you want to install from source or contribute to the project please read the
[Contributing Guide](https://hdi-project.github.io/BTB/contributing.html#get-started).

# Quickstart

Below there is a short example using ``BTBSession`` to perform tuning over
``ExtraTreesRegressor`` and ``RandomForestRegressor`` ensemblers from [scikit-learn](
https://scikit-learn.org/) and both of them are evaluated against the [Boston dataset](
http://lib.stat.cmu.edu/datasets/boston) regression problem.

```python3
from sklearn.datasets import load_boston as load_dataset
from sklearn.ensemble import ExtraTreesRegressor, RandomForestRegressor
from sklearn.metrics import make_scorer, r2_score
from sklearn.model_selection import cross_val_score, train_test_split

from btb.session import BTBSession

models = {
    'random_forest': RandomForestRegressor,
    'extra_trees': ExtraTreesRegressor,
}

def build_model(name, hyperparameters):
    model_class = models[name]
    return model_class(random_state=0, **hyperparameters)

def score_model(name, hyperparameters):
    model = build_model(name, hyperparameters)
    r2_scorer = make_scorer(r2_score)
    scores = cross_val_score(model, X_train, y_train, scoring=r2_scorer, cv=5)
    return scores.mean()

dataset = load_dataset()

X_train, X_test, y_train, y_test = train_test_split(
    dataset.data, dataset.target, test_size=0.3, random_state=0)

tunables = {
    'random_forest': {
        'n_estimators': {
            'type': 'int',
            'default': 2,
            'range': [1, 1000]
        },
        'max_features': {
            'type': 'str',
            'default': 'log2',
            'range': [None, 'auto', 'log2', 'sqrt']
        },
        'min_samples_split': {
            'type': 'int',
            'default': 2,
            'range': [2, 20]
        },
        'min_samples_leaf': {
            'type': 'int',
            'default': 2,
            'range': [1, 20]
        },
    },
    'extra_trees': {
        'n_estimators': {
            'type': 'int',
            'default': 2,
            'range': [1, 1000]
        },
        'max_features': {
            'type': 'str',
            'default': 'log2',
            'range': [None, 'auto', 'log2', 'sqrt']
        },
        'min_samples_split': {
            'type': 'int',
            'default': 2,
            'range': [2, 20]
        },
        'min_samples_leaf': {
            'type': 'int',
            'default': 2,
            'range': [1, 20]
        },
    }
}

session = BTBSession(tunables, score_model)
best_proposal = session.run(20)
```

# What's next?

For more details about **BTB** and all its possibilities and features, please check the
[project documentation site](https://HDI-Project.github.io/BTB/)!

Also do not forget to have a look at the [notebook tutorials](
https://github.com/HDI-Project/BTB/tree/master/examples/tutorials)!

# Citing BTB

If you use BTB, please consider citing our related papers.

- For the initial design and implementation of BTB (v0.1):

  Laura Gustafson. Bayesian Tuning and Bandits: An Extensible, Open Source Library for AutoML. Masters thesis, MIT EECS, June 2018. [(pdf)](https://dai.lids.mit.edu/wp-content/uploads/2018/05/Laura_MEng_Final.pdf)

  ``` bibtex
  @MastersThesis{Laura:2018,
    title = {Bayesian Tuning and Bandits: An Extensible, Open Source Library for AutoML},
    author = {Laura Gustafson},
    month = {May},
    year = {2018},
    url = {https://dai.lids.mit.edu/wp-content/uploads/2018/05/Laura_MEng_Final.pdf},
    type = {M. Eng Thesis},
    address = {Cambridge, MA},
    school = {Massachusetts Institute of Technology}",
  }
  ```

- For recent designs of BTB and its usage within the larger *ML Bazaar* project within the MIT Data to AI Lab:

  Micah J. Smith, Carles Sala, James Max Kanter, and Kalyan Veeramachaneni. ["The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development."](https://arxiv.org/abs/1905.08942) arXiv Preprint 1905.08942. 2019.

  ``` bibtex
  @article{smith2019mlbazaar,
    author = {Smith, Micah J. and Sala, Carles and Kanter, James Max and Veeramachaneni, Kalyan},
    title = {The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development},
    journal = {arXiv e-prints},
    year = {2019},
    eid = {arXiv:1905.08942},
    pages = {arXiv:1905.08942},
    archivePrefix = {arXiv},
    eprint = {1905.08942},
  }
  ```


# History

## 0.3.6 - 2020-03-04

This release improves `BTBSession` error handling and allows `Tunables` with cardinality
equal to 1 to be scored with `BTBSession`. Also, we provide a new documentation for
this version of `BTB`.

### Internal Improvements

Improved documentation, unittests and integration tests.

### Resolved Issues

* Issue #164: Improve documentation for `v0.3.5+`.
* Issue #166: Wrong erro raised by BTBSession on too many errors.
* Issue #170: Tuner has no scores attribute until record is run once.
* Issue #175: BTBSession crashes when record is not performed.
* Issue #176: BTBSession fails to select a proper Tunable when normalized_scores becomse None.

## 0.3.5 - 2020-01-21

With this release we are improving `BTBSession` by adding private attributes, or not intended to
be public / modified by the user and also improving the documentation of it.

### Internal Improvements

Improved docstrings, unittests and public interface of `BTBSession`.

### Resolved Issues

* Issue #162: Fix session with the given comments on PR 156.

## 0.3.4 - 2019-12-24

With this release we introduce a `BTBSession` class. This class represents the process of selecting
and tuning several tunables until the best possible configuration fo a specific `scorer` is found.
We also have improved and fixed some minor bugs arround the code (described in the issues below).

### New Features

* `BTBSession` that makes `BTB` more user friendly.

### Internal Improvements

Improved unittests, removed old dependencies, added more `MLChallenges` and fixed an issue with
the bound methods.

### Resolved Issues

* Issue #145: Implement `BTBSession`.
* Issue #155: Set defaut to `None` for `CategoricalHyperParam` is not possible.
* Issue #157: Metamodel `_MODEL_KWARGS_DEFAULT` becomes mutable.
* Issue #158: Remove `mock` dependency from the package.
* Issue #160: Add more Machine Learning Challenges and more estimators.


## 0.3.3 - 2019-12-11

Fix a bug where creating an instance of `Tuner` ends in an error.

### Internal Improvements

Improve unittests to use `spec_set` in order to detect errors while mocking an object.

### Resolved Issues

* Issue #153: Bug with tunner logger message that avoids creating the Tunner.

## 0.3.2 - 2019-12-10

With this release we add the new `benchmark` challenge `MLChallenge` which allows users to
perform benchmarking over datasets with machine learning estimators, and also some new
features to make the workflow easier.

### New Features

* New `MLChallenge` challenge that allows performing crossvalidation over datasets and machine
learning estimators.
* New `from_dict` function for `Tunable` class in order to instantiate from a dictionary that
contains information over hyperparameters.
* New `default` value for each hyperparameter type.

### Resolved Issues

* Issue #68: Remove `btb.tuning.constants` module.
* Issue #120: Tuner repr not helpful.
* Issue #121: HyperParameter repr not helpful.
* Issue #141: Imlement propper logging to the tuning section.
* Issue #150: Implement Tunable `from_dict`.
* Issue #151: Add default value for hyperparameters.
* Issue #152: Support `None` as a choice in `CategoricalHyperPrameters`.

## 0.3.1 - 2019-11-25

With this release we introduce a `benchmark` module for `BTB` which allows the users to perform
a benchmark over a series of `challenges`.

### New Features

* New `benchmark` module.
* New submodule named `challenges` to work toghether with `benchmark` module.

### Resolved Issues

* Issue #139: Implement a Benchmark for BTB

## 0.3.0 - 2019-11-11

With this release we introduce an improved `BTB` that has a major reorganization of the project
with emphasis on an easier way of interacting with `BTB` and an easy way of developing, testing and
contributing new acquisition functions, metamodels, tuners  and hyperparameters.

### New project structure

The new major reorganization comes with the `btb.tuning` module. This module provides everything
needed for the `tuning` process and comes with three new additions `Acquisition`, `Metamodel` and
`Tunable`. Also there is an update to the `Hyperparamters` and `Tuners`. This changes are meant
to help developers and contributors to easily develop, test and contribute new `Tuners`.

### New API

There is a slightly new way of using `BTB` as the new `Tunable` class is introduced, that is meant
to be the only requiered object to instantiate a `Tuner`. This `Tunable` class represents a
collection of `HyperParams` that need to be tuned as a whole, at once. Now, in order to create a
`Tuner`, a `Tunable` instance must be created first with the `hyperparameters` of the
`objective function`.

### New Features

* New `Hyperparameters` that allow an easier interaction for the final user.
* New `Tunable` class that manages a collection of `Hyperparameters`.
* New `Tuner` class that is a python mixin that requieres of `Acquisition` and `Metamodel` as
parents. Also now works with a single `Tunable` object.
* New `Acquisition` class, meant to implement an acquisition function to be inherit by a `Tuner`.
* New `Metamodel` class, meant to implement everything that a certain `model` needs and be inherit
by the `Tuner`.
* Reorganization of the `selection` module to follow a similar `API` to `tuning`.

### Resolved Issues

* Issue #131: Reorganize the project structure.
* Issue #133: Implement Tunable class to control a list of hyperparameters.
* Issue #134: Implementation of Tuners for the new structure.
* Issue #140: Reorganize selectors.

## 0.2.5

### Bug Fixes

* Issue #115: HyperParameter subclass instantiation not working properly

## 0.2.4

### Internal Improvements

* Issue #62: Test for `None` in `HyperParameter.cast` instead of `HyperParameter.__init__`

### Bug fixes

* Issue #98: Categorical hyperparameters do not support `None` as input
* Issue #89: Fix the computation of `avg_rewards` in `BestKReward`

## 0.2.3

### Bug Fixes

* Issue #84: Error in GP tuning when only one parameter is present bug
* Issue #96: Fix pickling of HyperParameters
* Issue #98: Fix implementation of the GPEi tuner

## 0.2.2

### Internal Improvements

* Updated documentation

### Bug Fixes

* Issue #94: Fix unicode `param_type` caused error on python 2.

## 0.2.1

### Bug fixes

* Issue #74: `ParamTypes.STRING` tunables do not work

## 0.2.0

### New Features

* New Recommendation module
* New HyperParameter types
* Improved documentation and examples
* Fully tested Python 2.7, 3.4, 3.5 and 3.6 compatibility
* HyperParameter copy and deepcopy support
* Replace print statements with logging

### Internal Improvements

* Integrated with Travis-CI
* Exhaustive unit testing
* New implementation of HyperParameter
* Tuner builds a grid of real values instead of indices
* Resolve Issue #29: Make args explicit in `__init__` methods
* Resolve Issue #34: make all imports explicit

### Bug Fixes

* Fix error from mixing string/numerical hyperparameters
* Inverse transform for categorical hyperparameter returns single item

## 0.1.2

* Issue #47: Add missing requirements in v0.1.1 setup.py
* Issue #46: Error on v0.1.1: 'GP' object has no attribute 'X'

## 0.1.1

* First release.


