Metadata-Version: 2.4
Name: Mighty-RL
Version: 1.0.0
Summary: A modular, meta-learning-ready RL library.
Author-email: "AutoRL@LUHAI" <a.mohan@ai.uni-hannover.de>
License: 
        
        BSD License
        
        Copyright (c) 2023, AutoML Hannover
        All rights reserved.
        
        Redistribution and use in source and binary forms, with or without modification,
        are permitted provided that the following conditions are met:
        
        * Redistributions of source code must retain the above copyright notice, this
          list of conditions and the following disclaimer.
        
        * Redistributions in binary form must reproduce the above copyright notice, this
          list of conditions and the following disclaimer in the documentation and/or
          other materials provided with the distribution.
        
        * Neither the name of the copyright holder nor the names of its
          contributors may be used to endorse or promote products derived from this
          software without specific prior written permission.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
        ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
        WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
        IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
        INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
        BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
        DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
        OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
        OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
        OF THE POSSIBILITY OF SUCH DAMAGE.
        
        
Keywords: Reinforcement Learning,MetaRL,Generalization in RL
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Development Status :: 3 - Alpha
Classifier: Topic :: Utilities
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: License :: OSI Approved :: BSD License
Requires-Python: <3.12,>=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy<2
Requires-Dist: gymnasium
Requires-Dist: matplotlib~=3.4
Requires-Dist: seaborn~=0.11
Requires-Dist: tensorboard
Requires-Dist: hydra-core~=1.2
Requires-Dist: hydra-colorlog~=1.2
Requires-Dist: hydra-submitit-launcher~=1.2
Requires-Dist: pandas
Requires-Dist: scipy
Requires-Dist: rich~=12.4
Requires-Dist: wandb~=0.12
Requires-Dist: torch~=2.7
Requires-Dist: dill
Requires-Dist: imageio
Requires-Dist: evosax==0.1.6
Requires-Dist: rliable
Requires-Dist: seaborn
Requires-Dist: uniplot
Requires-Dist: jax~=0.7
Requires-Dist: flax~=0.11
Provides-Extra: dev
Requires-Dist: ruff; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Provides-Extra: carl
Requires-Dist: carl_bench[brax]==1.1.1; extra == "carl"
Provides-Extra: dacbench
Requires-Dist: dacbench==0.4.0; extra == "dacbench"
Requires-Dist: torchvision; extra == "dacbench"
Provides-Extra: pufferlib
Requires-Dist: pufferlib==2.0.6; extra == "pufferlib"
Provides-Extra: docs
Requires-Dist: mkdocs; extra == "docs"
Requires-Dist: mkdocs-material; extra == "docs"
Requires-Dist: mkdocs-autorefs; extra == "docs"
Requires-Dist: mkdocs-gen-files; extra == "docs"
Requires-Dist: mkdocs-literate-nav; extra == "docs"
Requires-Dist: mkdocs-glightbox; extra == "docs"
Requires-Dist: mkdocs-glossary-plugin; extra == "docs"
Requires-Dist: mkdocstrings[python]; extra == "docs"
Requires-Dist: markdown-exec[ansi]; extra == "docs"
Requires-Dist: mike; extra == "docs"
Requires-Dist: mkdocs-mermaid2-plugin; extra == "docs"
Provides-Extra: examples
Dynamic: license-file

<p align="center">
    <a href="./docs/img/logo.png">
        <img src="./docs/img/logo_with_font.svg" alt="Mighty Logo" width="80%"/>
    </a>
</p>

<div align="center">
    
[![PyPI Version](https://img.shields.io/pypi/v/mighty-rl.svg)](https://pypi.org/project/Mighty-RL/)
![Python](https://img.shields.io/badge/Python-3.11-3776AB)
![License](https://img.shields.io/badge/License-BSD3-orange)
[![Test](https://github.com/automl/Mighty/actions/workflows/test.yaml/badge.svg)](https://github.com/automl/Mighty/actions/workflows/test.yaml)
[![Doc Status](https://github.com/automl/Mighty/actions/workflows/docs_test.yaml/badge.svg)](https://github.com/automl/Mighty/actions/workflows/docs_test.yaml)
    
</div>

<div align="center">
    <h3>
      <a href="#installation">Installation</a> |
      <a href="https://automl.github.io/Mighty/">Documentation</a> |
      <a href="#run-a-mighty-agent">Run a Mighty Agent</a> |
      <a href="#cite-us">Cite Us</a>
    </h3>
</div>

---

# Mighty

Welcome to Mighty, hopefully your future one-stop shop for everything cRL.
Currently Mighty is still in its early stages with support for normal gym envs, DACBench and CARL.
The interface is controlled through hydra and we provide DQN, PPO and SAC algorithms.
We log training and regular evaluations to file and optionally also to wandb.
If you have any questions or feedback, please tell us, ideally via the GitHub issues!
If you want to get started immediately, use our [Template repository](https://github.com/automl/mighty_project_template).

Mighty features:
- Modular structure for easy (Meta-)RL tinkering
- PPO, SAC and DQN as base algorithms
- Environment integrations via Gymnasium, Pufferlib, CARL & DACBench
- Implementations of some important baselines: RND, PLR, Cosine LR Schedule and more!

## Installation
We recommend to using uv to install and run Mighty in a virtual environment.
The code has been tested with python 3.11 on Unix systems.

First create a clean python environment:

```bash
uv venv --python=3.11
source .venv/bin/activate
```

Then  install Mighty:

```bash
make install
```

Optionally you can install the dev requirements directly:
```bash
make install-dev
```

Alternatively, you can install Mighty from PyPI:
```bash
pip install mighty-rl
```

## Run a Mighty Agent
In order to run a Mighty Agent, use the run_mighty.py script and provide any training options as keywords.
If you want to know more about the configuration options, call:
```bash
python mighty/run_mighty.py --help
```

An example for running the PPO agent on the Pendulum gym environment looks like this:
```bash
python mighty/run_mighty.py 'algorithm=ppo' 'environment=gymnasium/pendulum'
```

### Train your Agent on a CARL Environment
Mighty is designed with contextual RL in mind and therefore fully compatible with CARL.
Before you start training, however, please follow the installation instructions in the [CARL repo](https://github.com/automl/CARL).

Then use the same command as before, but provide the CARL environment, in this example CARLCartPoleEnv,
and information about the context distribution as keywords:
```bash
python mighty/run_mighty.py 'algorithm=ppo' 'env=CARLCartPole' '+env_kwargs.num_contexts=10' '+env_kwargs.context_feature_args.gravity=[normal, 9.8, 1.0, -100.0, 100.0]' 'env_wrappers=[mighty.mighty_utils.wrappers.FlattenVecObs]' 'algorithm_kwargs.rollout_buffer_kwargs.buffer_size=2048'
```

For more complex configurations like this, we recommend making an environment configuration file. Check out our [CARL Ant](mighty/configs/environment/carl_walkers/ant_goals.yaml) file to see how this simplifies the process of working with configurable environments.

### Learning a Configuration Policy via DAC

In order to use Mighty with DACBench, you need to install DACBench first.
We recommend following the instructions in the [DACBench repo](https://github.com/automl/DACBench).

Afterwards, configure the benchmark you want to run. Since most DACBench benchmarks have Dict action and observation spaces, some fairly complex,  you might need to wrap DACBenchmarks in order to translate the observations and actions to an easy-to-handle format. We have a version of the FunctionApproximationBenchmark configured for you so you can get started like this:
```bash
python mighty/run_mighty.py 'algorithm=ppo' 'environment=dacbench/function_approximation'
```
The matching [configuration file](mighty/configs/environment/dacbench/function_approximation.yaml) shows you how to set the search spaces and benchmark type. Refer to DACBench itself to learn how to configure other elements like observations spaces or instance sets.

### Optimize Hyperparameters
You can optimize the hyperparameters of your algorithm with the [Hypersweeper](https://github.com/automl/hypersweeper) package, e.g. using [SMAC3](https://github.com/automl/SMAC3). Mighty is directly compatible with Hypersweeper and thus smart and distributed HPO! There are also other HPO options, check out our [examples](examples/README.md) for more information.

## Build Your Own Mighty Project
If you want to implement your own method in Mighty, we recommend using the [Mighty template repository](https://github.com/automl/mighty_project_template) as a base. It contains a runscript, the most relevant config files and basic scripts for plotting. Our [domain randomization example](https://github.com/automl/mighty_dr_example) shows that you can get started right away. Since Mighty has many options of how to implement your idea, here's a rough guide which Mighty class you want to look at:

```mermaid
stateDiagram
  direction TB
  classDef Neutral stroke-width:1px,stroke-dasharray:none,stroke:#000000,fill:#FFFFFF,color:#000000;
  classDef Peach stroke-width:1px,stroke-dasharray:none,stroke:#FBB35A,fill:#FFEFDB,color:#8F632D;
  classDef Aqua stroke-width:1px,stroke-dasharray:none,stroke:#46EDC8,fill:#DEFFF8,color:#378E7A;
  classDef Sky stroke-width:1px,stroke-dasharray:none,stroke:#374D7C,fill:#E2EBFF,color:#374D7C;
  classDef Pine stroke-width:1px,stroke-dasharray:none,stroke:#254336,fill:#8faea5,color:#FFFFFF;
  classDef Rose stroke-width:1px,stroke-dasharray:none,stroke:#FF5978,fill:#FFDFE5,color:#8E2236;
  classDef Ash stroke-width:1px,stroke-dasharray:none,stroke:#999999,fill:#EEEEEE,color:#000000;
  classDef Seven fill:#E1BEE7,color:#D50000,stroke:#AA00FF;
  Still --> root_end:Yes
  Still --> Moving:No
  Moving --> Crash:Yes
  Moving --> s2:No, only current transitions, env and network
  s2 --> s6:Action Sampling
  s2 --> s10:Policy Update
  s2 --> s8:Training Batch Sampling
  s2 --> Crash:More than one/not listed
  s2 --> s12:Direct algorithm change
  s12 --> s13:Yes
  s12 --> s14:No
  Still:Modify training settings and then repeated runs?
  root_end:Runner
  Moving:Access to update infos (gradients, batches, etc.)?
  Crash:Meta Component
  s2:Which interaction point with the algorithm?
  s6:Exploration Policy
  s10:Update
  s8:Buffer
  s12:Change only the model architecture?
  s13:Network and/or Model
  s14:Agent
  class root_end Peach
  class Crash Aqua
  class s6 Sky
  class s8 Pine
  class s10 Rose
  class s13 Ash
  class s14 Seven
  class Still Neutral
  class Moving Neutral
  class s2 Neutral
  class s12 Neutral
  style root_end color:none
  style s8 color:#FFFFFF
```

## Pre-Implemented Methods
Mighty is meant to be a platform to build upon and not a large collection of methods in itself. We have a few relevant methods pre-implemented, however, and this collection will likely grow over time:

- **Agents**: SAC, PPO, DQN
- **Updates**: SAC, PPO, Q-learning, double Q-learning, clipped double Q-learning
- **Buffer**s: Rollout Buffer, Replay Buffer, Prioritized Replay Buffer
- **Exploration Policies**: e-greedy (with and without decay), ez-greedy, standard stochastic
- **Models** (with MLP, CNN or ResNet backbone): SAC, PPO, DQN (with soft and hard reset options)
- **Meta Components**: RND, NovelD, SPaCE, PLR
- **Runners**: online RL runner, ES runner

## Cite Us

If you use Mighty in your work, please cite us:

```bibtex
@misc{mohaneimer24,
  author    = {A. Mohan and T. Eimer and C. Benjamins and M. Lindauer and A. Biedenkapp},
  title     = {Mighty},
  year      = {2024},
  url = {https://github.com/automl/mighty}
}
```
