Metadata-Version: 2.1
Name: af-analysis
Version: 0.1.1
Summary: `AF analysis` is a python library allowing analysis of Alphafold results.
Author-email: Samuel Murail <samuel.murail@u-paris.fr>
License:  GPL-2.0
Project-URL: Homepage, https://github.com/samuelmurail/af_analysis
Keywords: AlphaFold2,ColabFold,Python,af_analysis
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Software Development
Requires-Python: >=3.8
Description-Content-Type: text/x-rst
License-File: LICENSE
Requires-Dist: pdb_numpy>=0.0.11
Requires-Dist: pandas>=1.3.4
Requires-Dist: numpy>=1.21
Requires-Dist: tqdm>=4.0
Requires-Dist: seaborn>=0.11
Requires-Dist: cmcrameri>=1.7
Requires-Dist: nglview>=3.0
Requires-Dist: ipywidgets>=7.6
Requires-Dist: mdanalysis>=2.4
Requires-Dist: scikit-learn

[![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest)
[![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis)
[![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main)
[![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/)
[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)

# About Alphafold Analysis

<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg" alt="AF Analysis Logo" width="200" style="display: block; margin: auto;"/>


`af-analysis` is a python package for the analysis of AlphaFold protein structure predictions.
This package is designed to simplify and streamline the process of working with protein structures
generated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M]
and [AlphaPulldown][AlphaPulldown].

* Source code repository:
   [https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)

## Statement of Need

AlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.
Analyzing the abundance of resulting structural models can be challenging and time-consuming.
Existing tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.
`af-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.

## Main features:

* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.
* Calculate and add additional structural quality metrics to the DataFrame, including:
    * pDockQ
    * pDockQ2
    * LIS score
* Visualize predicted protein models.
* Cluster generated models to identify diverse conformations.
* Select the best models based on defined criteria.
* Add your custom metrics to the DataFrame for further analysis.

## Installation

- `af-analysis` is available on PyPI and can be installed using ``pip``:

```bash
pip install af_analysis
```

- You can install last version from the github repo:

```bash
pip install git+https://github.com/samuelmurail/af_analysis.git@main
```

- AF-Analysis can also be installed easily through github:

```bash
git clone https://github.com/samuelmurail/af_analysis
cd af_analysis
pip install .
```

## Documentation

The full documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).


## Usage


### Importing data

Create the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run. 

```python
import af_analysis
my_data = af_analysis.Data('MY_AF_RESULTS_DIR')
```

Extracted data are available in the `df` attribute of the `Data` object. 

```python
my_data.df
```

### Analysis

- The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:

```python
from af_analysis import analysis
analysis.pdockq(my_data)
analysis.pdockq2(my_data)
```

### Docking Analysis

- The `docking` package contains several function to add metrics like [LIS Score][LIS]:

```python
from af_analysis import docking
docking.LIS_pep(my_data)
```

### Plots


- At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.

<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif" alt="Interactive Visualization" width="100%" style="display: block; margin: auto;"/>


- plot msa, plddt and PAE:

```python
my_data.plot_msa()
my_data.plot_plddt([0,1])
best_model_index = my_data.df['ranking_confidence'].idxmax()
my_data.plot_pae(best_model_index)
```

- show 3D structure (`nglview` package required):

```python
my_data.show_3d(my_data.df['ranking_confidence'].idxmax())
```

## Dependencies

`af_analysis` requires the following dependencies:

- `pdb_numpy`
- `pandas`
- `numpy`
- `tqdm`
- `seaborn`
- `cmcrameri`
- `nglview`
- `ipywidgets`
- `mdanalysis`


## Contributing

`af-analysis` is an open-source project and contributions are welcome. If
you find a bug or have a feature request, please open an issue on the GitHub
repository at https://github.com/samuelmurail/af_analysis. If you would like
to contribute code, please fork the repository and submit a pull request.


## Authors

* Alaa Regei, Graduate Student - [Université Paris Cité](https://u-paris.fr).
* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Université Paris Cité](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>).

See also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.

## License

This project is licensed under the GNU General Public License version 2 - see the `LICENSE` file for details.


# References

- Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]
- Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]
- Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]
- Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]
- Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]
- Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]
- Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]
- Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]


[AF2]: https://www.nature.com/articles/s41586-021-03819-2 "Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2"
[AF3]: https://www.nature.com/articles/s41586-024-07487-w "Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w"
[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 "Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1"
[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 "Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034"
[pdockq]: https://www.nature.com/articles/s41467-022-28865-w "Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w"
[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 "Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424"
[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 "Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 "
[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 "Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749"

