Metadata-Version: 2.1
Name: biolearns
Version: 0.0.18
Summary: BioLearns: Computational Biology and Bioinformatics Toolbox in Python
Home-page: http://biolearns.com
Author: Zhi Huang
Author-email: huang898@purdue.edu
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown

# biolearns
BioLearns: Computational Biology and Bioinformatics Toolbox in Python http://biolearns.com

<div style="text-align:center"><img src="http://biolearns.com/img/logo.png" width=300/></div>


## Installation

* From PyPI

```bash
pip install biolearns
```

## Documentation and Tutorials

* We select three examples listed below. For full list of tutorial, check our github wiki page:

    [Wiki](https://github.com/huangzhii/biolearns/wiki)




### 1. Read TCGA Data

#### Example: Read TCGA Breast invasive carcinoma (BRCA) data

Data is downloaded directly from https://gdac.broadinstitute.org/.
The results here are in whole or part based upon data generated by 
the TCGA Research Network: https://www.cancer.gov/tcga.

```python
from biolearns.dataset.TCGA import TCGACancer
```

```python
brca = TCGACancer('BRCA')
mRNAseq = brca.mRNAseq
clinical = brca.clinical
```

#### TCGA cancer table shortcut:

|              | Barcode            | Cancer full name         | Version            |
|---|---|---|---|
| 1      |  ACC          |  Adrenocortical carcinoma     | 2016_01_28 |
| 2      |  BLCA         |  Bladder urothelial carcinoma         | 2016_01_28 |
| 3      |  BRCA         |  Breast invasive carcinoma    | 2016_01_28 |
| 4      |  CESC         |  Cervical and endocervical cancers    | 2016_01_28 |
| 5      |  CHOL         |  Cholangiocarcinoma   | 2016_01_28 |
| 6      |  COAD         |  Colon adenocarcinoma         | 2016_01_28 |
| 7      |  COADREAD     |  Colorectal adenocarcinoma    | 2016_01_28 |
| 8      |  DLBC         |  Lymphoid Neoplasm Diffuse Large B-cell Lymphoma      | 2016_01_28 |
| 9      |  ESCA         |  Esophageal carcinoma         | 2016_01_28 |
| ...     |  ...         |  ...          | ... |


### 2. Gene Co-expression Analysis

We firstly download and access the mRNAseq data.
```python
from biolearns.dataset.TCGA import TCGACancer

brca = TCGACancer('BRCA')
mRNAseq = brca.mRNAseq
```

mRNAseq data is noisy. We filter out 50% of genes with lowest mean values, and then filter out 50% remained genes with lowest variance values.

```python
from biolearns.preprocessing.filter import expression_filter
mRNAseq = expression_filter(mRNAseq, meanq = 0.5, varq = 0.5)
```

We then use lmQCM class to create an lmQCM object ```lobj```.

The gene co-expression analysis is performed by simply call the ```fit()``` function.

```python
from biolearns.coexpression.lmQCM import lmQCM

lobj = lmQCM(mRNAseq)
clusters, genes, eigengene_mat = lobj.fit()
```

### 3. Univariate survival analysis

We firstly download and access the mRNAseq data. Use breast cancer as an example.
```python
from biolearns.dataset.TCGA import TCGACancer

brca = TCGACancer('BRCA')
mRNAseq = brca.mRNAseq
```

We import logranktest from survival subpackage. Choose gene "ABLIM3" as the univariate input.
```python
from biolearns.survival import logranktest

r = mRNAseq.loc['ABLIM3',].values
```

We find the intersection of univariate, time, and event data
```python
bcd_m = [b[:12] for b in mRNAseq.columns]
bcd_p = [b[:12] for b in clinical.index]
bcd = np.intersect1d(bcd_m, bcd_p)

r = r[np.nonzero(np.in1d(bcd, bcd_m))[0]]
t = brca.overall_survival_time[np.nonzero(np.in1d(bcd, bcd_p))[0]]
e = brca.overall_survival_event[np.nonzero(np.in1d(bcd, bcd_p))[0]]
```

We perform log-rank test:

```python
logrank_results, fig = logranktest(r[~np.isnan(t)], t[~np.isnan(t)], e[~np.isnan(t)])
test_statistic, p_value = logrank_results.test_statistic, logrank_results.p_value
```

The output figure looks like:

<div style="text-align:center"><img src="https://github.com/huangzhii/biolearns/blob/master/figures/survival_plot_BRCA_ABLIM3.png" width=600/></div>


