Metadata-Version: 2.1
Name: awehflow
Version: 1.10.2.1
Summary: Configuration based Apache Airflow
Home-page: UNKNOWN
Author: Philip Perold
Author-email: philip@spatialedge.co.za
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: aiohttp (==3.6.2)
Requires-Dist: alembic (==0.8.10)
Requires-Dist: async-timeout (==3.0.1)
Requires-Dist: attrs (==19.3.0)
Requires-Dist: Babel (==2.6.0)
Requires-Dist: bleach (==2.1.4)
Requires-Dist: cachetools (==3.1.0)
Requires-Dist: certifi (==2019.3.9)
Requires-Dist: chardet (==3.0.4)
Requires-Dist: click (==6.7)
Requires-Dist: colorama (==0.4.1)
Requires-Dist: configparser (==3.5.3)
Requires-Dist: coverage (==5.0.4)
Requires-Dist: croniter (==0.3.30)
Requires-Dist: defusedxml (==0.6.0)
Requires-Dist: dill (==0.2.9)
Requires-Dist: docutils (==0.14)
Requires-Dist: dotty-dict (==1.2.1)
Requires-Dist: enum34 (==1.1.6)
Requires-Dist: Flask (==0.12.4)
Requires-Dist: Flask-Admin (==1.5.2)
Requires-Dist: Flask-AppBuilder (==1.12.1)
Requires-Dist: Flask-Babel (==0.12.2)
Requires-Dist: Flask-Caching (==1.3.3)
Requires-Dist: Flask-Login (==0.4.1)
Requires-Dist: Flask-OpenID (==1.2.5)
Requires-Dist: Flask-SQLAlchemy (==2.4.0)
Requires-Dist: flask-swagger (==0.2.13)
Requires-Dist: Flask-WTF (==0.14.2)
Requires-Dist: funcsigs (==1.0.0)
Requires-Dist: future (==0.16.0)
Requires-Dist: gitdb (==4.0.5)
Requires-Dist: GitPython (==2.1.11)
Requires-Dist: google-api-core (==1.8.1)
Requires-Dist: google-api-python-client (==1.7.8)
Requires-Dist: google-auth (==1.6.3)
Requires-Dist: google-auth-httplib2 (==0.0.3)
Requires-Dist: google-auth-oauthlib (==0.3.0)
Requires-Dist: google-cloud-bigquery (==1.11.2)
Requires-Dist: google-cloud-core (==0.28.1)
Requires-Dist: google-resumable-media (==0.3.2)
Requires-Dist: googleapis-common-protos (==1.5.8)
Requires-Dist: gunicorn (==19.9.0)
Requires-Dist: html5lib (==1.0.1)
Requires-Dist: httplib2 (==0.12.0)
Requires-Dist: idna (==2.8)
Requires-Dist: iso8601 (==0.1.12)
Requires-Dist: itsdangerous (==1.1.0)
Requires-Dist: Jinja2 (==2.10)
Requires-Dist: json-merge-patch (==0.2)
Requires-Dist: lockfile (==0.12.2)
Requires-Dist: lxml (==4.5.1)
Requires-Dist: Mako (==1.0.9)
Requires-Dist: Markdown (==2.6.11)
Requires-Dist: MarkupSafe (==1.1.1)
Requires-Dist: monotonic (==1.5)
Requires-Dist: multidict (==4.7.6)
Requires-Dist: natsort (==7.0.1)
Requires-Dist: numpy (==1.16.0)
Requires-Dist: oauthlib (==3.0.1)
Requires-Dist: ordereddict (==1.1)
Requires-Dist: pandas (==0.24.2)
Requires-Dist: pandas-gbq (==0.10.0)
Requires-Dist: pendulum (==1.4.4)
Requires-Dist: protobuf (==3.7.0)
Requires-Dist: psutil (==5.6.2)
Requires-Dist: psycopg2 (==2.8.2)
Requires-Dist: pyasn1 (==0.4.5)
Requires-Dist: pyasn1-modules (==0.2.4)
Requires-Dist: pydata-google-auth (==0.1.3)
Requires-Dist: Pygments (==2.4.0)
Requires-Dist: python-daemon (==2.1.2)
Requires-Dist: python-dateutil (==2.7.5)
Requires-Dist: python-editor (==1.0.4)
Requires-Dist: python-nvd3 (==0.15.0)
Requires-Dist: python-slugify (==3.0.2)
Requires-Dist: python3-openid (==3.1.0)
Requires-Dist: pytz (==2019.1)
Requires-Dist: pytzdata (==2019.1)
Requires-Dist: PyYAML (==3.13)
Requires-Dist: requests (==2.21.0)
Requires-Dist: requests-oauthlib (==1.2.0)
Requires-Dist: rsa (==4.0)
Requires-Dist: setproctitle (==1.1.10)
Requires-Dist: setuptools-scm (==3.3.3)
Requires-Dist: six (==1.12.0)
Requires-Dist: slackclient (==2.5.0)
Requires-Dist: smmap2 (==2.0.5)
Requires-Dist: SQLAlchemy (==1.2.19)
Requires-Dist: tabulate (==0.8.2)
Requires-Dist: tenacity (==4.8.0)
Requires-Dist: text-unidecode (==1.2)
Requires-Dist: thrift (==0.11.0)
Requires-Dist: tzlocal (==1.5.1)
Requires-Dist: unicodecsv (==0.14.1)
Requires-Dist: uritemplate (==3.0.0)
Requires-Dist: urllib3 (==1.24.1)
Requires-Dist: webencodings (==0.5.1)
Requires-Dist: Werkzeug (==0.14.1)
Requires-Dist: WTForms (==2.2.1)
Requires-Dist: yarl (==1.4.2)
Requires-Dist: zope.deprecation (==4.4.0)
Provides-Extra: composer
Requires-Dist: apache-airflow (==1.10.2-composer) ; extra == 'composer'
Provides-Extra: default
Requires-Dist: apache-airflow (==1.10.2) ; extra == 'default'

# Awehflow

![coverage report](https://gitlab.com/spatialedge/awehflow/badges/master/coverage.svg)
![pipeline status](https://gitlab.com/spatialedge/awehflow/badges/master/pipeline.svg)

Configuration based Airflow pipelines with metric logging and alerting out the box.

## Prerequisites

You will need the following to run this code:
  * Python 3

## Installation

```
pip install awehflow[default]
```

If you are installing on Google Cloud Composer with Airflow 1.10.2:

```
pip install awehflow[composer]
```

## Usage

Usage of `awehflow` can be broken up into two parts: bootstrapping and configuration of pipelines

### Bootstrap

In order to expose the generated pipelines (`airflow` _DAGs_) for `airflow` to pick up when scanning for _DAGs_, one has to create a `DagLoader` that points to a folder where the pipeline configuration files will be located:

```python
import os

from awehflow.alerts.slack import SlackAlerter
from awehflow.core import DagLoader
from awehflow.events.postgres import PostgresMetricsEventHandler

"""airflow doesn't pick up DAGs in files unless 
the words 'airflow' and 'DAG' features"""

configs_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'configs')

metrics_handler = PostgresMetricsEventHandler(jobs_table='jobs', task_metrics_table='task_metrics')

slack_alerter = SlackAlerter(channel='#airflow')

loader = DagLoader(
    project="awehflow-demo",
    configs_path=configs_path,
    event_handlers=[metrics_handler],
    alerters=[slack_alerter]
)

dags = loader.load(global_symbol_table=globals())
```

As seen in the code snippet, one can also pass in _"event handlers"_ and _"alerters"_ to perform actions on certain pipeline events and potentially alert the user of certain events on a given channel. See the sections below for more detail.
The global symbol table needs to be passed to the `loader` since `airflow` scans it for objects of type `DAG`, and then synchronises the state with its own internal state store.

\*_caveat_: `airflow` ignores `python` files that don't contain the words _"airflow"_ and _"DAG"_. It is thus advised to put those words in a comment to ensure the generated _DAGs_ get picked up when the `DagBag` is getting filled.

#### Event Handlers

As a pipeline generated using `awehflow` is running, certain events get emitted. An event handler gives the user the option of running code when these events occur.

The following events are (potentially) potentially emitted as a pipeline runs:

* `start`
* `success`
* `failure`
* `task_metric`

Existing event handlers include:

* `PostgresMetricsEventHandler`: persists pipeline metrics to a Postgres database
* `PublishToGooglePubSubEventHandler`: events get passed straight to a Google Pub/Sub topic

An `AlertsEventHandler` gets automatically added to a pipeline. Events get passed along to registered alerters.

#### Alerters

An `Alerter` is merely a class that implements an `alert` method. The following alerters are currently available:

* `SlackAlerter`

###

## Running the tests

Tests may be run with
```bash
python -m unittest discover tests
```

or to run code coverage too:

```bash
coverage run -m unittest discover tests && coverage html
```



