Metadata-Version: 2.1
Name: awehflow
Version: 1.10.3.0
Summary: Configuration based Apache Airflow
Home-page: UNKNOWN
Author: Philip Perold
Author-email: philip@spatialedge.co.za
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: absl-py (==0.6.1)
Requires-Dist: adal (==1.2.0)
Requires-Dist: alembic (==0.9.10)
Requires-Dist: amqp (==2.4.2)
Requires-Dist: apache-airflow (==1.10.3)
Requires-Dist: apache-beam (==2.15.0)
Requires-Dist: asn1crypto (==0.24.0)
Requires-Dist: astor (==0.7.1)
Requires-Dist: attrs (==19.1.0)
Requires-Dist: avro-python3 (==1.9.1)
Requires-Dist: Babel (==2.6.0)
Requires-Dist: bcrypt (==3.1.7)
Requires-Dist: billiard (==3.6.0.0)
Requires-Dist: cachetools (==4.0.0)
Requires-Dist: celery (==4.3.0)
Requires-Dist: certifi (==2019.11.28)
Requires-Dist: cffi (==1.11.5)
Requires-Dist: chardet (==3.0.4)
Requires-Dist: click (==6.7)
Requires-Dist: colorama (==0.4.1)
Requires-Dist: configparser (==3.5.3)
Requires-Dist: crcmod (==1.7)
Requires-Dist: croniter (==0.3.30)
Requires-Dist: cryptography (==2.5)
Requires-Dist: defusedxml (==0.6.0)
Requires-Dist: dill (==0.2.9)
Requires-Dist: docopt (==0.6.2)
Requires-Dist: docutils (==0.15.2)
Requires-Dist: dotty-dict (==1.2.1)
Requires-Dist: fastavro (==0.21.24)
Requires-Dist: fasteners (==0.14.1)
Requires-Dist: Flask (==1.0.4)
Requires-Dist: Flask-Admin (==1.5.3)
Requires-Dist: Flask-AppBuilder (==1.12.3)
Requires-Dist: Flask-Babel (==0.12.2)
Requires-Dist: Flask-Bcrypt (==0.7.1)
Requires-Dist: Flask-Caching (==1.3.3)
Requires-Dist: Flask-Login (==0.4.1)
Requires-Dist: Flask-OpenID (==1.2.5)
Requires-Dist: Flask-SQLAlchemy (==2.4.0)
Requires-Dist: flask-swagger (==0.2.13)
Requires-Dist: Flask-WTF (==0.14.2)
Requires-Dist: flower (==0.9.2)
Requires-Dist: funcsigs (==1.0.0)
Requires-Dist: future (==0.16.0)
Requires-Dist: futures (==3.1.1)
Requires-Dist: gast (==0.2.2)
Requires-Dist: gitdb2 (==2.0.5)
Requires-Dist: GitPython (==2.1.11)
Requires-Dist: google-api-core (==1.16.0)
Requires-Dist: google-api-python-client (==1.7.8)
Requires-Dist: google-apitools (==0.5.30)
Requires-Dist: google-auth (==1.11.2)
Requires-Dist: google-auth-httplib2 (==0.0.3)
Requires-Dist: google-auth-oauthlib (==0.4.0)
Requires-Dist: google-cloud-bigquery (==1.10.1)
Requires-Dist: google-cloud-bigtable (==0.32.0)
Requires-Dist: google-cloud-container (==0.2.1)
Requires-Dist: google-cloud-core (==0.29.1)
Requires-Dist: google-cloud-language (==1.2.0)
Requires-Dist: google-cloud-logging (==1.9.1)
Requires-Dist: google-cloud-monitoring (==0.31.1)
Requires-Dist: google-cloud-spanner (==1.8.0)
Requires-Dist: google-cloud-storage (==1.13.2)
Requires-Dist: google-cloud-translate (==1.4.0)
Requires-Dist: google-cloud-vision (==0.38.0)
Requires-Dist: google-resumable-media (==0.4.1)
Requires-Dist: googleapis-common-protos (==1.51.0)
Requires-Dist: grpc-google-iam-v1 (==0.11.4)
Requires-Dist: grpcio (==1.23.0)
Requires-Dist: grpcio-gcp (==0.2.2)
Requires-Dist: gunicorn (==19.9.0)
Requires-Dist: h5py (==2.9.0)
Requires-Dist: hdfs (==2.5.8)
Requires-Dist: httplib2 (==0.9.2)
Requires-Dist: idna (==2.9)
Requires-Dist: iso8601 (==0.1.12)
Requires-Dist: itsdangerous (==1.1.0)
Requires-Dist: Jinja2 (==2.10)
Requires-Dist: json-merge-patch (==0.2)
Requires-Dist: jsonschema (==3.0.2)
Requires-Dist: Keras-Applications (==1.0.6)
Requires-Dist: Keras-Preprocessing (==1.0.5)
Requires-Dist: kombu (==4.5.0)
Requires-Dist: kubernetes (==8.0.1)
Requires-Dist: lockfile (==0.12.2)
Requires-Dist: Mako (==1.1.0)
Requires-Dist: Markdown (==2.6.11)
Requires-Dist: MarkupSafe (==1.1.1)
Requires-Dist: mock (==2.0.0)
Requires-Dist: monotonic (==1.5)
Requires-Dist: numpy (==1.17.2)
Requires-Dist: oauth2client (==3.0.0)
Requires-Dist: oauthlib (==3.0.1)
Requires-Dist: ordereddict (==1.1)
Requires-Dist: pandas (==0.25.1)
Requires-Dist: pandas-gbq (==0.9.0)
Requires-Dist: pbr (==5.4.3)
Requires-Dist: pendulum (==1.4.4)
Requires-Dist: pip (==19.0.2)
Requires-Dist: pipdeptree (==0.13.1)
Requires-Dist: protobuf (==3.11.3)
Requires-Dist: psutil (==5.6.3)
Requires-Dist: psycopg2 (==2.7.7)
Requires-Dist: pyarrow (==0.14.1)
Requires-Dist: pyasn1 (==0.4.8)
Requires-Dist: pyasn1-modules (==0.2.8)
Requires-Dist: pycparser (==2.19)
Requires-Dist: pydata-google-auth (==0.1.3)
Requires-Dist: pydot (==1.4.1)
Requires-Dist: Pygments (==2.4.2)
Requires-Dist: PyJWT (==1.7.1)
Requires-Dist: pymongo (==3.9.0)
Requires-Dist: pyOpenSSL (==19.0.0)
Requires-Dist: pyparsing (==2.4.2)
Requires-Dist: pyrsistent (==0.15.4)
Requires-Dist: python-daemon (==2.1.2)
Requires-Dist: python-dateutil (==2.7.5)
Requires-Dist: python-editor (==1.0.4)
Requires-Dist: python-http-client (==3.1.0)
Requires-Dist: python3-openid (==3.1.0)
Requires-Dist: pytz (==2019.3)
Requires-Dist: pytzdata (==2019.2)
Requires-Dist: PyYAML (==3.13)
Requires-Dist: redis (==3.2.1)
Requires-Dist: requests (==2.23.0)
Requires-Dist: requests-oauthlib (==1.2.0)
Requires-Dist: rsa (==4.0)
Requires-Dist: sendgrid (==5.6.0)
Requires-Dist: setproctitle (==1.1.10)
Requires-Dist: setuptools (==41.2.0)
Requires-Dist: six (==1.14.0)
Requires-Dist: slackclient (==2.5.0)
Requires-Dist: smmap2 (==2.0.5)
Requires-Dist: SQLAlchemy (==1.3.8)
Requires-Dist: statsd (==3.3.0)
Requires-Dist: tabulate (==0.8.3)
Requires-Dist: tenacity (==4.12.0)
Requires-Dist: tensorboard (==1.12.2)
Requires-Dist: tensorflow (==1.12.0)
Requires-Dist: termcolor (==1.1.0)
Requires-Dist: text-unidecode (==1.2)
Requires-Dist: thrift (==0.11.0)
Requires-Dist: tornado (==5.1.1)
Requires-Dist: tzlocal (==1.5.1)
Requires-Dist: unicodecsv (==0.14.1)
Requires-Dist: uritemplate (==3.0.0)
Requires-Dist: urllib3 (==1.25.8)
Requires-Dist: vine (==1.3.0)
Requires-Dist: virtualenv (==16.2.0)
Requires-Dist: websocket-client (==0.54.0)
Requires-Dist: Werkzeug (==0.14.1)
Requires-Dist: wheel (==0.31.1)
Requires-Dist: WTForms (==2.2.1)
Requires-Dist: zope.deprecation (==4.4.0)
Provides-Extra: composer
Requires-Dist: apache-airflow (==1.10.3-composer) ; extra == 'composer'
Provides-Extra: default
Requires-Dist: apache-airflow (==1.10.3) ; extra == 'default'

# Awehflow

![coverage report](https://gitlab.com/spatialedge/awehflow/badges/master/coverage.svg)
![pipeline status](https://gitlab.com/spatialedge/awehflow/badges/master/pipeline.svg)

Configuration based Airflow pipelines with metric logging and alerting out the box.

## Prerequisites

You will need the following to run this code:
  * Python 3

## Installation

```
pip install awehflow[default]
```

If you are installing on Google Cloud Composer with Airflow 1.10.2:

```
pip install awehflow[composer]
```

## Usage

Usage of `awehflow` can be broken up into two parts: bootstrapping and configuration of pipelines

### Bootstrap

In order to expose the generated pipelines (`airflow` _DAGs_) for `airflow` to pick up when scanning for _DAGs_, one has to create a `DagLoader` that points to a folder where the pipeline configuration files will be located:

```python
import os

from awehflow.alerts.slack import SlackAlerter
from awehflow.core import DagLoader
from awehflow.events.postgres import PostgresMetricsEventHandler

"""airflow doesn't pick up DAGs in files unless 
the words 'airflow' and 'DAG' features"""

configs_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'configs')

metrics_handler = PostgresMetricsEventHandler(jobs_table='jobs', task_metrics_table='task_metrics')

slack_alerter = SlackAlerter(channel='#airflow')

loader = DagLoader(
    project="awehflow-demo",
    configs_path=configs_path,
    event_handlers=[metrics_handler],
    alerters=[slack_alerter]
)

dags = loader.load(global_symbol_table=globals())
```

As seen in the code snippet, one can also pass in _"event handlers"_ and _"alerters"_ to perform actions on certain pipeline events and potentially alert the user of certain events on a given channel. See the sections below for more detail.
The global symbol table needs to be passed to the `loader` since `airflow` scans it for objects of type `DAG`, and then synchronises the state with its own internal state store.

\*_caveat_: `airflow` ignores `python` files that don't contain the words _"airflow"_ and _"DAG"_. It is thus advised to put those words in a comment to ensure the generated _DAGs_ get picked up when the `DagBag` is getting filled.

#### Event Handlers

As a pipeline generated using `awehflow` is running, certain events get emitted. An event handler gives the user the option of running code when these events occur.

The following events are (potentially) potentially emitted as a pipeline runs:

* `start`
* `success`
* `failure`
* `task_metric`

Existing event handlers include:

* `PostgresMetricsEventHandler`: persists pipeline metrics to a Postgres database
* `PublishToGooglePubSubEventHandler`: events get passed straight to a Google Pub/Sub topic

An `AlertsEventHandler` gets automatically added to a pipeline. Events get passed along to registered alerters.

#### Alerters

An `Alerter` is merely a class that implements an `alert` method. The following alerters are currently available:

* `SlackAlerter`

###

## Running the tests

Tests may be run with
```bash
python -m unittest discover tests
```

or to run code coverage too:

```bash
coverage run -m unittest discover tests && coverage html
```



