Metadata-Version: 2.0
Name: cadasta-workertoolbox
Version: 0.1.6
Summary: Cadasta Worker Toolbox
Home-page: https://github.com/Cadasta/cadasta-workertoolbox
Author: Anthony Lukach
Author-email: alukach@cadasta.org
License: GNU Affero General Public License v3.0
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Requires-Dist: SQLAlchemy (>=1.1.11)
Requires-Dist: boto3 (>=1.4.4)
Requires-Dist: celery (>=4.0.2)
Requires-Dist: kombu (>=4.0.2)
Requires-Dist: mock (>=2.0.0)
Requires-Dist: psycopg2 (>=2.7.1)
Requires-Dist: pycurl (>=7.43.0)

Cadasta Worker Toolbox
======================

|PyPI version| |Build Status| |Requirements Status|

A collection of helpers to assist in quickly building asynchronous
workers for the Cadasta system.

Architecture
------------

The Cadasta asynchronous system is designed so that both the scheduled
tasks and the task results can be tracked by the central `Cadasta
Platform <https://github.com/Cadasta/cadasta-platform>`__. To ensure
that this takes place, all Celery workers must be correctly configured
to support these features.

Tracking Scheduled Tasks
~~~~~~~~~~~~~~~~~~~~~~~~

To keep our system aware of all tasks being scheduled, the Cadasta
Platform has a process running to consume task messages off of a
task-monitor queue and insert those messages into our database. To
support this design, all task producers (including worker nodes) must
publish their task messages to both the normal destination queues and
the task-monitor queue. This is acheived by registering all queues with
a `Topic
Exchange <http://docs.celeryproject.org/en/latest/userguide/routing.html#topic-exchanges>`__,
setting the task-monitor queue to subscribe to all messages sent to the
exchange, and setting standard work queues to subscribe to messages with
a matching ``routing_key``. Being that the Cadasta Platform is designed
to work with Amazon SQS and the `SQS backend only keeps exchange/queue
declarations in
memory <http://docs.celeryproject.org/projects/kombu/en/v4.0.2/introduction.html#f1>`__,
each message producer must have this set up within their configuration.

Tracking Task Results
~~~~~~~~~~~~~~~~~~~~~

*TODO*

Library
-------

``cadasta.workertoolbox.conf.Config``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``Config`` class was built to simplify configuring Celery settings,
helping to ensure that all workers adhere to the architecture
requirements of the Cadasta asynchronous system. An instance of the
``Config`` should come configured with all Celery settings that are
required by our system. It is the aim of the class to not require much
customization on the part of the developer. However, some customization
may be needed when altering configuration between environments (e.g. if
dev settings vary greatly from prod settings).

Required Arguments
^^^^^^^^^^^^^^^^^^

``queues``
''''''''''

The only required argument is the ``queues`` array. This should contain
an array of names for queues that are to be used by the given worker.
This includes queues from which the node processes tasks and queues into
which the node will schedule tasks. It is not necessary to include the
``'celery'`` or ``'platform.fifo'`` queues, as these will be added
automatically. The input of the ``queues`` variable will be stored as
``QUEUES`` on the ``Config`` instance.

Optional Arguments
^^^^^^^^^^^^^^^^^^

Any `Celery
setting <http://docs.celeryproject.org/en/v4.0.2/userguide/configuration.html#new-lowercase-settings>`__
may be submitted. It is internal convention that we use the lowercase
Celery settings rather than their older upper-case counterparts. This
will ensure that they are displayed when calling ``repr`` on the
``Conf`` instance.

``result_backend``
''''''''''''''''''

Defaults to
``'db+postgresql://{0.RESULT_DB_USER}:{0.RESULT_DB_PASS}@{0.RESULT_DB_HOST}/{0.RESULT_DB_NAME}'``
rendered with ``self``.

``broker_transport``
''''''''''''''''''''

Defaults to ``'sqs``'.

``broker_transport_options``
''''''''''''''''''''''''''''

Defaults to:

.. code:: python

    {
        'region': 'us-west-2',
        'queue_name_prefix': '{}-'.format(QUEUE_NAME_PREFIX)
    }

``task_queues``
'''''''''''''''

Defaults to the following ``set`` of ``kombu.Queue`` objects, where
``queues`` is the configuration's required ``queues`` argument and
``exchange`` is an a ``kombu.Exchange`` object constructed from the
``task_default_exchange`` and ``task_default_exchange_type`` settings:

.. code:: python

    set([
        Queue('celery', exchange, routing_key='celery'),
        Queue(platform_queue, exchange, routing_key='#'),
    ] + [
        Queue(q_name, exchange, routing_key=q_name)
        for q_name in queues
    ])

*Note: It is recommended that developers not alter this setting.*

``task_routes``
'''''''''''''''

Defaults to the following ``dict``, where ``queues`` is the
configuration's required ``queues`` argument and ``exchange`` is an a
``kombu.Exchange`` object constructed from the ``task_default_exchange``
and ``task_default_exchange_type`` settings:

.. code:: python

    {
        'celery.*': {
            'exchange': exchange,
            'routing_key': 'celery',
        },
    }
    for q in queues:
        routes.setdefault('{}.*'.format(q), {
            'exchange': exchange,
            'routing_key': q,
        })

*Note: It is recommended that developers not alter this setting.*

``task_default_exchange``
'''''''''''''''''''''''''

Defaults to ``'task_exchange'``

``task_default_exchange_type``
''''''''''''''''''''''''''''''

Defaults to ``'topic'``

``task_track_started``
''''''''''''''''''''''

Defaults to ``True``.

Internal Variables
^^^^^^^^^^^^^^^^^^

By convention, all variables pertinent to only the ``Config`` class
(i.e. not used by Celery) should be written entirely uppercase.

``PLATFORM_QUEUE_NAME``
'''''''''''''''''''''''

Defaults to ``'platform.fifo'``.

*Note: It is recommended that developers not alter this setting.*

``QUEUE_NAME_PREFIX``
'''''''''''''''''''''

Used to populate the ``queue_name_prefix`` value of the connections
``broker_transport_options``. Defaults to value of ``QUEUE_PREFIX``
environment variable if populated, ``'dev'`` if not.

``RESULT_DB_USER``
''''''''''''''''''

Used to populate the default ``result_backend`` template.

``RESULT_DB_PASS``
''''''''''''''''''

Used to populate the default ``result_backend`` template.

``RESULT_DB_HOST``
''''''''''''''''''

Used to populate the default ``result_backend`` template.

``RESULT_DB_NAME``
''''''''''''''''''

Used to populate the default ``result_backend`` template.

``cadasta.workertoolbox.tests.build_functional_tests``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When provided with a Celery app instance, this function generates a
suite of functional tests to ensure that the provided application's
configuration and functionality conforms with the architecture of the
Cadasta asynchronous system.

An example, where an instanciated and configured ``Celery()`` app
instance exists in a parallel ``celery`` module:

.. code:: python

    from cadasta.workertoolbox.tests import build_functional_tests

    from .celery import app

    FunctionalTests = build_functional_tests(app)

To run these tests, use your standard test runner (e.g. ``pytest``) or
call manually from the command-line:

.. code:: bash

    python -m unittest path/to/tests.py

Development
-----------

Testing
~~~~~~~

.. code:: bash

    pip install -r requirements-test.txt
    ./runtests

Deploying
~~~~~~~~~

.. code:: bash

    pip install -r requirements-deploy.txt
    python setup.py test clean build publish tag

.. |PyPI version| image:: https://badge.fury.io/py/cadasta-workertoolbox.svg
   :target: https://badge.fury.io/py/cadasta-workertoolbox
.. |Build Status| image:: https://travis-ci.org/Cadasta/cadasta-workertoolbox.svg?branch=master
   :target: https://travis-ci.org/Cadasta/cadasta-workertoolbox
.. |Requirements Status| image:: https://requires.io/github/Cadasta/cadasta-workertoolbox/requirements.svg?branch=master
   :target: https://requires.io/github/Cadasta/cadasta-workertoolbox/requirements/?branch=master


