Metadata-Version: 1.1
Name: Sheepdog
Version: 0.1.10
Summary: Shepherd GridEngine
Home-page: http://sheepdog.readthedocs.org
Author: Adam Greig
Author-email: adam@adamgreig.com
License: MIT
Description: Sheepdog
        ========
        
        |Build Status| |Coverage Status| |PyPi Version| |License|
        
        Make Grid Engine a bit more useful from Python.
        
        Requirements
        ------------
        
        On the host:
        
        -  Python 2.7 or Python 3.3
        -  `Paramiko <https://github.com/paramiko/paramiko>`__
        -  `Flask <http://flask.pocoo.org/>`__
        -  Optional: `Tornado <http://www.tornadoweb.org/>`__ to speed up HTTP
           bits
        
        On the worker nodes:
        
        -  No requirements beyond the standard library
        -  Tested on Python 2.7, Python 3.3
        -  Should also work on Python 2.6 and 3.2, 3.4
        
        License
        -------
        
        MIT, see LICENSE file.
        
        Overview
        --------
        
        Running large map style operations on a Grid Engine cluster can be
        frustrating. Array jobs can only give scripts an input like some range()
        function call, but this is rarely sufficient. Collecting results is also
        a huge pain. Suddenly there are shell scripts and result files
        everywhere and you feel an overwhelming sense of mediocracy.
        
        Sheepdog aims to make life better for a somewhat specific use case:
        
        1. You're using Python. Hopefully even Python 3.3.
        
        2. You've got access to a Grid Engine cluster on some remote machines.
           They can also run Python, somehow or other. The cluster computers and
           your client computer can all communicate over a network.
        
        3. You have a function of several parameters and you want to run it many
           times with different arguments. Results should come back nicely
           collated, and are reasonably small (you're not too worried if
           argument or result objects get copied in memory).
        
        4. You're a PhD student in Div F at CUED desperately trying to use
           *fear* effectively.
        
        To accomplish these aims, Sheepdog:
        
        1. Takes your function and N tuples of arguments, marshals both
        2. Creates a mapping range(N) to arguments
        3. Starts a network interface (over HTTP)
        4. Starts a size N array job on the Grid Engine cluster, running the
           client
        5. Each client talks to the server to map its array job ID into an
           actual set of arguments, and fetches the Python function to execute
           as well
        6. The function is executed with the arguments
        7. The result is sent back over the network
        8. Results are collated against arguments
        
        This is very similar to:
        
        -  `pythongrid <https://code.google.com/p/pythongrid>`__. Almost
           identical. Sheepdog doesn't have to be run on the cluster head,
           though. And can't resubmit jobs or anything fancy like that. And
           isn't dead.
        -  `gridmap <http://gridmap.readthedocs.org/>`__. A fork of pythongrid
           that is actually active and looks quite nice! Maybe look at gridmap.
        -  `Celery <http://celeryproject.org/>`__. Yes. Pretty similar.
        -  `rq <http://python-rq.org/>`__. Quite similar.
        -  `Resque <http://resquework.org/>`__. But Resque is written in Ruby,
           boo.
        -  Every other distributed map compute queue thing ever written.
        
        Usage
        -----
        
        Ensure the GridEngine workers have Python available.
        
        Then,
        
        .. code:: python
        
                import sheepdog
        
                def f(a, b):
                    return a + b
        
                args = [(1, 1), (1, 2), (2, 2)]
                config = {"host": "fear"}
        
                results = sheepdog.map_sync(f, args, config)
                print("Received results:", results)
                # Received results: [2, 3, 4]
        
        There is also support for transferring other functions and variables
        (using the namespace parameter ``ns`` of map\_sync) and imports can be
        handled using ``global``, for example:
        
        .. code:: python
        
                def f(a, b):
                    import numpy as np
                    global np
                    return g(a, b)
        
                def g(a, b):
                    return np.array((a, b)) ** 2
        
                args = [(1, 2), (3, 4)]
                namespace = {"g": g}
                config = {"host": "fear"}
        
                results = sheepdog.map_sync(f, args, config, namespace)
        
        See the documentation for full details.
        
        Documentation
        -------------
        
        View Sheepdog on `ReadTheDocs <http://sheepdog.readthedocs.org/>`__.
        
        .. |Build Status| image:: https://travis-ci.org/adamgreig/sheepdog.png?branch=master
           :target: https://travis-ci.org/adamgreig/sheepdog
        .. |Coverage Status| image:: https://coveralls.io/repos/adamgreig/sheepdog/badge.png?branch=master
           :target: https://coveralls.io/r/adamgreig/sheepdog?branch=master
        .. |PyPi Version| image:: https://pypip.in/v/Sheepdog/badge.png
           :target: https://pypi.python.org/pypi/Sheepdog/
        .. |License| image:: https://pypip.in/license/Sheepdog/badge.png
           :target: https://pypi.python.org/pypi/Sheepdog/
        
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.3
Classifier: Topic :: Scientific/Engineering
