Metadata-Version: 2.1
Name: batcat
Version: 0.2.1
Summary: BatCat, A Cat Looks Like A Bat.
Home-page: https://github.com/Ewen2015/BatCat
Author: Ewen Wang
Author-email: wolfgangwong2012@gmail.com
License: Apache License 2.0
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.7
License-File: LICENSE
Requires-Dist: pandas
Requires-Dist: boto3 (==1.18.57)
Requires-Dist: botocore (==1.21.57)
Requires-Dist: protobuf (==3.19.4)
Requires-Dist: awscli (==1.20.57)
Requires-Dist: requests (==2.26.0)
Requires-Dist: sagemaker (==2.59.8)
Requires-Dist: stepfunctions (==2.2.0)
Requires-Dist: sagemaker-experiments (==0.1.35)
Requires-Dist: pyathena (==2.3.0)
Requires-Dist: redshift-connector (==2.0.888)
Requires-Dist: sqlalchemy (==1.3.23)
Requires-Dist: psycopg2-binary (==2.9.1)


##############################
BatCat, A Cat with A Bat Face!
##############################

😸😹😺😻😼😽😾😿🙀🐱

BatCat is designed to help data scientists to practice machine learning operations (MLOps) on Amazon Web Services (AWS). 

Services of AWS covered:
- AWS Lambda: a serverless, event-driven compute service
- AWS S3 (Simple Storage Service): provides object storage service
- Amazon Athena: a serverless, interactive query service on S3
- Amazon Redshift: a data warehouse product

Philosophy of BatCat's MLOps
============================

BatCat practices MLOps in **3 layers (ASA)**:

- **AWS Lambda**: a serverless, event-driven compute service
- **AWS S3 (Simple Storage Service)**: provides object storage service
- **Amazon Athena**: a serverless, interactive query service on S3
- **Amazon Redshift**: a data warehouse product
- **AWS Step Functions**: a low-code, visual workflow service that developers use to build distributed applications, automate IT and business processes, and build data and machine learning pipelines using AWS services.

1. Algorithm level
------------------

Tool: `GossipCat <https://github.com/Ewen2015/GossipCat>`_, `TensorBoard <https://www.tensorflow.org/tensorboard>`_

.. image:: https://raw.githubusercontent.com/Ewen2015/BatCat/master/gc_learning_curve.png
    :width: 600
    :align: center

.. image:: https://www.tensorflow.org/tensorboard/images/tensorboard.gif
    :width: 600
    :align: center

2. System level
---------------

Tool: AWS CloudWatch

AWS CloudWatch provides standard monitoring and operational data with dashboards, which satisfies the requirements of MLOps in system level. Generally, the following operational data are presented in the dashboard:

- SageMaker CPU Ultilization
- S3 bucket size
- Lambda
    - invocations
    - erros
- StepFunction
    - execution time
    - execution failed
- Cost
- Log group

.. image:: https://raw.githubusercontent.com/Ewen2015/BatCat/master/aws_cloudwatch.png
    :width: 600
    :align: center

3. Application level
--------------------

Tool: DataOps

BatCat realizes application level MLOps by monitoring the distributions of data inputs (data source) and data outputs (predictions). As the applicaiton levle MLOps is a part of the whole DataOps, it should algin with the practice of DataOps according to each organziation or company.

Story of the BatCat
===================

The package names after a cat of my friend, Clara. 

.. image:: https://raw.githubusercontent.com/Ewen2015/BatCat/master/BatCat.jpeg
    :width: 400
    :align: center

License
=======

BatCat is licensed under the MIT License. © Contributors, 2022.
