Metadata-Version: 2.1
Name: JLpyUtils
Version: 0.2.7
Summary: Custom methodes for various data science, computer vision, and machine learning operations in python
Home-page: https://github.com/jlnerd/JLpyUtils.git
Author: John T. Leonard
Author-email: jtleona01@gmail.com
License: UNKNOWN
Platform: UNKNOWN
Description-Content-Type: text/markdown
Requires-Dist: numpy (~=1.17.1)
Requires-Dist: pandas
Requires-Dist: sklearn
Requires-Dist: scipy
Requires-Dist: matplotlib
Requires-Dist: kaggle
Requires-Dist: scikit-image
Requires-Dist: opencv-python
Requires-Dist: nose
Requires-Dist: xgboost
Requires-Dist: dill
Requires-Dist: h5py
Requires-Dist: bokeh
Requires-Dist: click
Requires-Dist: cloudpickle
Requires-Dist: cytoolz
Requires-Dist: dask
Requires-Dist: dask-core
Requires-Dist: distributed
Requires-Dist: fsspec
Requires-Dist: heapdict
Requires-Dist: locket
Requires-Dist: mkl-fft
Requires-Dist: mkl-random
Requires-Dist: msgpack-python
Requires-Dist: numpy-base
Requires-Dist: packaging
Requires-Dist: partd
Requires-Dist: psutil
Requires-Dist: sortedcontainers
Requires-Dist: tblib
Requires-Dist: toolz
Requires-Dist: zict
Requires-Dist: zstd
Requires-Dist: dask-ml
Requires-Dist: dask-xgboost
Requires-Dist: tables

# JLpyUtils
Custom modules/classes/methods for various data science, computer vision, and machine learning operations in python

## Installing & Importing
In your command line interface (CLI):
```
$ pip install --upgrade JLpyUtils
```
After this, the package can be imported into jupyter notebook or python in general via the comman:
```import JLpyUtils```


# Modules:
```
JLpyUtils.ML
JLpyUtils.plot
JLpyUtils.img
JLpyUtils.video
JLpyUtils.file_utils
JLpyUtils.summary_tables
JLpyUtils.kaggle
```

## Modules Overview

Below, we highlight several of the most interesting modules in more detail.

### JLpyUtils.ML
Machine learning module for python focusing on streamlining and wrapping sklearn, xgboost, dask_ml, & tensorflow/keras functions

The sub0modules within JLpyUtils.ML are:
```JLpyUtils.ML.preprocessing```: Functions related to preprocessing/feature engineering for machine learning
    * The main class of interest is the ```JLpyUtils.ML.preprocessing.feat_eng_pipe``` class, which iterates through a standard feature engineering sequence and saves the resulting engineered data. The standard sequence is:
        * LabelEncode.categorical_features ->  
        * Scale.continuous_features -> 
            - for Scaler_ID in Scalers_dict.keys()
        * Impute.categorical_features ->
            - for Imputer_cat_ID in Imputer_categorical_dict[Imputer_cat_ID].keys():
                - for Imputer_iter_class_ID in Imputer_categorical_dict[Imputer_cat_ID].keys():
        * Imputer.continuous_features ->
            - for Imputer_cont_ID in Imputer_continuous_dict.keys():
                - for Imputer_iter_reg_ID in Imputer_continuous_dict[Imputer_cont_ID].keys():
        * OneHotEncode ->
        * CorrCoeffThreshold ->
        * Finished!
```JLpyUtils.ML.model_selection```: functions/classes for running hyperparameter searches across multiple types of models & comparing those models
    * The main class of interest is the ```JLpyUtils.ML.model_selection.GridSearchCV``` class, which runs hyperparameter GridSearchCV across different types of models & compares the results to allow one to find the best-of-best (BoB) model. The functions is compatible with evaluating sklearn models, tensorflow/keras models, and xgboost models.
```JLpyUtils.ML.NeuralNet```: sub-modules/functions/classes for streamlining common neural-net architectures implemented in tensorflow/keras.
    * The most notetable sub-modules are the ```DenseNet``` and ```Conv2D``` modules, which provide a keras implementation of a general dense neural network & 2D convolutional neural network, where the depth & general architecture of the network s are defined by generic hyperparameters, such that one can easily perform a grid search across multiple neural network architectures.
```JLpyUtils.ML.inspection```: Functions to inspect features and/or models after training
```JLpyUtils.ML.postprocessing```: ML model outputs postprocessing helper functions


### JLpyUtils.plot
This module contains helper functions related to common plotting operations via matplotlib.

The most noteable functions are:
```JLpyUtils.plot.corr_matrix()```: Plot a correlation matrix chart
```JLpyUtils.plot.ccorr_pareto()```: Plot a pareto bar-chart for 1 label of interest within a correlation dataframe
```JLpyUtils.plot.hist_or_bar()```: Iterate through each column in a dataframe and plot the histogram or bar chart for the data.

### JLpyUtils.img
This module contains functions/classes related to image analysis, most of which wrap SciKit image functions in some way.

The most noteable functions are: 
```JLpyUtils.img.auto_crop.use_edges()```: Use skimage.feature.canny method to find edges in the image passed and autocrop on the outermost edges
```JLpyUtils.img.decompose_video_to_img()```: Use cv2 to pull out image frames from a video and save them as png files


### JLpyUtils.kaggle
This module contains functions for interacting with kaggle. The simplest and most useful function is:
```
JLpyUtils.kaggle.competition_download_files(competition)
```
where ```competition``` is the competition name, such as  "home-credit-default-risk"

### JLpyUtils.file_utils
This module contains simple but extremely useful helper functions to save and load standard file types including 'hdf', 'csv', 'json', 'dill'. Essentially the ```save``` and ```load``` functions take care of the boiler plate operations related to saving or loading on the file-types specified above.



