Metadata-Version: 2.1
Name: QuNet
Version: 0.0.170
Summary: Working with deep learning models
Home-page: https://github.com/step137/qunet
Author: synset
Author-email: steps137ai@gmail.com
License: UNKNOWN
Description: # QuNet
        
        [![Python 3.7+](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/release/python-370/)
        [![PyPI version](https://badge.fury.io/py/torchinfo.svg)](https://badge.fury.io/py/torchinfo)
        
        
        Easy working with deep learning models.
        * Trainer class for training the model.
        * Various tools for visualizing the training process and the state of the model.
        * Training large models: float16, mini-batch splitting, etc.
        * Large set of custom modules for neural networks (MLP, CNN, Transformer, etc.)
        
        <hr>
        
        ## Install
        
        ```
        pip install qunet
        ```
        <hr>
        
        ## Usage
        
        To work with the library, it is enough to add `training_step(batch, batch_id)` to the model, in which to calculate the loss and, if necessary, some quality metrics.
        For example, for 1D linear regression  $y=f(x)$ with mse-loss and metric as |y_pred-y_true|, model looks like:
        ```python
        class Model(nn.Module):
            def __init__(self):              
                super().__init__() 
                self.fc = nn.Linear( 1, 1 )
        
            def forward(self, x):                                 # (B,1)
                return self.fc(x)                                 # (B,1)
        
            def training_step(self, batch, batch_id):        
                x, y_true = batch                                 # the model knows the minbatch format
                y_pred = self(x)                                  # (B,1)  forward function call
        
                loss  = (y_pred - y_true).pow(2).mean()           # ()     loss for optimization (scalar)!
                error = torch.abs(y_pred.detach()-y_true).mean()  # (B,1)  error for batch samples
        
                return {'loss':loss, 'score': error}              # if no score, you can return loss
        
        model = Model()        
        ```
        
        Training and validation datasets can be standard `DataLoader`.
        For small datasets, you can also use the faster loader  `Data` from the library:
        
        ```python
        from qunet import Data, Trainer
        
        num, val = 1000, 900
        X = torch.rand(num)
        Y = 2*X + torch.randn(X.shape)
        
        data_trn = Data( (X[:val], Y[:val]) )
        data_val = Data( (X[val:], Y[val:]) )
        ```
        
        After that, we create an instance of the trainer, pass the model and data to it.
        Set the optimizer at the trainer and start training:
        
        ```python                                             
        trainer = Trainer(model, data_trn, data_val)
        trainer.set_optimizer( torch.optim.SGD(model.parameters(), lr=1e-2) )
        trainer.fit(epochs=10, period_plot=5, monitor=['loss'])
        ```
        
        This is all!
        
        Let's make a small overview of the library.
        A more detailed introduction can be found in the document [Quick start](doc/intro.md), documents describing the various modules of the library, and notebooks dedicated to various deep learning tasks.
        <hr>
        
        ## Trainer
        
        The trainer is a key object of the QuNet library. It solves the following tasks:
        * Model training and validation.
        * Visualization of the learning process, with ample opportunities for its customization.
        * Calculation of optimal breakpoints based on the best local and smoothed metrics.
        * Saving the best models by loss or score, as well as saving checkpoints.
        * Combining different training schedulers
        * For large models, switch to half precision and use the gradient accumulation buffer.
        * Use of multiple callback objects that can be embedded in different parts of the pipeline.
        
        Below is an example of visualization:
        
        <center>
        <img src="doc/img/loss.png" style="width:800px;">
        </center>
        
        ```
        val_loss:  best = 0.190465[293], smooth21 = 0.199713[296], last21 = 0.210965 В± 0.019436
        trn_loss:  best = 0.209042[234], smooth21 = 0.244457[299], last21 = 0.293281 В± 0.043728
        
        val_score: best = 0.942300[291], smooth21 = 0.938188[295], last21 = 0.934581 В± 0.000000
        trn_score: best = 0.929560[234], smooth21 = 0.916017[299], last21 = 0.898531 В± 0.005823
        
        epochs=300, samples=15000000, steps=30000
        times=(trn:214.34, val:11.69)m,  42.87 s/epoch, 428.68 s/10^3 steps,  857.35 s/10^6 samples
        ```
        
        Example of learning curves of various [schedulers](doc/schedules.md):
        
        <center>
        <img src="doc/img/schedulers.png" style="width:800px;">
        </center>
        
        
        <hr>
        
        ## ModelState
        
        The standalone `ModelState` class is a powerful replacement for libraries such as torchinfo.
        It allows you to display information about submodules and their parameters.
        ```
        Transformer                            params           data
        в”њв”Ђ ModuleList                                                           ->                 <  blocks
        в”‚  в””в”Ђ TransformerBlock                                   (1, 10, 64)    -> (1, 10, 64)     <  blocks[0]
        в”‚     в””в”Ђ Residual                                        (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].fft
        в”‚        в””в”Ђ FFT                                          (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].fft.module
        в”‚           в””в”Ђ Dropout(0)                                (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].fft.module.drop        
        в”‚        в””в”Ђ LayerNorm                     128         |  (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].fft.norm
        в”‚     в””в”Ђ Residual                                        (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att
        в”‚        в””в”Ђ Attention                                    (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att.module
        в”‚           в””в”Ђ Linear(64->192)         12,480  ~  25% |  (1, 10, 64)    -> (1, 10, 192)    <  blocks[0].att.module.c_attn      
        в”‚           в””в”Ђ Linear(64->64)           4,160  ~   8% |  (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att.module.c_proj      
        в”‚           в””в”Ђ Dropout(0)                                (1, 4, 10, 10) -> (1, 4, 10, 10)  <  blocks[0].att.module.att_dropout 
        в”‚           в””в”Ђ Dropout(0)                                (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att.module.res_dropout 
        в”‚        в””в”Ђ LayerNorm                     128         |  (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att.norm
        в”‚     в””в”Ђ Residual                                        (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].mlp
        в”‚        в””в”Ђ MLP                                          (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].mlp.module
        в”‚           в””в”Ђ Sequential                                (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].mlp.module.layers      
        в”‚              в””в”Ђ Linear(64->256)      16,640  ~  33% |  (1, 10, 64)    -> (1, 10, 256)    <  blocks[0].mlp.module.layers[0]   
        в”‚              в””в”Ђ GELU                                   (1, 10, 256)   -> (1, 10, 256)    <  blocks[0].mlp.module.layers[1]   
        в”‚              в””в”Ђ Dropout(0)                             (1, 10, 256)   -> (1, 10, 256)    <  blocks[0].mlp.module.layers[2]   
        в”‚              в””в”Ђ Linear(256->64)      16,448  ~  33% |  (1, 10, 256)   -> (1, 10, 64)     <  blocks[0].mlp.module.layers[3]   
        в”‚        в””в”Ђ LayerNorm                     128         |  (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].mlp.norm
        =============================================
        trainable:                             50,115
        ```
        
        During training, `ModelState` keeps track of gradients and smoothes values:
        ```
         #                                           params          |mean|  [     min,      max ]  |grad|   shape
        -------------------------------------------------------------------------------------
          0: blocks.0.fft.gamma                            1           0.200  [   0.200,    0.200]   1.3e+02  ()
          1: blocks.0.fft.norm.weight                     64           1.000  [   1.000,    1.000]   4.7e-01  (64,)
          2: blocks.0.fft.norm.bias                       64           0.000  [   0.000,    0.000]   2.2e-01  (64,)
          ...
        ```
        
        
        <hr>
        
        ## Modules
        
        The library has many ready-made modules for building various architectures of neural networks:
        
        * MLP
        * Transformer
        * CNN
        * ResCNN
        * ProjViT
        * ResCNN3D
        * GNN
        
        Most modules have debugging and visualization tools.
        For example, this is how the visualization of the learning process of a transformer, consisting of 10 blocks, looks like.
        
        <center>
        <img src="doc/img/transf.png" style="width:800px;">
        </center>
        
        
        Such diagrams allow you to analyze the problem areas of the network and change them in the learning process.
        <hr>
        
        
        ## Docs
        
        <hr>
        
        ## Examples
        
        * <a href="https://colab.research.google.com/drive/179sHb3WyHNrSJKGLfKrXaAzvShmS1SSf?usp=sharing">Interpolation_F(x)</a> - interpolation of a function of one variable (example of setting up a training plot; working with the list of schedulers; adding a custom plot)
        * <a href="https://colab.research.google.com/drive/1N4b6mwUvH-o-t6VIiuhq7FMuGRabdOm0?usp=sharing">MNIST</a> - recognition of handwritten digits 0-9 (example using pytorch DataLoader, model predict, show errors, confusion matrix)
        * <a href="https://colab.research.google.com/drive/1ThxnMrAjuFTGKXLI-93oRa9doNpP32y4?usp=sharing">CIFAR10</a>  (truncated EfficientNet, pre-trained parameters, bone freezing, augmentation)
        * Vanishing gradient
        * Regression_1D - visualization of changes in model parameters
        
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
