Metadata-Version: 2.1
Name: allennlp-shiba
Version: 0.0.1
Summary: AllenNLP integration for Shiba: Japanese CANINE model
Home-page: https://github.com/shunk031/allennlp-shiba-model
License: Apache-2.0
Keywords: natural language processing,deep learning,transformers,allennlp
Author: Shunsuke KITADA
Author-email: shunsuke.kitada.0831@gmail.com
Requires-Python: >=3.7,<4.0
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: allennlp (>=2.5.0,<3.0.0)
Requires-Dist: shiba-model (>=0.1.0,<0.2.0)
Project-URL: Repository, https://github.com/shunk031/allennlp-shiba-model
Description-Content-Type: text/markdown

# Allennlp Integration for [Shiba](https://github.com/octanove/shiba)

[![CI](https://github.com/shunk031/allennlp-shiba-model/actions/workflows/ci.yml/badge.svg)](https://github.com/shunk031/allennlp-shiba-model/actions/workflows/ci.yml)

`allennlp-shiab-model` is a Python library that provides AllenNLP integration for [shiba-model](https://pypi.org/project/shiba-model/).

> SHIBA is an approximate reimplementation of CANINE [[1]](https://github.com/octanove/shiba#1) in raw Pytorch, pretrained on the Japanese wikipedia corpus using random span masking. If you are unfamiliar with CANINE, you can think of it as a very efficient (approximately 4x as efficient) character-level BERT model. Of course, the name SHIBA comes from the identically named Japanese canine.

## Example

This library enables users to specify the in a jsonnet config file. Here is an example of the model in jsonnet config file:

```json
{
    "dataset_reader": {
        "tokenizer": {
            "type": "shiba",
        },
        "token_indexers": {
            "tokens": {
                "type": "shiba",
            }
        },
    },
    "model": {
        "shiba_embedder": {
            "type": "basic",
            "token_embedders": {
                "shiba": {
                    "type": "shiba",
                    "eval_model": true,
                }
            }

        }
    }
}
```


## Reference

- Joshua Tanner and Masato Hagiwara (2021). [SHIBA: Japanese CANINE model](https://github.com/octanove/shiba). GitHub repository, GitHub.


