Metadata-Version: 2.4
Name: btok
Version: 0.2
Summary: A Python tokenizer trained on modern web corpus
Author: Hansimov
Project-URL: Homepage, https://github.com/Hansimov/btok
Project-URL: Issues, https://github.com/Hansimov/btok/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: tclogger
Requires-Dist: sentencepiece
Dynamic: license-file

# BTok

A Python multilingual tokenizer trained on modern web corpus with [SentencePiece](https://github.com/google/sentencepiece).

![](https://img.shields.io/pypi/v/btok?label=btok&color=blue&cacheSeconds=60)

## Install

```sh
pip install btok --upgrade
```

## Usage

Run tests:

```sh
python tests.py
```

See: [tests.py](https://github.com/Hansimov/btok/blob/main/tests.py)
