Metadata-Version: 2.1
Name: Anuvaad-Tokenizer
Version: 0.0.2
Summary: Tokenizer by Anuvaad 
Home-page: UNKNOWN
Author: Anuvaad
Author-email: nlp-nmt@tarento.com
License: MIT
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Requires-Python: >=3.6.0
Description-Content-Type: text/markdown
Requires-Dist: nltk

# Anuvaad Tokenizer

Anuvaad Tokenizer is a python package, which can be used to tokenize paragraphs into sentences. It supports most of the Indian languages including English. This Tokenizer is built using regular expressions.

## Prerequisites

- python >= 3.6

## Installation
``` pip install Anuvaad_Tokenizer==0.0.2 ```

## Author

Anuvaad (nlp-nmt@tarento.com)

# Usage Example

## For English
```
from Anuvaad_Tokenizer.AnuvaadEnTokenizer import AnuvaadEnTokenizer 

para=" "  
tokenized_text = AnuvaadEnTokenizer().tokenize(para)
```
## For Hindi
```
from Anuvaad_Tokenizer.AnuvaadHiTokenizer import AnuvaadHiTokenizer

para=" "
tokenized_text = AnuvaadHiTokenizer().tokenize(para)
```
## For Kannada
```
from Anuvaad_Tokenizer.AnuvaadKnTokenizer import AnuvaadKnTokenizer

para=" "
tokenized_text = AnuvaadKnTokenizer().tokenize(para)
```
## For Telugu
```
from Anuvaad_Tokenizer.AnuvaadTeTokenizer import AnuvaadTeTokenizer

para=" "
tokenized_text = AnuvaadTeTokenizer().tokenize(para)
```
## LICENSE

MIT License 2021 
Developer - Anuvaad

