Metadata-Version: 2.1
Name: Wolaita_POST
Version: 1.1.3
Summary: A POS tagger for the Wolaita language using deep learning
Home-page: https://github.com/Sisagegn/Wolaita_POST
Author: Sisagegn Samuel
Author-email: samuelsisagegn@gmail.com
Project-URL: Documentation, https://github.com/Sisagegn/Wolaita_POST/wiki
Project-URL: Source, https://github.com/Sisagegn/Wolaita_POST
Project-URL: Tracker, https://github.com/Sisagegn/Wolaita_POST/issues
Keywords: Wolaita POS tagging NLP deep learning
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: tensorflow>=2.0.0
Requires-Dist: numpy>=1.18.0
Requires-Dist: nltk>=3.5
Requires-Dist: fasttext>=0.9.2
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: sphinx>=4.0; extra == "dev"
Provides-Extra: gpu
Requires-Dist: tensorflow-gpu>=2.0.0; extra == "gpu"

# Wolaita_POST

## Overview

Wolaita_POST is a Python framework tailored for accurate Part-of-Speech (POS) tagging of the Wolaita language. Leveraging advanced deep learning models, including Bi-GRU and others, it integrates FastText embeddings to enhance tagging performance. The framework uses pretrained models, streamlining deployment and boosting accuracy. Designed for researchers and developers working with Natural Language Processing (NLP) in lesser-resourced languages, Wolaita_POST provides a robust solution for Wolaita language text analysis, making it a valuable tool in the NLP field.

## Features
- Accurate POS Tagging: Utilizes deep learning models (Bi-GRU, Bi-LSTM, etc.) to achieve precise Part-of-Speech tagging for Wolaita language text.
- Pretrained Models: Ready-to-use pretrained models for quick deployment and high accuracy.
- FastText Embeddings: Incorporates FastText word embeddings to capture subword information and improve performance on low-resource languages.
- Easy Integration: Simple API that allows researchers and developers to integrate POS tagging into their NLP pipelines.
- Supports Wolaita Language: Specifically designed for the Wolaita language, addressing the challenges of processing lesser-resourced languages.
- Customizable: Flexible configuration to accommodate different models, tokenizers, and word vectors based on project requirements.
- Efficient Deployment: Enables easy deployment for various NLP applications, such as machine translation and named entity recognition (NER).

## Installation
To install Wolaita_POST, you can use pip:
- !pip install Wolaita_POST

##Usage

After installation, you can use Wolaita_POST as follows:
1. Import the package:

from Wolaita_POST import WolaitaPOSTagger

2. Set file paths for your pretrained model, word vectors, and tokenizers:

import os

base_dir = "/content/drive/MyDrive"  # Replace with the actual path

# Set the relative paths

model_path = os.path.join(base_dir, "last_model/last_model/Bi_GRU_model.keras")

fasttext_model_path = os.path.join(base_dir, "FastText_and_embedding_matrix/fasttext_model.bin")

word_tokenizer_path = os.path.join(base_dir, "POS/word_tokenizer.pkl")

tag_tokenizer_path = os.path.join(base_dir, "POS/tag_tokenizer.pkl")


3. Initialize the POS tagger:

pos_tagger = WolaitaPOSTagger(

    model_path=model_path,
    
    word_vector_path=fasttext_model_path,
    
    word_tokenizer_path=word_tokenizer_path,
    
    tag_tokenizer_path=tag_tokenizer_path
    
)

4. Use the POS tagger to tag Wolaita text:

text = ['Insert your sample text here']

tagged_text = pos_tagger.tag(text)

print(tagged_text)

The tagged_text will contain the part-of-speech tags for the given Wolaita text.

##Running Tests

If you want to verify functionality, you can use pytest. Run this command in your project directory:

- !pytest /content/drive/MyDrive/Wolaita_POST/tests > test_report.txt

##License

This project is licensed under the MIT License. See the LICENSE file for more details.

##Contributing

Contributions are welcome! If you have suggestions for improving the package or find any issues, feel free to open a pull request or submit an issue on GitHub.

##Acknowledgements

Special thanks to the developers and researchers who contributed to this project, making it possible to expand NLP resources for the Wolaita language.

