Metadata-Version: 2.4
Name: bodyig-cpm
Version: 0.0.2
Summary: A character-pair merging module for Classical Tibetan.
Author-email: Elya Brown <elyavyale@gmail.com>
Project-URL: Homepage, https://github.com/merzsielen/Bodyig-CPM
Keywords: Classical Tibetan,character-pairs
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: bumpver; extra == "dev"
Requires-Dist: pip-tools; extra == "dev"
Dynamic: license-file

# Bodyig-CPM
 A module containing a character-pair merging algorithm catered to the orthographic nuances of Classical Tibetan. As Tibetan is written without spaces and segmented instead based on syllables (using punctuation 
 particular to the script), Bodyig-CPM permits those working with the language to more easily perform character-pair merging which remains sensitive to sentence-, word-, and syllable-boundaries.

 Provided a plaintext corpus (optionally, with each sentence on a new line) and the desired number of merge-iterations, Bodyig-CPM returns a set of merge rules and the corpus with these rules applied.

 (TBD: Insert example sentences.)
