Metadata-Version: 2.1
Name: cantoseg
Version: 0.0.1
Summary: Cantonese segmentation tool 粵語分詞工具
Home-page: https://github.com/ayaka14732/cantoseg
Author: ayaka14732
Author-email: ayaka@mail.shn.hk
License: MIT
Project-URL: Bug Reports, https://github.com/ayaka14732/cantoseg/issues
Project-URL: Source, https://github.com/ayaka14732/cantoseg
Keywords: cantonese chinese natural-language-processing
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Natural Language :: Cantonese
Classifier: Natural Language :: Chinese (Traditional)
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Python: >=3.5, <4
Description-Content-Type: text/markdown
Requires-Dist: jieba

# cantoseg ![](https://github.com/ayaka14732/cantoseg/workflows/Python%20package/badge.svg)

Cantonese segmentation tool 粵語分詞工具

## Install

```sh
$ pip install cantoseg
```

## Usage

```python
>>> import cantoseg
>>> cantoseg.cut('香港喺舊石器時代就有人住')
['香港', '喺', '舊石器時代', '就', '有人', '住']
```

A generator version is also available: `cantoseg.lcut`.

## Design

See article [_Cantonese Segmentation and Part-Of-Speech Tagging_](https://ayaka.shn.hk/yueseg/hant/) (in Chinese).


