Metadata-Version: 2.1
Name: asrp
Version: 0.0.9
Summary: UNKNOWN
Home-page: https://github.com/voidful/asrp
Author: Voidful
Author-email: voidful.stack@gmail.com
License: Apache
Keywords: asr
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: Unidecode
Requires-Dist: jiwer
Requires-Dist: transformers
Requires-Dist: soundfile
Requires-Dist: editdistance

# asrp

ASR text preprocessing utility

## install

`pip install asrp`

## usage - preprocess

input: dictionary, with key `sentence`    
output: preprocessed result, inplace handling.

```python
import asrp

batch_data = {
    'sentence': "I'm fine, thanks."
}
asrp.fun_en(batch_data)
```

dynamic loading

```python
import asrp

batch_data = {
    'sentence': "I'm fine, thanks."
}
preprocessor = getattr(asrp, 'fun_en')
preprocessor(batch_data)
```

## usage - evaluation

```python
import asrp

targets = ['HuggingFace is great!', 'Love Transformers!', 'Let\'s wav2vec!']
preds = ['HuggingFace is awesome!', 'Transformers is powerful.', 'Let\'s finetune wav2vec!']
print("chunk size WER: {:2f}".format(100 * asrp.chunked_wer(targets, preds, chunk_size=None)))
print("chunk size CER: {:2f}".format(100 * asrp.chunked_cer(targets, preds, chunk_size=None)))
```

## usage - hubertcode

```python
import asrp

hc = asrp.HubertCode("facebook/hubert-large-ll60k", './km_feat_100_layer_20', 20)
hc('voice file path')
```

