Metadata-Version: 2.0
Name: PYEVALB
Version: 0.1.3
Summary: Scoring tools for bracket tree banks.
Home-page: https://github.com/flyaway1217/PYEVALB
Author: Flyaway
Author-email: flyaway1217@gmail.com
License: GNU
Keywords: score bracket tree banks
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: Programming Language :: Python :: 3.2
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Requires-Dist: pytablewriter (>=0.10.2)

PYEVALB
=========

EVEVALB is a python version of Evalb_ which is used to score the bracket tree banks.

Installation
=============
::

    pip install PYEVALB

Examples
=========

Score two corpus
----------------

.. code:: python

    from PYEVALB import scorer

    gold_path = 'gold_corpus.txt'
    test_path = 'test_corpus.txt'
    result_path = 'result.txt'

    scorer.evalb(gold_path, test_path, result_path)

And the result would be:

.. code::

     ID | length | state | recall | prec | matched_brackets | gold_brackets | test_brackets | cross_brackets | words | correct_tags | tag_accracy 
    ---:|-------:|------:|-------:|-----:|-----------------:|--------------:|--------------:|---------------:|------:|-------------:|------------:
       0|      44|      0|    0.57|  0.61|                31|             54|             51|              16|     44|            43|         0.98
       1|      13|      0|    0.64|  0.60|                 9|             14|             15|               3|     13|            12|         0.92
       2|      29|      0|    0.97|  0.97|                29|             30|             30|               0|     29|            29|         1.00
       3|      20|      0|    0.80|  0.80|                20|             25|             25|               4|     20|            20|         1.00
       4|      19|      0|    0.91|  1.00|                21|             23|             21|               0|     19|            19|         1.00
       5|      71|      0|    0.67|  0.68|                52|             78|             77|              15|     71|            65|         0.92
       6|      16|      0|    0.61|  0.69|                11|             18|             16|               0|     16|            14|         0.88
       7|      27|      0|    0.92|  0.96|                24|             26|             25|               0|     27|            26|         0.96
       8|      19|      0|    1.00|  1.00|                20|             20|             20|               0|     19|            19|         1.00
       9|      41|      0|    0.80|  0.78|                32|             40|             41|               5|     41|            39|         0.95

    =================================================================================================================================================
    Number of sentence:	10.00
    Number of Error sentence:	0.00
    Number of Skip  sentence:	0.00
    Number of Valid sentence:	10.00
    Bracketing Recall:	75.91
    Bracketing Precision:	77.57
    Bracketing FMeasure:	76.73
    Complete match:	10.00
    Average crossing:	4.30
    No crossing:	50.00
    Tagging accuracy:	95.65

Score two trees
---------------

.. code:: python

    from PYEVALB import scorer
    from PYEVALB import parser

    gold = '(IP (NP (PN 这里)) (VP (ADVP (AD 便)) (VP (VV 产生) (IP (NP (QP (CD 一) (CLP (M 个))) (DNP (NP (JJ 结构性)) (DEG 的)) (NP (NN 盲点))) (PU ：) (IP (VP (VV 臭味相投) (PU ，) (VV 物以类聚)))))) (PU 。))'

    test = '(IP (IP (NP (PN 这里)) (VP (ADVP (AD 便)) (VP (VV 产生) (NP (QP (CD 一) (CLP (M 个))) (DNP (ADJP (JJ 结构性)) (DEG 的)) (NP (NN 盲点)))))) (PU ：) (IP (NP (NN 臭味相投)) (PU ，) (VP (VV 物以类聚))) (PU 。))'

    gold_tree = parser.create_from_bracket_string(gold)
    test_tree = parser.create_from_bracket_string(test)

    result = scorer.score_trees(gold_tree, test_tree)

    print('Recall =' + str(result.recall))
    print('Precision =' + str(result.prec))

And the result is:

.. code::

    Recall = 64.29
    Precision =  56.25


TODO
====

1. Remove the dependency of pytablewriter_
2. Add more configurations, such as limiting the length of sentence.

.. _Evalb: http://nlp.cs.nyu.edu/evalb/
.. _pytablewriter: https://github.com/thombashi/pytablewriter/blob/master/README.rst



