Metadata-Version: 2.1
Name: NERDA
Version: 0.0.29
Summary: A Framework for Finetuning Transformers for Named Entity Recognition
Home-page: https://github.com/ebanalyse/NERDA
Author: PIN
Author-email: lars.kjeldgaard@eb.dk
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: torch
Requires-Dist: transformers (==3.5.1)
Requires-Dist: sklearn
Requires-Dist: nltk
Requires-Dist: pandas
Requires-Dist: pyconll

# NERDA - UNDER CONSTRUCTION <img src="https://raw.githubusercontent.com/ebanalyse/NERDA/main/logo.png" align="right" height=250/>

![Build status](https://github.com/ebanalyse/NERDA/workflows/build/badge.svg)
[![codecov](https://codecov.io/gh/ebanalyse/NERDA/branch/main/graph/badge.svg?token=OB6LGFQZYX)](https://codecov.io/gh/ebanalyse/NERDA)
![PyPI](https://img.shields.io/pypi/v/NERDA.svg)
![PyPI - Downloads](https://img.shields.io/pypi/dm/NERDA?color=green)
![License](https://img.shields.io/badge/license-MIT-blue.svg)

## !!! UNDER CONSTRUCTION!!!!
`NERDA` is not only a mesmerizing muppet-like character. `NERDA` is also
a python package, that offers a complete framework for fine-tuning 
pretrained [`huggingface` transformers](https://huggingface.co/) 
for Named Entity Recognition (=NER) tasks.

## Installation guide
```
pip install NERDA
```

## NER tasks
Named Entity Recognition (NER) tasks are all about identifying and 
extracting names of named entitites from natural language texts. 

Read more about NER on [Wikipedia](https://en.wikipedia.org/wiki/Named-entity_recognition).

## Performance

The table below summarizes the performance (=**F1-scores**) of the model
 configurations, that `NERDA` ships with.

| **Level**     | **MBERT** | **DABERT** | **ELECTRA** | **XLMROBERTA** | **DISTILMBERT** |
|---------------|-----------|------------|-------------|----------------|-----------------|
| B-PER         | 0.92      | 0.93       | 0.92        | 0.94           | 0.89            |      
| I-PER         | 0.97      | 0.99       | 0.97        | 0.99           | 0.96            |   
| B-ORG         | 0.68      | 0.79       | 0.65        | 0.78           | 0.66            |     
| I-ORG         | 0.67      | 0.79       | 0.72        | 0.77           | 0.61            |   
| B-LOC         | 0.86      | 0.85       | 0.79        | 0.87           | 0.80            |     
| I-LOC         | 0.33      | 0.32       | 0.44        | 0.24           | 0.29            |     
| B-MISC        | 0.73      | 0.74       | 0.61        | 0.77           | 0.70            |     
| I-MISC        | 0.70      | 0.86       | 0.65        | 0.91           | 0.61            |   
| **AVG_MICRO** | 0.81      | 0.85       | 0.79        | 0.86           | 0.78            |      
| **AVG_MACRO** | 0.73      | 0.78       | 0.72        | 0.78           | 0.69            |

**AVG_** stands for micro- and macro AVeraGed F1-scores.

## '`NERDA`'?
'`NERDA`' originally stands for *'Named Entity Recognition for DAnish'*. However, this
is somewhat misleading, since the functionality is no longer limited to Danish. 
On the contrary it generalizes to all other languages, i.e. NERDA supports 
fine-tuning of transformer-based models for NER tasks for any arbitrary 
language.

## Read more
The documentation for `NERDA` including code references and
examples can be accessed [here](https://ebanalyse.github.io/NERDA/).

## Contact
We hope, that you will find `NERDA` useful.

Please direct any questions and feedbacks to
[us](mailto:lars.kjeldgaard@eb.dk)!

If you want to contribute (which we encourage you to), open a
[PR](https://github.com/ebanalyse/NERDA/pulls).

If you encounter a bug or want to suggest an enhancement, please 
[open an issue](https://github.com/ebanalyse/NERDA/issues).



