Metadata-Version: 2.1
Name: bertmoticon
Version: 1.0.1
Summary: multilingual emoji prediction
Home-page: https://github.com/Stefanos-stk/Bertmoticon
Author: Stefanos Stoikos
Author-email: st.stoikos@gmail.com
License: MIT
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Description-Content-Type: text/markdown
Requires-Dist: torch
Requires-Dist: transformers
Requires-Dist: requests

# Bertmoticon 

The Bertmoticon package is fine-tuned from the [BERT](https://github.com/google-research/bert) model, to the emoji prediction task. It can predict emojis in 102 languages. In this package we include two functions that enable the use of it: [bertmoticon.infer](#bertmoticon.infer) and [bertmoticon.infer_mappings](#bertmoticon.infer_mappings). The number of emojis available for this model are 80; and are listed in [bertmoticon.emojis](#bertmoticon.emojis). 

## Installation

Installing the Bertmoticon package from [PyPI](https://pypi.org/) using:

```
pip3 install bertmoticon
```
## Importing in python
Importing the package can be done as:
```
import bertmoticon
```
If the model is not already downloaded; upon first run it will download and extract the model automatically as such:
```
Downloading bermoticon model
[=                                                          ]
...
[==================                                         ]
...
[===========================================================]
Extracting the model
```
The model is not included with the pypi installation. It requires 1.34 GB. Loads it either into CUDA or CPU based on CUDA availability.
## Usage


## bertmoticon.emojis
The model can predict up to 80 emojis. Acceessing the emojis can be done by calling the global variable ```emojis``` called as ```bertmoticon.emojis```. 
```
>>> print(bertmoticon.emojis)
['😂', '😭', '😍', '😊', '🙏', '😅', '😁', '🙄', '😘', '😔', '😩', '😉', '😎', '😢', '😆', '😋', '😌', '😳', '😏', '🙂', '😃', '🙃', '😒', '😜', '😀', '😱', '🙈', '😄', '😡', '😬', '🙌', '😴', '😫', '😪', '😤', '😇', '😈', '😞', '😷', '😣', '😥', '😝', '😑', '😓', '😕', '😹', '😐', '😻', '😖', '😛', '😠', '🙊', '😰', '😚', '😲', '😶', '😮', '🙁', '😵', '😗', '😟', '😨', '🙇', '🙋', '😙', '😯', '🙆', '🙉', '😧', '😿', '😸', '🙀', '😦', '😽', '😺', '😼', '🙅', '😾', '🙍', '🙎']
```

## bertmoticon.infer

Takes in a ```list``` of ```strings``` and an ```int``` number of guesses. It returns a list of dictionaries, where each dictionary contains an emoji and a corresponding percentage.

```
>>> ls_of_strings =  ["Vote #TRUMP2020ToSaveAmerica from corrupt Joe Biden and the radical left.","Je veux aller dormir. #fatigué"]
>>> print(bertmoticon.infer(ls_of_strings,3))
    [{'😂': '0.1938', '😡': '0.1866', '🙄': '0.0847'}, {'😴': '0.1547', '😭': '0.1507', '😩': '0.0892'}]
```
## bertmoticon.infer_mappings
Takes in a ```list``` of ```strings```, a dictionary ```dict``` of the emoji mappings, and an ```int``` number of guesses. It returns the number of occurences of each key value. We define the dictionary and the list as follows:

```
>>> mappings = {"Anger":['😡'], "Other":['😂','😭']}
>>> ls_of_strings =  ["Vote #TRUMP2020ToSaveAmerica from corrupt Joe Biden and the radical left.","Je veux aller dormir. #fatigué"]
```
The key values are the category names and the values are lists of the emojis contained in that category. Then parsed into the ```bertmoticon.infer_mappings``` returns:
```
>>>print(bertmoticon.infer_mappings(ls_of_strings,mappings,3))
{'Anger': 1, 'Other': 2}
```



