Metadata-Version: 2.1
Name: wrapperWSD
Version: 0.0.2
Summary: Word Sense Disambiguation wrapper
Home-page: UNKNOWN
Author: Henry Rosales
Author-email: hrosmendez@gmail.com
License: UNKNOWN
Description: # Word Sense Disambiguation wrapper
        
        In natural language processing **word sense disambiguation** (WSD) is the problem of determining which "sense" (meaning) of a word is activated by the use of the word in a particular context, a process which appears to be largely unconscious in people.
        
        This is a simple library that wrap two WSD methods: NLTK and Babelfy. 
        
        ## Requirements
        You should run 
        ```bash
        pip3 install xmltodict
        pip3 install nltk
        pip3 install pywsd
        ```
        The NLTK library requires more extra configurations, see this [link](https://pythonprogramming.net/installing-nltk-nlp-python/) to more details.
        
        ## Methods
        The ```wsdNLTK``` methods call the function ```pywsd.disambiguate``` which returns a mapping between words of the input text and their WornNet Synsets. 
        ```python
        wsd = WrapperWSD()
        wsd.wsdNLTK(u'My sister has a dog. She loves him.')
        #output: [('sister', Synset('sister.n.02'), 3, 9), ('dog', Synset('pawl.n.01'), 16, 19), ('loves', Synset('sleep_together.v.01'), 25, 30)]
        ```
        
        Instead of returning the WornNet Synsets, the method ```wsdNLTK_offset``` returns a mapping between words of the input text and their WornNet offset.  
        
        ```python
        wsd.wsdNLTK_offset(u'My sister has a dog. She loves him.')
        #output: [('president', 597265, 21, 30), ('USA', 8394922, 38, 41), ('best', 67379, 54, 58)]
        ```
        
        A mapping between WordNet and Wikipedia was proposed in  **[Miller et al]** available for download [here](https://www.informatik.tu-darmstadt.de/media/ukp/data/fileupload_2/lexical_resources/MillerGurevych2014_alignment.tar_1.zip).  In the next example you can see some key-values of it.
        
        ```python
        wd2wiki = {
         1740: 'https://en.wikipedia.org/wiki/Madison_Square_Garden,_L.P.',
         2137: 'https://en.wikipedia.org/wiki/Abstraction',
         2452: 'https://en.wikipedia.org/wiki/Object_(philosophy)',
         2684: 'https://en.wikipedia.org/wiki/Computer_file',
         3553: 'https://en.wikipedia.org/wiki/Unit_of_alcohol',
         ...
         }
        ```
        
        We used this mapping to link entities from Wikipedia for those cases where exists a correspondence.
        
        ```python
        wsd.wsdNLTK_links(u'My sister has a dog. She loves him.')
        #output: [{'start': 38, 'end': 41, 'label': 'USA', 'link': 'United_States_Army'}]
        ```
        
        On the other hand, we include Babelfy targetting BabelSynsets
        ```python
        wsd.wsdBabelfy(u'My sister has a dog. She loves him.')
        #output: [('sister', 'bn:00071838n', 3, 9), ('dog', 'bn:00015267n', 16, 19), ('loves', 'bn:00090504v', 25, 30)]
        ```
        ## Reference
        
        **[Miller et al]**  *WordNet–Wikipedia–Wiktionary: Construction of a Three-way Alignment*. Tristan Miller and Iryna Gurevych. 2014 [https://pdfs.semanticscholar.org/90cd/22a9cd59dc1fc21f4ec36e9c7d95085f7fb6.pdf](https://pdfs.semanticscholar.org/90cd/22a9cd59dc1fc21f4ec36e9c7d95085f7fb6.pdf)
        
Keywords: Word Sense Disambiguation,NLP
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Education
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4
Description-Content-Type: text/markdown
Provides-Extra: dev
Provides-Extra: test
