Metadata-Version: 2.1
Name: Orange3-Text
Version: 1.16.1
Summary: Orange3 TextMining add-on.
Home-page: https://github.com/biolab/orange3-text
Download-URL: https://github.com/biolab/orange3-text/tarball/1.16.1
Author: Bioinformatics Laboratory, FRI UL
Author-email: info@biolab.si
Keywords: orange3-text,data mining,orange3 add-on
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: Orange3>=3.35.0
Requires-Dist: anyqt
Requires-Dist: beautifulsoup4
Requires-Dist: biopython
Requires-Dist: conllu
Requires-Dist: docx2txt>=0.6
Requires-Dist: gensim>=4.3.3
Requires-Dist: httpx!=0.23.1
Requires-Dist: langdetect
Requires-Dist: lemmagen3
Requires-Dist: nltk>=3.9.1
Requires-Dist: numpy
Requires-Dist: odfpy>=1.3.5
Requires-Dist: orange-canvas-core
Requires-Dist: orange-widget-base>=4.20.0
Requires-Dist: owlready2
Requires-Dist: pandas
Requires-Dist: pypdf
Requires-Dist: pyqtgraph
Requires-Dist: pyyaml
Requires-Dist: requests
Requires-Dist: scikit-learn
Requires-Dist: scipy
Requires-Dist: serverfiles
Requires-Dist: shapely>=2.0
Requires-Dist: simhash>=1.11
Requires-Dist: six
Requires-Dist: trimesh>=3.9.8
Requires-Dist: tweepy>=4.0.0
Requires-Dist: ufal.udpipe>=1.2.0.3
Requires-Dist: wikipedia
Requires-Dist: yake
Provides-Extra: test
Requires-Dist: coverage; extra == "test"
Provides-Extra: doc
Requires-Dist: sphinx; extra == "doc"
Requires-Dist: recommonmark; extra == "doc"
Requires-Dist: sphinx_rtd_theme; extra == "doc"
Requires-Dist: docutils; extra == "doc"

Orange3 Text
============

Orange add-on for text mining. It provides access to publicly available data,
like NY Times, Twitter and PubMed. Further, it provides tools for preprocessing,
constructing vector spaces (like bag-of-words, topic modeling and word2vec) and
visualizations like word cloud end geo map. All features can be combined with
powerful data mining techniques from the Orange data mining framework.

See [documentation](http://orange3-text.readthedocs.org/).

Features
--------
#### Access to data
* Load a corpus of text documents
* Access publicly available data (The Guardian, NY Times, Twitter, Wikipedia, PubMed)

#### Text analysis
* Preprocess corpus
* Generate bag of words
* Embed documents into vector space
* Perform sentiment analysis
* Detect emotions in tweets
* Discover topics in the text
* Compute document statistics
* Visualize frequent words in the word cloud
* Find words that enrich selected documents
