Metadata-Version: 2.1
Name: TExtractor
Version: 0.1
Summary: Extract text content from many filetypes.
Home-page: http://bitbucket.org/whitie/textractor2/
Author: Thorsten Weimann
Author-email: weimann.th@yahoo.com
License: MIT
Keywords: text extract pdf docx
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Description-Content-Type: text/x-rst
Requires-Dist: pdfminer.six
Requires-Dist: pluginbase
Requires-Dist: chardet

TExtractor
==========

Usage::

    >>> from textractor import TExtractor
    >>> extractor = TExtractor()
    >>> extractor.index('test.docx')
    ['workflow_history', 'portal_workflow', 'review_history',
     'implementation', 'organizations', 'Illustrations', ...]
    >>> extractor.index('test.pdf')
    ['workflow_history', 'portal_workflow', 'review_history',
     'implementation', 'organizations', 'Illustrations', ...]
    >>>



