Metadata-Version: 1.1
Name: PyStanfordDependencies
Version: 0.3.1
Summary: Python interface for converting Penn Treebank trees to Universal Dependencies and Stanford Dependencies
Home-page: http://github.com/dmcc/PyStanfordDependencies
Author: David McClosky
Author-email: notsoweird+pystanforddependencies@gmail.com
License: Apache 2.0
Description: PyStanfordDependencies
        ======================
        
        .. image:: https://travis-ci.org/dmcc/PyStanfordDependencies.svg?branch=master
            :target: https://travis-ci.org/dmcc/PyStanfordDependencies
        
        .. image:: https://badge.fury.io/py/PyStanfordDependencies.png
           :target: https://badge.fury.io/py/PyStanfordDependencies
        
        .. image:: https://coveralls.io/repos/dmcc/PyStanfordDependencies/badge.png?branch=master
           :target: https://coveralls.io/r/dmcc/PyStanfordDependencies?branch=master
        
        Python interface for converting `Penn Treebank
        <http://www.cis.upenn.edu/~treebank/>`_ trees to `Universal
        Dependencies <http://universaldependencies.github.io/docs/>`_
        and `Stanford Dependencies
        <http://nlp.stanford.edu/software/stanford-dependencies.shtml>`_.
        
        Example usage
        -------------
        Start by getting a ``StanfordDependencies`` instance with
        ``StanfordDependencies.get_instance()``::
        
            >>> import StanfordDependencies
            >>> sd = StanfordDependencies.get_instance(backend='subprocess')
        
        ``get_instance()`` takes several options. ``backend`` can currently
        be ``subprocess`` or ``jpype`` (see below). If you have an existing
        `Stanford CoreNLP <http://nlp.stanford.edu/software/corenlp.shtml>`_ or
        `Stanford Parser <http://nlp.stanford.edu/software/lex-parser.shtml>`_
        jar file, use the ``jar_filename`` parameter to point to the full path of
        the jar file. Otherwise, PyStanfordDependencies will download a jar file
        for you and store it in locally (``~/.local/share/pystanforddeps``). You
        can request a specific version with the ``version`` flag, e.g.,
        ``version='3.4.1'``. To convert trees, use the ``convert_trees()`` or
        ``convert_tree()`` method (note that by default, ``convert_trees()`` can
        be considerably faster if you're doing batch conversion). These return
        a sentence (list of ``Token`` objects) or a list of sentences (list of
        list of ``Token`` objects) respectively::
        
            >>> sent = sd.convert_tree('(S1 (NP (DT some) (JJ blue) (NN moose)))')
            >>> for token in sent:
            ...     print token
            ...
            Token(index=1, form='some', cpos='DT', pos='DT', head=3, deprel='det')
            Token(index=2, form='blue', cpos='JJ', pos='JJ', head=3, deprel='amod')
            Token(index=3, form='moose', cpos='NN', pos='NN', head=0, deprel='root')
        
        This tells you that ``moose`` is the head of the sentence and is
        modified by ``some`` (with a ``det`` = determiner relation) and ``blue``
        (with an ``amod`` = adjective modifier relation). Fields on ``Token``
        objects are readable as attributes. See docs for additional options in
        ``convert_tree()`` and ``convert_trees()``.
        
        Visualization
        -------------
        
        If you have the `asciitree <https://pypi.python.org/pypi/asciitree>`_
        package, you can use a prettier ASCII formatter::
        
            >>> print sent.as_asciitree()
             moose [root]
              +-- some [det]
              +-- blue [amod]
        
        If you have Python 2.7 or later, you can use `Graphviz
        <http://graphviz.org/>`_ to render your graphs. You'll need the `Python
        graphviz <https://pypi.python.org/pypi/graphviz>`_ package to call
        ``as_dotgraph()``::
        
            >>> dotgraph = sent.as_dotgraph()
            >>> print dotgraph
            digraph {
                    0 [label=root]
                    1 [label=some]
                            3 -> 1 [label=det]
                    2 [label=blue]
                            3 -> 2 [label=amod]
                    3 [label=moose]
                            0 -> 3 [label=root]
            }
            >>> dotgraph.render('moose') # renders a PDF by default
            'moose.pdf'
            >>> dotgraph.format = 'svg'
            >>> dotgraph.render('moose')
            'moose.svg'
        
        The Python `xdot <https://pypi.python.org/pypi/xdot>`_
        package provides an interactive visualization::
        
            >>> import xdot
            >>> window = xdot.DotWindow()
            >>> window.set_dotcode(dotgraph.source)
        
        Both ``as_asciitree()`` and ``as_dotgraph()`` allow customization.
        See the docs for additional options.
        
        Backends
        --------
        Currently PyStanfordDependencies includes two backends:
        
        - ``subprocess`` (works anywhere with a ``java`` binary, but more
          overhead so batched conversions with ``convert_trees()`` are
          recommended)
        - ``jpype`` (requires `jpype1 <https://pypi.python.org/pypi/JPype1>`_,
          faster than the subprocess backend, also includes access to the Stanford
          CoreNLP lemmatizer)
        
        By default, PyStanfordDependencies will attempt to use the ``jpype``
        backend. If ``jpype`` isn't available or crashes on startup,
        PyStanfordDependencies will fallback to ``subprocess`` with a warning.
        
        Universal Dependencies status
        -----------------------------
        PyStanfordDependencies supports most features in `Universal Dependencies
        <http://universaldependencies.github.io/docs/>`_ (see `issue #10
        <https://github.com/dmcc/PyStanfordDependencies/issues/10>`_ for the
        most up to date status). PyStanfordDependencies output matches Universal
        Dependencies in terms of structure and dependency labels, but Universal
        POS tags and features are missing. Currently, PyStanfordDependencies will
        output Universal Dependencies by default (unless you're using Stanford
        CoreNLP 3.5.1 or earlier).
        
        Related projects
        ----------------
        - `clearnlp-converter <https://pypi.python.org/pypi/clearnlp-converter/>`_
          (uses `clearnlp <http://www.clearnlp.com/>`_ instead of `Stanford
          CoreNLP <http://nlp.stanford.edu/software/corenlp.shtml>`_ for
          dependency conversion)
        
        More information
        ----------------
        Licensed under `Apache 2.0 <http://www.apache.org/licenses/LICENSE-2.0>`_.
        
        Written by David McClosky (`homepage
        <http://nlp.stanford.edu/~mcclosky/>`_, `code <http://github.com/dmcc>`_)
        
        Bug reports and feature requests: `GitHub issue tracker
        <http://github.com/dmcc/PyStanfordDependencies/issues>`_
        
        Release summaries
        -----------------
        - 0.3.1 (2015.11.02): Better collapsed universal handling, bugfixes
        - 0.3.0 (2015.10.09): Support copy nodes, more input checking/debugging
          help, example ``convert.py`` program
        - 0.2.0 (2015.08.02): Universal Dependencies support (mostly),
          Python 3 support (fully), minor API updates
        - 0.1.7 (2015.06.13): Bugfixes for ``JPype``, handle version mismatches
          in IBM Java
        - 0.1.6 (2015.02.12): Support for ``graphviz`` formatting, CoreNLP 3.5.1,
          better Windows portability
        - 0.1.5 (2015.01.10): Support for ASCII tree formatting
        - 0.1.4 (2015.01.07): Fix ``CCprocessed`` support
        - 0.1.3 (2015.01.03): Bugfixes, coveralls integration, refactoring
        - 0.1.2 (2015.01.02): Better CoNLL structures, test suite and Travis CI
          support, bugfixes
        - 0.1.1 (2014.12.15): More docs, fewer bugs
        - 0.1 (2014.12.14): Initial release
        
Keywords: parsing,syntax,linguistics,language,artificial intelligence,machine learning,NLP,natural language,natural language processing,Stanford CoreNLP,Stanford Dependencies,Universal Dependencies
Platform: POSIX
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Natural Language :: English
Classifier: Operating System :: POSIX
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
