Metadata-Version: 2.1
Name: annonex2embl
Version: 0.9.5
Summary: Converts an annotated DNA multi-sequence alignment (in NEXUS format) to an EMBL flatfile for submission to ENA via the Webin-CLI submission tool
Home-page: https://github.com/michaelgruenstaeudl/annonex2embl
Author: Michael Gruenstaeudl, PhD
Author-email: m.gruenstaeudl@fu-berlin.de
License: BSD
Description: *annonex2embl*
        ==============
        
        [![Build Status](https://travis-ci.com/michaelgruenstaeudl/annonex2embl.svg?branch=master)](https://travis-ci.com/michaelgruenstaeudl/annonex2embl)
        [![PyPI status](https://img.shields.io/pypi/status/annonex2embl.svg)](https://pypi.python.org/pypi/annonex2embl/)
        [![PyPI pyversions](https://img.shields.io/pypi/pyversions/annonex2embl.svg)](https://pypi.python.org/pypi/annonex2embl/)
        [![PyPI version shields.io](https://img.shields.io/pypi/v/annonex2embl.svg)](https://pypi.python.org/pypi/annonex2embl/)
        [![PyPI license](https://img.shields.io/pypi/l/annonex2embl.svg)](https://pypi.python.org/pypi/annonex2embl/)
        
        Converts an annotated DNA multi-sequence alignment (in [NEXUS](http://wiki.christophchamp.com/index.php?title=NEXUS_file_format) format) to an EMBL flatfile for submission to [ENA](http://www.ebi.ac.uk/ena) via the [Webin-CLI submission tool](https://ena-docs.readthedocs.io/en/latest/cli_05.html).
        
        
        ## INSTALLATION
        To get the most recent stable version of *annonex2embl*, run:
        
            pip install annonex2embl
        
        Or, alternatively, if you want to get the latest development version of *annonex2embl*, run:
        
            pip install git+https://github.com/michaelgruenstaeudl/annonex2embl.git
        
        
        ## INPUT, OUTPUT AND PREREQUISITES
        * **Input**: an annotated DNA multiple sequence alignment in NEXUS format; and a comma-delimited (CSV) metadata table
        * **Output**: a submission-ready, multi-record EMBL flatfile
        
        #### Requirements / Input preparation
        The annotations of a NEXUS file are specified via [SETS-block](http://hydrodictyon.eeb.uconn.edu/eebedia/index.php/Phylogenetics:_NEXUS_Format), which is located beneath a DATA-block and defines sets of characters in the DNA alignment. In such a SETS-block, every gene and every exon charset must be accompanied by one CDS charset. Other charsets can be defined unaccompanied.
        
        #### Example of a complete SETS-BLOCK
        ```
        BEGIN SETS;
        CHARSET matK_gene_forward = 929-2530;
        CHARSET matK_CDS_forward = 929-2530;
        CHARSET trnK_intron_forward = 1-928 2531-2813;
        END;
        ```
        
        #### Examples of corresponding DESCR variable
        ```
        DESCR="tRNA-Lys (trnK) intron, partial sequence; maturase K (matK) gene, complete sequence"
        ```
        
        ## EXAMPLE USAGE
        #### On Linux / MacOS
        ```
        SCRPT=$PWD/scripts/annonex2embl_launcher_CLI.py
        INPUT=examples/input/TestData1.nex
        METAD=examples/input/Metadata.csv
        OTPUT=examples/temp/TestData1.embl
        DESCR='description of alignment here'  # Do not use double-quotes
        EMAIL=your_email_here@yourmailserver.com
        AUTHR='your name here'  # Do not use double-quotes
        MNFTS=PRJEB00000
        MNFTD=${DESCR//[^[:alnum:]]/_}
        
        python3 $SCRPT -n $INPUT -c $METAD -d "$DESCR" -e $EMAIL -a "$AUTHR" -o $OTPUT --productlookup --manifeststudy $MNFTS --manifestdescr $MNFTD --compress
        ```
        
        #### On Windows
        ```
        SET SCRPT=$PWD\scripts\annonex2embl_launcher_CLI.py
        SET INPUT=examples\input\TestData1.nex
        SET METAD=examples\input\Metadata.csv
        SET OTPUT=examples\temp\TestData1.embl
        SET DESCR='description of alignment here'
        SET EMAIL=your_email_here@yourmailserver.com
        SET AUTHR='your name here'
        SET MNFTS=PRJEB00000
        SET MNFTD=a_unique_description_here
        
        python %SCRPT% -n %INPUT% -c %METAD% -d %DESCR% -e %EMAIL% -a %AUTHR% -o %OTPUT% --productlookup --manifeststudy %MNFTS% --manifestdescr %MNFTD% --compress
        ```
        
        ## TO DO
        * ~~Would it be possible to have the CDS definition of Taxon 4 in TestData2.embl changed from "complement(join(19..27,<11..16))" to "complement(join(<11..16,19..27))", while the automatic translation remains "MLSLL"? (I know that we chatted about it, but I don't remember the precise discussion anymore. Something about the meaning of "complement".) If it is possible, then please adjust the code accordingly. If not, then please write down an explanation sentence for me as to why that would mess up the translation.~~
        Currently the output is "complement(join(<11..16,19..27))"
        
        <!--
        ## TESTING
            python3 -m unittest discover -s tests -p "*_test.py"
            python3 -m unittest discover -s tests -p "*_test.py" -v  # verbose version
            pytest  # on Linux only, if python-pytest installed via pip
        -->
        
        ## CHANGELOG
        See [`CHANGELOG.md`](CHANGELOG.md) for a list of recent changes to the software.
        
Keywords: novel DNA sequences,public sequence databases,European Nucleotide Archive,file conversion,flatfile
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.7
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Description-Content-Type: text/markdown
