Metadata-Version: 1.2
Name: NanoStat
Version: 1.0.0
Summary: Calculate statistics for Oxford Nanopore sequencing data and alignments
Home-page: https://github.com/wdecoster/nanostat
Author: Wouter De Coster
Author-email: decosterwouter@gmail.com
License: MIT
Description-Content-Type: UNKNOWN
Description: NanoStat
        ========
        
        Calculate various statistics from a long read sequencing dataset in
        fastq, bam or albacore sequencing summary format.
        
        |Twitter URL| |install with conda| |Build Status|
        
        INSTALLATION
        ~~~~~~~~~~~~
        
        | ``pip install nanostat``
        | or
        | ``conda install -c bioconda nanostat``
        
        USAGE
        ~~~~~
        
        ::
        
            NanoStat [-h] [-v] [-o OUTDIR] [-p PREFIX] [-n NAME] [-t N]
                            [--barcoded] [--readtype {1D,2D,1D2}]
                            (--fastq file [file ...] | --fasta file [file ...] | --summary file [file ...] | --bam file [file ...])
        
            Calculate statistics of long read sequencing dataset.
        
            General options:
              -h, --help            show the help and exit
              -v, --version         Print version and exit.
              -o, --outdir OUTDIR   Specify directory in which output has to be created.
              -p, --prefix PREFIX   Specify an optional prefix to be used for the output file.
              -n, --name NAME       Specify a filename/path for the output, stdout is the default.
              -t, --threads N       Set the allowed number of threads to be used by the script.
        
            Input options.:
              --barcoded            Use if you want to split the summary file by barcode
              --readtype {1D,2D,1D2}
                                    Which read type to extract information about from summary. Options are 1D, 2D,
                                    1D2
        
            Input data sources, one of these is required.:
              --fastq file [file ...]
                                    Data is in one or more (compressed) fastq file(s).
              --fasta file [file ...]
                                    Data is in one or more (compressed) fasta file(s).
              --summary file [file ...]
                                    Data is in one or more (compressed) summary file(s)generated by albacore.
              --bam file [file ...]
                                    Data is in one or more sorted bam file(s).
        
            EXAMPLES:
              NanoStat --fastq reads.fastq.gz --outdir statreports
              NanoStat --summary sequencing_summary1.txt sequencing_summary2.txtsequencing_summary3.txt --readtype 1D2
              NanoStat --bam alignment.bam alignment2.bam
        
        EXAMPLES
        ^^^^^^^^
        
        ::
        
            NanoStat --fastq reads.fastq.gz --outdir statreports
            NanoStat --summary sequencing_summary1.txt sequencing_summary2.txt sequencing_summary3.txt --readtype 1D2
            NanoStat --bam alignment.bam alignment2.bam
        
        Example output
        ~~~~~~~~~~~~~~
        
        ::
        
            General summary:     
            Number of reads:    3995
            Total bases:    11418359
            Median read length: 1221.0
            Mean read length:   2858.2
            Read length N50:    8676
            Active channels:    933
            Mean read quality:  10.2
            Median read quality:    10.6
            Top 5 longest reads and their mean basecall quality score
            1:  36928 (10.8, [a9dbd2b5-718c-4d0c-afa8-a12a54a5a12a])
            2:  32830 (10.2, [b87fc717-1cf8-4526-9f96-3042fda5b769])
            3:  30474 (12.4, [ea3e43d8-6cbf-4687-95bd-66e6123512d4])
            4:  27531 (12.5, [74c0e08c-eb94-4825-b93b-21d63e05cf14])
            5:  26535 (10.4, [8e6ed505-8477-4462-9f0a-3a72783cbf60])
            Top 5 highest mean basecall quality scores and their read lengths
            1:  14.8 (1040, [acf6f90b-ea22-4960-8049-6e6e694a3f9a])
            2:  14.7 (9603, [ec796da1-5c4a-4350-974b-6dabb8deb546])
            3:  14.6 (680, [792c485a-81cb-4ef7-8f23-01f10f9c7c23])
            4:  14.5 (2664, [d8092ffb-9919-42fb-ad41-34b1658f1bd5])
            5:  14.5 (909, [d55d3bf6-0729-4b46-82cd-0cef00bcf849])
            Number and percentage of reads above quality cutoffs
            >Q5:    3559 (89.1%)
            >Q7:    3429 (85.8%)
            >Q10:   2705 (67.7%)
            >Q12:   1072 (26.8%)
            >Q15:   0 (0.0%)
        
        I welcome all suggestions, bug reports, feature requests and
        contributions. Please leave an
        `issue <https://github.com/wdecoster/nanostat/issues>`__ or open a pull
        request. I will usually respond within a day, or rarely within a few
        days.
        
        .. |Twitter URL| image:: https://img.shields.io/twitter/url/https/twitter.com/wouter_decoster.svg?style=social&label=Follow%20%40wouter_decoster
           :target: https://twitter.com/wouter_decoster
        .. |install with conda| image:: https://anaconda.org/bioconda/nanostat/badges/installer/conda.svg
           :target: https://anaconda.org/bioconda/nanostat
        .. |Build Status| image:: https://travis-ci.org/wdecoster/nanostat.svg?branch=master
           :target: https://travis-ci.org/wdecoster/nanostat
        
Keywords: nanopore sequencing statistics
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Requires-Python: >=3
