Metadata-Version: 2.1
Name: casanova
Version: 0.1.0
Summary: Specialized & performant CSV readers, writers and enrichers for python.
Home-page: http://github.com/medialab/casanova
Author: Guillaume Plique
Author-email: kropotkinepiotr@gmail.com
License: MIT
Description: [![Build Status](https://travis-ci.org/medialab/casanova.svg)](https://travis-ci.org/medialab/casanova)
        
        # Casanova
        
        If you often find yourself reading CSV files using python, you will quickly notice that, while being more comfortable, `csv.DictReader` remains way slower than `csv.reader`:
        
        ```
        # To read a 1.5G CSV file:
        csv.reader: 24s
        csv.DictReader: 84s
        ```
        
        Casanova is therefore an attempt to stick to `csv.reader` performance while still keeping a comfortable interface, still able to consider headers etc.
        
        Casanova is thus a good fit for you if you need to:
        
        * Stream large CSV files without running out of memory
        * Enrich the same CSV files by outputing a similar file, all while adding, filtering and editing cells.
        * Have the possibility to resume said enrichment if your process exited
        * Do so in a threadsafe fashion, and be able to resume even if your output does not have the same order as the input
        
        ## Installation
        
        You can install `casanova` with pip with the following command:
        
        ```
        pip install casanova
        ```
        
        ## Usage
        
        * [reader](#reader)
        
        ## reader
        
        ```python
        import casanova
        
        with open('./file.csv') as f:
        
          # Interested in a single column?
          for url in casanova.reader(f, column='url'):
            print(url)
        
          # No headers?
          for url in casanova.reader(f, column=0, no_headers=True):
            print(url)
        
          # Interested in several columns
          for title, url in casanova.reader(f, columns=('title', 'url')):
            print(title, url)
        
          # Working on records
          for record in casanova.reader(f, columns=('title', 'url')):
            # record is a namedtuple based on your columns
            print(record[0], record.url)
        
          # Records slow you down? Need to go faster?
          # You can iterate directly on rows and use the reader's recorded positions
          reader = casanova.reader(f, columns=('title', 'url'))
          url_pos = reader.pos.url
        
          for row in reader.rows():
            print(row[url_pos])
        ```
        
        ### Arguments
        
        * **file** *file*: file object to read.
        * **column** *?str|int*: name or index of target column.
        * **colums** *?iterable<str|int>*: iterable of name or index of target columns.
        
        ### Attributes
        
        * *pos* *int|namedtuple<int>*: index of target column or named tuple of target columns.
        
        ### Methods
        
        #### __iter__
        
        Lets you iterate on a single value or on a namedtuple record.
        
        #### rows
        
        Lets you iterate over the original `csv.reader`.
        
Keywords: url
Platform: UNKNOWN
Requires-Python: >=2.7
Description-Content-Type: text/markdown
