Metadata-Version: 2.4
Name: csv_fix
Version: 0.2.0
Summary: A command line tool for CSV file processing
Author-email: Your Name <your.email@example.com>
License: MIT
Project-URL: Homepage, https://github.com/yourusername/csv_fix
Project-URL: Repository, https://github.com/yourusername/csv_fix
Keywords: csv,converter,cli
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Utilities
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Dynamic: license-file

# csv_conv

[![PyPI version](https://img.shields.io/pypi/v/csv_conv.svg)](https://pypi.org/project/csv_conv/)
[![Python versions](https://img.shields.io/pypi/pyversions/csv_conv.svg)](https://pypi.org/project/csv_conv/)

## Instruction

This script uses state machine to fix maleformed csv file.

```txt
Usage: csv_conv.py [Options] <filename>
Options and arguments:
  [-h/--help]: Show this message.
  [-s]: Define sperator. Defaults to comma.
  [-q]: Define text qualifier. Defaults to auto detect.
  [-t]: Trim white space at the beginning and end of each field. Defaults to double quote.
  [-z]: (Being constructed) Specify timezone for time fields. Defaults to server timezone. Can also be Asia/Chongqing etc.
        For standard timezone names, refer to: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
  [-k]: Skip errors and continue parsing following lines.
  <filename>: csv file name.
```

## Examples
```bash
cat testcase/test_case_1.csv | csv_conv
csv_conv -t testcase/test_case_1.csv
```

## Features

### Quote within Quote

Quotes should be immediated followed by seperator or line end or EOF, otherwise it's not a valid qualifier, and should be escaped. The following format should be fixed (_See `../testcase/test_case_1.csv`_):

```csv
"What""s up!","I"m good",I"m also good,"I'm still good"","two quotes" shouldn't make any difference""
```

```bash
./csv_conv.py ../testcase/test_case_1.csv
"What""s up!","I""m good","I""m also good","""I'm still good""","two quotes"" shouldn't make any difference"""
```

### Begin/End with White Space

For fields that begin/end with white space are stripped by default. Otherwise `mongoimport` type conversion wouldn't work properly (_See `../testcase/test_case_2.csv`_):

```csv
 red,	yellow,green ,"red	"
 ```

```bash
./csv_conv.py ../testcase/test_case_2.csv
"red","yellow","green","red"
```

If it's not expected behavior, use `-t false` to cancel it.

### Customize Seperator

Seperator can be customized, not necessarily to be ",". Specify seperator by `-s`. Note that seperators like `|` needs to be escaped in bash. E.g. (_See `../testcase/test_case_3.csv`_):

```csv
"hello"||"This is a test"||"This should be good.||"||""
```

```bash
./csv_conv.py -s '||' ../testcase/test_case_3.csv
"hello","This is a test","This should be good.||""",
```
