Metadata-Version: 2.4
Name: autocsv-profiler
Version: 1.1.0
Summary: Comprehensive automated CSV data analysis with statistical insights and visualizations
Author-email: dhaneshbb <dhaneshbb5@gmail.com>
Maintainer-email: dhaneshbb <dhaneshbb5@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/dhaneshbb/AutoCSV-Profiler-Suite
Project-URL: Repository, https://github.com/dhaneshbb/AutoCSV-Profiler-Suite
Project-URL: Issues, https://github.com/dhaneshbb/AutoCSV-Profiler-Suite/issues
Project-URL: Changelog, https://github.com/dhaneshbb/AutoCSV-Profiler-Suite/blob/main/CHANGELOG.md
Project-URL: Documentation, https://github.com/dhaneshbb/AutoCSV-Profiler-Suite/tree/main/docs
Keywords: csv,data-analysis,statistics,eda,data-science,profiling
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=1.5.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: scipy>=1.10.0
Requires-Dist: matplotlib>=3.6.0
Requires-Dist: seaborn>=0.12.0
Requires-Dist: scikit-learn>=1.2.0
Requires-Dist: statsmodels>=0.13.0
Requires-Dist: tqdm>=4.64.0
Requires-Dist: tableone>=0.7.12
Requires-Dist: missingno>=0.5.2
Requires-Dist: tabulate>=0.9.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: black>=22.0.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Requires-Dist: mypy>=0.991; extra == "dev"
Provides-Extra: test
Requires-Dist: pytest>=7.0.0; extra == "test"
Requires-Dist: pytest-cov>=4.0.0; extra == "test"
Provides-Extra: docs
Requires-Dist: sphinx>=5.0.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=1.0.0; extra == "docs"
Dynamic: license-file

# AutoCSV Profiler

A comprehensive toolkit for automated CSV data analysis providing statistical insights, data quality assessment, and interactive visualizations.

[![PyPI version](https://badge.fury.io/py/autocsv-profiler.svg)](https://badge.fury.io/py/autocsv-profiler)
[![Python Support](https://img.shields.io/pypi/pyversions/autocsv-profiler.svg)](https://pypi.org/project/autocsv-profiler/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Features

- **Comprehensive Statistical Analysis**: Descriptive statistics, distributions, and data summaries
- **Data Quality Assessment**: Missing value analysis, outlier detection, and duplicate identification
- **Advanced Visualizations**: Box plots, histograms, correlation matrices, and KDE plots
- **Interactive Reports**: HTML reports with detailed insights and recommendations
- **Command-Line Interface**: Easy-to-use CLI for immediate analysis
- **Python API**: Programmatic access for integration into data pipelines

## Installation

```bash
pip install autocsv-profiler
```

## Quick Start

### Command Line Usage

```bash
# Basic analysis
autocsv-profiler data.csv

# Specify output directory
autocsv-profiler data.csv --output ./my_analysis

# Custom delimiter
autocsv-profiler data.csv --delimiter ";"
```

### Python API Usage

```python
from autocsv_profiler import auto_csv_profiler

# Run comprehensive analysis
auto_csv_profiler.main("data.csv", "output_directory")

# Or import specific functions
from autocsv_profiler.recognize_delimiter import detect_delimiter

delimiter = detect_delimiter("data.csv")
print(f"Detected delimiter: {delimiter}")
```

## Analysis Workflow

<img src="data:image/svg+xml;base64,<svg aria-roledescription="flowchart-v2" role="graphics-document document" viewBox="0 0 2089.703125 694" style="max-width: 2089.7px; background-color: white;" class="flowchart" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" width="100%" id="my-svg"><style>#my-svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#my-svg .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#my-svg .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#my-svg .error-icon{fill:#552222;}#my-svg .error-text{fill:#552222;stroke:#552222;}#my-svg .edge-thickness-normal{stroke-width:1px;}#my-svg .edge-thickness-thick{stroke-width:3.5px;}#my-svg .edge-pattern-solid{stroke-dasharray:0;}#my-svg .edge-thickness-invisible{stroke-width:0;fill:none;}#my-svg .edge-pattern-dashed{stroke-dasharray:3;}#my-svg .edge-pattern-dotted{stroke-dasharray:2;}#my-svg .marker{fill:#333333;stroke:#333333;}#my-svg .marker.cross{stroke:#333333;}#my-svg svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#my-svg p{margin:0;}#my-svg .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#my-svg .cluster-label text{fill:#333;}#my-svg .cluster-label span{color:#333;}#my-svg .cluster-label span p{background-color:transparent;}#my-svg .label text,#my-svg span{fill:#333;color:#333;}#my-svg .node rect,#my-svg .node circle,#my-svg .node ellipse,#my-svg .node polygon,#my-svg .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#my-svg .rough-node .label text,#my-svg .node .label text,#my-svg .image-shape .label,#my-svg .icon-shape .label{text-anchor:middle;}#my-svg .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#my-svg .rough-node .label,#my-svg .node .label,#my-svg .image-shape .label,#my-svg .icon-shape .label{text-align:center;}#my-svg .node.clickable{cursor:pointer;}#my-svg .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#my-svg .arrowheadPath{fill:#333333;}#my-svg .edgePath .path{stroke:#333333;stroke-width:2.0px;}#my-svg .flowchart-link{stroke:#333333;fill:none;}#my-svg .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#my-svg .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#my-svg .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#my-svg .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#my-svg .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#my-svg .cluster text{fill:#333;}#my-svg .cluster span{color:#333;}#my-svg div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#my-svg .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#my-svg rect.text{fill:none;stroke-width:0;}#my-svg .icon-shape,#my-svg .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#my-svg .icon-shape p,#my-svg .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#my-svg .icon-shape rect,#my-svg .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#my-svg .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#my-svg .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#my-svg :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}</style><g><marker orient="auto" markerHeight="8" markerWidth="8" markerUnits="userSpaceOnUse" refY="5" refX="5" viewBox="0 0 10 10" class="marker flowchart-v2" id="my-svg_flowchart-v2-pointEnd"><path style="stroke-width: 1; stroke-dasharray: 1, 0;" class="arrowMarkerPath" d="M 0 0 L 10 5 L 0 10 z"/></marker><marker orient="auto" markerHeight="8" markerWidth="8" markerUnits="userSpaceOnUse" refY="5" refX="4.5" viewBox="0 0 10 10" class="marker flowchart-v2" id="my-svg_flowchart-v2-pointStart"><path style="stroke-width: 1; stroke-dasharray: 1, 0;" class="arrowMarkerPath" d="M 0 5 L 10 10 L 10 0 z"/></marker><marker orient="auto" markerHeight="11" markerWidth="11" markerUnits="userSpaceOnUse" refY="5" refX="11" viewBox="0 0 10 10" class="marker flowchart-v2" id="my-svg_flowchart-v2-circleEnd"><circle style="stroke-width: 1; stroke-dasharray: 1, 0;" class="arrowMarkerPath" r="5" cy="5" cx="5"/></marker><marker orient="auto" markerHeight="11" markerWidth="11" markerUnits="userSpaceOnUse" refY="5" refX="-1" viewBox="0 0 10 10" class="marker flowchart-v2" id="my-svg_flowchart-v2-circleStart"><circle style="stroke-width: 1; stroke-dasharray: 1, 0;" class="arrowMarkerPath" r="5" cy="5" cx="5"/></marker><marker orient="auto" markerHeight="11" markerWidth="11" markerUnits="userSpaceOnUse" refY="5.2" refX="12" viewBox="0 0 11 11" class="marker cross flowchart-v2" id="my-svg_flowchart-v2-crossEnd"><path style="stroke-width: 2; stroke-dasharray: 1, 0;" class="arrowMarkerPath" d="M 1,1 l 9,9 M 10,1 l -9,9"/></marker><marker orient="auto" markerHeight="11" markerWidth="11" markerUnits="userSpaceOnUse" refY="5.2" refX="-1" viewBox="0 0 11 11" class="marker cross flowchart-v2" id="my-svg_flowchart-v2-crossStart"><path style="stroke-width: 2; stroke-dasharray: 1, 0;" class="arrowMarkerPath" d="M 1,1 l 9,9 M 10,1 l -9,9"/></marker><g class="root"><g class="clusters"/><g class="edgePaths"><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_A_B_0" d="M1130.656,62L1130.656,66.167C1130.656,70.333,1130.656,78.667,1130.656,86.333C1130.656,94,1130.656,101,1130.656,104.5L1130.656,108"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_B_C_0" d="M1130.656,166L1130.656,170.167C1130.656,174.333,1130.656,182.667,1130.656,190.333C1130.656,198,1130.656,205,1130.656,208.5L1130.656,212"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_C_D_0" d="M1130.656,270L1130.656,274.167C1130.656,278.333,1130.656,286.667,1130.656,294.333C1130.656,302,1130.656,309,1130.656,312.5L1130.656,316"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_D_E_0" d="M1010.203,355.238L903.556,362.532C796.909,369.825,583.615,384.413,476.967,395.206C370.32,406,370.32,413,370.32,416.5L370.32,420"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_D_F_0" d="M1130.656,374L1130.656,378.167C1130.656,382.333,1130.656,390.667,1130.656,398.333C1130.656,406,1130.656,413,1130.656,416.5L1130.656,420"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_D_G_0" d="M1251.109,357.004L1335.384,364.003C1419.659,371.003,1588.208,385.001,1672.483,395.501C1756.758,406,1756.758,413,1756.758,416.5L1756.758,420"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_E_H_0" d="M274.102,470.447L247.257,475.872C220.411,481.298,166.721,492.149,139.876,501.074C113.031,510,113.031,517,113.031,520.5L113.031,524"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_E_I_0" d="M370.32,478L370.32,482.167C370.32,486.333,370.32,494.667,370.32,502.333C370.32,510,370.32,517,370.32,520.5L370.32,524"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_E_J_0" d="M466.539,470.782L492.656,476.152C518.773,481.522,571.008,492.261,597.125,501.13C623.242,510,623.242,517,623.242,520.5L623.242,524"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_F_K_0" d="M1013.469,475.419L991.409,480.016C969.349,484.613,925.229,493.806,903.169,501.903C881.109,510,881.109,517,881.109,520.5L881.109,524"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_F_L_0" d="M1130.656,478L1130.656,482.167C1130.656,486.333,1130.656,494.667,1130.656,502.333C1130.656,510,1130.656,517,1130.656,520.5L1130.656,524"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_F_M_0" d="M1247.844,476.74L1267.77,481.117C1287.695,485.493,1327.547,494.247,1347.473,502.123C1367.398,510,1367.398,517,1367.398,520.5L1367.398,524"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_G_N_0" d="M1662.156,478L1647.557,482.167C1632.958,486.333,1603.76,494.667,1589.161,502.333C1574.563,510,1574.563,517,1574.563,520.5L1574.563,524"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_G_O_0" d="M1756.758,478L1756.758,482.167C1756.758,486.333,1756.758,494.667,1756.758,502.333C1756.758,510,1756.758,517,1756.758,520.5L1756.758,524"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_G_P_0" d="M1872.124,478L1889.928,482.167C1907.731,486.333,1943.338,494.667,1961.142,502.333C1978.945,510,1978.945,517,1978.945,520.5L1978.945,524"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_H_Q_0" d="M113.031,582L113.031,586.167C113.031,590.333,113.031,598.667,267.918,610.748C422.804,622.829,732.576,638.658,887.463,646.573L1042.349,654.488"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_I_Q_0" d="M370.32,582L370.32,586.167C370.32,590.333,370.32,598.667,482.326,610.493C594.331,622.32,818.342,637.641,930.348,645.301L1042.353,652.961"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_J_Q_0" d="M623.242,582L623.242,586.167C623.242,590.333,623.242,598.667,693.096,609.992C762.95,621.317,902.657,635.635,972.511,642.793L1042.365,649.952"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_K_Q_0" d="M881.109,582L881.109,586.167C881.109,590.333,881.109,598.667,907.996,608.436C934.882,618.205,988.655,629.41,1015.541,635.013L1042.428,640.615"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_L_Q_0" d="M1130.656,582L1130.656,586.167C1130.656,590.333,1130.656,598.667,1130.656,606.333C1130.656,614,1130.656,621,1130.656,624.5L1130.656,628"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_M_Q_0" d="M1367.398,582L1367.398,586.167C1367.398,590.333,1367.398,598.667,1342.645,608.27C1317.891,617.874,1268.383,628.749,1243.629,634.186L1218.876,639.623"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_N_Q_0" d="M1574.563,582L1574.563,586.167C1574.563,590.333,1574.563,598.667,1515.292,609.776C1456.022,620.886,1337.482,634.772,1278.212,641.715L1218.942,648.658"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_O_Q_0" d="M1756.758,582L1756.758,586.167C1756.758,590.333,1756.758,598.667,1667.124,610.278C1577.49,621.889,1398.223,636.778,1308.589,644.222L1218.955,651.666"/><path marker-end="url(#my-svg_flowchart-v2-pointEnd)" style="" class="edge-thickness-normal edge-pattern-solid edge-thickness-normal edge-pattern-solid flowchart-link" id="L_P_Q_0" d="M1978.945,582L1978.945,586.167C1978.945,590.333,1978.945,598.667,1852.281,610.598C1725.617,622.529,1472.289,638.058,1345.625,645.822L1218.961,653.587"/></g><g class="edgeLabels"><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g><g class="edgeLabel"><g transform="translate(0, 0)" class="label"><foreignObject height="0" width="0"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" class="labelBkg" xmlns="http://www.w3.org/1999/xhtml"><span class="edgeLabel"></span></div></foreignObject></g></g></g><g class="nodes"><g transform="translate(1130.65625, 35)" id="flowchart-A-0" class="node default"><rect height="54" width="159.90625" y="-27" x="-79.953125" style="fill:#e8f5e8 !important" class="basic label-container"/><g transform="translate(-49.953125, -12)" style="" class="label"><rect/><foreignObject height="24" width="99.90625"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Input CSV File</p></span></div></foreignObject></g></g><g transform="translate(1130.65625, 139)" id="flowchart-B-1" class="node default"><rect height="54" width="201.546875" y="-27" x="-100.7734375" style="" class="basic label-container"/><g transform="translate(-70.7734375, -12)" style="" class="label"><rect/><foreignObject height="24" width="141.546875"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Delimiter Detection</p></span></div></foreignObject></g></g><g transform="translate(1130.65625, 243)" id="flowchart-C-3" class="node default"><rect height="54" width="245.453125" y="-27" x="-122.7265625" style="" class="basic label-container"/><g transform="translate(-92.7265625, -12)" style="" class="label"><rect/><foreignObject height="24" width="185.453125"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Data Loading &amp; Validation</p></span></div></foreignObject></g></g><g transform="translate(1130.65625, 347)" id="flowchart-D-5" class="node default"><rect height="54" width="240.90625" y="-27" x="-120.453125" style="" class="basic label-container"/><g transform="translate(-90.453125, -12)" style="" class="label"><rect/><foreignObject height="24" width="180.90625"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Exploratory Data Analysis</p></span></div></foreignObject></g></g><g transform="translate(370.3203125, 451)" id="flowchart-E-7" class="node default"><rect height="54" width="192.4375" y="-27" x="-96.21875" style="" class="basic label-container"/><g transform="translate(-66.21875, -12)" style="" class="label"><rect/><foreignObject height="24" width="132.4375"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Statistical Analysis</p></span></div></foreignObject></g></g><g transform="translate(1130.65625, 451)" id="flowchart-F-9" class="node default"><rect height="54" width="234.375" y="-27" x="-117.1875" style="" class="basic label-container"/><g transform="translate(-87.1875, -12)" style="" class="label"><rect/><foreignObject height="24" width="174.375"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Data Quality Assessment</p></span></div></foreignObject></g></g><g transform="translate(1756.7578125, 451)" id="flowchart-G-11" class="node default"><rect height="54" width="235.515625" y="-27" x="-117.7578125" style="" class="basic label-container"/><g transform="translate(-87.7578125, -12)" style="" class="label"><rect/><foreignObject height="24" width="175.515625"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Visualization Generation</p></span></div></foreignObject></g></g><g transform="translate(113.03125, 555)" id="flowchart-H-13" class="node default"><rect height="54" width="210.0625" y="-27" x="-105.03125" style="" class="basic label-container"/><g transform="translate(-75.03125, -12)" style="" class="label"><rect/><foreignObject height="24" width="150.0625"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Descriptive Statistics</p></span></div></foreignObject></g></g><g transform="translate(370.3203125, 555)" id="flowchart-I-15" class="node default"><rect height="54" width="204.515625" y="-27" x="-102.2578125" style="" class="basic label-container"/><g transform="translate(-72.2578125, -12)" style="" class="label"><rect/><foreignObject height="24" width="144.515625"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Distribution Analysis</p></span></div></foreignObject></g></g><g transform="translate(623.2421875, 555)" id="flowchart-J-17" class="node default"><rect height="54" width="201.328125" y="-27" x="-100.6640625" style="" class="basic label-container"/><g transform="translate(-70.6640625, -12)" style="" class="label"><rect/><foreignObject height="24" width="141.328125"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Correlation Analysis</p></span></div></foreignObject></g></g><g transform="translate(881.109375, 555)" id="flowchart-K-19" class="node default"><rect height="54" width="214.40625" y="-27" x="-107.203125" style="" class="basic label-container"/><g transform="translate(-77.203125, -12)" style="" class="label"><rect/><foreignObject height="24" width="154.40625"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Missing Value Analysis</p></span></div></foreignObject></g></g><g transform="translate(1130.65625, 555)" id="flowchart-L-21" class="node default"><rect height="54" width="184.6875" y="-27" x="-92.34375" style="" class="basic label-container"/><g transform="translate(-62.34375, -12)" style="" class="label"><rect/><foreignObject height="24" width="124.6875"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Outlier Detection</p></span></div></foreignObject></g></g><g transform="translate(1367.3984375, 555)" id="flowchart-M-23" class="node default"><rect height="54" width="188.796875" y="-27" x="-94.3984375" style="" class="basic label-container"/><g transform="translate(-64.3984375, -12)" style="" class="label"><rect/><foreignObject height="24" width="128.796875"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Duplicate Analysis</p></span></div></foreignObject></g></g><g transform="translate(1574.5625, 555)" id="flowchart-N-25" class="node default"><rect height="54" width="125.53125" y="-27" x="-62.765625" style="" class="basic label-container"/><g transform="translate(-32.765625, -12)" style="" class="label"><rect/><foreignObject height="24" width="65.53125"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Box Plots</p></span></div></foreignObject></g></g><g transform="translate(1756.7578125, 555)" id="flowchart-O-27" class="node default"><rect height="54" width="138.859375" y="-27" x="-69.4296875" style="" class="basic label-container"/><g transform="translate(-39.4296875, -12)" style="" class="label"><rect/><foreignObject height="24" width="78.859375"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Histograms</p></span></div></foreignObject></g></g><g transform="translate(1978.9453125, 555)" id="flowchart-P-29" class="node default"><rect height="54" width="205.515625" y="-27" x="-102.7578125" style="" class="basic label-container"/><g transform="translate(-72.7578125, -12)" style="" class="label"><rect/><foreignObject height="24" width="145.515625"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Correlation Matrices</p></span></div></foreignObject></g></g><g transform="translate(1130.65625, 659)" id="flowchart-Q-31" class="node default"><rect height="54" width="168.625" y="-27" x="-84.3125" style="fill:#e3f2fd !important" class="basic label-container"/><g transform="translate(-54.3125, -12)" style="" class="label"><rect/><foreignObject height="24" width="108.625"><div style="display: table-cell; white-space: nowrap; line-height: 1.5; max-width: 200px; text-align: center;" xmlns="http://www.w3.org/1999/xhtml"><span class="nodeLabel"><p>Output Reports</p></span></div></foreignObject></g></g></g></g></g></svg>" alt="Analysis Workflow" width="600">

## Generated Outputs

### Statistical Reports
- **Dataset Overview**: Shape, data types, memory usage
- **Descriptive Statistics**: Mean, median, mode, standard deviation
- **Distribution Analysis**: Skewness, kurtosis, normality tests
- **Categorical Analysis**: Frequency tables and unique value counts

### Data Quality Assessment
- **Missing Values**: Patterns, counts, and visualizations
- **Outliers**: IQR-based detection with statistical summaries
- **Duplicates**: Identification and detailed reporting
- **Data Consistency**: Type validation and integrity checks

### Visualizations
- **Distribution Plots**: Histograms with KDE overlays
- **Box Plots**: Outlier visualization and quartile analysis
- **Correlation Analysis**: Heatmaps and relationship matrices
- **Missing Data Patterns**: Matrix plots and summary charts

### Interactive Reports
- **HTML Dashboard**: Comprehensive overview with navigation
- **Data Dictionary**: Detailed variable descriptions
- **Quality Summary**: Actionable insights and recommendations

## Output Structure

```
your_file_analysis/
├── your_file.csv                     # Copy of original data
├── dataset_info.txt                  # Basic dataset information
├── summary_statistics_all.txt        # Comprehensive statistics
├── categorical_summary.txt           # Categorical variable analysis
├── missing_values_report.txt         # Missing data analysis
├── outliers_summary.txt              # Outlier detection results
├── distinct_values_count_by_dtype.html # Interactive value explorer
└── visualization/                    # Generated plots and charts
    ├── box_plots/
    ├── histograms/
    └── correlation_matrices/
```

## Advanced Features

### Missing Value Analysis
- Automatic detection of missing value patterns
- Visualization of missing data distribution
- Imputation suggestions and options
- Missing value correlation analysis

### Outlier Detection
- IQR-based outlier identification
- Statistical summaries for outliers
- Visual outlier highlighting in plots
- Outlier impact assessment

### Statistical Testing
- Normality tests (Shapiro-Wilk)
- Correlation analysis (Pearson, Spearman)
- Chi-square tests for categorical variables
- Variance inflation factor (VIF) analysis

### Relationship Analysis
- Variable correlation matrices
- Target variable analysis (if specified)
- Feature importance insights
- Interaction effect detection

## Examples

### Basic CSV Analysis
```python
import autocsv_profiler

# Analyze sales data
autocsv_profiler.main("sales_data.csv", "sales_analysis")
```

### Custom Analysis Pipeline
```python
from autocsv_profiler import auto_csv_profiler
from autocsv_profiler.recognize_delimiter import detect_delimiter
import pandas as pd

# Load and analyze data
delimiter = detect_delimiter("customer_data.csv")
df = pd.read_csv("customer_data.csv", delimiter=delimiter)

# Run comprehensive analysis
auto_csv_profiler.main("customer_data.csv", "customer_analysis")
```

### Batch Processing
```python
import os
from autocsv_profiler import auto_csv_profiler

# Analyze all CSV files in a directory
for filename in os.listdir("data/"):
    if filename.endswith(".csv"):
        input_file = f"data/{filename}"
        output_dir = f"analysis/{filename[:-4]}_results"
        auto_csv_profiler.main(input_file, output_dir)
```

## Requirements

- Python 3.9 or higher
- pandas >= 1.5.0
- numpy >= 1.24.0
- matplotlib >= 3.6.0
- seaborn >= 0.12.0
- scipy >= 1.10.0
- scikit-learn >= 1.2.0
- statsmodels >= 0.13.0

All dependencies are automatically installed with pip.

## Performance Tips

- **Large Files**: For files > 100MB, consider sampling first
- **Memory Usage**: Monitor memory for datasets with many categorical variables
- **Output Management**: Clean old analysis directories to save disk space
- **Parallel Processing**: Use batch scripts for multiple files

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Support

- **Issues**: [GitHub Issues](https://github.com/dhaneshbb/AutoCSV-Profiler-Suite/issues)
- **Documentation**: [GitHub Docs](https://github.com/dhaneshbb/AutoCSV-Profiler-Suite/tree/main/docs)
- **Changelog**: [CHANGELOG.md](CHANGELOG.md)

## Version

Current version: 1.1.0

See [CHANGELOG.md](CHANGELOG.md) for version history and updates.
