Metadata-Version: 2.1
Name: DataCleanerAI
Version: 0.0.1
Summary: An interactive, intelligent data-cleaning library with ML-based user adaptation
Home-page: https://github.com/harshv-v/DataCleanerAI
Author: Harsha Vardhan
Author-email: harshav.vanukuri@gmail.com
License: MIT
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas
Requires-Dist: numpy
Requires-Dist: scikit-learn
Requires-Dist: dask
Requires-Dist: matplotlib
Requires-Dist: spacy

# DataCleanerAI

**DataCleanerAI** is an interactive, intelligent Python library for data cleaning, designed to analyze, prompt, and adapt to user preferences for data cleaning tasks, with particular support for large financial or textual datasets.

## Key Features

- **Automatic Data Analysis**: Detects missing values, outliers, duplicates, and type inconsistencies.
- **Interactive Prompting**: Prompts users for specific cleaning actions based on identified issues, learning from user responses over time.
- **ML-Based Preference Prediction**: Learns from user interactions and predicts future actions based on historical responses.
- **Memory Optimization for Large Datasets**: Supports efficient processing of large datasets with Dask for out-of-core data handling.
- **Reusable Cleaning Pipelines**: Enables creating automated, reusable workflows for repetitive data cleaning tasks.
- **Enhanced NLP for Text Data**: Utilizes NLP techniques to clean and standardize textual data fields.
- **Preference Export/Import**: Allows exporting and importing user-defined preferences for collaborative or repeated projects.
- **Detailed Reporting**: Generates visual insights and summary reports on data quality.
- **ETL Integration**: Seamlessly integrates with ETL workflows for scheduled data cleaning tasks.
- **Multi-User Support**: Supports multi-user configurations, with individualized or shared preferences.

## Installation

You can install DataCleanerAI via pip:

```bash
pip install DataCleanerAI
