Metadata-Version: 2.2
Name: Team_1_library
Version: 1.1
Summary: Data prerpocess library with missing value imputation and outlier correction functions.
Home-page: https://github.com/Manex14/Team_1_library
Author: Jokin Agirre, Irene Alvarez, Uxue Auzmendi, Jon Lorenzo, Manex Ugarte 
Author-email: manex.ugarte@alumni.mondragon.edu
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE.md
Requires-Dist: pandas
Requires-Dist: scipy
Requires-Dist: numpy
Requires-Dist: scikit-learn
Requires-Dist: fuzzywuzzy
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

Team_1_library

Python library with two classes: 
-Automatic preprocess, with advaced functions for missing values imputations using clustering and statistical techniques, as well as outlier corrections.
-Manual prerpocess, with manual column statistical imputations, numerical outlier correction, and string focused advanced functions, like normalization and correction.

Characteristics

- Missing value imputation with clustering (K-Means).
- Missing value imputation with statistical values (mean,median,mode).
- Outlier detection and correction to mean with Z-score technique.
- Empty column elimination with customizable threshold.
- Low variance column elimination with customizable threshold.

-String normalization (lowercase and gap elimination)
-String correction depending on the similarity

Usage:

The usage of this library functions are explained in the file 'USER_GUIDE.md'. An example of the use is included in the file 'example_usage.py'. We higly recommend to take a look in those files to understand the functioning of the library.

Instalation:

To install the library, navigate to the folder containing 'Team_1_library' in your preferred environment and install it using 'pip':
```bash
pip install ./path/to/the/library

Make sure you have the required dependencies installed:
```bash
pip install pandas scipy numpy scikit-learn fuzzywuzzy

 
