Metadata-Version: 2.1
Name: MissingValues-101703292
Version: 1.0.2
Summary: A Python package to handle missing values in the dataset
Home-page: UNKNOWN
Author: Kriti Pandey
Author-email: kritip105@gmail.com
License: MIT
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: pandas

# Project MISSING VALUES

Name **Kriti Pandey** 

Roll no **101703292**

Group **3COE13**

**DESCRIPTION**

Data can have missing values for a number of reasons such as observations that were not recorded and data corruption.Handling missing data is important as many machine learning algorithms do not support data with missing values.

Some typical reasons why data is missing:

1) User forgot to fill in a field.

2) Data was lost while transferring manually from a legacy database.

3) There was a programming error.

4) Users chose not to fill out a field tied to their beliefs about how the results would be used or interpreted.

Specifically, there are 2 steps to handle missing data:

1) mark invalid or corrupt values as missing in your dataset.

2) impute missing values with mean values in your dataset.

## Installation

Use the package manager [pip](https://pip.pypa.io/en/stable/) to install OUTLIER_101703292.

```bash
pip install MissingValues_101703292
```

## Usage
Enter csv filename followed by .csv extentsion

```python
MissingValues_101703292 data.csv 
```
## Sample dataset

| 0  | 1  | 2     | 3    | 4    | 5     | 6    | 7     | 8  | 9 |
|----|----|-------|------|------|-------|------|-------|----|---|
| 0  | 6  | 148.0 | 72.0 | 35.0 | NaN   | 33.6 | 0.627 | 50 | 1 |
| 1  | 1  | 85.0  | 66.0 | 29.0 | NaN   | 26.6 | 0.351 | 31 | 0 |
| 2  | 8  | 183.0 | 64.0 | NaN  | NaN   | 23.3 | 0.672 | 32 | 1 |
| 3  | 1  | 89.0  | 66.0 | 23.0 | 94.0  | 28.1 | 0.167 | 21 | 0 |
| 4  | 0  | 137.0 | 40.0 | 35.0 | 168.0 | 43.1 | 2.288 | 33 | 1 |
| 5  | 5  | 116.0 | 74.0 | NaN  | NaN   | 25.6 | 0.201 | 30 | 0 |
| 6  | 3  | 78.0  | 50.0 | 32.0 | 88.0  | 31.0 | 0.248 | 26 | 1 |
| 7  | 10 | 115.0 | NaN  | NaN  | NaN   | 35.3 | 0.134 | 29 | 0 |
| 8  | 2  | 197.0 | 70.0 | 45.0 | 543.0 | 30.5 | 0.158 | 53 | 1 |
| 9  | 8  | 125.0 | 96.0 | NaN  | NaN   | NaN  | 0.232 | 54 | 1 |
| 10 | 4  | 110.0 | 92.0 | NaN  | NaN   | 37.6 | 0.191 | 30 | 0 |
| 11 | 10 | 168.0 | 74.0 | NaN  | NaN   | 38.0 | 0.537 | 34 | 1 |
| 12 | 10 | 139.0 | 80.0 | NaN  | NaN   | 27.1 | 1.441 | 57 | 0 |
| 13 | 1  | 189.0 | 60.0 | 23.0 | 846.0 | 30.1 | 0.398 | 59 | 1 |
| 14 | 5  | 166.0 | 72.0 | 19.0 | 175.0 | 25.8 | 0.587 | 51 | 1 |
| 15 | 7  | 100.0 | NaN  | NaN  | NaN   | 30.0 | 0.484 | 32 | 1 |
| 16 | 0  | 118.0 | 84.0 | 47.0 | 230.0 | 45.8 | 0.551 | 31 | 1 |
| 17 | 7  | 107.0 | 74.0 | NaN  | NaN   | 29.6 | 0.254 | 31 | 1 |
| 18 | 1  | 103.0 | 30.0 | 38.0 | 83.0  | 43.3 | 0.183 | 33 | 0 |
| 19 | 1  | 115.0 | 70.0 | 30.0 | 96.0  | 34.6 | 0.529 | 32 | 1 |

## Input

```bash
MissingValues_101703292 Sampledata.csv
```
 ## Result

 ```bash
S No.   1    2     3     4       5      6      7     8   9

0       0   6  148  72.0  35.0  116.15  33.60  0.627  50  1

1       1   1   85  66.0  29.0  116.15  26.60  0.351  31  0

2       2   8  183  64.0  17.8  116.15  23.30  0.672  32  1

3       3   1   89  66.0  23.0   94.00  28.10  0.167  21  0

4       4   0  137  40.0  35.0  168.00  43.10  2.288  33  1

5       5   5  116  74.0  17.8  116.15  25.60  0.201  30  0

6       6   3   78  50.0  32.0   88.00  31.00  0.248  26  1

7       7  10  115  61.7  17.8  116.15  35.30  0.134  29  0

8       8   2  197  70.0  45.0  543.00  30.50  0.158  53  1

9       9   8  125  96.0  17.8  116.15  30.95  0.232  54  1

10     10   4  110  92.0  17.8  116.15  37.60  0.191  30  0

11     11  10  168  74.0  17.8  116.15  38.00  0.537  34  1

12     12  10  139  80.0  17.8  116.15  27.10  1.441  57  0

13     13   1  189  60.0  23.0  846.00  30.10  0.398  59  1

14     14   5  166  72.0  19.0  175.00  25.80  0.587  51  1

15     15   7  100  61.7  17.8  116.15  30.00  0.484  32  1

16     16   0  118  84.0  47.0  230.00  45.80  0.551  31  1

17     17   7  107  74.0  17.8  116.15  29.60  0.254  31  1

18     18   1  103  30.0  38.0   83.00  43.30  0.183  33  0

19     19   1  115  70.0  30.0   96.00  34.60  0.529  32  1
 ```

## Constraint 
*Your csv file should not have categorical data*



## License
[MIT](https://choosealicense.com/licenses/mit/)

