The **`KNNImputer`** belongs to the `scikit-learn` module in Python. 
> `Scikit-learn` is generally used for machine learning. 

The `KNNImputer` is used to fill in missing values in a dataset using the [k-Nearest Neighbors](https://www.educative.io/answers/what-is-the-k-nearest-neighbor-algorithm) method. 

> k-Nearest Neighbors algorithm is used for classification and prediction problems. 

The `KNNImputer` predicts the value of a missing value by observing trends in related columns. It then chooses the best fit value based on the k-Nearest Neighbors algorithm. 

The illustration below show how `KNNImputer` works in `scikit-learn`:

# Definition

The `KNNImputer` class is defined as follows:

``` Python
class sklearn.impute.KNNImputer(*, missing_values=nan, n_neighbors=5, weights='uniform', metric='nan_euclidean', copy=True, add_indicator=False)
```
# Parameters 

The `KNNImputer` class takes in the following parameters:

| Parameters  |  Purpose |
| - | - |
| `missing_values`  | All instances of `missing_values` will be imputed. Values include `int`, `float`, `str`, `np.nan` or `None`. By default: `np.nan`  |
| `n_neighbors`  |  Number of neighbors used for prediction. By default: 5 |
|`weights`| Weight function used for prediction. Vales include `uniform`, `distance`, or `callable`. By default: `uniform`. |
| `metric` | Distance metric for searching neighbors. Used in k-nearest neighbors algorithm. Value include `nan_euclidean` or `callable`. By default: `nan_euclidean` |
| `copy`  | Takes in a `bool` value. If True, a copy of the data will be created. If False, imputation will be done in-place. By default: True  |
| `add_indicator`  | Takes in a `bool` value. If True, a MissingIndicator transform will stack onto the output of the imputer’s transform. By default: False  |

# Method

The `KNNImputer` class has several methods:

| Method  | Purpose  |
| - | - |
| `fit(X)`  | Fit the imputer on X. |
| `fit_transform(X)`  | Fit to data, then transform it.  |
| `get_param()`  | Get parameters for this estimator.  |
| `set_params(**params)`  | Set parameters for the estimator  |
|`transform(X)` | Impute all missing values of X|

Simple imputation can work using the `fit_transform` method only.

# Example

The following example shows how we can use the `KNNImputer` in scikit-learn:

import numpy as np # Importing numpy to create an array
from sklearn.impute import KNNImputer 
# Creating array with missing values 
X = [[1, 2, np.nan], [3, 6, 12], [np.nan, 12, 24], [2, 4, 16]] 
print("Original array: ", X)
imputer = KNNImputer(n_neighbors=2) # Creating a KNNImputer
array = imputer.fit_transform(X) # Imputing data 
print("Updated array: ", array)

tarbal.tar.gz

What is KNNImputer in scikit-learn?

KNNImputer in scikit-learn fills missing dataset values using the k-Nearest Neighbors method for better predictions.

Parameters	Purpose
`missing_values`	All instances of `missing_values` will be imputed. Values include `int`, `float`, `str`, `np.nan` or `None`. By default: `np.nan`
`n_neighbors`	Number of neighbors used for prediction. By default: 5
`weights`	Weight function used for prediction. Vales include `uniform`, `distance`, or `callable`. By default: `uniform`.
`metric`	Distance metric for searching neighbors. Used in k-nearest neighbors algorithm. Value include `nan_euclidean` or `callable`. By default: `nan_euclidean`
`copy`	Takes in a `bool` value. If True, a copy of the data will be created. If False, imputation will be done in-place. By default: True
`add_indicator`	Takes in a `bool` value. If True, a MissingIndicator transform will stack onto the output of the imputer’s transform. By default: False

Method	Purpose
`fit(X)`	Fit the imputer on X.
`fit_transform(X)`	Fit to data, then transform it.
`get_param()`	Get parameters for this estimator.
`set_params(**params)`	Set parameters for the estimator
`transform(X)`	Impute all missing values of X