A nonparametric model is a statistical model used in machine learning. Unlike the parametric model where parameters are fixed, this model is used in the case where very little is known to the researcher about the parameters governing the variable that is present in the population.
They don’t rely on arithmetic functions such as mean (average) or standard deviation, but instead on describing the variable’s distribution. The data determines nonparametric models. Unlike parametric models, they are not specified by a probability where every event has the same likelihood of occurring.
One of the most popular nonparametric machine learning algorithms is the k nearest neighbor (KNN) algorithm. This algorithm is used in classification and regression. Here, the model compares the training data directly and, in doing so, locates the k nearest neighbors through the means of the Euclidean distance.
Suppose we have two categories, represented by blue and orange. We add a new data point, represented by green. We will now use KNN to assign it to either category.
Another popular nonparametric machine learning algorithm is the decision tree algorithm. This algorithm is also used for classification and regression. Decision trees are often used for data mining and pattern analysis. They use predictor data to make hierarchical decisions about the outcome variable in order to find solutions. The benefit of using decision trees is that they are easy to understand and can handle non-linear data efficiently.
Similar to the previous two models, support vector machine (SVM) algorithms also provide analysis of data for classification and regression. Even though SVMs can be applied to regression, they are mainly used for classification. SVM algorithms use a set of mathematical functions called kernels to take data as input and further transform this data into any form that is required. A popular function of an SVM is to differentiate between two classes.
Parametric | Nonparametric |
Fixed number of parameters | Undefined number of parameters |
Lesser data required | More data required |
Follows normal distribution | Does not follow any distribution |
Higher statistical power | Lower statistical power |
Free Resources