Maximum likelihood estimation (MLE) is a framework that is used to determine the parameters of a machine learning model. The parameters obtained as a result always maximize the likelihood so that the observed data conforms to the model that produces it.
MLE aims to fit a distribution to the provided data so that the data can be easier to work with. The results and inference from the data can then be generalized.
This method is often used in many machine learning models, such as logistic regression, where finding the best parameters for the models is vital in optimizing the model to make future predictions.
Let's now understand the mathematical formulation that is used behind this framework. It effectively solves the problem by searching the parameter space to find the optimal parameters that conform to the given dataset
Here
Following this, we can state the formulation for MLE in the following way:
The notation above represents the data's conditional probability
For a single data point
Here
As we know that the data points in our dataset are independent of each other so we can extend the notation above over the whole dataset with the help of the law of independence of probability and obtain:
We know that the values for probability always lie in the range of
Now using the property above we can translate our problem into the one below:
The relation between the joint probability distribution and the likelihood function can be seen from the following equation:
In the notation above,
To conclude, we can see that this framework is an efficient way to tune the parameters of a model. Also, it is essential to note that as we increase the size of the dataset, the quality of the maximum likelihood estimator rises significantly.
Free Resources