How to find the mean, median, and mode in Python

Key takeaways:
Python offers built-in functions and libraries like statistics and numpy to simplify statistical computations such as mean, median, and mode.
The mean is obtained by dividing the total sum of values by the count of values.
The median is the central value in an ordered dataset or the average of two middle values if the dataset size is even. Sorting is essential when manually computing the median.
The mode represents the value(s) that appear most frequently in a dataset.
Custom implementations require handling cases like empty datasets and multiple modes.
The statistics module includes mean(), median(), and mode() for efficient calculations.

In data analysis, understanding the central tendency of a dataset is crucial. The mean, median, and mode are three key metrics that provide insights into the dataset’s characteristics. Python, with its robust libraries, makes calculating these metrics straightforward. In this Answer, we’ll explore how to compute the mean, median, and mode in Python using built-in functions and libraries like statistics.

Understanding the metrics

Let’s first understand what mean, mode, and median are. We’ll be working with a small dataset. Usually, these metrics are performed on large datasets with huge chunks of data, but we’ll use a small one for demonstration purposes. Here are the definitions for all the metrics we’ll cover in this Answer:

Mean: A dataset’s arithmetic average. It is computed by dividing the sum of values by the total number of values.
Median: The median is the midway value in a sorted dataset. In the case of an even number of values, the median is calculated as the mean of the two middle numbers in the dataset.
Mode: The values that appear most frequently in the dataset are called the mode.

Mean, median, and mode in Python

One may ask why use Python for computing these statistics? Well, the answer is simple: Python offers straightforward functions for statistical operations and modules like statistics or numpy provide efficient implementations. These tools handle large datasets with ease.

Let’s examine how to use Python’s statistics module to compute these metrics and discuss the implementation from scratch.

1. Calculate the mean using `+=` and `\` operators

Here’s how we can compute the mean without using any modules:

Let’s review the code above:

Line 2: This condition checks if the dataset is empty and returns None.
Line 4: We need to first sort the dataset using the sorted() method.
Lines 5–6: We get the length of the entire data and divide it by 2 to get the middle element.

4. Calculate the median using the `statistics` module

In the code shown below, there are two variations of the median() function that we used:

median_low(): Returns the lower of the two middle numbers when the dataset size is even. If the dataset size is odd, it returns the middle number (same as median()).
median_high(): Returns the higher of the two middle numbers when the dataset size is even. If the dataset size is odd, it returns the middle number (same as median()).

Let’s look at how the code works:

Line 2: This condition checks if the dataset is empty and returns None.
Line 4: We’ll keep a dictionary to keep track of how many times each element appears in data.
Lines 5–6: This loop iterates over data and updates its count in frequency for every number.
Line 8: Here, we’re extracting the highest frequency of any value in the dictionary.
Line 9: Finally, we identify the elements in the dataset that appear most frequently and store them in a list. If multiple elements have the same highest frequency, all of them are included in the list.

6. Mode and multimode with the `statistics` module

In the code shown below, here are two variations of the mode() function that we used:

mode(): The mode is the most frequently occurring value in a dataset. It represents the number that appears the highest number of times.
multimode(): A dataset is multimodal when multiple values appear with the highest frequency. The function statistics.multimode() returns a list of all such values.

As expected, it’s as simple as using the mode() method.

Learn the basics with our engaging Learn Python course!

Start your coding journey with Learn Python, the perfect course for beginners! Whether exploring coding as a hobby or building a foundation for a tech career, this course is your gateway to mastering Python—the most beginner-friendly and in-demand programming language. With simple explanations, interactive exercises, and real-world examples, you’ll confidently write your first programs and understand Python essentials. Our step-by-step approach ensures you grasp core concepts while having fun. Join now and start your Python journey today—no prior experience is required!

Conclusion

Understanding the mean, median, and mode is essential for analyzing data effectively. Python provides multiple ways to calculate these metrics, from manual implementations using loops and operators to built-in functions in the statistics module. The simplicity and efficiency of Python’s libraries make it an excellent choice for statistical computations. Whether working with small or large datasets, these methods help extract meaningful insights effortlessly.

Frequently asked questions

Haven’t found what you were looking for? Contact Us

What if there is no mode?

If there is no mode, the dataset has no repeated values, and the mode can be returned as None or an indication of no repetition.

How do we find median formula?

The median formula is:

Sort the dataset, and for an odd count, it’s the middle value.
For an even count, it’s the average of the two middle values.

What are some other statistics?

Other statistics include range, variance, standard deviation, quartiles, percentiles, and interquartile range.

What is the median of 21, 62, 66, 66, 79, 28, 63, 48, 59, 94, and 19?

To find the median, first, sort the numbers in ascending order:
19, 21, 28, 48, 59, 62, 63, 66, 66, 79, and 94

As there are 11 numbers (odd count), the median is the middle value, which is the 6th number:
Median = 62

What is mode() in Python?

The mode() function in Python finds the most frequently occurring value in a dataset. It is available in the statistics module. For example:

from statistics import mode

numbers = [1, 2, 2, 3, 4]
print(mode(numbers))  # Output: 2

If multiple values have the same highest frequency, mode() returns the first one found.

How to find the mean, median, and mode in Python

Understanding the metrics

Mean, median, and mode in Python

1. Calculate the mean using `+=` and `\` operators

2. Calculate the mean using the `statistics` module

3. Calculate the median using `len` and `sorted`

4. Calculate the median using the `statistics` module

5. Calculate the mode with the `for` loop

6. Mode and multimode with the `statistics` module

Conclusion

Frequently asked questions

What if there is no mode?

How do we find median formula?

What are some other statistics?

What is the median of 21, 62, 66, 66, 79, 28, 63, 48, 59, 94, and 19?

What is mode() in Python?

How to find the mean, median, and mode in Python

Understanding the metrics

Mean, median, and mode in Python

1. Calculate the mean using += and \ operators

2. Calculate the mean using the statistics module

3. Calculate the median using len and sorted

4. Calculate the median using the statistics module

5. Calculate the mode with the for loop

6. Mode and multimode with the statistics module

Conclusion

Frequently asked questions

What if there is no mode?

How do we find median formula?

What are some other statistics?

What is the median of 21, 62, 66, 66, 79, 28, 63, 48, 59, 94, and 19?

What is mode() in Python?

1. Calculate the mean using `+=` and `\` operators

2. Calculate the mean using the `statistics` module

3. Calculate the median using `len` and `sorted`

4. Calculate the median using the `statistics` module

5. Calculate the mode with the `for` loop

6. Mode and multimode with the `statistics` module