Pandas is an open-source Python library that is used in data analysis. It provides functionalities to manipulate data in the form of table structures called data frames. The describe()
method displays a statistical summary of the data that consists of the mean, the standard deviation, the minimum and maximum value, and so on.
dataFrame.describe(percentiles, include, exclude, datetime_is_numeric)
percentiles
: Values between 0 and 1. Specifies the percentile to be returned in the result. (Optional)include
: List of data types to include in the result. Options are None | ‘all’ | datatypes. (Optional)exclude
: List of data types to exclude in the result. Options are None | ‘all’ | datatypes. (Optional)datetime_is_numeric
: To treat datetime data as numeric. Set to True or False, with default as False. (Optional)The functions return a DataFrame object, where each row has a type of statistic that provides a summary of the columns.
#import libraryimport pandas as pd#define datadata = {'Name': ['Kris', 'Kelly', 'Josh', 'Bob','Lisa'],'Age': [16, 21, 17, 19, 20],'Marks': [78, 56, 87, 89, 79]}#create a DataFrame objectdf = pd.DataFrame(data)#describe the dataprint(df.describe())