What is sklearn.datasets.load_wine in scikit-learn?

The load_wine method from the datasets module is used to load the wine dataset for machine learning classification problems. It is a classic and multi-class dataset.

Dataset overview

This dataset contains 13 different parameters for wine with 178 samples. The purpose of this wine dataset in scikit-learn is to predict the best wine class among 3 classes.

load_wine()

Name

Facts

Classes

3

Features total

13

Total samples

178

No. of samples per class

[59,71,48]

Features type

Positive and real

It is a new method in sklearn version 0.180.18.

Syntax


sklearn.datasets.load_wine(*, return_X_y= False, as_frame= False)

Parameters

  • return_X_y: type=bool, default=False

  • as_frame: type=bool, default=False

Return value

  • data: This is a dictionary-like object and contains the following attributes:
    • data: It will be either ndarray or dataframe of shape (178, 13). If as_frame is set to True, then the data matrix will be a pandas DataFrame. Otherwise, it will be an ndarray.
    • target: It will be either ndarray or Series of shape (178,). If as_frame is set to True, then target will be a pandas Series.
    • feature_names: It will be a list of the names of dataset columns as features. Otherwise, it will be an ndarray.
    • target_names: It will be a list of the names of target classes.
    • frame: It will be a DataFrame of shape (178, 14). If as_frame is set to True, then this field will be available.
    • DESCR: It will be a string that contains information about the dataset.
  • (data, target): It will be a tuple if return_X_y is set to True.

Explanation

The code snippet below shows how the wine dataset looks.

# Program to load Wine Dataset
# Load useful libraries
import pandas as pd
from sklearn.datasets import load_wine
# Loading dataset
data = load_wine()
# Configuring pandas to show all features
pd.set_option("display.max_rows", None, "display.max_columns", None)
# Converting data to a dataframe to view properly
data = pd.DataFrame(data=data['data'],columns=data['feature_names'])
# Printing first 5 observations
print(data.head())

Explanation

  • Line 6: We load the wine dataset in a variable named data.
  • Line 8: We set the length and width of the pandas data frame to its maximum so as to provide a better view to the reader.
  • Line 10: We convert unordered data of the wine dataset to a data frame for better understanding and view.
  • Line 12: We print the first five observations (which include 13 different features) of the wine dataset by using the head() method.

Free Resources