What is the polars.from_numpy() in polars?

Polars is a fast and efficient data manipulation library written in Rust. It’s designed to provide high-performance operations on large datasets and handles them more quickly than pandas. It’s particularly suitable when working with tabular data.

The polars.from_numpy() method increases the usability and integration of Polars into data processing workflows, particularly for users who work with NumPy arrays in Python.

The polars.from_numpy() method

The polars.from_numpy() method builds a DataFrame using NumPy ndarray by copying its data into the newly created DataFrame. In other words, it creates a separate copy of the data, so any modifications made to the DataFrame will not affect the original array.

Syntax

polars.from_numpy(data, schema = None, schema_overrides = None, orient = None)

Parameters

  • data: It refers to the data stored as a NumPy ndarray.

  • schema: It refers to the structure of the DataFrame, which includes the names of the columns and the data types associated with each column. Declaration of DataFrame schema can be done in different ways:

      • Dictionary of {name: type} pairs. If the value is None, it will be auto-inferred.

      • List of column names; types are inferred automatically.

      • List of (name, type) pairs, equivalent to the dictionary form.

  • schema_overrides: A dictionary to specify or override types for one or more columns. It overrides any types inferred from the columns.

  • orient: This parameter specifies ways to interpret two-dimensional data.

    • None: This implies the default orientation where each row in the array becomes a row in the DataFrame, and each column in the array becomes a column in the DataFrame.

    • col: Data is treated as columns.

    • row: Data is treated as rows.

Note: If orientation inference doesn't yield conclusive results, column orientation is used by default.

Return value

This method returns a DataFrame from NumPy ndarray.

Code

import numpy as np
import polars as pl
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
schema_declaration= ["row 1", "row 2", "row 3"]
df = pl.from_numpy(data, schema=schema_declaration, orient="col")
print(df)

Explanation

Lines 1–2: We import the polars  and numpy library as np and pl, respectively.

Lines 4: We create a 2D NumPy array, data , with three rows and columns.

Line 6: The pl.from_numpy() function creates a DataFrame, df, out of the data array.

  • The schema parameter is set to ["row 1", "row 2", "row 3"], which specifies the column names for the DataFrame. In this case, the DataFrame will have columns named "row 1" , "row 2" and "row 3".

  • The orient parameter is set to "col", which specifies that the NumPy array's columns should be treated as columns in the DataFrame.

Line 7: We print the Polars DataFrame.

Translating data from array to DataFrame
Translating data from array to DataFrame

Try replacing line 5 as follows:

schema_declaration = {"row": pl.Int64, "row2": pl.Int64, "row3": pl.Int64}

Here, {"row1": pl.Int64, "row2": pl.Int64, "row3": pl.Int64} are the key-value pairs within the dictionary. The keys ("row1", "row2", and "row3") are the names of the columns and pl.Int64 indicates that the data type for these columns is a 64-bit integer provided by the Polars library.

Now, try replacing line 5 as follows:

schema_declaration = [("row1", pl.Int64), ("row2", pl.Int64), ("row3", pl.Int64)]

Here, ("row1",pl.Int64), ("row2": pl.Int64), ("row3": pl.Int64) are the tuples in a list. The keys ("row1", "row2", and "row3") are the names of the columns and pl.Int64 indicates that the data type for these columns is a 64-bit integer provided by the Polars library.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved