When we have a huge data set, pandas does not have the ability to handle this data. This is where the to_sql()
function comes in. The to_sql()
function also gives SQL access to write recorded data in the data frame.
df.to_sql(self, name, conn=engine, schema=“schema_name”, if_exists='Replace', index=True)
It takes the following argument values:
name
: This is the SQL table’s name.conn
: Using SQLAlchemy makes it accessible to use any database given by that library. Sqlite3.connection objects are used to give legacy support.schema
: If the database supports schema, we highlight the schema. If there is no DB, we use the default schema.index_label
: This is the column label for index column(s). The index is true if no label is given and index names are used. If DataFrames use a multi-index, then a sequence should be given.None or integer
: This is the number of rows affected by to_sql
. If the callable passed into the method does not return the number of rows, None
will be returned in the parameter.
from sqlalchemy import create_engineimport pandas as pdengine = create_engine('sqlite://', echo=False)df = pd.DataFrame({'Book' : ['Book 1', 'Book 2', 'Book 3']})dfdf.to_sql('Book', con=engine)engine.execute("SELECT * FROM Book").fetchall()[(0, 'Book 1'), (1, 'Book 2'), (2, 'Book 3')]with engine.begin() as connection:df1 = pd.DataFrame({'Book' : ['Book 4', 'Book 5']})df1.to_sql('Book', con=connection, if_exists='append')df2 = pd.DataFrame({'Book' : ['Book 6', 'Book 7']})df2.to_sql('Book', con=engine, if_exists='append')engine.execute("SELECT * FROM Book").fetchall()df = pd.DataFrame({"A": [1, None, 2]})dffrom sqlalchemy.types import Integerdf.to_sql('integers', con=engine, index=False,dtype={"A": Integer()})print(engine.execute("SELECT * FROM integers").fetchall())
sqlalchemy.engine.Connection(connection with SQLALchemy)
can also be passed to con.This is helpful to support operations that need the same DBAPI connection. Overwrite the table with a second data type. Highlight the used data type( used for missing values in integers).
The database provides Null integers while pandas store data in floating points. We get back integer value when we escort data from python.