Reading & Writing SQL (read_sql, to_sql)
pandas can talk directly to databases — read_sql pulls the result of a query into a DataFrame, and to_sql pushes a DataFrame back out as a database table.
Learn Reading & Writing SQL (read_sql, to_sql) in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice exercise…
Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
You'll connect to a SQLite database, run queries with pd.read_sql , write with df.to_sql(if_exists=) , and stream large results with chunksize .
Every database operation needs a connection . Python ships with sqlite3 , and sqlite3.connect(":memory:") spins up a throwaway in-memory database — perfect for learning. You write a DataFrame with df.to_sql("table", conn) and read it back with pd.read_sql("SELECT ...", conn) .
A big advantage of read_sql is that you can do the filtering, joining, and aggregation in SQL and only bring back the rows you need — far cheaper than loading everything and filtering in pandas. On the write side, if_exists decides what happens when the table already exists.
When a result set is too large to hold in memory at once, pass chunksize=n to read_sql . Instead of one giant DataFrame you get an iterator that yields DataFrames of n rows, so you can process the data batch by batch and keep memory flat.
Writing twice without saying what to do on conflict:
Write two related tables, then let SQL do the heavy lifting.
Lesson complete — pandas and databases speak the same language!
You can open a connection, write with to_sql(if_exists=) , query with read_sql , push filters and joins into SQL, and stream big tables with chunksize .
🚀 Up next: Efficient Formats — Parquet, Feather, and Pickle for fast storage.
Practice quiz
Which function reads the result of a SQL query into a DataFrame?
- pd.read_sql('SELECT * FROM t', conn)
- pd.sql_read('SELECT * FROM t', conn)
- pd.from_sql('SELECT * FROM t', conn)
- pd.query_sql('SELECT * FROM t', conn)
Answer: pd.read_sql('SELECT * FROM t', conn). pd.read_sql runs the query against the connection and returns a DataFrame.
Which method writes a DataFrame out to a database table?
- df.write_sql('t', conn)
- df.to_sql('t', conn)
- df.save_sql('t', conn)
- df.sql_out('t', conn)
Answer: df.to_sql('t', conn). df.to_sql(name, conn) pushes the DataFrame to a table.
How do you stop to_sql from writing the index as an extra column?
- drop_index=True
- no_index=True
- index=False
- keep_index=False
Answer: index=False. Pass index=False so the RangeIndex is not written as its own column.
What does if_exists default to when the table already exists?
- 'append'
- 'replace'
- 'ignore'
- 'fail'
Answer: 'fail'. The default is 'fail', which raises an error if the table exists.
Which if_exists value drops and recreates the table?
- 'replace'
- 'append'
- 'fail'
- 'rebuild'
Answer: 'replace'. 'replace' drops the existing table and writes a fresh one.
Which if_exists value adds rows to an existing table?
- 'add'
- 'append'
- 'insert'
- 'extend'
Answer: 'append'. 'append' keeps the existing rows and adds the new ones.
What does passing chunksize=4 to read_sql return?
- One DataFrame of 4 rows
- A list of DataFrames
- An iterator that yields DataFrames of 4 rows
- Only the first 4 rows
Answer: An iterator that yields DataFrames of 4 rows. With chunksize, read_sql returns an iterator you loop over.
What happens if you call to_sql twice without if_exists?
- The second write is silently skipped
- Rows are appended automatically
- The table is replaced
- A ValueError is raised because the table exists
Answer: A ValueError is raised because the table exists. The default 'fail' raises a ValueError on the second write.
Why push filters into the SQL query instead of pandas?
- Only the rows you need are returned, saving memory
- SQL cannot filter rows
- pandas cannot filter strings
- It changes the table permanently
Answer: Only the rows you need are returned, saving memory. Filtering in SQL returns fewer rows than loading everything first.
What connection makes a throwaway in-memory SQLite database?
- sqlite3.connect('temp.db')
- sqlite3.connect(':memory:')
- sqlite3.open(':ram:')
- sqlite3.memory()
Answer: sqlite3.connect(':memory:'). sqlite3.connect(':memory:') creates a temporary in-memory database.