Reading & Writing SQL (read_sql, to_sql)

pandas can talk directly to databases — read_sql pulls the result of a query into a DataFrame, and to_sql pushes a DataFrame back out as a database table.

Learn Reading & Writing SQL (read_sql, to_sql) in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice exercise…

Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

You'll connect to a SQLite database, run queries with pd.read_sql , write with df.to_sql(if_exists=) , and stream large results with chunksize .

Every database operation needs a connection . Python ships with sqlite3 , and sqlite3.connect(":memory:") spins up a throwaway in-memory database — perfect for learning. You write a DataFrame with df.to_sql("table", conn) and read it back with pd.read_sql("SELECT ...", conn) .

A big advantage of read_sql is that you can do the filtering, joining, and aggregation in SQL and only bring back the rows you need — far cheaper than loading everything and filtering in pandas. On the write side, if_exists decides what happens when the table already exists.

When a result set is too large to hold in memory at once, pass chunksize=n to read_sql . Instead of one giant DataFrame you get an iterator that yields DataFrames of n rows, so you can process the data batch by batch and keep memory flat.

Writing twice without saying what to do on conflict:

Write two related tables, then let SQL do the heavy lifting.

Lesson complete — pandas and databases speak the same language!

You can open a connection, write with to_sql(if_exists=) , query with read_sql , push filters and joins into SQL, and stream big tables with chunksize .

🚀 Up next: Efficient Formats — Parquet, Feather, and Pickle for fast storage.

Practice quiz

Which function reads the result of a SQL query into a DataFrame?

  • pd.read_sql('SELECT * FROM t', conn)
  • pd.sql_read('SELECT * FROM t', conn)
  • pd.from_sql('SELECT * FROM t', conn)
  • pd.query_sql('SELECT * FROM t', conn)

Answer: pd.read_sql('SELECT * FROM t', conn). pd.read_sql runs the query against the connection and returns a DataFrame.

Which method writes a DataFrame out to a database table?

  • df.write_sql('t', conn)
  • df.to_sql('t', conn)
  • df.save_sql('t', conn)
  • df.sql_out('t', conn)

Answer: df.to_sql('t', conn). df.to_sql(name, conn) pushes the DataFrame to a table.

How do you stop to_sql from writing the index as an extra column?

  • drop_index=True
  • no_index=True
  • index=False
  • keep_index=False

Answer: index=False. Pass index=False so the RangeIndex is not written as its own column.

What does if_exists default to when the table already exists?

  • 'append'
  • 'replace'
  • 'ignore'
  • 'fail'

Answer: 'fail'. The default is 'fail', which raises an error if the table exists.

Which if_exists value drops and recreates the table?

  • 'replace'
  • 'append'
  • 'fail'
  • 'rebuild'

Answer: 'replace'. 'replace' drops the existing table and writes a fresh one.

Which if_exists value adds rows to an existing table?

  • 'add'
  • 'append'
  • 'insert'
  • 'extend'

Answer: 'append'. 'append' keeps the existing rows and adds the new ones.

What does passing chunksize=4 to read_sql return?

  • One DataFrame of 4 rows
  • A list of DataFrames
  • An iterator that yields DataFrames of 4 rows
  • Only the first 4 rows

Answer: An iterator that yields DataFrames of 4 rows. With chunksize, read_sql returns an iterator you loop over.

What happens if you call to_sql twice without if_exists?

  • The second write is silently skipped
  • Rows are appended automatically
  • The table is replaced
  • A ValueError is raised because the table exists

Answer: A ValueError is raised because the table exists. The default 'fail' raises a ValueError on the second write.

Why push filters into the SQL query instead of pandas?

  • Only the rows you need are returned, saving memory
  • SQL cannot filter rows
  • pandas cannot filter strings
  • It changes the table permanently

Answer: Only the rows you need are returned, saving memory. Filtering in SQL returns fewer rows than loading everything first.

What connection makes a throwaway in-memory SQLite database?

  • sqlite3.connect('temp.db')
  • sqlite3.connect(':memory:')
  • sqlite3.open(':ram:')
  • sqlite3.memory()

Answer: sqlite3.connect(':memory:'). sqlite3.connect(':memory:') creates a temporary in-memory database.