Reading & Writing Data (CSV, Excel, JSON)

Reading and writing data in Pandas means loading external files — CSV, Excel, or JSON — into a DataFrame with the pd.read_* functions, and saving a DataFrame back out with the matching df.to_* methods.

Learn Reading & Writing Data (CSV, Excel, JSON) in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice exercise…

Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

Real analysis starts with real data. This is the bridge between files on disk and the DataFrames you've been building by hand.

CSV (comma-separated values) is the most common data format. Read one in a single line with pd.read_csv :

Pandas uses the first row as the column headers automatically. To try this without a real file , wrap a CSV string in io.StringIO — Pandas treats the in-memory text exactly like a file:

Excel files use pd.read_excel . For modern .xlsx files you need the openpyxl engine installed:

JSON data uses pd.read_json , which accepts a file path or a JSON string:

Each reader has a matching writer method you call on the DataFrame:

Calling df.to_csv() with no filename returns the CSV as a string — handy for previewing exactly what would be written.

Excel needs openpyxl installed for .xlsx files.

Read CSV text in, then write it straight back out:

Lesson 5 complete — you can load and save real data!

You can read CSV, Excel, and JSON into DataFrames, write them back out with the to_* methods, use index=False correctly, and test reads with io.StringIO.

🚀 Up next: Inspecting Data — get an instant overview of any dataset with head, tail, info, and describe.

Practice quiz

Which function reads a CSV file into a DataFrame?

  • pd.load_csv()
  • pd.read_csv()
  • pd.open_csv()
  • pd.csv_read()

Answer: pd.read_csv(). pd.read_csv('file.csv') parses a CSV into a DataFrame.

What is the default separator assumed by read_csv?

  • A tab
  • A semicolon
  • A comma
  • A space

Answer: A comma. read_csv assumes comma-separated values by default (sep=',').

How do you write a DataFrame to CSV WITHOUT the index column?

  • df.to_csv('f.csv', index=False)
  • df.to_csv('f.csv', drop_index=True)
  • df.to_csv('f.csv', no_index=1)
  • df.to_csv('f.csv', skipindex=True)

Answer: df.to_csv('f.csv', index=False). Pass index=False so the row index is not written as a column.

Which method writes a DataFrame to a CSV file?

  • df.write_csv()
  • df.save_csv()
  • df.export_csv()
  • df.to_csv()

Answer: df.to_csv(). df.to_csv('file.csv') saves the frame to disk.

Which function reads JSON into a DataFrame?

  • pd.read_json()
  • pd.json_read()
  • pd.from_json()
  • pd.load_json()

Answer: pd.read_json(). pd.read_json() parses JSON text or a file into a DataFrame.

What does df.to_json(orient='records') produce?

  • A single object keyed by column
  • A list of row objects, one per row
  • Just the column names
  • A CSV string

Answer: A list of row objects, one per row. orient='records' emits a JSON array of per-row objects.

How can you test read_csv on an in-memory string?

  • pd.read_csv(open(text))
  • pd.read_csv(text.bytes)
  • pd.read_csv(io.StringIO(text))
  • pd.read_csv(str(text))

Answer: pd.read_csv(io.StringIO(text)). io.StringIO wraps a string so read_csv can parse it like a file.

Which method reads an Excel spreadsheet?

  • pd.read_xls()
  • pd.read_sheet()
  • pd.excel_read()
  • pd.read_excel()

Answer: pd.read_excel(). pd.read_excel('file.xlsx') loads a spreadsheet into a DataFrame.

What does the header=None argument tell read_csv?

  • The file has no header row, so don't use one for column names
  • Drop the first column
  • Skip every other row
  • Read only the header

Answer: The file has no header row, so don't use one for column names. header=None means there is no header row; columns get integer names.

Which parameter picks specific columns to load in read_csv?

  • columns=
  • select=
  • usecols=
  • keep=

Answer: usecols=. usecols=['a','b'] reads only the listed columns.