Data Types & astype

A dtype is the internal storage type pandas assigns to each column — int64, float64, object, bool, or datetime64 — and it decides how fast and how correctly your operations run.

Learn Data Types & astype in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick reference.

Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

Learn to read the .dtypes attribute, convert columns with astype(), handle messy values with pd.to_numeric(errors=), and understand why object columns are the slow ones.

Every column has exactly one dtype, and df.dtypes lists them all at once. The names you will meet most often are int64 (whole numbers), float64 (decimals, and the home of NaN ), bool (True/False), datetime64[ns] (timestamps), and object — the catch-all that usually means text, but is also where pandas dumps any column it could not interpret as something cleaner.

astype() is the direct way to change a column's type. Call it on a Series and pass the target type — "int64" , "float64" , "str" , "bool" . It is strict: if any value cannot be converted, it raises an error rather than guessing. That strictness is a feature, because it stops bad data from silently slipping through.

Real spreadsheets are dirty: a price column might hold "1.50" , "3.00" , and a rogue "N/A" . Calling astype("float") on that explodes. pd.to_numeric() is built for this — its errors= argument lets you decide what happens to values that will not parse. errors="coerce" turns each bad value into NaN so the rest of the column converts successfully.

A stray non-numeric value blows up a strict astype:

astype("int") refuses a column that contains NaN:

✅ Fix: fill the gaps first, or use the nullable Int64:

A small import came in with everything as text. Repair the dtypes.

Lesson complete — your columns have the right types!

You can read df.dtypes , convert cleanly with astype , rescue messy data with pd.to_numeric(errors="coerce") and pd.to_datetime , and you know why object columns drag performance down.

🚀 Up next: Categorical Data — the special dtype that shrinks repeated-text columns and unlocks ordered categories.

Practice quiz

What does df.dtypes show you?

The data type of every column
The number of rows
The column names only
The missing values

Answer: The data type of every column. df.dtypes lists each column's storage type at once.

Which dtype is the home of NaN and decimal numbers?

int64
bool
float64
datetime64

Answer: float64. float64 holds decimals and is where NaN lives.

What happens to an int column the moment a single NaN appears?

It stays int64
It becomes bool
It is deleted
It becomes float64

Answer: It becomes float64. Classic int64 cannot hold NaN, so the column is promoted to float64.

What does pd.Series([88.7, 91.2, 75.9]).astype('int') produce?

astype('int') truncates toward zero rather than rounding.

How is astype() described compared with to_numeric()?

Strict, raising an error on bad values
It silently drops bad rows
It only works on dates
It always returns strings

Answer: Strict, raising an error on bad values. astype is strict and raises if any value cannot be converted.

What does errors='coerce' do in pd.to_numeric()?

Raises a ValueError
Skips the whole column
Rounds the values
Turns unparseable values into NaN

Answer: Turns unparseable values into NaN. errors='coerce' replaces values that cannot be parsed with NaN.

After pd.to_numeric(pd.Series(['1.50','3.00','N/A','0.75']), errors='coerce'), what does .isna().sum() return?

Answer: 1. Only 'N/A' fails to parse, so exactly one NaN appears.

Which function parses a column of date strings into real datetimes?

pd.to_int
pd.parse
pd.to_datetime
pd.astype_date

Answer: pd.to_datetime. pd.to_datetime parses common formats into a datetime64 column.

Why are object columns slow?

They are always sorted
They store pointers to scattered Python objects instead of a contiguous block
They are stored on disk
They are encrypted

Answer: They store pointers to scattered Python objects instead of a contiguous block. Object columns hold pointers to separate Python objects, so operations loop in slow Python.

astype('int') fails on a column containing NaN. A safe fix is:

Delete the column
Convert to bool first
Use a for loop
fillna(0) first, or use the nullable 'Int64' dtype

Answer: fillna(0) first, or use the nullable 'Int64' dtype. Clean the NaNs with fillna, or use the nullable Int64 dtype that allows missing values.