Hashing with hashlib
Hashing turns any data into a short, fixed-size fingerprint called a digest, and hashlib is Python's standard module of one-way hash functions for producing it.
Learn Hashing with hashlib in our free Python course — an interactive lesson with runnable examples, a practice exercise and a quick reference.
Part of the free Python course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
Verify a download, detect a changed file, deduplicate data, or build the foundation for password storage — all from a few lines using the standard library.
Feed bytes to a hash constructor and call .hexdigest() to get the fingerprint as a hex string. The same input always yields the same digest:
Hash functions work on bytes , not text. To hash a normal string you must encode it first — almost always with UTF-8:
You do not have to hand the hasher everything at once. Create an empty hash object and feed it with .update() as many times as you like — the result is identical to hashing the concatenation in one go:
This is exactly how you hash a large file without loading it all into memory — read it in fixed-size chunks and update as you go:
The iter(callable, sentinel) trick keeps calling f.read(chunk_size) until it returns the empty bytes b"" at end of file. Memory use stays flat no matter how big the file is.
A digest is a one-way fingerprint: there is no key and no way to recover the original input. That is the opposite of encryption, which is designed to be reversed. So hashing is great for verifying data but useless for storing data you need back:
For passwords, a bare sha256 is dangerous: it is fast (so guessable in bulk) and unsalted (so identical passwords share a digest). The standard fix is a salt plus a slow key-derivation function. hashlib ships one — pbkdf2_hmac :
Complete the program so it hashes a string correctly. Replace each ___ and match the expected output.
First blank: encode — text.encode("utf-8") converts the string to bytes.
Second blank: hexdigest — returns the digest as a hex string.
✅ Encode first: hashlib.sha256("hello".encode("utf-8")) .
✅ Hashing is one-way. If you need the data back, use encryption (e.g. the cryptography library), not hashlib.
✅ Use hashlib.pbkdf2_hmac with a unique salt and many rounds, or bcrypt/argon2.
Build a fingerprint function and use it to detect when content changes — the same idea behind file integrity checks and caching.
Go deeper with the official Python documentation:
Lesson complete — you can fingerprint any data!
You can produce digests with sha256 and md5 , feed data with .update() , hash files in chunks, explain why hashing is one-way (not encryption), and store passwords properly with a salt and pbkdf2_hmac .
🚀 Up next: Testing with unittest — prove your code works and keep it working.
Practice quiz
What is a hash digest?
- A reversible encryption of the input
- The original input compressed
- A short, fixed-size fingerprint of the input that cannot be reversed
- A random number
Answer: A short, fixed-size fingerprint of the input that cannot be reversed. Hashing turns any input into a short fixed-size digest; the same input always gives the same digest and you cannot reverse it.
Why must you .encode() a string before hashing it with hashlib?
- Because hash functions operate on bytes, not text
- To make it shorter
- To add a salt
- To make it case-insensitive
Answer: Because hash functions operate on bytes, not text. Hash functions work on bytes, so a str must be encoded (usually UTF-8) first or you get a TypeError.
Is hashing the same as encryption?
- Yes, they are interchangeable
- Yes, but hashing is faster
- No — encryption is one-way
- No — hashing is one-way and has no key; encryption is reversible
Answer: No — hashing is one-way and has no key; encryption is reversible. Hashing is one-way by design (no key, no way back); encryption is reversible with the right key.
What does hashlib.md5(b"hello").hexdigest() return?
- 2cf24dba5fb0a30e
- 5d41402abc4b2a76b9719d911017c592
- The text hello
- An error
Answer: 5d41402abc4b2a76b9719d911017c592. md5 of b"hello" is 5d41402abc4b2a76b9719d911017c592 — fine for checksums, never for security.
Calling h.update(b"hello") then h.update(b" world") on a sha256 object gives the same digest as which one-shot call?
- hashlib.sha256(b"hello world")
- hashlib.sha256(b"hello")
- hashlib.sha256(b" world")
- None — update changes the result
Answer: hashlib.sha256(b"hello world"). Feeding data with repeated .update() is identical to hashing the concatenation b"hello world" at once.
Why hash a large file in fixed-size chunks instead of all at once?
- It produces a different, safer digest
- Because update() only accepts small inputs
- To keep memory use flat regardless of file size
- To make the hash reversible
Answer: To keep memory use flat regardless of file size. Reading in chunks and calling update keeps memory use flat even for multi-gigabyte files.
Why is a bare sha256 unsafe for storing passwords?
- It is too slow
- It is fast and unsalted, so identical passwords share a digest and guessing is cheap
- It cannot hash text
- It produces a different digest each time
Answer: It is fast and unsalted, so identical passwords share a digest and guessing is cheap. sha256 is fast (easy to brute force) and unsalted (identical passwords match), so it is unsafe for passwords.
Which hashlib function is designed for password storage?
- hashlib.md5
- hashlib.sha256
- hashlib.new
- hashlib.pbkdf2_hmac
Answer: hashlib.pbkdf2_hmac. pbkdf2_hmac is a slow, salted key-derivation function suitable for passwords (bcrypt/argon2 are also good).
What role does a salt play in password hashing?
- It makes hashing faster
- A unique random value per user so identical passwords get different digests
- It encrypts the password
- It compresses the password
Answer: A unique random value per user so identical passwords get different digests. A unique random salt per user ensures identical passwords do not produce the same stored digest.
What does hashlib.sha256("hello") (a str, not bytes) raise?
- A ValueError
- Nothing — it works fine
- A TypeError: Unicode-objects must be encoded
- A KeyError
Answer: A TypeError: Unicode-objects must be encoded. Passing a str raises TypeError: Unicode-objects must be encoded before hashing — encode it first.