Capstone: Image as an Array
An image is simply a 3D NumPy array of shape (height, width, 3) holding uint8 values from 0 to 255 — one number per color channel (red, green, blue) for every pixel — which means every editing trick is really just array indexing, slicing, broadcasting, and aggregation. In this final capstone you'll build a synthetic image, set its channels, convert it to grayscale, flip and crop with slicing, invert its colors, and brighten it with clip — tying together everything from the whole course.
Learn Capstone: Image as an Array in our free NumPy course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick…
Part of the free Numpy course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
A color image is a grid of pixels, and each pixel is three numbers — red, green, blue. NumPy stores that as a 3D array with shape (height, width, 3) and dtype uint8 (0–255). Create a blank black image with np.zeros .
The three axes are: axis 0 = rows (height), axis 1 = columns (width), axis 2 = color channels (R, G, B).
Expected output: shape (4, 6, 3) , dtype uint8 , and the first pixel [0 0 0] (pure black).
The last axis selects a channel: index 0 is red, 1 is green, 2 is blue. Assigning to a channel slice uses broadcasting to fill many pixels at once. Color cheat-sheet: [255, 0, 0] is red, [0, 255, 0] is green, [0, 0, 255] is blue, and [255, 255, 255] is white.
Expected output: a red pixel [255 0 0] and a green corner pixel [0 255 0] .
Grayscale collapses the 3 channels into a single brightness value per pixel using a weighted sum across the channel axis. The standard luminosity weights are 0.2989·R + 0.5870·G + 0.1140·B , which match how the human eye perceives color. np.dot applies them across the last axis.
Expected output: a 2D (2, 2) array of brightness values about 76.2 (red), 149.7 (green), 29.1 (blue), 255.0 (white).
Geometry operations are pure slicing. img[::-1] reverses the row axis to flip top-to-bottom; img[:, ::-1] flips left-to-right. Cropping is a slice of the grid: img[r0:r1, c0:c1] keeps a rectangle of pixels (with all channels).
Expected output: red values [0 10 20 30] top-to-bottom, reversed to [30 20 10 0] after the flip, and a crop of shape (2, 2, 3) .
Inverting colors is broadcasting a scalar across the whole array: 255 - img turns black to white and red to cyan. Brightening adds a value but must stay in range — use np.clip(..., 0, 255) so nothing overflows past 255.
Expected output: inverted pixels [245 235 225] and [55 5 155] ; the brightened pixel [240 255 140] — note 250 + 40 clips to 255, not a wrap-around.
Replace each ___ so the program builds a small blue image, then inverts it to yellow.
Expected output: [0 0 255] then [255 255 0] . (Answers: 3 , 2 , 255 .)
With uint8, 250 + 40 wraps to 34 instead of saturating, giving weird colors.
✅ Fix: compute in a wider type, then np.clip(result, 0, 255).astype(np.uint8) .
The shape is (height, width, 3) : rows come first, columns second.
✅ Fix: index as img[row, col] , and remember axis 2 is the color channel.
Build a gradient image, brighten it with clip, flip it, then report its average brightness — tying together shape, dtype, slicing, broadcasting, clip, and aggregation.
🎉 Capstone complete — you've finished the NumPy course!
You treated an image as a 3D array and used every core NumPy skill — shape and dtype, indexing and slicing, broadcasting, aggregation, and clip — to create, recolor, grayscale, flip, crop, invert, and brighten it. That is real image processing, powered entirely by arrays.
🚀 Congratulations! From your first np.array to a full image-editing pipeline, you now have a working command of NumPy. Go build something amazing with it.
Practice quiz
What shape does a color image have as a NumPy array?
- (width, height)
- (channels, height, width)
- (height, width, channels)
- (height, width)
Answer: (height, width, channels). Color images are 3D arrays of shape (height, width, channels).
Which dtype is used for standard 0-255 image pixels?
- uint8
- float64
- int32
- bool
Answer: uint8. uint8 holds exactly the 0-255 range in a single byte per value.
What does np.zeros((4, 6, 3), dtype=np.uint8) create?
- A white image
- A 4x6 black RGB image
- A 6x4 grayscale image
- A 1D array
Answer: A 4x6 black RGB image. All-zero RGB pixels are black, with shape (4, 6, 3).
In an RGB image, which index selects the blue channel?
- 0
- 3
- 1
- 2
Answer: 2. Channel order is R=0, G=1, B=2, so blue is index 2.
What does img[:, :, 0] = 255 do?
- Sets every pixel to white
- Deletes the red channel
- Sets the red channel of every pixel to full
- Sets only the first pixel
Answer: Sets the red channel of every pixel to full. It assigns 255 to the red channel (index 0) of all pixels.
How is an RGB image converted to grayscale?
- A weighted sum across the channel axis
- By dropping the red channel
- By summing all axes
- By transposing the array
Answer: A weighted sum across the channel axis. Grayscale is a luminosity-weighted sum: 0.2989R + 0.5870G + 0.1140B.
What does img[::-1] do to an image?
- Inverts the colors
- Flips it top-to-bottom (reverses rows)
- Crops the image
- Flips it left-to-right
Answer: Flips it top-to-bottom (reverses rows). Slicing the row axis with ::-1 reverses the rows, a vertical flip.
Which expression inverts an image's colors?
- img * -1
- img.T
- np.flip(img)
- 255 - img
Answer: 255 - img. Broadcasting 255 - img turns each value into its complement.
Why use np.clip when brightening uint8 pixels?
- To make the image larger
- To prevent overflow wrap-around past 255
- To convert to float
- To flip the image
Answer: To prevent overflow wrap-around past 255. Clipping to 0-255 stops values from wrapping around in uint8.
Cropping a rectangle of an image is done with...
Cropping is just slicing the row and column axes.