Runes, Bytes & UTF-8

A byte is a single uint8 of raw UTF-8 data while a rune is an int32 holding one full Unicode character, and understanding the difference is the key to handling non-ASCII text correctly in Go.

Learn Runes, Bytes & UTF-8 in our free Go course — an interactive lesson with runnable examples, a practice exercise and a quick reference.

Part of the free Go course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

What You'll Learn in This Lesson

1️⃣ Bytes, Runes & Why len() Counts Bytes

Go source code is UTF-8, and a string stores that UTF-8 directly as bytes. A byte is a uint8 — one slot of the encoding. A rune is an int32 — one whole Unicode code point. ASCII characters fit in a single byte, but accented or non-Latin characters take 2-4. That's why len("héllo") is 6 (bytes) while the character count is 5 , which you get from utf8.RuneCountInString .

2️⃣ Ranging Over a String Yields Runes

The for ... range loop over a string is UTF-8 aware. Each iteration decodes one rune and gives you two values: the byte index where that character starts and the rune itself. Because a multi-byte character advances the index by more than one, you'll see the index jump — proof that range is walking by character, not by byte. A rune literal like 'A' is just an int32 code point (65).

3️⃣ []rune vs []byte

Converting a string makes its structure explicit. []byte(s) gives the raw UTF-8 bytes — perfect for I/O and byte-level work. []rune(s) gives one element per character , so you can index the Nth character safely and operate per character. Reversing text is the classic case: doing it with bytes would split a multi-byte character into garbage, but reversing a []rune and converting back keeps every character whole.

🎯 Your Turn

Fill in the blanks to print both the byte count and the character count of "café" — they differ because é is two bytes.

Reverse "go世界" the correct way. Convert to []rune , swap from both ends, then convert back with string(rs) . If you tried this with bytes the CJK characters would shatter.

Practice quiz

A byte in Go is an alias for which type?

  • int32
  • rune
  • uint8
  • int8

Answer: uint8. byte is an alias for uint8 — one 8-bit value of the raw UTF-8 encoding.

A rune in Go is an alias for which type?

  • int32
  • uint8
  • int64
  • uint32

Answer: int32. rune is an alias for int32 and holds one full Unicode code point.

What does len("héllo") return, where é is 2 bytes in UTF-8?

  • 4
  • 5
  • 7
  • 6

Answer: 6. len counts bytes: h-é(2)-l-l-o is 6 bytes.

How do you count the number of characters (runes) in a string s?

  • len(s)
  • utf8.RuneCountInString(s)
  • cap(s)
  • s.Count()

Answer: utf8.RuneCountInString(s). utf8.RuneCountInString(s) (or len([]rune(s))) counts characters, not bytes.

What does utf8.RuneCountInString("café") return, where é is 2 bytes?

  • 4
  • 3
  • 5
  • 6

Answer: 4. There are 4 characters (c, a, f, é) even though it is 5 bytes.

Ranging over a string with for i, r := range s yields what for each iteration?

  • A byte index and a byte
  • A rune index and a rune
  • A byte index and a rune
  • A character index and a byte

Answer: A byte index and a rune. range is UTF-8 aware: it gives the starting byte index and the decoded rune.

Ranging over "Gé!", what byte index does the '!' character have?

  • 1
  • 3
  • 2
  • 4

Answer: 3. G is byte 0, é occupies bytes 1-2, so '!' starts at byte index 3.

Indexing a string with s[1] returns what kind of value?

  • A one-character string
  • A rune
  • A pointer
  • A single byte (uint8)

Answer: A single byte (uint8). Indexing a string always yields a single byte, not a character.

Why reverse a string by converting to []rune rather than []byte?

Reversing bytes would split multi-byte UTF-8 sequences; []rune keeps characters whole.

What does the conversion string(rune(65)) produce?

  • "65"
  • 65
  • "A"
  • An error

Answer: "A". Converting an int code point to string yields the character for that code point — "A".