Runes, Bytes & UTF-8
A byte is a single uint8 of raw UTF-8 data while a rune is an int32 holding one full Unicode character, and understanding the difference is the key to handling non-ASCII text correctly in Go.
Learn Runes, Bytes & UTF-8 in our free Go course — an interactive lesson with runnable examples, a practice exercise and a quick reference.
Part of the free Go course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
What You'll Learn in This Lesson
1️⃣ Bytes, Runes & Why len() Counts Bytes
Go source code is UTF-8, and a string stores that UTF-8 directly as bytes. A byte is a uint8 — one slot of the encoding. A rune is an int32 — one whole Unicode code point. ASCII characters fit in a single byte, but accented or non-Latin characters take 2-4. That's why len("héllo") is 6 (bytes) while the character count is 5 , which you get from utf8.RuneCountInString .
2️⃣ Ranging Over a String Yields Runes
The for ... range loop over a string is UTF-8 aware. Each iteration decodes one rune and gives you two values: the byte index where that character starts and the rune itself. Because a multi-byte character advances the index by more than one, you'll see the index jump — proof that range is walking by character, not by byte. A rune literal like 'A' is just an int32 code point (65).
3️⃣ []rune vs []byte
Converting a string makes its structure explicit. []byte(s) gives the raw UTF-8 bytes — perfect for I/O and byte-level work. []rune(s) gives one element per character , so you can index the Nth character safely and operate per character. Reversing text is the classic case: doing it with bytes would split a multi-byte character into garbage, but reversing a []rune and converting back keeps every character whole.
🎯 Your Turn
Fill in the blanks to print both the byte count and the character count of "café" — they differ because é is two bytes.
Reverse "go世界" the correct way. Convert to []rune , swap from both ends, then convert back with string(rs) . If you tried this with bytes the CJK characters would shatter.
Practice quiz
A byte in Go is an alias for which type?
- int32
- rune
- uint8
- int8
Answer: uint8. byte is an alias for uint8 — one 8-bit value of the raw UTF-8 encoding.
A rune in Go is an alias for which type?
- int32
- uint8
- int64
- uint32
Answer: int32. rune is an alias for int32 and holds one full Unicode code point.
What does len("héllo") return, where é is 2 bytes in UTF-8?
- 4
- 5
- 7
- 6
Answer: 6. len counts bytes: h-é(2)-l-l-o is 6 bytes.
How do you count the number of characters (runes) in a string s?
- len(s)
- utf8.RuneCountInString(s)
- cap(s)
- s.Count()
Answer: utf8.RuneCountInString(s). utf8.RuneCountInString(s) (or len([]rune(s))) counts characters, not bytes.
What does utf8.RuneCountInString("café") return, where é is 2 bytes?
- 4
- 3
- 5
- 6
Answer: 4. There are 4 characters (c, a, f, é) even though it is 5 bytes.
Ranging over a string with for i, r := range s yields what for each iteration?
- A byte index and a byte
- A rune index and a rune
- A byte index and a rune
- A character index and a byte
Answer: A byte index and a rune. range is UTF-8 aware: it gives the starting byte index and the decoded rune.
Ranging over "Gé!", what byte index does the '!' character have?
- 1
- 3
- 2
- 4
Answer: 3. G is byte 0, é occupies bytes 1-2, so '!' starts at byte index 3.
Indexing a string with s[1] returns what kind of value?
- A one-character string
- A rune
- A pointer
- A single byte (uint8)
Answer: A single byte (uint8). Indexing a string always yields a single byte, not a character.
Why reverse a string by converting to []rune rather than []byte?
Reversing bytes would split multi-byte UTF-8 sequences; []rune keeps characters whole.
What does the conversion string(rune(65)) produce?
- "65"
- 65
- "A"
- An error
Answer: "A". Converting an int code point to string yields the character for that code point — "A".