Lesson 5

Base64 vs Hex

When to choose Base64 over hexadecimal encoding.

Hexadecimal (“hex”) expands each nibble (4 bits) into one of sixteen ASCII symbols 0–9A–F. Base64 packs more bits per output character—six instead of four—so payloads become shorter in character count though not always shorter in byte size after compression layers.

Expansion factors (rough intuition)

Encoding	Bits per output symbol	Alphabet size
Hex	4	16
Base64	6	64

Ignoring framing, Base64 typically uses ~4 typed characters per 3 raw bytes. Hex spends 2 typed characters per byte. That ~33% Base64 overhead sounds bad until you realize hex doubles visible length unconditionally.

Concrete feel: sixteen raw bytes ⇒ thirty-two hex digits vs roughly twenty-two Base64 characters (depends on alignment & padding tails).

When hex shines

Prefer hex when:

Humans must eyeball-diff fingerprints (hashes often printed hex)
You only emit printer-safe narrow charset but still want trivial mapping (“each pair of hex chars = one byte”)
Debugging quick scripts where Base64 introduces padding mental load
Constraints mandate case-insensitive lexical ordering uniformly (sometimes hex uppercasing standardizes comparisons)

Hashes in Git, SHA-256 digests on download pages—hex dominance is ergonomic, not cryptographic necessity.

When Base64 shines

Prefer Base64 when:

You must embed binary in historical MIME / JSON-ish text envelopes quickly
Channel forbids ambiguity from newline-insensitive parsers chewing odd widths (still watch wrapping)
You want fewer delimiter characters overall versus long hex blobs in logs (subjective readability trade-off surfaces)
Library ecosystem already standardized on Base64 wrappers (JWT, PEM inner payload blocks)

Neither format compresses entropy; choosing is transport ergonomics.

Mixed pipelines

Danger pattern: SHA-256 shown as hex in UI → developer Base64 encodes the ASCII hex string, not raw digest bytes → checksum mismatch hell. Maintain clarity: digest object vs printable representation vs wire encoding layers.

Similarly, double application (hex then Base64) rarely helps—each layer expands or transforms without adding security unless tied to purposeful protocol steps.

Example confusion (conceptual)

digest_bytes   = cryptographic output (opaque 32 bytes typical for SHA-256)
hex_string     = human view of digest_bytes ('a3f...')
base64_digest  = another view of SAME bytes (different alphabet)

Converting formats must unpack to identical octets.

Performance note

Computational cost is negligible for typical sizes (< few MB) on modern CPUs. Serialization choice should follow interop + clarity, not micro-benchmark folklore.

Key takeaway

Hex is wonderfully regular (2 chars / byte); Base64 is denser among printable ASCII-heavy stacks. Encode the raw bytes intended by your protocol—not an accidental textual intermediate—and document which canonical string form validators expect.

← Back to course overview