Lesson 7

String Escaping and Unicode

Escape sequences, Unicode code points, and special characters in JSON strings.

JSON strings are always delimited by double quotes. Any character that would break the string—or be ambiguous—must be escaped with a backslash.

Common escape sequences

SequenceMeaning
\"Double quote inside a string
\\Literal backslash
\nNewline
\tTab
\rCarriage return
\bBackspace
\fForm feed

Example:

{
  "message": "Line one\nLine two",
  "path": "C:\\Users\\dev\\config.json"
}

Unicode escapes

Use \uXXXX for a Unicode code point with exactly four hex digits:

{
  "greeting": "Hello, \u4e16\u754c"
}

For characters outside the Basic Multilingual Plane, UTF-16 surrogate pairs appear as two \u escapes in JSON—most editors and parsers handle this when reading/writing UTF-8 files.

Characters you cannot put raw in strings

Control characters (U+0000 through U+001F) must be escaped. Unescaped line breaks inside strings are invalid JSON—use \n instead.

UTF-8 files vs escaped Unicode

A .json file saved as UTF-8 may contain literal Chinese or emoji characters:

{ "label": "世界" }

That is valid JSON. Escaped \u forms are equivalent when normalized—choose whichever keeps your pipeline and diff tools happiest.

Practical tips

  • When copying strings from logs, watch for smart quotes " "—they are not valid JSON delimiters.
  • API docs often show \n in examples; your parser converts them to real newlines in memory.
  • If validation fails inside a long string, search for unescaped backslashes or broken \u sequences.

When you want to practice, use the related DevCove tool — optional, not part of this lesson.

Open related tool

Back to course overview