Question 1

What is a Unicode code point?

Accepted Answer

A Unicode code point is a unique number assigned to every character in the Unicode standard, written as U+XXXX (e.g., U+0041 for 'A'). Unicode covers over 140,000 characters across all writing systems, symbols, and emoji.

Question 2

What is the difference between UTF-8 and UTF-16?

Accepted Answer

UTF-8 is a variable-width encoding that uses 1–4 bytes per character. ASCII characters use just 1 byte, making it efficient for English text and common on the web. UTF-16 uses 2 or 4 bytes per character and is common in Windows and Java environments. This tool shows both encodings side by side.

Question 3

Why do some emoji show as multiple rows?

Accepted Answer

Complex emoji like flags, family groups, and skin-tone variations are composed of multiple Unicode code points joined with Zero Width Joiner (ZWJ) sequences. The inspector breaks these into individual code points so you can see exactly what makes up the sequence.

Question 4

What are the escape output options?

Accepted Answer

JavaScript escaping converts non-ASCII characters to \uXXXX sequences safe for use in JS string literals. URL encoding converts characters to %XX percent-encoded format for use in URLs. HTML entity encoding converts characters to &#NNN; numeric entities for safe use in HTML documents.

Char	Category	Name
H	Letter	LATIN CAPITAL LETTER H
e	Letter	LATIN SMALL LETTER E
l	Letter	LATIN SMALL LETTER L
l	Letter	LATIN SMALL LETTER L
o	Letter	LATIN SMALL LETTER O
,	Punctuation	COMMA
	Space	SPACE
世	Other	U+4E16
界	Other	U+754C
!	Punctuation	EXCLAMATION MARK
	Space	SPACE
🎉	Emoji	U+1F389

Unicode / UTF-8 Inspector

Frequently Asked Questions

What is a Unicode code point?

What is the difference between UTF-8 and UTF-16?

Why do some emoji show as multiple rows?

What are the escape output options?

Related Tools