🔄 Unicode Converter
Convert between characters and Unicode code points, HTML entities, CSS/JS escapes, Python/Java literals, and UTF-8/16/32 byte sequences.
Type or paste any character, word, or phrase into the input field. The tool accepts raw text, Unicode escape sequences (U+0041), HTML entities (&), or hexadecimal code points directly.
Choose from 17 encoding formats including UTF-8 hex bytes, UTF-16 code units, UTF-32, decimal NCR, hexadecimal NCR, HTML named entity, CSS escape, JavaScript escape, Python literal, URL percent-encoding, Base64, binary, octal, and more.
Click any output field to copy the converted representation. Use the "Copy All" button to export every encoding at once in a structured format suitable for documentation or code.
Unicode conversion bridges the gap between the abstract identity of a character and its concrete binary representation in various computing contexts. The Unicode Standard assigns a unique code point to every character across all of the world's writing systems, symbols, and emoji — over 149,000 characters in Unicode 16.0 spanning 154 scripts. Understanding how these code points translate into bytes, escape sequences, and markup forms is fundamental to building software that handles multilingual text correctly.
Different encoding formats serve different purposes. UTF-8 is the dominant encoding for the web and file storage, using 1–4 bytes per code point while remaining backward-compatible with ASCII. UTF-16 is used internally by JavaScript, Java, and Windows, representing most characters in 2 bytes but requiring surrogate pairs for supplementary characters. HTML numeric character references (&#decimal; or &#xhex;) and named entities (&, ©) embed characters safely in markup. CSS escapes (\000041) and JavaScript \uXXXX and \u{XXXXX} sequences are necessary for embedding characters in stylesheets and scripts respectively.
Practical knowledge of Unicode conversion prevents common bugs: mojibake occurs when a byte sequence encoded in one format is decoded as another; incorrect URL encoding causes broken links; truncating UTF-8 strings at byte boundaries rather than code point boundaries corrupts characters. Tools that expose all 17 encoding representations simultaneously allow developers to audit every layer of a character's representation at once, making it easier to trace encoding errors from HTML source through JavaScript runtime to database storage.
Convert between characters and Unicode code points, HTML entities, CSS/JS escapes, Python/Java literals, and UTF-8/16/32 byte sequences.