Monday, August 2, 2010

unicode escape formats

For my personal reference only. The original article can be found here.


The following are ASCII representations of Unicode characters known to be used in
various contexts. In a few cases we also include unusual representations of integers
since integers are sometimes converted to characters.



DescriptionExampleUsage
raw hex00E9Unicode Consortium file NamesList.txt
prefix 0x hex0x00E9Yudit keymap source files
prefix v decimalv233Perl
prefix $ hex$00E9Alex Eulenberg Mac OS X keyboard map source files
prefix # with suffix # hex#00E9#Mimer SQL
prefix #$ hex#$00E9some Pascals including Delphi
prefix 16# hex16#00E9Postscript
prefix #x hex#x00E9some Common Lisp implementations
prefix #16r hex#16r00E9Common Lisp integer
prefix backslash hex [4]\00E9Oracle in unistr function
prefix backslash-u decimal\u0233Rich Text Format
prefix backslash-u hex [4]\u00E9Java, Ruby
prefix backslash-u left brace hex [variable] right brace\u{E9}, \u{000E9}, multiple codepoints: \u{E9 74 E9}Ruby
prefix backslash-U hex [8] outside BMP, prefix backslash-u hex [4] within BMP\u00E9C#, D, Python, Scheme, Tcl
prefix U hexU00E9
prefix u hexu00E9
prefix u HEX [5-6]u100E9Fontlab Studio outside BMP
prefix %u hex%u00E9
prefix U+ hexU+00E9Unicode Consortium documents
prefix uni HEX [4]uni00E9Fontlab Studio within BMP
prefix X with hex in single quotesX'00E9'some IBM documentation
prefix X with hex in double quotes with optional type postfix character c[har], d[char], w[char]X"00E9"dD
prefix 16# and suffix # hex16#00E9#Ada
prefix U in angle brackets hex<U00E9>POSIX locale specifications
prefix backslash-x hex\x00E9C wide string, Tcl integer
prefix backslash-x hex in braces\x{00E9}Perl
prefix backslash {U+ hex with suffix }\{U+E9}BitC
prefix &# with suffix ; decimal&#0233;HTML, XML, XHTML
prefix &#x hex with suffix ;&#x00E9;HTML, XML, XHTML
prefix &# decimal&#0233SGML, HTML (deprecated)
prefix &#x hex&#x00E9SGML, HTML (deprecated)
prefix backslash-# decimal\#0233;SGML
prefix backslash-#x hex\#x00E9;SGML
prefix _x and suffix _ hex_x00E9_OOXML, SQL/XML
3 low bytes each with backslash prefix in big-endian order octal\000\000\351
3 low bytes each with backslash-x prefix in big-endian order hex\x00\x00\xE9
3 low bytes each with backslash-d prefix in big-endian order decimal\d000\d000\d233POSIX locale specifications
prefix " hex[4] (UTF16 - use surrogate pairs beyond BMP) "00E9XeTeX
hex UTF-8 with each byte's hex preceded by an =-sign=C3=A9RFC 2045 Quoted Printable
hex UTF-8 with each byte's hex preceded by a %-sign%C3%A9RFC 2396 URI escape format
hex UTF-8 with each byte's hex preceded by a backslash-x\xC3\xA9Apache log format
hex UTF-8 with each byte's hex surrounded by angle brackets<C3><A9>print format for uninterpreted bytes used by various programs
octal UTF-8 with backslash prefixes\303\251print format for uninterpreted bytes used by various programs

No comments:

Related Posts Plugin for WordPress, Blogger...