Unicode

Unicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, emoji (including in colors), and non-visual control and formatting codes.

Alias: Universal Coded Character Set
Caption: enLogo of the Unicode Consortium; enThe Arabic; enThe Devanāgarī -ligature of JanaSanskritSans; enligature
Comment: enUnicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, emoji (including in colors), and non-visual control and formatting codes.
Cs1Dates: eny
Date: enApril 2010; enMay 2019
Depiction
Has abstract: enUnicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, emoji (including in colors), and non-visual control and formatting codes. Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, and most modern programming languages. The Unicode character repertoire is synchronized with ISO/IEC 10646, each being code-for-code identical with the other. The Unicode Standard, however, includes more than just the base code. Alongside the character encodings, the Consortium's official publication includes a wide variety of details about the scripts and how to display them: normalization rules, decomposition, collation, rendering, and bidirectional text display order for multilingual texts, and so on. The Standard also includes reference data files and visual charts to help developers and designers correctly implement the repertoire. Unicode can be stored using several different encodings, which translate the character codes into sequences of bytes. The Unicode standard defines three and several other encodings exist, all in practice variable-length encodings. The most common encodings are the ASCII-compatible UTF-8, the ASCII-incompatible UTF-16 (compatible with the obsolete UCS-2), and the Chinese Unicode encoding standard GB18030 which is not an official Unicode standard but is used in China and implements Unicode fully.
Hypernym: Industry
Image: enJanaSanskritSans ddhrya.svg; 23
Is primary topic of: Unicode
Label: enUnicode
Lang: enInternational
Link from a Wikipage to an external page: home.unicode.org/%7Cname=Official; doi.org/10.36824/2018-graf-hara1; www.alanwood.net/unicode/; www.unicode.org/reports/tr44/; www.unicode.org/versions/latest/; www.unicode.org/versions/Unicode6.0.0/; unicode.org/main.html%7Cname=Official; www.unicode.org/main.html%7Cname=Official,; scripts.sil.org/cms/scripts/page.php%3Fsite_id=nrsi&id=UnicodeBMPFallbackFont; www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt; www.worldswritingsystems.org
Link from a Wikipage to another Wikipage: .NET Framework; 16-bit computing; 32-bit computing; Abugida; Acute accent; Adobe Inc.; Allograph; Alphabet; Alphabetic Presentation Forms; Apple Advanced Typography; Apple Inc.; April Fools' Day RFC; Arabic script in Unicode; Arrows (Unicode block); ASCII; ATSUI; Base64; Basic Latin (Unicode block); Bidirectional text; Bidirectional Text; Binary Ordered Compression for Unicode; Block (Unicode); Block Elements; Box Drawing; Byte; Byte order mark; C0 and C1 control codes; Cangjie method; Canonical equivalence; Category:Character encoding; Category:Digital typography; Category:Unicode; Character (computing); Character encoding; Character property (Unicode); Character set; Charis SIL; Chinese characters; CJK characters; CJK Unified Ideographs; Cocoa text system; Code; Code page; Code point; Combining character; Combining diacritical mark; Comparison of Unicode encodings; ConScript Unicode Registry; Core Text; COVID-19 pandemic; Currency Symbols (Unicode block); Cyrillic (Unicode block); Dave Opstad; Devanagari; Diminishing returns; DIN 91379; DirectWrite; Domain Name System; Dot (diacritic); Dot above; Duplicate characters in Unicode; E; East Asian language; EBCDIC; Egyptian hieroglyph; Egyptian hieroglyphs; Email; Emoji; Endianness; EUC-JP; European Union; Extended ASCII; Facebook, Inc.; File:Cyrillic cursive.svg; File:Hiero O4.png; File:Unicode sample.png; Font; Font substitution; FreeBSD; FTP; GB 18030; General Punctuation; Geometric Shapes; Glyph; Gmail; GNOME; GNU Compiler Collection; Google; Grapheme; Graphite (SIL); Greek and Coptic; Greek Extended; GTK+; Halfwidth and Fullwidth Forms (Unicode block); Hangul; Hangul Jamo; Han unification; Hexadecimal; High-level programming language; Homoglyphs; HTML; HTTP; IBM; Ideographic Description Sequences; Ideographic Research Group; IDNA; IEC 14755; IEC 6429; IEC 8859; IEC 8859-1; IETF; Indic script; Indo-Aryan languages; Information technology; Injective; Input method; International Components for Unicode; Internationalization and localization; Internationalized Domain Names; International Organization for Standardization; Internet Explorer; IPA Extensions; ISCII; ISO-2022; ISO 8859-1; Java virtual machine; Joe Becker (Unicode); Jurchen script; Kanji; KDE; Ken Lunde; Khitan small script; Klingon scripts; Last Resort font; Latin-1 Supplement (Unicode block); Latin character; Latin Extended-A; Latin Extended Additional; Latin Extended-B; Leading zero; Lee Collins (software engineer); Letterlike Symbols; Ligature (typography); Linux distributions; List of binary codes; List of typefaces; List of Unicode characters; List of Unicode fonts; List of XML and HTML character entity references; Lithuanian language; Lotus Multi-Byte Character Set; MacOS; Macron (diacritic); Mark Davis (Unicode); Mathematical Operators; Maya script; Medieval Unicode Font Initiative; Michael Everson; Microsoft; Microsoft Layer for Unicode; Microsoft Windows; MIME; Ministry of Endowments and Religious Affairs (Oman); Miscellaneous Symbols; Miscellaneous Technical; Mojibake; Multilingualism; Musical notation; Netflix; Newline; NeXT; NLP (computer science); Number; Number Forms; Ogonek; Open-source Unicode typefaces; OpenType; Operating system; Outlook.com; Pango; Parody; People's Republic of China; Percent encoding; Plan 9 from Bell Labs; Precomposed character; Private Use Area (Unicode block); Private Use Areas; Programming language; Proof of concept; Punycode; Python (programming language); Quoted-printable; Radical (Chinese character); Radical (Chinese characters); Religious and political symbols in Unicode; Replacement character; Research Libraries Group; Romanization; Rongorongo; Roozbeh Pournader; Round-trip format conversion; SAP SE; SC 2; Script (Unicode); Seed7; Shape context; Shift-JIS; Sic; SIL International; Software; Spacing Modifier Letters; Specials (Unicode block); Standard Compression Scheme for Unicode; Standardization Administration of China; Standards related to Unicode; Sun Microsystems; Superscripts and Subscripts; Syllabary; Tamil script; Tatsuo Kobayashi; Technical standard; Tengwar; Thai alphabet; Tibetan script; TIS-620; TRON (encoding); TrueType; Typeface; Unicode alias names and abbreviations; Unicode collation algorithm; Unicode Consortium; Unicode equivalence; Unicode fallback font; Unicode symbols; Uniform Resource Identifier; Uniscribe; Universal Coded Character Set; University of California, Berkeley; University of Cambridge; University of Edinburgh; Unix-like; URL; UTF-1; UTF-16; UTF-18; UTF-32; UTF-5; UTF-6; UTF-7; UTF-8; UTF-9; UTF-EBCDIC; Variable-width encoding; Variation Selectors; W3C; Web browser; Web Open Font Format; WGL-4; Wide character; Windows 10; Windows 11; Windows-1252; Windows 2000; Windows 7; Windows 8; Windows 9x; Windows NT; Windows NT 4.0; Windows Vista; Windows XP; Wireless; WOFF2; Word processor; World Wide Web; Writing system; Wubi method; Xerox; Xerox Character Code Standard; XHTML; XML; Yahoo! Mail; ʼPhags-pa script
M: enUnicode
Mw: enno
N: enno
Name: enUnicode
Prev: enISO/IEC 8859, various others
Q: enno
Reason: en"and, contains" and meaning of statement
S: enno
SameAs: 4343497-6; 52xEf; Junikod; m.07s w; m.07x89; Mx4rv6pxRpwpEbGdrcN5Y29ycA; Q8819; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicode; Unicòde; Unicôde; Unicodex; Unikod; Unikod; Unikodas; Unikodo; Unikods; Уникод; Уникод; Унікод; Юникод; Юникод; Юникод; Юникод; Юникод; Юникод; Юникод; Юнікод; Յունիկոդ; יוניקאד; יוניקוד; الترميز الموحد; یونی‌کد; یونیکوڈ; یوونیکۆد; युनिकोड; युनिकोड; युनिकोड; युनिकोड; यूनिकोड; ইউনিকোড; ਯੂਨੀਕੋਡ; યુનિકોડ; ஒருங்குறி; యూనికోడ్; ಯುನಿಕೋಡ್; യൂണികോഡ്; යුනිකෝඩ්; ยูนิโคด; ယူနီကုဒ်; უნიკოდი; ዩኒኮድ; 유니코드
SeeAlso: Unicode normalization; Universal Character Set characters; UTF-8
Species: enno
Standard: enUnicode Standard
Subject: Category:Character encoding; Category:Digital typography; Category:Unicode
Thumbnail
TotalWidth: 300
V: enno
Voy: enno
WasDerivedFrom: Unicode?oldid=1124795833&ns=0
WikiPageLength: 77883
Wikipage page ID: 31742
Wikipage revision ID: 1124795833
WikiPageUsesTemplate: Template:Abbr.; Template:Anchor; Template:As of; Template:Authority control; Template:Better source needed; Template:Char; Template:Character encoding; Template:Citation needed; Template:Cite book; Template:Clarify; Template:Cn; Template:Contains special characters; Template:DMOZ; Template:Em; Template:General Category (Unicode); Template:IAST; Template:IETF RFC; Template:Infobox character encoding; Template:IPA-th; Template:ISBN; Template:Main; Template:Middot; Template:Mono; Template:Multiple image; Template:Notelist; Template:Official website; Template:Quote; Template:Refbegin; Template:Refend; Template:Reflist; Template:Refn; Template:Sc2; Template:See also; Template:Short description; Template:Sister project links; Template:Snd; Template:Tt; Template:Typo; Template:Ubl; Template:Unichar; Template:Unicode navigation; Template:Unicode version history; Template:Use dmy dates; Template:Use Oxford spelling; Template:Wiktth

Unicode

Backlinks

About

Resources

Support

Follow us