034_Unicode标准
1. Unicode標準
1.1. 由于ASCII字符集、ISO字符集、GBK字符集列出的字符集都有容量限制, 而且不兼容多語言環境, Unicode聯盟開發了Unicode標準。
1.2. Unicode標準涵蓋了世界上的所有字符、標點和符號。
1.3. 不論是何種平臺、程序或語言, Unicode都能夠進行文本數據的處理、存儲和交換。
1.4. ?Unicode標準已經獲得了成功, 在XML、Java、ECMAScript(JavaScript)、LDAP、CORBA 3.0、WML中, Unicode已經得到了實現。在許多操作系統以及所有的現代瀏覽器中, Unicode同樣得到了支持。
2. Unicode字符
2.1. 網址: https://home.unicode.org/
2.2. Unicode標準版本13歸檔字符
| 編號 | 字符名稱 | 范圍 |
| 1 | C0 Controls and Basic Latin(控制符及基本拉丁文) | 0000–007F |
| 2 | C1 Controls and Latin-1 Supplement(控制符及拉丁文-1補充) | 0080–00FF |
| 3 | Latin Extended-A(拉丁文擴展-A) | 0100–017F |
| 4 | Latin Extended-B(拉丁文擴展-B) | 0180–024F |
| 5 | IPA Extensions(國際音標擴展) | 0250–02AF |
| 6 | Spacing Modifier Letters(空白修飾字母) | 02B0–02FF |
| 7 | Combining Diacritical Marks(組合用附加符號) | 0300–036F |
| 8 | Greek and Coptic(希臘文及科普特文) | 0370–03FF |
| 9 | Cyrillic(西里爾語) | 0400–04FF |
| 10 | Cyrillic Supplement(西里爾語補充) | 0500–052F |
| 11 | Armenian(亞美尼亞語) | 0530–058F |
| 12 | Hebrew(希伯來語) | 0590–05FF |
| 13 | Arabic(阿拉伯語) | 0600–06FF |
| 14 | Syriac(敘利亞語) | 0700–074F |
| 15 | Arabic Supplement(阿拉伯語補充) | 0750–077F |
| 16 | Thaana(塔納文) | 0780–07BF |
| 17 | N'Ko(西非書面語言) | 07C0–07FF |
| 18 | Samaritan(撒瑪利亞語) | 0800–083F |
| 19 | Mandaic(曼代克語) | 0840–085F |
| 20 | Syriac Supplement(敘利亞語補充) | 0860–086F |
| 21 | Arabic Extended-A(阿拉伯語擴展-A) | 08A0–08FF |
| 22 | Devanagari(天城體文字) | 0900–097F |
| 23 | Bengali(孟加拉語) | 0980–09FF |
| 24 | Gurmukhi(錫克教語) | 0A00–0A7F |
| 25 | Gujarati(古吉拉特語) | 0A80–0AFF |
| 26 | Oriya(奧里雅語) | 0B00–0B7F |
| 27 | Tamil(泰米爾語) | 0B80–0BFF |
| 28 | Telugu(泰盧固語) | 0C00–0C7F |
| 29 | Kannada(卡納拉語) | 0C80–0CFF |
| 30 | Malayalam(馬拉亞蘭語) | 0D00–0D7F |
| 31 | Sinhala(僧伽羅語) | 0D80–0DFF |
| 32 | Thai(泰文) | 0E00–0E7F |
| 33 | Lao(老撾文) | 0E80–0EFF |
| 34 | Tibetan(藏文) | 0F00–0FFF |
| 35 | Myanmar(緬甸語) | 1000–109F |
| 36 | Georgian(格魯吉亞語) | 10A0–10FF |
| 37 | Hangul Jamo(朝鮮文) | 1100–11FF |
| 38 | Ethiopic(埃塞俄比亞語) | 1200–137F |
| 39 | Ethiopic Supplement(埃塞俄比亞語補充) | 1380–139F |
| 40 | Cherokee(切羅基語) | 13A0–13FF |
| 41 | Unified Canadian Aboriginal Syllabics(統一加拿大土著語音節) | 1400–167F |
| 42 | Ogham(歐甘字母) | 1680–169F |
| 43 | Runic(如尼文) | 16A0–16FF |
| 44 | Tagalog(菲律賓語) | 1700–171F |
| 45 | Hanunoo(塔加路文) | 1720–173F |
| 46 | Buhid(布希德文) | 1740–175F |
| 47 | Tagbanwa(塔格巴努亞文) | 1760–177F |
| 48 | Khmer(高棉語) | 1780–17FF |
| 49 | Mongolian(蒙古文) | 1800–18AF |
| 50 | Unified Canadian Aboriginal Syllabics Extended(統一加拿大土著語音節擴展) | 18B0–18FF |
| 51 | Limbu(林布文) | 1900–194F |
| 52 | Tai Le(德宏傣文) | 1950–197F |
| 53 | New Tai Lue(新傣文) | 1980–19DF |
| 54 | Khmer Symbols(高棉語符號) | 19E0–19FF |
| 55 | Buginese(布吉文) | 1A00–1A1F |
| 56 | Tai Tham(老傣文) | 1A20–1AAF |
| 57 | Combining Diacritical Marks Extended(組合用附加符號擴展) | 1AB0–1AFF |
| 58 | Balinese(巴厘語) | 1B00–1B7F |
| 59 | Sundanese(巽他語) | 1B80–1BBF |
| 60 | Batak(巴塔克文) | 1BC0–1BFF |
| 61 | Lepcha(雷布查語) | 1C00–1C4F |
| 62 | Ol Chiki(歐甘語) | 1C50–1C7F |
| 63 | Cyrillic Extended-C(西里爾語擴展-C) | 1C80–1C8F |
| 64 | Georgian Extended(格魯吉亞語擴展) | 1C90–1CBF |
| 65 | Sundanese Supplement(巽他語補充) | 1CC0–1CCF |
| 66 | Vedic Extensions(梵語擴展) | 1CD0–1CFF |
| 67 | Phonetic Extensions(語音學擴展) | 1D00–1D7F |
| 68 | Phonetic Extensions Supplement(語音學擴展補充) | 1D80–1DBF |
| 69 | Combining Diacritical Marks Supplement(組合用附加符號補充) | 1DC0–1DFF |
| 70 | Latin Extended Additional(拉丁文擴充附加) | 1E00–1EFF |
| 71 | Greek Extended(希臘語擴充) | 1F00–1FFF |
| 72 | General Punctuation(常用標點) | 2000–206F |
| 73 | Superscripts and Subscripts(上標及下標) | 2070–209F |
| 74 | Currency Symbols(貨幣符號) | 20A0–20CF |
| 75 | Combining Diacritical Marks for Symbols(組合用記號) | 20D0–20FF |
| 76 | Letterlike Symbols(字母式符號) | 2100–214F |
| 77 | Number Forms(數字形式) | 2150–218F |
| 78 | Arrows(箭頭) | 2190–21FF |
| 79 | Mathematical Operators(數學運算符) | 2200–22FF |
| 80 | Miscellaneous Technical(雜項工業符號) | 2300–23FF |
| 81 | Control Pictures(控制圖片) | 2400–243F |
| 82 | Optical Character Recognition(光學識別符) | 2440–245F |
| 83 | Enclosed Alphanumerics(封閉式字母數字) | 2460–24FF |
| 84 | Box Drawing(制表符) | 2500–257F |
| 85 | Block Elements(方塊元素) | 2580–259F |
| 86 | Geometric Shapes(幾何圖形) | 25A0–25FF |
| 87 | Miscellaneous Symbols(雜項符號) | 2600–26FF |
| 88 | Dingbats(印刷符號) | 2700–27BF |
| 89 | Miscellaneous Mathematical Symbols-A(雜項數學符號-A) | 27C0–27EF |
| 90 | Supplemental Arrows-A(追加箭頭-A) | 27F0–27FF |
| 91 | Braille Patterns(盲文點字模型) | 2800–28FF |
| 92 | Supplemental Arrows-B(追加箭頭-B) | 2900–297F |
| 93 | Miscellaneous Mathematical Symbols-B(雜項數學符號-B) | 2980–29FF |
| 94 | Supplemental Mathematical Operators(追加數學運算符) | 2A00–2AFF |
| 95 | Miscellaneous Symbols and Arrows(雜項符號和箭頭) | 2B00–2BFF |
| 96 | Glagolitic(格拉哥里字母) | 2C00–2C5F |
| 97 | Latin Extended-C(拉丁文擴展-C) | 2C60–2C7F |
| 98 | Coptic(古埃及語) | 2C80–2CFF |
| 99 | Georgian Supplement(格魯吉亞語補充) | 2D00–2D2F |
| 100 | Tifinagh(提非納文) | 2D30–2D7F |
| 101 | Ethiopic Extended(埃塞俄比亞語擴展) | 2D80–2DDF |
| 102 | Cyrillic Extended-A(西里爾語擴展-A) | 2DE0–2DFF |
| 103 | Supplemental Punctuation(追加標點) | 2E00–2E7F |
| 104 | CJK Radicals Supplement(CJK部首補充) | 2E80–2EFF |
| 105 | Kangxi Radicals(康熙字典部首) | 2F00–2FDF |
| 106 | Ideographic Description Characters(表意文字描述符) | 2FF0–2FFF |
| 107 | CJK Symbols and Punctuation(CJK符號和標點) | 3000–303F |
| 108 | Hiragana(日文平假名) | 3040–309F |
| 109 | Katakana(日文片假名) | 30A0–30FF |
| 110 | Bopomofo(注音字母) | 3100–312F |
| 111 | Hangul Compatibility Jamo(朝鮮文兼容字母) | 3130–318F |
| 112 | Kanbun(象形字注釋標志) | 3190–319F |
| 113 | Bopomofo Extended(注音字母擴展) | 31A0–31BF |
| 114 | CJK Strokes(CJK筆畫) | 31C0–31EF |
| 115 | Katakana Phonetic Extensions(日文片假名語音擴展) | 31F0–31FF |
| 116 | Enclosed CJK Letters and Months(封閉式CJK文字和月份) | 3200–32FF |
| 117 | CJK Compatibility(CJK兼容) | 3300–33FF |
| 118 | CJK Unified Ideographs Extension A(CJK統一表意文字擴展A) | 3400–4DBF |
| 119 | Yijing Hexagram Symbols(易經六十四卦符號) | 4DC0–4DFF |
| 120 | CJK Unified Ideographs(CJK統一表意文字(基本漢字)) | 4E00–9FFC |
| 121 | Yi Syllables(彝文音節) | A000–A48F |
| 122 | Yi Radicals(彝文字根) | A490–A4CF |
| 123 | Lisu(傈僳語) | A4D0–A4FF |
| 124 | Vai(瓦伊語) | A500–A63F |
| 125 | Cyrillic Extended-B(西里爾字母擴展-B) | A640–A69F |
| 126 | Bamum(巴姆穆語) | A6A0–A6FF |
| 127 | Modifier Tone Letters(聲調修飾字母) | A700–A71F |
| 128 | Latin Extended-D(拉丁文擴展-D) | A720–A7FF |
| 129 | Syloti Nagri(錫爾赫特文) | A800–A82F |
| 130 | Common Indic Number Forms(普通印度數字表) | A830–A83F |
| 131 | Phags-pa(八思巴字) | A840–A87F |
| 132 | Saurashtra(索拉什特拉) | A880–A8DF |
| 133 | Devanagari Extended(天城體文字擴展) | A8E0–A8FF |
| 134 | Kayah Li(克耶字母) | A900–A92F |
| 135 | Rejang(勒姜語) | A930–A95F |
| 136 | Hangul Jamo Extended-A(朝鮮文擴展-A) | A960–A97F |
| 137 | Javanese(爪哇語) | A980–A9DF |
| 138 | Myanmar Extended-B(緬甸語擴展-B) | A9E0–A9FF |
| 139 | Cham(韃靼文) | AA00–AA5F |
| 140 | Myanmar Extended-A(緬甸語擴展-A) | AA60–AA7F |
| 141 | Tai Viet(越南傣文) | AA80–AADF |
| 142 | Meetei Mayek Extensions(曼尼普爾文擴展) | AAE0–AAFF |
| 143 | Ethiopic Extended-A(埃塞俄比亞文擴展-A) | AB00–AB2F |
| 144 | Latin Extended-E(拉丁文擴展-E) | AB30–AB6F |
| 145 | Cherokee Supplement(徹羅基語補充) | AB70–ABBF |
| 146 | Meetei Mayek(曼尼普爾文) | ABC0–ABFF |
| 147 | Hangul Syllables(朝鮮文音節) | AC00–D7AF |
| 148 | Hangul Jamo Extended-B(朝鮮文擴展-B) | D7B0–D7FF |
| 149 | High Surrogate Area(UTF-16高字節占用區域) | D800-DBFF |
| 150 | Low Surrogate Area(UTF-16低字節占用區域) | DC00-DFFF |
| 151 | Private Use Area(自行使用區域) | E000-F8FF |
| 152 | CJK Compatibility Ideographs(CJK兼容表意文字) | F900–FAD9 |
| 153 | Alphabetic Presentation Forms(字母表達形式) | FB00–FB4F |
| 154 | Arabic Presentation Forms-A(阿拉伯文表達形式-A) | FB50–FDFF |
| 155 | Variation Selectors(變量選擇符) | FE00–FE0F |
| 156 | Vertical Forms(豎排形式) | FE10–FE1F |
| 157 | Combining Half Marks(組合用半符號) | FE20–FE2F |
| 158 | CJK Compatibility Forms(CJK兼容形式) | FE30–FE4F |
| 159 | Small Form Variants(小型變體形式) | FE50–FE6F |
| 160 | Arabic Presentation Forms-B(阿拉伯文表達形式-B) | FE70–FEFF |
| 161 | Halfwidth and Fullwidth Forms(半型及全型形式) | FF00–FFEF |
| 162 | Specials(特殊) | FFF0–FFFF |
| 163 | Linear B Syllabary | 10000–1007F |
| 164 | Linear B Ideograms | 10080–100FF |
| 165 | Aegean Numbers | 10100–1013F |
| 166 | Ancient Greek Numbers | 10140–1018F |
| 167 | Ancient Symbols | 10190–101CF |
| 168 | Phaistos Disc | 101D0–101FF |
| 169 | Lycian | 10280–1029F |
| 170 | Carian | 102A0–102DF |
| 171 | Coptic Epact Numbers | 102E0–102FF |
| 172 | Old Italic | 10300–1032F |
| 173 | Gothic | 10330–1034F |
| 174 | Old Permic | 10350–1037F |
| 175 | Ugaritic | 10380–1039F |
| 176 | Old Persian | 103A0–103DF |
| 177 | Deseret | 10400–1044F |
| 178 | Shavian | 10450–1047F |
| 179 | Osmanya | 10480–104AF |
| 180 | Osage | 104B0–104FF |
| 181 | Elbasan | 10500–1052F |
| 182 | Caucasian Albanian | 10530–1056F |
| 183 | Linear A | 10600–1077F |
| 184 | Cypriot Syllabary | 10800–1083F |
| 185 | Imperial Aramaic | 10840–1085F |
| 186 | Palmyrene | 10860–1087F |
| 187 | Nabataean | 10880–108AF |
| 188 | Hatran | 108E0–108FF |
| 189 | Phoenician | 10900–1091F |
| 190 | Lydian | 10920–1093F |
| 191 | Meroitic Hieroglyphs | 10980–1099F |
| 192 | Meroitic Cursive | 109A0–109FF |
| 193 | Kharoshthi | 10A00–10A5F |
| 194 | Old South Arabian | 10A60–10A7F |
| 195 | Old North Arabian | 10A80–10A9F |
| 196 | Manichaean | 10AC0–10AFF |
| 197 | Avestan | 10B00–10B3F |
| 198 | Inscriptional Parthian | 10B40–10B5F |
| 199 | Inscriptional Pahlavi | 10B60–10B7F |
| 200 | Psalter Pahlavi | 10B80–10BAF |
| 201 | Old Turkic | 10C00–10C4F |
| 202 | Old Hungarian | 10C80–10CFF |
| 203 | Hanifi Rohingya | 10D00–10D3F |
| 204 | Rumi Numeral Symbols | 10E60–10E7F |
| 205 | Yezidi | 10E80–10EBF |
| 206 | Old Sogdian | 10F00–10F2F |
| 207 | Sogdian | 10F30–10F6F |
| 208 | Chorasmian | 10FB0–10FDF |
| 209 | Elymaic | 10FE0–10FFF |
| 210 | Brahmi | 11000–1107F |
| 211 | Kaithi | 11080–110CF |
| 212 | Sora Sompeng | 110D0–110FF |
| 213 | Chakma | 11100–1114F |
| 214 | Mahajani | 11150–1117F |
| 215 | Sharada | 11180–111DF |
| 216 | Sinhala Archaic Numbers | 111E0–111FF |
| 217 | Khojki | 11200–1124F |
| 218 | Multani | 11280–112AF |
| 219 | Khudawadi | 112B0–112FF |
| 220 | Grantha | 11300–1137F |
| 221 | Newa | 11400–1147F |
| 222 | Tirhuta | 11480–114DF |
| 223 | Siddham | 11580–115FF |
| 224 | Modi | 11600–1165F |
| 225 | Mongolian Supplement | 11660–1167F |
| 226 | Takri | 11680–116CF |
| 227 | Ahom | 11700–1173F |
| 228 | Dogra | 11800–1184F |
| 229 | Warang Citi | 118A0–118FF |
| 230 | Dives Akuru | 11900–1195F |
| 231 | Nandinagari | 119A0–119FF |
| 232 | Zanabazar Square | 11A00–11A4F |
| 233 | Soyombo | 11A50–11AAF |
| 234 | Pau Cin Hau | 11AC0–11AFF |
| 235 | Bhaiksuki | 11C00–11C6F |
| 236 | Marchen | 11C70–11CBF |
| 237 | Masaram Gondi | 11D00–11D5F |
| 238 | Gunjala Gondi | 11D60–11DAF |
| 239 | Makasar | 11EE0–11EFF |
| 240 | Lisu Supplement | 11FB0–11FBF |
| 241 | Tamil Supplement | 11FC0–11FFF |
| 242 | Cuneiform | 12000–123FF |
| 243 | Cuneiform Numbers and Punctuation | 12400–1247F |
| 244 | Early Dynastic Cuneiform | 12480–1254F |
| 245 | Egyptian Hieroglyphs | 13000–1342F |
| 246 | Egyptian Hieroglyph Format Controls | 13430–1343F |
| 247 | Anatolian Hieroglyphs | 14400–1467F |
| 248 | Bamum Supplement | 16800–16A3F |
| 249 | Mro | 16A40–16A6F |
| 250 | Bassa Vah | 16AD0–16AFF |
| 251 | Pahawh Hmong | 16B00–16B8F |
| 252 | Medefaidrin | 16E40–16E9F |
| 253 | Miao | 16F00–16F9F |
| 254 | Ideographic Symbols and Punctuation | 16FE0–16FFF |
| 255 | Tangut | 17000–187F7 |
| 256 | Tangut Components | 18800–18AFF |
| 257 | Khitan Small Script | 18B00–18CFF |
| 258 | Tangut Supplement | 18D00–18D08 |
| 259 | Kana Supplement | 1B000–1B0FF |
| 260 | Kana Extended-A | 1B100–1B12F |
| 261 | Small Kana Extension | 1B130–1B16F |
| 262 | Nushu | 1B170–1B2FF |
| 263 | Duployan | 1BC00–1BC9F |
| 264 | Shorthand Format Controls | 1BCA0–1BCAF |
| 265 | Byzantine Musical Symbols | 1D000–1D0FF |
| 266 | Musical Symbols | 1D100–1D1FF |
| 267 | Ancient Greek Musical Notation | 1D200–1D24F |
| 268 | Mayan Numerals | 1D2E0–1D2FF |
| 269 | Tai Xuan Jing Symbols | 1D300–1D35F |
| 270 | Counting Rod Numerals | 1D360–1D37F |
| 271 | Mathematical Alphanumeric Symbols | 1D400–1D7FF |
| 272 | Sutton SignWriting | 1D800–1DAAF |
| 273 | Glagolitic Supplement | 1E000–1E02F |
| 274 | Nyiakeng Puachue Hmong | 1E100–1E14F |
| 275 | Wancho | 1E2C0–1E2FF |
| 276 | Mende Kikakui | 1E800–1E8DF |
| 277 | Adlam | 1E900–1E95F |
| 278 | Indic Siyaq Numbers | 1EC70–1ECBF |
| 279 | Ottoman Siyaq Numbers | 1ED00–1ED4F |
| 280 | Arabic Mathematical Alphabetic Symbols | 1EE00–1EEFF |
| 281 | Mahjong Tiles | 1F000–1F02F |
| 282 | Domino Tiles | 1F030–1F09F |
| 283 | Playing Cards | 1F0A0–1F0FF |
| 284 | Enclosed Alphanumeric Supplement | 1F100–1F1FF |
| 285 | Enclosed Ideographic Supplement | 1F200–1F2FF |
| 286 | Miscellaneous Symbols and Pictographs | 1F300–1F5FF |
| 287 | Emoticons | 1F600–1F64F |
| 288 | Ornamental Dingbats | 1F650–1F67F |
| 289 | Transport and Map Symbols | 1F680–1F6FF |
| 290 | Alchemical Symbols | 1F700–1F77F |
| 291 | Geometric Shapes Extended | 1F780–1F7FF |
| 292 | Supplemental Arrows-C | 1F800–1F8FF |
| 293 | Supplemental Symbols and Pictographs | 1F900–1F9FF |
| 294 | Chess Symbols | 1FA00–1FA6F |
| 295 | Symbols and Pictographs Extended-A | 1FA70–1FAFF |
| 296 | Symbols for Legacy Computing | 1FB00–1FBFF |
| 297 | Unassigned | 1FF80–1FFFF |
| 298 | CJK Unified Ideographs Extension B | 20000–2A6DD |
| 299 | CJK Unified Ideographs Extension C | 2A700–2B734 |
| 300 | CJK Unified Ideographs Extension D | 2B740–2B81D |
| 301 | CJK Unified Ideographs Extension E | 2B820–2CEA1 |
| 302 | CJK Unified Ideographs Extension F | 2CEB0–2EBE0 |
| 303 | CJK Compatibility Ideographs Supplement | 2F800–2FA1D |
| 304 | Unassigned | 2FF80–2FFFF |
| 305 | CJK Unified Ideographs Extension G | 30000–3134A |
| 306 | Unassigned | 3FF80–3FFFF |
| 307 | Unassigned | 4FF80–4FFFF |
| 308 | Unassigned | 5FF80–5FFFF |
| 309 | Unassigned | 6FF80–6FFFF |
| 310 | Unassigned | 7FF80–7FFFF |
| 311 | Unassigned | 8FF80–8FFFF |
| 312 | Unassigned | 9FF80–9FFFF |
| 313 | Unassigned | AFF80–AFFFF |
| 314 | Unassigned | BFF80–BFFFF |
| 315 | Unassigned | CFF80–CFFFF |
| 316 | Unassigned | DFF80–DFFFF |
| 317 | Tags | E0000–E007F |
| 318 | Variation Selectors Supplement | E0100–E01EF |
| 319 | Unassigned | EFF80–EFFFF |
| 320 | Supplementary Private Use Area-A | FFF80–FFFFF |
| 321 | Supplementary Private Use Area-B | 10FF80–10FFFF |
總結
以上是生活随笔為你收集整理的034_Unicode标准的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 050_Unicode字符官方标准一
- 下一篇: 049_汉字Unicode编码范围