拉丁字母补充-1
拉丁字母补充-1(Latin-1 Supplement),又称C1控制字符及拉丁字母补充-1(C1 Controls and Latin-1 Supplement),是Unicode标准下的第二个Unicode区段。该区段将ISO 8859-1中的上部区段80~FF(U+0080..U+00FF)予以编码,唯C1控制字符并非可见字符。该区段的码位范围为U+0080..U+00FF,共有128个字符,当中包括C1控制字符、拉丁字母-1标点与规约符号、30组带附加符号的大写及小写拉丁字母,及2个数学运算符。
C1控制字符及拉丁字母补充-1 C1 Controls and Latin-1 Supplement | |
---|---|
范围 | U+0080..U+00FF (128个码位) |
平面 | 基本多文种平面(BMP) |
文字 | 拉丁字母(64个) 通用(64个) |
应用 | |
符号系列 | |
已分配 | 128个码位 |
未分配 | 0个保留码位 |
来源标准 | ISO/IEC 8859-1 |
统一码版本历史 | |
1.0.0 | 128 (+128) |
注释:[1][2] |
C1控制字符及拉丁字母补充-1区段内的字符自Unicode标准的1.0版本起一直沿用至今,[3]而该区段在1.0版本的名称只是称作“拉丁字母1”(Latin1)。[4]
字符表
|
|
|
|
子标题
“C1控制字符及拉丁字母补充-1”区段内包含四个子标题,分别为C1控制字符、拉丁字母-1标点及符号、字母及数学运算符。[5]
C1控制字符
“C1控制字符”(C1 controls)子标题下包含32个继承自ISO/IEC 8859-1及其他8位字符标准的补充控制字符(control code)。C0与C1控制字符的别名乃基于ISO/IEC 6429:1992而取。[5]
拉丁字母-1标点及符号
“拉丁字母-1标点及符号”(Latin-1 Punctuation and Symbols)子标题下包含32个常见的国际标点字符(如倒感叹号、倒问号和间隔号等)及汇率符号、占位变音权标号(spacing diacritic mark)、普通分数及上标数字等符号。[5]
字母
“字母”(Letters)子标题下包含30组用于西欧语言的大写和小写拉丁字母,亦额外包含两个不常用于字词开首的小写字母。[5]
数学运算符
“数学运算符”(Mathematical operator)子标题下包含乘号和除号。[5]
符号、字母与控制符数量
下表显示“C1控制字符及拉丁字母补充-1”区段中,各子标题的字母、符号与控制符数量。
子标题名称 | 符号数量 | 字符代码范围 |
---|---|---|
C1控制字符 | 32个控制字符 | U+0080..U+009F |
拉丁字母-1标点及符号 | 32个标点符号及规约符号 | U+00A0..U+00BF |
字母 | 30组带附加符号的大写及小写拉丁字母 | U+00C0..U+00D6、U+00D8..U+00F6及U+00F8..U+00FF |
数学运算符 | 包含U+00D7 × MULTIPLICATION SIGN 及U+00F7 ÷ DIVISION SIGN共两个符号。 | U+00D7及U+00F7 |
区块
C1控制字符及拉丁字母补充-1 C1 Controls and Latin-1 Supplement[1] Unicode Consortium 官方码表 (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+008x | XXX | XXX | BPH | NBH | IND | NEL | SSA | ESA | HTS | HTJ | VTS | PLD | PLU | RI | SS2 | SS3 |
U+009x | DCS | PU1 | PU2 | STS | CCH | MW | SPA | EPA | SOS | XXX | SCI | CSI | ST | OSC | PM | APC |
U+00Ax | NBSP | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | « | ¬ | SHY | ® | ¯ |
U+00Bx | ° | ± | ² | ³ | ´ | µ | ¶ | · | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿ |
U+00Cx | À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë | Ì | Í | Î | Ï |
U+00Dx | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | × | Ø | Ù | Ú | Û | Ü | Ý | Þ | ß |
U+00Ex | à | á | â | ã | ä | å | æ | ç | è | é | ê | ë | ì | í | î | ï |
U+00Fx | ð | ñ | ò | ó | ô | õ | ö | ÷ | ø | ù | ú | û | ü | ý | þ | ÿ |
注释
|
绘文字
拉丁字母补充-1区段包含两个绘文字:U+00A9 © COPYRIGHT SIGN及U+00AE ® REGISTERED SIGN。[6][7]
该区段就两个绘文字的绘文字形式(添加U+FE0F ️ VS16)或文字展示形式(添加U+FE0E ︎ VS15)定义四个标准化变体,并以后者作为默认。[8]
U+ | 00A9 | 00AE |
基本码位 | © | ® |
基本+VS15(文本) | ©︎ | ®︎ |
基本+VS16(绘文字) | ©️ | ®️ |
历史
下列与Unicode相关的文件记录了在拉丁字母补充-1区块中定义特定字符的目的和过程:
版本 | 最终码位[a] | 码位数 | L2 ID | WG2 ID | 文档 |
---|---|---|---|---|---|
1.0.0 | U+0080..009F | 32 | X3L2/95-002 | PDAM No. 3 to ISO/IEC 10646-1 on coding of C1 controls, 1994-11-01 | |
X3L2/95-028 | N1148 | Nine tables of replies to repeated/extended votes, 1995-02-22 | |||
N1203 (页面存档备份,存于互联网档案馆) | Umamaheswaran, V. S.; Ksar, Mike, Unconfirmed minutes of SC2/WG2 Meeting 27, Geneva, 1995-05-03 | ||||
X3L2/95-061 | DAM no.3 to ISO/IEC 10646-1 (Coding of C1 controls), 1995-06-01 | ||||
N1307 | Table of replies to JTC1 letter ballot on 10646 DAM 3, Coding of C1 Controls, (SC2 N 2666), 1996-01-15 | ||||
N1309 | Paterson, Bruce, Report and Disposition of Comments on DAM 1, UTF 16 and DAM 2, UTF-8, DAM 3, Coding of C1 Controls, and DAM 4, Removal of Annex G: UTF1, 1996-01-17 | ||||
N1312 | Paterson, Bruce, Draft Final Text of 10646 AMD-3, Coding of C1 Controls, 1996-01-17 | ||||
L2/99-048 | Umamaheswaran, V. S., C1 controls in the code charts, 1999-02-04 | ||||
L2/99-054R | Aliprand, Joan, Approved Minutes from the UTC/L2 meeting in Palo Alto, February 3-5, 1999, 1999-06-21 | ||||
N3046 (页面存档备份,存于互联网档案馆) | Suignard, Michel, Improving formal definition for control characters, 2006-02-22 | ||||
N3103 (pdf, doc (页面存档备份,存于互联网档案馆)) | Umamaheswaran, V. S., Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27, 2006-08-25 | ||||
U+00A0..00FF | 96 | (待查) | |||
X3L2/94-077 | N994 (页面存档备份,存于互联网档案馆) | Davis, Mark, ISO/IEC 10646-1 - Proposed Draft Corrigendum 1, 1994-03-03 | |||
X3L2/94-098 | N1033 (pdf, doc (页面存档备份,存于互联网档案馆)) | Umamaheswaran, V. S.; Ksar, Mike, Unconfirmed Minutes of ISO/IEC JTC 1/SC 2/WG 2 Meeting 25, Falez Hotel, Antalya, Turkey, 1994-04-18--22, 1994-06-01 | |||
L2/11-016 | Moore, Lisa, UTC #126 / L2 #223 Minutes, 2011-02-15 | ||||
L2/11-116 | Moore, Lisa, UTC #127 / L2 #224 Minutes, 2011-05-17, Change the general category of to U+00AA FEMININE ORDINAL INDICATOR and U+00BA MASCULINE ORDINAL INDICATOR "Lo" for Unicode 6.1. | ||||
L2/11-261R2 | Moore, Lisa, UTC #128 / L2 #225 Minutes, 2011-08-16, Change the general category from "So" to "Po" ... [U+00A7 and U+00B6] | ||||
L2/15-050R[b][c] | Davis, Mark; et al, Additional variation selectors for emoji, 2015-01-29 | ||||
|
参见
注释
参考资料
- ^ Unicode character database. The Unicode Standard. [2016-07-09]. (原始内容存档于2017-09-25) (英语).
- ^ Enumerated Versions of The Unicode Standard. The Unicode Standard. [2016-07-09]. (原始内容存档于2016-06-29) (英语).
- ^ The Unicode Standard Version 1.0, Volume 1. Addison-Wesley Publishing Company, Inc. 1991 [1990]. ISBN 0-201-56788-1.
- ^ 3.8: Block-by-Block Charts (PDF). The Unicode Standard. Unicode Consortium. [2021-10-10]. (原始内容 (PDF)存档于2021-02-11) (英语).
- ^ 5.0 5.1 5.2 5.3 5.4 Unicode 6.2 code charts (PDF). The Unicode Standard. [2013-04-01]. (原始内容 (PDF)存档于2018-07-04) (英语).
- ^ UTR #51: Unicode Emoji. Unicode Consortium. 2020-02-11 [2022-05-17]. (原始内容存档于2020-06-30).
- ^ UCD: Emoji Data for UTR #51. Unicode Consortium. 2021-08-26 [2022-05-17]. (原始内容存档于2022-03-28).
- ^ UTS #51 Emoji Variation Sequences. The Unicode Consortium. [2022-05-17]. (原始内容存档于2022-03-31).