《Oracle MySQL编程自学与面试指南》05-02:查看字符集和排序规则

osc_6srwahuo 2020-11-10 12:48:11
人工智能 面试 Mysql oracle emoji


课程封面-MySQL-AT阿宝哥


内容导航

  • 前言
  • 1、查看字符集
  • 2、查看排序规则

前言

MySQL默认使用latin1(ISO-8859-1)字符集,它是单字节编码的字符集。但是,中文汉字是多字节编码的字符,一般对应的字符集为GBKGB18030UTF8


1、查看字符集

1.1、SQL语句

通过使用SHOW CHARSET;或者SHOW CHARACTER SET;可以查看可用字符集,如下所示:

mysql> show charset ;
+----------+---------------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+---------------------------------+---------------------+--------+
| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| binary | Binary pseudo charset | binary | 1 |
| cp1250 | Windows Central European | cp1250_general_ci | 1 |
| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |
| cp1256 | Windows Arabic | cp1256_general_ci | 1 |
| cp1257 | Windows Baltic | cp1257_general_ci | 1 |
| cp850 | DOS West European | cp850_general_ci | 1 |
| cp852 | DOS Central European | cp852_general_ci | 1 |
| cp866 | DOS Russian | cp866_general_ci | 1 |
| cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 |
| dec8 | DEC West European | dec8_swedish_ci | 1 |
| eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 |
| euckr | EUC-KR Korean | euckr_korean_ci | 2 |
| gb18030 | China National Standard GB18030 | gb18030_chinese_ci | 4 |
| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |
| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |
| geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 |
| greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| hp8 | HP West European | hp8_english_ci | 1 |
| keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 |
| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |
| macce | Mac Central European | macce_general_ci | 1 |
| macroman | Mac West European | macroman_general_ci | 1 |
| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |
| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| tis620 | TIS620 Thai | tis620_thai_ci | 1 |
| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |
| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |
| utf16 | UTF-16 Unicode | utf16_general_ci | 4 |
| utf16le | UTF-16LE Unicode | utf16le_general_ci | 4 |
| utf32 | UTF-32 Unicode | utf32_general_ci | 4 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_0900_ai_ci | 4 |
+----------+---------------------------------+---------------------+--------+
41 rows in set (0.12 sec)
mysql> show character set;
+----------+---------------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+---------------------------------+---------------------+--------+
| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| binary | Binary pseudo charset | binary | 1 |
| cp1250 | Windows Central European | cp1250_general_ci | 1 |
| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |
| cp1256 | Windows Arabic | cp1256_general_ci | 1 |
| cp1257 | Windows Baltic | cp1257_general_ci | 1 |
| cp850 | DOS West European | cp850_general_ci | 1 |
| cp852 | DOS Central European | cp852_general_ci | 1 |
| cp866 | DOS Russian | cp866_general_ci | 1 |
| cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 |
| dec8 | DEC West European | dec8_swedish_ci | 1 |
| eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 |
| euckr | EUC-KR Korean | euckr_korean_ci | 2 |
| gb18030 | China National Standard GB18030 | gb18030_chinese_ci | 4 |
| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |
| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |
| geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 |
| greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| hp8 | HP West European | hp8_english_ci | 1 |
| keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 |
| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |
| macce | Mac Central European | macce_general_ci | 1 |
| macroman | Mac West European | macroman_general_ci | 1 |
| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |
| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| tis620 | TIS620 Thai | tis620_thai_ci | 1 |
| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |
| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |
| utf16 | UTF-16 Unicode | utf16_general_ci | 4 |
| utf16le | UTF-16LE Unicode | utf16le_general_ci | 4 |
| utf32 | UTF-32 Unicode | utf32_general_ci | 4 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_0900_ai_ci | 4 |
+----------+---------------------------------+---------------------+--------+
41 rows in set (0.00 sec)

上述执行结果显示了字符集名称(Charset)、描述信息(Description)、默认校对集(Default collation)和单字符最大长度(Maxlen)。

常用字符集摘选简介如下:

Charset Description Default collation Maxlen
ascii US ASCII(美国信息交换标准代码) ascii_general_ci 1字节
latin1 cp1252 West European(西欧、希腊字符) latin1_swedish_ci 1字节
latin2 ISO 8859-2 Central European(中欧) latin2_general_ci 1字节
latin5 ISO 8859-9 Turkish(土耳其) latin5_turkish_ci 1字节
latin7 ISO 8859-13 Baltic(波罗的海) latin7_general_ci 1字节
big5 Big5 Traditional Chinese(繁体中文) big5_chinese_ci 2字节
gb2312 GB2312 Simplified Chinese(简体中文) gb2312_chinese_ci 2字节
gbk GBK Simplified Chinese(简体和繁体中文、日文、韩文等) gbk_chinese_ci 2字节
gb18030 China National Standard GB18030 gb18030_chinese_ci 4字节
utf8 UTF-8 Unicode(世界上大部分国家的文字) utf8_general_ci 3字节
utf8mb4 UTF-8 Unicode utf8mb4_0900_ai_ci 4字节

从上表可以看出,单字符所占用的存储空间越多,所支持的语言越多。而且,表中所列举的字符集都是向下兼容ASCII,就是说,如果我们输入的字符在ASCII范围内(0x00~0x7F),那么,它们的编码是完全相同的。

提示:
由于历史原因,MySQL中的utf8编码和标准的UTF-8(RFC3629)存在一些差异。标准的UTF-8(RFC3629)规定一个字符最多4字节,但是MySQL中的utf8规定一个字符最多3个字节,这导致UTF8中的一些特殊字符(比如emoji符号)无法在MySQL的utf8编码中使用。为了解决该问题,MySQL5.5.3版本开始新增了utf8mb4编码,将一个字符扩展到4字节。因此,如果要考虑支持标准的UTF-8(RFC3629),应该使用utf8mb4编码。

1.2、查询元数据字典表

mysql> select * from information_schema.character_sets;
+--------------------+----------------------+---------------------------------+--------+
| CHARACTER_SET_NAME | DEFAULT_COLLATE_NAME | DESCRIPTION | MAXLEN |
+--------------------+----------------------+---------------------------------+--------+
| big5 | big5_chinese_ci | Big5 Traditional Chinese | 2 |
...
+--------------------+----------------------+---------------------------------+--------+
41 rows in set (0.00 sec)

2、查看排序规则

2.1、SQL语句

通过使用SHOW COLLATION;可以查看MySQL可用的校对集,如下所示:


mysql> show collation;
+----------------------------+----------+-----+---------+----------+---------+---------------+
| Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+----------------------------+----------+-----+---------+----------+---------+---------------+
| armscii8_bin | armscii8 | 64 | | Yes | 1 | PAD SPACE |
| armscii8_general_ci | armscii8 | 32 | Yes | Yes | 1 | PAD SPACE |
| ascii_bin | ascii | 65 | | Yes | 1 | PAD SPACE |
| ascii_general_ci | ascii | 11 | Yes | Yes | 1 | PAD SPACE |
| big5_bin | big5 | 84 | | Yes | 1 | PAD SPACE |
| big5_chinese_ci | big5 | 1 | Yes | Yes | 1 | PAD SPACE |
| binary | binary | 63 | Yes | Yes | 1 | NO PAD |
| cp1250_bin | cp1250 | 66 | | Yes | 1 | PAD SPACE |
| cp1250_croatian_ci | cp1250 | 44 | | Yes | 1 | PAD SPACE |
| cp1250_czech_cs | cp1250 | 34 | | Yes | 2 | PAD SPACE |
| cp1250_general_ci | cp1250 | 26 | Yes | Yes | 1 | PAD SPACE |
| cp1250_polish_ci | cp1250 | 99 | | Yes | 1 | PAD SPACE |
| cp1251_bin | cp1251 | 50 | | Yes | 1 | PAD SPACE |
| cp1251_bulgarian_ci | cp1251 | 14 | | Yes | 1 | PAD SPACE |
| cp1251_general_ci | cp1251 | 51 | Yes | Yes | 1 | PAD SPACE |
| cp1251_general_cs | cp1251 | 52 | | Yes | 1 | PAD SPACE |
| cp1251_ukrainian_ci | cp1251 | 23 | | Yes | 1 | PAD SPACE |
| cp1256_bin | cp1256 | 67 | | Yes | 1 | PAD SPACE |
| cp1256_general_ci | cp1256 | 57 | Yes | Yes | 1 | PAD SPACE |
| cp1257_bin | cp1257 | 58 | | Yes | 1 | PAD SPACE |
| cp1257_general_ci | cp1257 | 59 | Yes | Yes | 1 | PAD SPACE |
| cp1257_lithuanian_ci | cp1257 | 29 | | Yes | 1 | PAD SPACE |
| cp850_bin | cp850 | 80 | | Yes | 1 | PAD SPACE |
| cp850_general_ci | cp850 | 4 | Yes | Yes | 1 | PAD SPACE |
| cp852_bin | cp852 | 81 | | Yes | 1 | PAD SPACE |
| cp852_general_ci | cp852 | 40 | Yes | Yes | 1 | PAD SPACE |
| cp866_bin | cp866 | 68 | | Yes | 1 | PAD SPACE |
| cp866_general_ci | cp866 | 36 | Yes | Yes | 1 | PAD SPACE |
| cp932_bin | cp932 | 96 | | Yes | 1 | PAD SPACE |
| cp932_japanese_ci | cp932 | 95 | Yes | Yes | 1 | PAD SPACE |
| dec8_bin | dec8 | 69 | | Yes | 1 | PAD SPACE |
| dec8_swedish_ci | dec8 | 3 | Yes | Yes | 1 | PAD SPACE |
| eucjpms_bin | eucjpms | 98 | | Yes | 1 | PAD SPACE |
| eucjpms_japanese_ci | eucjpms | 97 | Yes | Yes | 1 | PAD SPACE |
| euckr_bin | euckr | 85 | | Yes | 1 | PAD SPACE |
| euckr_korean_ci | euckr | 19 | Yes | Yes | 1 | PAD SPACE |
| gb18030_bin | gb18030 | 249 | | Yes | 1 | PAD SPACE |
| gb18030_chinese_ci | gb18030 | 248 | Yes | Yes | 2 | PAD SPACE |
| gb18030_unicode_520_ci | gb18030 | 250 | | Yes | 8 | PAD SPACE |
| gb2312_bin | gb2312 | 86 | | Yes | 1 | PAD SPACE |
| gb2312_chinese_ci | gb2312 | 24 | Yes | Yes | 1 | PAD SPACE |
| gbk_bin | gbk | 87 | | Yes | 1 | PAD SPACE |
| gbk_chinese_ci | gbk | 28 | Yes | Yes | 1 | PAD SPACE |
| geostd8_bin | geostd8 | 93 | | Yes | 1 | PAD SPACE |
| geostd8_general_ci | geostd8 | 92 | Yes | Yes | 1 | PAD SPACE |
| greek_bin | greek | 70 | | Yes | 1 | PAD SPACE |
| greek_general_ci | greek | 25 | Yes | Yes | 1 | PAD SPACE |
| hebrew_bin | hebrew | 71 | | Yes | 1 | PAD SPACE |
| hebrew_general_ci | hebrew | 16 | Yes | Yes | 1 | PAD SPACE |
| hp8_bin | hp8 | 72 | | Yes | 1 | PAD SPACE |
| hp8_english_ci | hp8 | 6 | Yes | Yes | 1 | PAD SPACE |
| keybcs2_bin | keybcs2 | 73 | | Yes | 1 | PAD SPACE |
| keybcs2_general_ci | keybcs2 | 37 | Yes | Yes | 1 | PAD SPACE |
| koi8r_bin | koi8r | 74 | | Yes | 1 | PAD SPACE |
| koi8r_general_ci | koi8r | 7 | Yes | Yes | 1 | PAD SPACE |
| koi8u_bin | koi8u | 75 | | Yes | 1 | PAD SPACE |
| koi8u_general_ci | koi8u | 22 | Yes | Yes | 1 | PAD SPACE |
| latin1_bin | latin1 | 47 | | Yes | 1 | PAD SPACE |
| latin1_danish_ci | latin1 | 15 | | Yes | 1 | PAD SPACE |
| latin1_general_ci | latin1 | 48 | | Yes | 1 | PAD SPACE |
| latin1_general_cs | latin1 | 49 | | Yes | 1 | PAD SPACE |
| latin1_german1_ci | latin1 | 5 | | Yes | 1 | PAD SPACE |
| latin1_german2_ci | latin1 | 31 | | Yes | 2 | PAD SPACE |
| latin1_spanish_ci | latin1 | 94 | | Yes | 1 | PAD SPACE |
| latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 | PAD SPACE |
| latin2_bin | latin2 | 77 | | Yes | 1 | PAD SPACE |
| latin2_croatian_ci | latin2 | 27 | | Yes | 1 | PAD SPACE |
| latin2_czech_cs | latin2 | 2 | | Yes | 4 | PAD SPACE |
| latin2_general_ci | latin2 | 9 | Yes | Yes | 1 | PAD SPACE |
| latin2_hungarian_ci | latin2 | 21 | | Yes | 1 | PAD SPACE |
| latin5_bin | latin5 | 78 | | Yes | 1 | PAD SPACE |
| latin5_turkish_ci | latin5 | 30 | Yes | Yes | 1 | PAD SPACE |
| latin7_bin | latin7 | 79 | | Yes | 1 | PAD SPACE |
| latin7_estonian_cs | latin7 | 20 | | Yes | 1 | PAD SPACE |
| latin7_general_ci | latin7 | 41 | Yes | Yes | 1 | PAD SPACE |
| latin7_general_cs | latin7 | 42 | | Yes | 1 | PAD SPACE |
| macce_bin | macce | 43 | | Yes | 1 | PAD SPACE |
| macce_general_ci | macce | 38 | Yes | Yes | 1 | PAD SPACE |
| macroman_bin | macroman | 53 | | Yes | 1 | PAD SPACE |
| macroman_general_ci | macroman | 39 | Yes | Yes | 1 | PAD SPACE |
| sjis_bin | sjis | 88 | | Yes | 1 | PAD SPACE |
| sjis_japanese_ci | sjis | 13 | Yes | Yes | 1 | PAD SPACE |
| swe7_bin | swe7 | 82 | | Yes | 1 | PAD SPACE |
| swe7_swedish_ci | swe7 | 10 | Yes | Yes | 1 | PAD SPACE |
| tis620_bin | tis620 | 89 | | Yes | 1 | PAD SPACE |
| tis620_thai_ci | tis620 | 18 | Yes | Yes | 4 | PAD SPACE |
| ucs2_bin | ucs2 | 90 | | Yes | 1 | PAD SPACE |
| ucs2_croatian_ci | ucs2 | 149 | | Yes | 8 | PAD SPACE |
| ucs2_czech_ci | ucs2 | 138 | | Yes | 8 | PAD SPACE |
| ucs2_danish_ci | ucs2 | 139 | | Yes | 8 | PAD SPACE |
| ucs2_esperanto_ci | ucs2 | 145 | | Yes | 8 | PAD SPACE |
| ucs2_estonian_ci | ucs2 | 134 | | Yes | 8 | PAD SPACE |
| ucs2_general_ci | ucs2 | 35 | Yes | Yes | 1 | PAD SPACE |
| ucs2_general_mysql500_ci | ucs2 | 159 | | Yes | 1 | PAD SPACE |
| ucs2_german2_ci | ucs2 | 148 | | Yes | 8 | PAD SPACE |
| ucs2_hungarian_ci | ucs2 | 146 | | Yes | 8 | PAD SPACE |
| ucs2_icelandic_ci | ucs2 | 129 | | Yes | 8 | PAD SPACE |
| ucs2_latvian_ci | ucs2 | 130 | | Yes | 8 | PAD SPACE |
| ucs2_lithuanian_ci | ucs2 | 140 | | Yes | 8 | PAD SPACE |
| ucs2_persian_ci | ucs2 | 144 | | Yes | 8 | PAD SPACE |
| ucs2_polish_ci | ucs2 | 133 | | Yes | 8 | PAD SPACE |
| ucs2_romanian_ci | ucs2 | 131 | | Yes | 8 | PAD SPACE |
| ucs2_roman_ci | ucs2 | 143 | | Yes | 8 | PAD SPACE |
| ucs2_sinhala_ci | ucs2 | 147 | | Yes | 8 | PAD SPACE |
| ucs2_slovak_ci | ucs2 | 141 | | Yes | 8 | PAD SPACE |
| ucs2_slovenian_ci | ucs2 | 132 | | Yes | 8 | PAD SPACE |
| ucs2_spanish2_ci | ucs2 | 142 | | Yes | 8 | PAD SPACE |
| ucs2_spanish_ci | ucs2 | 135 | | Yes | 8 | PAD SPACE |
| ucs2_swedish_ci | ucs2 | 136 | | Yes | 8 | PAD SPACE |
| ucs2_turkish_ci | ucs2 | 137 | | Yes | 8 | PAD SPACE |
| ucs2_unicode_520_ci | ucs2 | 150 | | Yes | 8 | PAD SPACE |
| ucs2_unicode_ci | ucs2 | 128 | | Yes | 8 | PAD SPACE |
| ucs2_vietnamese_ci | ucs2 | 151 | | Yes | 8 | PAD SPACE |
| ujis_bin | ujis | 91 | | Yes | 1 | PAD SPACE |
| ujis_japanese_ci | ujis | 12 | Yes | Yes | 1 | PAD SPACE |
| utf16le_bin | utf16le | 62 | | Yes | 1 | PAD SPACE |
| utf16le_general_ci | utf16le | 56 | Yes | Yes | 1 | PAD SPACE |
| utf16_bin | utf16 | 55 | | Yes | 1 | PAD SPACE |
| utf16_croatian_ci | utf16 | 122 | | Yes | 8 | PAD SPACE |
| utf16_czech_ci | utf16 | 111 | | Yes | 8 | PAD SPACE |
| utf16_danish_ci | utf16 | 112 | | Yes | 8 | PAD SPACE |
| utf16_esperanto_ci | utf16 | 118 | | Yes | 8 | PAD SPACE |
| utf16_estonian_ci | utf16 | 107 | | Yes | 8 | PAD SPACE |
| utf16_general_ci | utf16 | 54 | Yes | Yes | 1 | PAD SPACE |
| utf16_german2_ci | utf16 | 121 | | Yes | 8 | PAD SPACE |
| utf16_hungarian_ci | utf16 | 119 | | Yes | 8 | PAD SPACE |
| utf16_icelandic_ci | utf16 | 102 | | Yes | 8 | PAD SPACE |
| utf16_latvian_ci | utf16 | 103 | | Yes | 8 | PAD SPACE |
| utf16_lithuanian_ci | utf16 | 113 | | Yes | 8 | PAD SPACE |
| utf16_persian_ci | utf16 | 117 | | Yes | 8 | PAD SPACE |
| utf16_polish_ci | utf16 | 106 | | Yes | 8 | PAD SPACE |
| utf16_romanian_ci | utf16 | 104 | | Yes | 8 | PAD SPACE |
| utf16_roman_ci | utf16 | 116 | | Yes | 8 | PAD SPACE |
| utf16_sinhala_ci | utf16 | 120 | | Yes | 8 | PAD SPACE |
| utf16_slovak_ci | utf16 | 114 | | Yes | 8 | PAD SPACE |
| utf16_slovenian_ci | utf16 | 105 | | Yes | 8 | PAD SPACE |
| utf16_spanish2_ci | utf16 | 115 | | Yes | 8 | PAD SPACE |
| utf16_spanish_ci | utf16 | 108 | | Yes | 8 | PAD SPACE |
| utf16_swedish_ci | utf16 | 109 | | Yes | 8 | PAD SPACE |
| utf16_turkish_ci | utf16 | 110 | | Yes | 8 | PAD SPACE |
| utf16_unicode_520_ci | utf16 | 123 | | Yes | 8 | PAD SPACE |
| utf16_unicode_ci | utf16 | 101 | | Yes | 8 | PAD SPACE |
| utf16_vietnamese_ci | utf16 | 124 | | Yes | 8 | PAD SPACE |
| utf32_bin | utf32 | 61 | | Yes | 1 | PAD SPACE |
| utf32_croatian_ci | utf32 | 181 | | Yes | 8 | PAD SPACE |
| utf32_czech_ci | utf32 | 170 | | Yes | 8 | PAD SPACE |
| utf32_danish_ci | utf32 | 171 | | Yes | 8 | PAD SPACE |
| utf32_esperanto_ci | utf32 | 177 | | Yes | 8 | PAD SPACE |
| utf32_estonian_ci | utf32 | 166 | | Yes | 8 | PAD SPACE |
| utf32_general_ci | utf32 | 60 | Yes | Yes | 1 | PAD SPACE |
| utf32_german2_ci | utf32 | 180 | | Yes | 8 | PAD SPACE |
| utf32_hungarian_ci | utf32 | 178 | | Yes | 8 | PAD SPACE |
| utf32_icelandic_ci | utf32 | 161 | | Yes | 8 | PAD SPACE |
| utf32_latvian_ci | utf32 | 162 | | Yes | 8 | PAD SPACE |
| utf32_lithuanian_ci | utf32 | 172 | | Yes | 8 | PAD SPACE |
| utf32_persian_ci | utf32 | 176 | | Yes | 8 | PAD SPACE |
| utf32_polish_ci | utf32 | 165 | | Yes | 8 | PAD SPACE |
| utf32_romanian_ci | utf32 | 163 | | Yes | 8 | PAD SPACE |
| utf32_roman_ci | utf32 | 175 | | Yes | 8 | PAD SPACE |
| utf32_sinhala_ci | utf32 | 179 | | Yes | 8 | PAD SPACE |
| utf32_slovak_ci | utf32 | 173 | | Yes | 8 | PAD SPACE |
| utf32_slovenian_ci | utf32 | 164 | | Yes | 8 | PAD SPACE |
| utf32_spanish2_ci | utf32 | 174 | | Yes | 8 | PAD SPACE |
| utf32_spanish_ci | utf32 | 167 | | Yes | 8 | PAD SPACE |
| utf32_swedish_ci | utf32 | 168 | | Yes | 8 | PAD SPACE |
| utf32_turkish_ci | utf32 | 169 | | Yes | 8 | PAD SPACE |
| utf32_unicode_520_ci | utf32 | 182 | | Yes | 8 | PAD SPACE |
| utf32_unicode_ci | utf32 | 160 | | Yes | 8 | PAD SPACE |
| utf32_vietnamese_ci | utf32 | 183 | | Yes | 8 | PAD SPACE |
| utf8mb4_0900_ai_ci | utf8mb4 | 255 | Yes | Yes | 0 | NO PAD |
| utf8mb4_0900_as_ci | utf8mb4 | 305 | | Yes | 0 | NO PAD |
| utf8mb4_0900_as_cs | utf8mb4 | 278 | | Yes | 0 | NO PAD |
| utf8mb4_0900_bin | utf8mb4 | 309 | | Yes | 1 | NO PAD |
| utf8mb4_bin | utf8mb4 | 46 | | Yes | 1 | PAD SPACE |
| utf8mb4_croatian_ci | utf8mb4 | 245 | | Yes | 8 | PAD SPACE |
| utf8mb4_cs_0900_ai_ci | utf8mb4 | 266 | | Yes | 0 | NO PAD |
| utf8mb4_cs_0900_as_cs | utf8mb4 | 289 | | Yes | 0 | NO PAD |
| utf8mb4_czech_ci | utf8mb4 | 234 | | Yes | 8 | PAD SPACE |
| utf8mb4_danish_ci | utf8mb4 | 235 | | Yes | 8 | PAD SPACE |
| utf8mb4_da_0900_ai_ci | utf8mb4 | 267 | | Yes | 0 | NO PAD |
| utf8mb4_da_0900_as_cs | utf8mb4 | 290 | | Yes | 0 | NO PAD |
| utf8mb4_de_pb_0900_ai_ci | utf8mb4 | 256 | | Yes | 0 | NO PAD |
| utf8mb4_de_pb_0900_as_cs | utf8mb4 | 279 | | Yes | 0 | NO PAD |
| utf8mb4_eo_0900_ai_ci | utf8mb4 | 273 | | Yes | 0 | NO PAD |
| utf8mb4_eo_0900_as_cs | utf8mb4 | 296 | | Yes | 0 | NO PAD |
| utf8mb4_esperanto_ci | utf8mb4 | 241 | | Yes | 8 | PAD SPACE |
| utf8mb4_estonian_ci | utf8mb4 | 230 | | Yes | 8 | PAD SPACE |
| utf8mb4_es_0900_ai_ci | utf8mb4 | 263 | | Yes | 0 | NO PAD |
| utf8mb4_es_0900_as_cs | utf8mb4 | 286 | | Yes | 0 | NO PAD |
| utf8mb4_es_trad_0900_ai_ci | utf8mb4 | 270 | | Yes | 0 | NO PAD |
| utf8mb4_es_trad_0900_as_cs | utf8mb4 | 293 | | Yes | 0 | NO PAD |
| utf8mb4_et_0900_ai_ci | utf8mb4 | 262 | | Yes | 0 | NO PAD |
| utf8mb4_et_0900_as_cs | utf8mb4 | 285 | | Yes | 0 | NO PAD |
| utf8mb4_general_ci | utf8mb4 | 45 | | Yes | 1 | PAD SPACE |
| utf8mb4_german2_ci | utf8mb4 | 244 | | Yes | 8 | PAD SPACE |
| utf8mb4_hr_0900_ai_ci | utf8mb4 | 275 | | Yes | 0 | NO PAD |
| utf8mb4_hr_0900_as_cs | utf8mb4 | 298 | | Yes | 0 | NO PAD |
| utf8mb4_hungarian_ci | utf8mb4 | 242 | | Yes | 8 | PAD SPACE |
| utf8mb4_hu_0900_ai_ci | utf8mb4 | 274 | | Yes | 0 | NO PAD |
| utf8mb4_hu_0900_as_cs | utf8mb4 | 297 | | Yes | 0 | NO PAD |
| utf8mb4_icelandic_ci | utf8mb4 | 225 | | Yes | 8 | PAD SPACE |
| utf8mb4_is_0900_ai_ci | utf8mb4 | 257 | | Yes | 0 | NO PAD |
| utf8mb4_is_0900_as_cs | utf8mb4 | 280 | | Yes | 0 | NO PAD |
| utf8mb4_ja_0900_as_cs | utf8mb4 | 303 | | Yes | 0 | NO PAD |
| utf8mb4_ja_0900_as_cs_ks | utf8mb4 | 304 | | Yes | 24 | NO PAD |
| utf8mb4_latvian_ci | utf8mb4 | 226 | | Yes | 8 | PAD SPACE |
| utf8mb4_la_0900_ai_ci | utf8mb4 | 271 | | Yes | 0 | NO PAD |
| utf8mb4_la_0900_as_cs | utf8mb4 | 294 | | Yes | 0 | NO PAD |
| utf8mb4_lithuanian_ci | utf8mb4 | 236 | | Yes | 8 | PAD SPACE |
| utf8mb4_lt_0900_ai_ci | utf8mb4 | 268 | | Yes | 0 | NO PAD |
| utf8mb4_lt_0900_as_cs | utf8mb4 | 291 | | Yes | 0 | NO PAD |
| utf8mb4_lv_0900_ai_ci | utf8mb4 | 258 | | Yes | 0 | NO PAD |
| utf8mb4_lv_0900_as_cs | utf8mb4 | 281 | | Yes | 0 | NO PAD |
| utf8mb4_persian_ci | utf8mb4 | 240 | | Yes | 8 | PAD SPACE |
| utf8mb4_pl_0900_ai_ci | utf8mb4 | 261 | | Yes | 0 | NO PAD |
| utf8mb4_pl_0900_as_cs | utf8mb4 | 284 | | Yes | 0 | NO PAD |
| utf8mb4_polish_ci | utf8mb4 | 229 | | Yes | 8 | PAD SPACE |
| utf8mb4_romanian_ci | utf8mb4 | 227 | | Yes | 8 | PAD SPACE |
| utf8mb4_roman_ci | utf8mb4 | 239 | | Yes | 8 | PAD SPACE |
| utf8mb4_ro_0900_ai_ci | utf8mb4 | 259 | | Yes | 0 | NO PAD |
| utf8mb4_ro_0900_as_cs | utf8mb4 | 282 | | Yes | 0 | NO PAD |
| utf8mb4_ru_0900_ai_ci | utf8mb4 | 306 | | Yes | 0 | NO PAD |
| utf8mb4_ru_0900_as_cs | utf8mb4 | 307 | | Yes | 0 | NO PAD |
| utf8mb4_sinhala_ci | utf8mb4 | 243 | | Yes | 8 | PAD SPACE |
| utf8mb4_sk_0900_ai_ci | utf8mb4 | 269 | | Yes | 0 | NO PAD |
| utf8mb4_sk_0900_as_cs | utf8mb4 | 292 | | Yes | 0 | NO PAD |
| utf8mb4_slovak_ci | utf8mb4 | 237 | | Yes | 8 | PAD SPACE |
| utf8mb4_slovenian_ci | utf8mb4 | 228 | | Yes | 8 | PAD SPACE |
| utf8mb4_sl_0900_ai_ci | utf8mb4 | 260 | | Yes | 0 | NO PAD |
| utf8mb4_sl_0900_as_cs | utf8mb4 | 283 | | Yes | 0 | NO PAD |
| utf8mb4_spanish2_ci | utf8mb4 | 238 | | Yes | 8 | PAD SPACE |
| utf8mb4_spanish_ci | utf8mb4 | 231 | | Yes | 8 | PAD SPACE |
| utf8mb4_sv_0900_ai_ci | utf8mb4 | 264 | | Yes | 0 | NO PAD |
| utf8mb4_sv_0900_as_cs | utf8mb4 | 287 | | Yes | 0 | NO PAD |
| utf8mb4_swedish_ci | utf8mb4 | 232 | | Yes | 8 | PAD SPACE |
| utf8mb4_tr_0900_ai_ci | utf8mb4 | 265 | | Yes | 0 | NO PAD |
| utf8mb4_tr_0900_as_cs | utf8mb4 | 288 | | Yes | 0 | NO PAD |
| utf8mb4_turkish_ci | utf8mb4 | 233 | | Yes | 8 | PAD SPACE |
| utf8mb4_unicode_520_ci | utf8mb4 | 246 | | Yes | 8 | PAD SPACE |
| utf8mb4_unicode_ci | utf8mb4 | 224 | | Yes | 8 | PAD SPACE |
| utf8mb4_vietnamese_ci | utf8mb4 | 247 | | Yes | 8 | PAD SPACE |
| utf8mb4_vi_0900_ai_ci | utf8mb4 | 277 | | Yes | 0 | NO PAD |
| utf8mb4_vi_0900_as_cs | utf8mb4 | 300 | | Yes | 0 | NO PAD |
| utf8mb4_zh_0900_as_cs | utf8mb4 | 308 | | Yes | 0 | NO PAD |
| utf8_bin | utf8 | 83 | | Yes | 1 | PAD SPACE |
| utf8_croatian_ci | utf8 | 213 | | Yes | 8 | PAD SPACE |
| utf8_czech_ci | utf8 | 202 | | Yes | 8 | PAD SPACE |
| utf8_danish_ci | utf8 | 203 | | Yes | 8 | PAD SPACE |
| utf8_esperanto_ci | utf8 | 209 | | Yes | 8 | PAD SPACE |
| utf8_estonian_ci | utf8 | 198 | | Yes | 8 | PAD SPACE |
| utf8_general_ci | utf8 | 33 | Yes | Yes | 1 | PAD SPACE |
| utf8_general_mysql500_ci | utf8 | 223 | | Yes | 1 | PAD SPACE |
| utf8_german2_ci | utf8 | 212 | | Yes | 8 | PAD SPACE |
| utf8_hungarian_ci | utf8 | 210 | | Yes | 8 | PAD SPACE |
| utf8_icelandic_ci | utf8 | 193 | | Yes | 8 | PAD SPACE |
| utf8_latvian_ci | utf8 | 194 | | Yes | 8 | PAD SPACE |
| utf8_lithuanian_ci | utf8 | 204 | | Yes | 8 | PAD SPACE |
| utf8_persian_ci | utf8 | 208 | | Yes | 8 | PAD SPACE |
| utf8_polish_ci | utf8 | 197 | | Yes | 8 | PAD SPACE |
| utf8_romanian_ci | utf8 | 195 | | Yes | 8 | PAD SPACE |
| utf8_roman_ci | utf8 | 207 | | Yes | 8 | PAD SPACE |
| utf8_sinhala_ci | utf8 | 211 | | Yes | 8 | PAD SPACE |
| utf8_slovak_ci | utf8 | 205 | | Yes | 8 | PAD SPACE |
| utf8_slovenian_ci | utf8 | 196 | | Yes | 8 | PAD SPACE |
| utf8_spanish2_ci | utf8 | 206 | | Yes | 8 | PAD SPACE |
| utf8_spanish_ci | utf8 | 199 | | Yes | 8 | PAD SPACE |
| utf8_swedish_ci | utf8 | 200 | | Yes | 8 | PAD SPACE |
| utf8_tolower_ci | utf8 | 76 | | Yes | 1 | PAD SPACE |
| utf8_turkish_ci | utf8 | 201 | | Yes | 8 | PAD SPACE |
| utf8_unicode_520_ci | utf8 | 214 | | Yes | 8 | PAD SPACE |
| utf8_unicode_ci | utf8 | 192 | | Yes | 8 | PAD SPACE |
| utf8_vietnamese_ci | utf8 | 215 | | Yes | 8 | PAD SPACE |
+----------------------------+----------+-----+---------+----------+---------+---------------+
272 rows in set (0.02 sec)

上述执行结果显示了校对集名称(Collation)、对应字符集名称(Charset)、校对集ID(Id)、是否为对应字符集的默认校对集(Default)、是否已编译(Compiled)、排序的内存需求量(Sortlen)和排序时是否需要比较字符后面的空格(Pad_attribute)。

其中,校对集的名称由_分割的三部分组成:

  • 开头是对应的字符集;
  • 中间是国家名称或者general;
  • 结尾是cicsbinci表示不区分大小写,cs表示区分大小写,bin表示以二进制方式比较。

比如,gbk_chinese_ci代表gbk字符集、中国、不区分大小写。

2.2、查询元数据字典表

mysql> select * from information_schema.collations;
+----------------------------+--------------------+-----+------------+-------------+---------+---------------+
| COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN | PAD_ATTRIBUTE |
+----------------------------+--------------------+-----+------------+-------------+---------+---------------+
| armscii8_general_ci | armscii8 | 32 | Yes | Yes | 1 | PAD SPACE |
...
+----------------------------+--------------------+-----+------------+-------------+---------+---------------+
272 rows in set (0.00 sec)

参考:
https://dev.mysql.com/doc/refman/8.0/en/charset.html
https://zhuanlan.zhihu.com/p/147207165


Oracle MySQL最佳学习路线图(2020最新版)



GET!童鞋,你好棒呀,给我们一起点个赞。



我想了解职业晋升路线和课程学习指南

我想了解IT/互联网行业职业规划

我想了解世界编程语言排行榜


和AT阿宝哥一起加油!

版权声明
本文为[osc_6srwahuo]所创,转载请带上原文链接,感谢
https://my.oschina.net/u/4374580/blog/4711086

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云