Issue 16: index design (index structure of MySQL)

Aikesheng cloud database 2020-11-11 16:40:58
issue index design index structure


image

In the last chapter, the database is basically used B+ The reason why trees store indexes : Suitable for disk storage , It can make full use of the characteristics of multi fork balanced tree , Disk read ahead , And good support for equivalence , Range , Sequential scanning, etc . This article mainly introduces MySQL Two common engines ,MyISAM and InnoDB The index organization of , Learn about these ways of storage , Good for database optimization .

MySQL The index of is divided into two categories according to the storage mode :

Clustered index : Also known as Clustered Index. The physical order of relational table records is the same as the logical order of indexes . Because a table can only be stored in one physical order , A table can only have one clustered index at most . Compared to nonclustered indexes , Clustered index has faster retrieval speed .

MySQL Only in Li INNODB Tables support clustered indexes ,INNODB Table data itself is a clustered index , That is to say IOT, Index organization table . Non leaf nodes are stored in primary key order , The leaf node stores the primary key and the corresponding row records . So for INNODB It's very fast to scan the whole table sequentially .

Nonclustered indexes : Also called Secondary Index. It means that non leaf nodes are stored according to the key value order of the index , The leaf node stores the index key value and the corresponding primary key value .MySQL In addition to INNODB Outside the primary key of the table , The others are secondary indexes .MYISAM,memory The table indexes of the engine are all nonclustered indexes . To put it simply , The index is stored separately from the row data . A table can have multiple secondary indexes . 

MYISAM surface :

MYISAM A table is a typical separation of data and index storage , There is no essential difference between a primary key and a secondary index . For example MYISAM Table inside primary key 、 The only index is the same , There is no essential difference .

Hypothesis table t1 by MYISAM engine , As a ID, full name , Gender , Age , Phone number . among ID Primary key , Age is a secondary index . Record the following :

image

The corresponding two B+ The tree index is shown in the following figure ,

Primary key field index tree :

image

The image above is a 3 Step B+ Trees , Non leaf nodes are sorted according to the value of the primary key , Leaf nodes are also sorted according to the value of the primary key , And contains a pointer to the physical data row on the disk . 

Age field index tree :

image

The index tree in the age field above is also a 3 Step B+ Trees , Non leaf nodes are stored in the order of values in the age field , The leaf node holds the value of the age field and a pointer to the physical data row on the disk .

As can be seen from the two pictures above ,MYISAM The biggest disadvantage of the index storage method of the table is that it is not stored in the order of the physical data rows , In this way, both the primary key retrieval and the secondary index retrieval need to be sorted twice . 

Let me give you a simple example ,

following SQL 1 There is no sort by default , Out of order output ; Need to follow ID Sequential output , You have to use it. SQL 2, Explicitly add ORDER BY .

mysql
# SQL 1
mysql> select * from t1;
+-------+----------+--------+------+--------------+
| id | username | gender | age | phone_number |
+-------+----------+--------+------+--------------+
| 10001 | floret | Woman | 18 | 18501877098 |
| 10005 | petty thief | Woman | 21 | 15827654555 |
| 10006 | The small white | male | 38 | 19929933000 |
| 10009 | Xiao He | male | 35 | 19012378676 |
| 10002 | Xiao Wang | male | 20 | 17760500293 |
| 10003 | Xiao zhao | Woman | 29 | 13581386000 |
| 10004 | indigo plant | Woman | 25 | 13456712000 |
| 10007 | millet | male | 23 | 19800092354 |
| 10008 | Xiao Xu | Woman | 22 | 18953209331 |
+-------+----------+--------+------+--------------+
9 rows in set (0.00 sec)
# SQL 2
mysql> select * from t1 order by id;
+-------+----------+--------+------+--------------+
| id | username | gender | age | phone_number |
+-------+----------+--------+------+--------------+
| 10001 | floret | Woman | 18 | 18501877098 |
| 10002 | Xiao Wang | male | 20 | 17760500293 |
| 10003 | Xiao zhao | Woman | 29 | 13581386000 |
| 10004 | indigo plant | Woman | 25 | 13456712000 |
| 10005 | petty thief | Woman | 21 | 15827654555 |
| 10006 | The small white | male | 38 | 19929933000 |
| 10007 | millet | male | 23 | 19800092354 |
| 10008 | Xiao Xu | Woman | 22 | 18953209331 |
| 10009 | Xiao He | male | 35 | 19012378676 |
+-------+----------+--------+------+--------------+
9 rows in set (0.00 sec)

So let's see INNODB The composition of primary key index and secondary index .

INNODB surface :

INNODB The table itself is an index organized table , In other words, index is data . Here's the chart T1 The data rows are displayed in the form of clustered indexes , The non leaf node holds the value of the primary key , The leaf node holds the value of the primary key and the corresponding data row , And each page has a pointer to the front and back pages .

INNODB A watch is different from MYISAM,INNODB Tables have their own data page management , Default 16KB.MYISAM The management of table data depends on the file system , For example, the file system generally defaults to 4KB,MYISAM The size of the block is also 4KB,MYISAM Table does not have its own crash recovery mechanism , It all depends on the file system .

image

INNODB There are two advantages of this design :

  1. Data is stored in primary key order . The order of the primary keys is the physical order of the record rows , Compared to the storage of pointer to data row , Avoid reordering . We know , Sorting costs the most . Now watch t1 It's based on the primary key ID Sort .
mysql
mysql> select * from t1;
+-------+----------+--------+------+--------------+
| id | username | gender | age | phone_number |
+-------+----------+--------+------+--------------+
| 10001 | floret | Woman | 18 | 18501877098 |
| 10002 | Xiao Wang | male | 20 | 17760500293 |
| 10003 | Xiao zhao | Woman | 29 | 13581386000 |
| 10004 | indigo plant | Woman | 25 | 13456712000 |
| 10005 | petty thief | Woman | 21 | 15827654555 |
| 10006 | The small white | male | 38 | 19929933000 |
| 10007 | millet | male | 23 | 19800092354 |
| 10008 | Xiao Xu | Woman | 22 | 18953209331 |
| 10009 | Xiao He | male | 35 | 19012378676 |
+-------+----------+--------+------+--------------+
9 rows in set (0.00 sec)
  1. Two leaf nodes contain pointers to the first and second nodes respectively , In this way, when inserting new lines or splitting pages , Just move the corresponding pointer .

And then look at INNODB Secondary index of table , As shown in the figure below :

image

INNODB The non leaf nodes of the secondary index hold the field values of the index , The index in the figure above is the table t1 Field of age. The leaf node contains the index field value and the corresponding primary key value .

The advantage of this is that when there is data row movement or data page splitting , Unnecessary maintenance of the secondary index . When data needs to be updated , The secondary index does not need to be rebuilt , Just modify the cluster index .

But there are drawbacks :

  1. The secondary index keeps the primary key value at the same time , It's going to get bigger . Especially when the primary key design is unreasonable , For example, use UUID Do primary key . In the next article, I'll explain in detail how to design a reasonable primary key .
  2. The retrieval of the secondary index needs to retrieve the index tree twice . The first time by retrieving the secondary index leaf node , Find the primary key value corresponding to the filter row ; The second time through the value of the primary key to the cluster index to find the corresponding row .

for instance :

as follows SQL sentence , The retrieval age was 23 The line record of :

mysql
select * from t1 where age = 23;

It will be split into the following two SQL sentence :

First through the index field age Find the corresponding primary key of the value :10005.

mysql
select id from t1 where age=23;

Then go to the cluster index according to the primary key ID = 10005 Retrieve the required data row , If the table reads for the first time , You need to go back to the table .

mysql
select * from t1 where id = 10005;

however MySQL It's a good optimization of this , The data was warmed up in advance ( Data preheating , I won't talk about it here , You can refer to MySQL manual , It's very detailed in the manual ).

The content of this article is introduced here , A brief review of this article . This paper mainly introduces MySQL Two common engines MYISAM and INNODB And their advantages and disadvantages . If you have any questions, you are welcome to criticize , In the next article, I'll introduce MySQL How to design the primary key well .


About MySQL Technical content of , What else do you want to know ? Leave a message and tell Xiaobian !

image

版权声明
本文为[Aikesheng cloud database]所创,转载请带上原文链接,感谢

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云