How to deal with the uneven distribution of hash in redis

itread01 2021-01-21 01:46:36
deal uneven distribution hash redis


# Preface `Redis` It's a key value pair database , The keys are stored by hashing . The whole `Redis` It can be thought of as an outer layer of clutter , It's called outer clutter , Because `Redis` There is also a hash type inside , This can be called internal hashing . When we use hash objects for data storage , To the whole `Redis` for , It goes through two layers of hash storage .# A hash object is also a `key-value` Storage structure , The underlying storage structure can also be divided into two types :`ziplist`( Compressed list ) and `hashtable`( Hash table ). These two storage structures are also distinguished by coding :| Encoding properties | describe | object encoding Command return value || -------------------- | ------------------------ | ------------------------- || OBJ_ENCODING_ZIPLIST | Hash objects using compressed lists | ziplist || OBJ_ENCODING_HT | Using dictionaries to implement hash objects | hashtable |## hashtable`Redis` Medium `key-value` Yes `dictEntry` The object is packed , A hash table is a hash table `dictEntry` The object is packaged again, and the result is , This is the hash table object `dictht`:```ctypedef struct dictht { dictEntry **table;// Hash table array unsigned long size;// Hash table size unsigned long sizemask;// Mask size , Used to calculate index values , Always wait for size-1 unsigned long used;// The number of existing nodes in the hash table } dictht;``` Be careful : In the above structure definition `table` It's an array , Each of its elements is a `dictEntry` thing .### Dictionary dictionary , Also known as the symbol table (symbol table), Associative array (associative array) Or the opposite (map), A hash table is nested inside the dictionary `dictht` thing , Here's a dictionary `ht` The definition of :```ctypedef struct dict { dictType *type;// Some specific functions of a dictionary type void *privdata;// Private data ,type You may need to use dictht ht[2];// Hash table ( Notice that there are 2 A hash table ) long rehashidx; //rehash Indexes , be not in rehash When , The value is -1 unsigned long iterators; // Number of iterators in use } dict;``` among `dictType ` Some common functions are defined internally , The data structure is defined as follows :```ctypedef struct dictType { uint64_t (*hashFunction)(const void *key);// Computes the hash valued function void *(*keyDup)(void *privdata, const void *key);// Copy the key function void *(*valDup)(void *privdata, const void *obj);// Copy the value function int (*keyCompare)(void *privdata, const void *key1, const void *key2);// Contrast key function void (*keyDestructor)(void *privdata, void *key);// Destroy the key function void (*valDestructor)(void *privdata, void *obj);// Destroy the value function } dictType;``` When we create a hash object , You can get the following diagram ( Some attributes are omitted ):![](https://img2020.cnblogs.com/blog/2232223/202101/2232223-20210120213440752-2100115347.png)### rehash operation `dict` An array is defined in `ht[2]`,`ht[2]` Two hash tables are defined in :`ht[0]` and `ht[1]`. and `Redis` By default, only `ht[0]`, It doesn't use `ht[1]`, Not for `ht[1]` Initialize allocation space . When setting a hash object , It's going to be a hash array ( In the picture above `dictEntry[3]`) Which subscript in , It's determined by calculating the hash value . If there is a hash collision ( The calculated hash values are consistent ), Then the same subscript will have multiple `dictEntry`, So that we can form a connection string ( The far right in the picture above points to `NULL` The location of ), However, it should be noted that the last element inserted is always at the top of the link string ( When there is a hash conflict , Always put the node at the head of the link string ). When reading data, we encounter a node with multiple elements , You need to traverse the linked Columns , So the longer the connection string is , The less effective . To ensure the performance of the hash table , When one of the following two conditions is satisfied , On the hash table `rehash`( Re hash ) operation : - The load factor is greater than or equal to `1` And `dict_can_resize` For `1` When . - The load factor is greater than or equal to the security threshold (`dict_force_resize_ratio=5`) When .PS: Load factor = Number of nodes used in hash table / Hash table size ( namely :`h[0].used/h[0].size`).### rehash Step expand suite hashing and shrink hashing are done by executing `rehash` To complete , This involves the allocation and release of space , There are five steps :1. For dictionary `dict` Of `ht[1]` Hash table allocates space , Its size depends on the number of nodes stored in the current hash table ( namely :`ht[0].used`): - If it is an Expansion Kit operation, then `ht[1]` The size of is `2 Of `n` The first of the powers is greater than or equal to `ht[0].used * 2` Value of property ( such as `used=3`, Now `ht[0].used * 2=6`, so `2` Of `3` To the power of `8` The first one is bigger than `used * 2` Value (2 Of 2 Power < 6 And 2 Of 3 Power > 6)).` - If it is a contraction operation, then `ht[1]` Size is 2 Of n The first of the powers is greater than or equal to `ht[0].used` Value .2. Put the attributes in the dictionary `rehashix` The value of is set to `0`, It means that it is being carried out `rehash` operation .3. Will `ht[0]` The hash values are recalculated in turn for all key value pairs in , And put it in `ht[1]` The corresponding position of the array , Every time you complete a key value pair `rehash` After that `rehashix` The value of needs to increase `1`.4. When `ht[0]` All key value pairs in are migrated to `ht[1]` After that , Release `ht[0]` , And will `ht[1]` It is amended as follows `ht[0]`, And then build a new `ht[1]` Array , For the next time `rehash` Make preparations .5. Put the attributes in the dictionary `rehashix` Set to `-1`, This time `rehash` End of operation , Wait for the next time `rehash`.### Step by step rehash`Redis` This kind of hashing operation in ** Because it's not all at once `rehash`, It's a lot of times, slowly `ht[0]` Key value pairs in `rehash` To `ht[1]`, So this kind of operation is also called progressive `rehash`**. Step by step `rehash` You can avoid centralization `rehash` The huge amount of computation , It's the idea of divide and rule . In a progressive way `rehash` In the process , Because there may be new key value pairs saved in , Now ** `Redis` The new method is to put the newly added key value pairs into `ht[1]` in , This ensures that `ht[0]` The number of key value pairs will only decrease **. When executing `rehash` In operation , If the server receives a command from the client requesting operation , Then ** I'll check first `ht[0]`, If you can't find the result, you'll find it again `ht[1]` Query in **.## ziplist About `ziplist` Some characteristics of , Previous articles have analyzed it separately , Want to know more about , Sure [ Click here ](https://www.cnblogs.com/lonely-wolf/p/14281136.html). But it's important to note that `ziplist` And list objects `ziplist` The difference is that the hash object is a `key-value` form , So its `ziplist` It is also shown in `key-value`,`key` and `value` Next to each other :![](https://img2020.cnblogs.com/blog/2232223/202101/2232223-20210120213514656-935498919.png)## ziplist and hashtable When a hash object can satisfy either of the following two conditions , Hash objects will choose to use `ziplist` Code to store : - The total length of all key value pairs in the hash object ( Including keys and values ) Less than `64` Byte ( This threshold can be determined by the argument `hash-max-ziplist-value` To control ). - The number of key value pairs in the hash object is less than or equal to `512` One ( This threshold can be determined by the argument `hash-max-ziplist-entries` To control ). If either of these two conditions is not satisfied , Hash objects will choose to use `hashtable` Code for storage .## Common commands for hash objects - hset key field value: Set a single `field`( It's a jumble of things `key` value ).- hmset key field1 value1 field2 value2 : Set multiple `field`( It's a jumble of things `key` value ).- hsetnx key field value: Make a hash of the table `key` Mesosphere `field` The value of is set to `value`, If `field` Already exists , No operation is performed .- hget key field: Get the hash table `key` In the domain `field` Corresponding `value`.- hmget key field1 field2: Get the hash table `key` Multiple domains in `field` Corresponding `value`.- hdel key field1 field2: Delete hash table `key` One or more of `field`.- hlen key: Go back to the hash table key The number of midlands .- hincrby key field increment: Make a hash of the table `key` In the domain `field` Plus the increment `increment` ,`increment` It can be negative , If `field` If it's not a number, it will report an error .- hincrbyfloat key field increment: Make a hash of the table `key` In the domain `field` Plus the increment `increment`,`increment` It can be negative , If `field` No `float` Type will report an error .- hkeys key: Get the hash table `key` All domains in .- hvals key: Get the values of all fields in the hash table . Learn about common commands for manipulating hash objects , We can verify the type and code of the hash object mentioned above , Before testing, in order to prevent other `key` The interference of values , Let's do it first `flushall` Command clear `Redis` Database . Then execute the following commands in turn :```javahset address country chinatype addressobject encoding address``` The results are as follows :![](https://img2020.cnblogs.com/blog/2232223/202101/2232223-20210120213546323-1888508446.png) You can see that when there is only one key value pair in our hash object , The underlying code is `ziplist`. Now we will `hash-max-ziplist-entries` Change the argument to `2`, Then restart `Redis`, Finally, enter the following command to test :```javahmset key field1 value1 field2 value2 field3 value3object encoding key``` After output, we get the following result :![](https://img2020.cnblogs.com/blog/2232223/202101/2232223-20210120213612035-857436495.png) You can see , Coding has become `hashtable`.# In summary, this paper mainly introduces `Redis` in `5` The underlying storage structure of a hash type in a common data type `hashtable` Use , And when `hash` When the distribution is uneven `Redis` It's the question of how to re hash , Finally, I learned some common commands of hash objects , The conclusion of this paper is verified by some examples .
版权声明
本文为[itread01]所创,转载请带上原文链接,感谢
https://javamana.com/2021/01/20210121014355162S.html

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云