Redis core principle and practice: implementation principle of hash type and dictionary structure

binecy 2021-10-14 07:48:03
redis core principle practice implementation


Redis Hash types can store an unordered set of key value pairs , It is especially suitable for storing an object data .

HSET fruit name apple price 7.6 origin china
3
HGET fruit price
"7.6"

In this paper, Redis Hash type and its underlying data structure in -- The realization principle of Dictionary .

Dictionaries

Redis A dictionary structure is usually used to store user hash data . 
The dictionary is Redis Important data structures . Except for hash types ,Redis The database also uses a dictionary structure . 
Redis Use Hash Table implements the dictionary structure . analysis Hash surface , We usually focus on the following issues : 
(1) What to use Hash Algorithm ? 
(2)Hash How to resolve the conflict ? 
(3)Hash How to expand a table ?

Tips : Unless otherwise specified in the code of this chapter , Both in dict.h、dict.c in .

Definition

The definitions of key value pairs in the dictionary are as follows :

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;
  • key、v: key 、 value .

  • next: Next key value pair pointer . so Redis The dictionary uses the linked list method to solve Hash The question of conflict .

Tips :C Language union Keyword is used to declare a common body , All properties of the common body share the same space , Only one of the attribute values can be stored at a time . in other words ,dictEntry.v It can store val、u64、s64、d One of the attribute values in . Use sizeof Function to calculate the size of the common body , The result will not be less than the largest member attribute size in the community .

In the dictionary Hash Table is defined as follows :

typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;
  • table:Hash Table array , Responsible for storing data .

  • used: Record the number of stored key value pairs .

  • size:Hash Table array length .

dictht The structure of is shown in the figure 3-1 Shown .

 picture
chart 3-1

The definition of a dictionary is as follows :

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; 
    unsigned long iterators;
} dict;
  • type: Specifies the function pointer to the operation data .

  • ht[2]: Define two Hash Table is used to implement the dictionary expansion mechanism . Usually, only ht[0], When expanding capacity , Will create ht[1], And when operating the data, gradually ht[0] Move your data to ht[1] in .

  • rehashidx: The next step of capacity expansion is to migrate ht[0]Hash Table array index ,-1 It means that there is no capacity expansion operation currently .

  • iterators: Number of iterators currently running , Iterators are used to traverse dictionary key value pairs . 
    dictType Defines the function pointer used to manipulate data in the dictionary , These functions are responsible for data replication 、 Comparison and other operations .

typedef struct dictType {
    uint64_t (*hashFunction)(const void *key);
    void *(*keyDup)(void *privdata, const void *key);
    void *(*valDup)(void *privdata, const void *obj);
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);
    void (*keyDestructor)(void *privdata, void *key);
    void (*valDestructor)(void *privdata, void *obj);
} dictType;

adopt dictType Specifies the function pointer to the operation data , Dictionaries can store different types of data . But in a dictionary , key 、 Values can be of different types , But the keys must be of the same type , Values must also be of the same type . 
Redis Different definitions are defined for different dictionaries dictType, As used by the database server.c/dbDictType, The hash type uses server.c/setDictType etc. .

Operational analysis

dictAddRaw Function can insert or find keys in the dictionary :

dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
{
    long index;
    dictEntry *entry;
    dictht *ht;
    // [1]
    if (dictIsRehashing(d)) _dictRehashStep(d);

    // [2]
    if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
        return NULL;
    // [3]
    ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
    // [4]
    entry = zmalloc(sizeof(*entry));
    entry->next = ht->table[index];
    ht->table[index] = entry;
    ht->used++;

    // [5]
    dictSetKey(d, entry, key);
    return entry;
}

Parameter description :

  • existing: If the parameter already exists in the dictionary key, Will correspond to dictEntry The pointer is assigned to *existing, And back to null, Otherwise, return the created dictEntry. 

【1】 If the dictionary is expanding , Then perform a single step operation of capacity expansion . 
【2】 Calculate the parameters key Of Hash Table array index , return -1, Represents that the key already exists , At this time dictAddRaw The function returns NULL, Indicates that the key already exists . 
【3】 If the dictionary is expanding , Will be new dictEntry Add to ht[1] in , Otherwise add to ht[0] in . 
【4】 establish dictEntry, Head in Hash In the linked list at the corresponding position of the table array .Redis The dictionary uses the linked list method to solve Hash Conflict ,Hash The elements of the table array are linked lists . 
【5】 Set the key to dictEntry in .

dictAddRaw Functions only insert keys , The corresponding value is not inserted . You can use the returned dictEntry Insert value :

    entry = dictAddRaw(dict,mykey,NULL);
    if (entry != NULL) dictSetSignedIntegerVal(entry,1000);

Hash Algorithm  
dictHashKey Macro call dictType.hashFunction The function calculates the value of the key Hash value :

#define dictHashKey(d, key) (d)->type->hashFunction(key)

Redis Chinese dictionaries basically use SipHash Algorithm (server.c/dbDictType、server.c/setDictType etc. dictType Of hashFunction The functions pointed to by the property use SipHash Algorithm ). The algorithm can effectively prevent Hash Table collision attack , And provide good performance . 
Hash The algorithm involves more mathematical knowledge , This book does not discuss Hash The principle and implementation of the algorithm , Readers can read the relevant code by themselves .

Tips :Redis 4.0 Used before Hash The algorithm is MurmurHash. Even if the input keys are regular , The results of the algorithm still have good discreteness , And the calculation speed is very fast .Redis 4.0 Start changing to SipHash Algorithm , It should be for security reasons .

Calculate the bond Hash After value , You also need to calculate the... Of the key Hash Table array index :

static long _dictKeyIndex(dict *d, const void *key, uint64_t hash, dictEntry **existing)
{
    unsigned long idx, table;
    dictEntry *he;
    if (existing) *existing = NULL;

    // [1]
    if (_dictExpandIfNeeded(d) == DICT_ERR)
        return -1;
    // [2]
    for (table = 0; table <= 1; table++) {
        idx = hash & d->ht[table].sizemask;
        he = d->ht[table].table[idx];
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key)) {
                if (existing) *existing = he;
                return -1;
            }
            he = he->next;
        }
        // [3]
        if (!dictIsRehashing(d)) break;
    }
    return idx;
}

【1】 Expand or initialize as needed Hash Table operations . 
【2】 Traverse ht[0]、ht[1], Calculation Hash Table array index , And decide Hash Whether the parameter... Already exists in the table key. If it already exists , Will correspond to dictEntry Assign a value to *existing. 
【3】 If there is no capacity expansion operation at present , The calculation ht[0] Exit after indexing , There is no need to calculate ht[1].

Capacity expansion

Redis A progressive expansion method is used , This design , Because Redis It's single threaded . If in one operation ht[0] All data is migrated to ht[1], Then it may cause long-term thread blocking . therefore ,Redis Dictionary capacity expansion is a one-step operation of capacity expansion every time data is operated , The single step operation of capacity expansion is about to ht[0].table[rehashidx] Data migration to ht[1]. wait until ht[0] All of the data is migrated to ht[1], It will be ht[0] Point to ht[1], Complete the expansion . 
_dictExpandIfNeeded Function is used to determine Hash Whether the table needs to be expanded :

static int _dictExpandIfNeeded(dict *d)
{
    ...

    if (d->ht[0].used >= d->ht[0].size &&
        (dict_can_resize ||
         d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
    {
        return dictExpand(d, d->ht[0].used*2);
    }
    return DICT_OK;
}

Capacity expansion needs to meet two conditions : 
(1)d->ht[0].used≥d->ht[0].size:Hash The number of key value pairs stored in the table is greater than or equal to Hash The length of the table array . 
(2) Open the dict_can_resize Or the load factor is greater than dict_force_resize_ratio. 
d->ht[0].used/d->ht[0].size, namely Hash The number of key value pairs stored in the table /Hash The length of the table array , Call it the load factor .dict_can_resize Default on , That is, the load factor is equal to 1 Capacity expansion . The load factor is equal to 1 There may be relatively high Hash Conflict rate , But this can improve Hash Table memory usage .dict_force_resize_ratio closed , You have to wait until the load factor equals 5 Forced capacity expansion only when . The user cannot shut down... Through configuration dict_force_resize_ratio, The switch of this value is the same as Redis Persistence is about , When we analyze Redis We'll discuss this value when we persist .

dictExpand Function starts the expansion operation :

int dictExpand(dict *d, unsigned long size)
{
    ...
    // [1]
    dictht n; 
    unsigned long realsize = _dictNextPower(size);
    ...

    // [2]
    n.size = realsize;
    n.sizemask = realsize-1;
    n.table = zcalloc(realsize*sizeof(dictEntry*));
    n.used = 0;

    // [3]
    if (d->ht[0].table == NULL) {
        d->ht[0] = n;
        return DICT_OK;
    }

    // [4]
    d->ht[1] = n;
    d->rehashidx = 0;
    return DICT_OK;
}

Parameter description :

  • size: new Hash Table array length . 

【1】_dictNextPower The function will size Adjusted for 2 Of n The next power . 
【2】 Building a new Hash surface dictht. 
【3】ht[0].table==NULL, On behalf of a dictionary Hash The table array has not been initialized yet , Will be new dictht Assign a value to ht[0], Now it can store data . This is not an expansion operation , It's the initialization operation before the dictionary is used for the first time . 
【4】 otherwise , Will be new dictht Assign a value to ht[1], And will rehashidx The assignment is 0.rehashidx Represents the data to be migrated in the next single step operation of capacity expansion ht[0] Hash Table array index .

Why would you size Adjusted for 2 Of n Power ? This is for ht[1] Hash The length of the table array is ht[0] Hash Multiple of the length of the table array , advantageous to ht[0] The data is evenly migrated to ht[1]. 
Let's look at the key Hash Table array index calculation method :idx=hash&ht.sizemask, because sizemask= size-1, The calculation method is equivalent to :idx=hash%(ht.size)
therefore , If ht[0].size by n,ht[1].size by 2×n, about ht[0] On the element ,ht[0].table[k] The data of , Or move to ht[1].table[k], Or move to ht[1].table[k+n]. This allows you to ht[0].table Split data of one index bit into ht[1] On the two index bits of . 
chart 3-2 Shows a simple example . 

 picture
chart 3-2

_dictRehashStep Function is responsible for performing one-step operation of capacity expansion , take ht[0] The data of one index bit in is migrated to ht[1] in .dictAddRaw、dictGenericDelete、dictFind、dictGetRandomKey、dictGetSomeKeys And other functions will call this function , So as to gradually migrate the data to a new Hash In the table .
_dictRehashStep call dictRehash Function to complete the one-step operation of capacity expansion :

int dictRehash(dict *d, int n) {
    int empty_visits = n*10
    // [1]
    if (!dictIsRehashing(d)) return 0;

    while(n-- && d->ht[0].used != 0) {
        dictEntry *de, *nextde;

        assert(d->ht[0].size > (unsigned long)d->rehashidx);
        // [2]
        while(d->ht[0].table[d->rehashidx] == NULL) {
            d->rehashidx++;
            if (--empty_visits == 0return 1;
        }
        // [3]
        de = d->ht[0].table[d->rehashidx];
        while(de) {
            uint64_t h;

            nextde = de->next;
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;
            d->ht[0].used--;
            d->ht[1].used++;
            de = nextde;
        }
        d->ht[0].table[d->rehashidx] = NULL;
        d->rehashidx++;
    }

    // [4]
    if (d->ht[0].used == 0) {
        zfree(d->ht[0].table);
        d->ht[0] = d->ht[1];
        _dictReset(&d->ht[1]);
        d->rehashidx = -1;
        return 0;
    }
    return 1;
}

Parameter description :

  • n: The of this operation migration Hash Number of array indexes . 

【1】 If the dictionary is not currently expanded , Then exit the function directly . 
【2】 from rehashidx Start , Find the first non empty index bit . 
If the number of empty index bits found here is greater than n×10, Then return directly . 
【3】 Traverse all the elements on the index bit linked list . 
Calculate each element at ht[1] Of Hash Index in table array , Move element to ht[1] in . 
【4】ht[0].used==0, representative ht[0] All the data has been moved to ht[1] in . 
Release ht[0].table, take ht[0] Pointer to ht[1], And reset rehashidx、d->ht[1], Expansion completed .

Shrinkage capacity

After performing the delete operation ,Redis Will check whether the dictionary needs to be reduced , When Hash Table length is greater than 4 And the load factor is less than 0.1 when , Will perform volume reduction , To save memory . Volume reduction is actually through dictExpand Function completed , Just the second argument to the function size Is the reduced size .

dict Common functions are shown in table 3-1 Shown .

function effect
dictAdd Insert key-value pairs
dictReplace Replace or insert key value pairs
dictDelete Delete key value pair
dictFind Look for key value pairs
dictGetIterator Generate unsafe iterators , The dictionary can be modified
dictGetSafeIterator Generate a secure iterator , The dictionary cannot be modified
dictResize The dictionary shrinks
dictExpand Dictionary expansion

code

Hash types are OBJ_ENCODING_HT and OBJ_ENCODING_ZIPLIST Two codes , Separate use dict、ziplist Structure stores data (redisObject.ptr Point to dict、ziplist structure ).Redis Will be used preferentially ziplist Store hash elements , Use one ziplist Node storage key , The value stored in the rear drive node , Search needs to traverse ziplist. Use dict Store hash elements , The keys and values of the dictionary are sds type . The hash type uses OBJ_ENCODING_ZIPLIST code , The following conditions must be met : 
(1) The length of all keys or values in the hash is less than or equal to server.hash_max_ziplist_value, This value can be obtained by hash-max-ziplist-value Configuration item adjustment . 
(2) The number of key value pairs in the hash is less than server.hash_max_ziplist_entries, This value can be obtained by hash-max- ziplist-entries Configuration item adjustment . 
The implementation code of hash type is in t_hash.c in , Readers can check the source code for more implementation details .

database

Redis Is a memory database , Database objects are defined internally server.h/redisDb Responsible for storing data ,redisDb Dictionary structure is also used to manage data .

typedef struct redisDb {
    dict *dict;                 
    dict *expires;              
    dict *blocking_keys;        
    dict *ready_keys;           
    dict *watched_keys;         
    int id;                     
    ...
} redisDb;
  • dict: Database dictionary , The redisDb All the data is stored here .

  • expires: Out of date Dictionary , Store Redis All keys with expiration time set in and their corresponding expiration time , The expiration time is long long Type of UNIX Time stamp .

  • blocking_keys: The blocked key and the corresponding client .

  • ready_keys: When the data is ready, the key that can unblock the blocking state and the corresponding client .

  • watched_keys: By watch Command to monitor the key and the corresponding client .

  • id: database ID identification .

Redis Is a key value pair database , Its full name is Remote Dictionary Server( Remote Dictionary Service ), It is itself a dictionary service .redisDb.dict The keys in the dictionary are sds, Values are redisObject. This is also redisObject One of the functions , It encapsulates all data structures into redisObject structure , As redisDb Dictionary value . 
A simple redisDb Structure is shown in figure 3-3 Shown .

 picture
chart 3-3

When we need to operate Redis Data time , From the redisDb The data was found in . 
db.c It defines hashTypeLookupWriteOrCreate、lookupKeyReadOrReply Such as function , You can find... By pressing the key redisDb.dict Corresponding redisObject, These functions are all done by calling dict API Realized , It's not shown here , Interested readers can read the code by themselves .

summary :

  • Redis Dictionary use SipHash Algorithm calculation Hash value , And use the linked list method to deal with Hash Conflict .

  • Redis The dictionary uses progressive expansion , Perform a single step operation of capacity expansion in each data operation , Until the expansion is completed .

  • The encoding format of hash type can be OBJ_ENCODING_HT、OBJ_ENCODING_ZIPLIST.

This article is excerpted from the author's new book 《Redis Core principles and practices 》, This book gives an in-depth analysis of Redis Internal mechanism and implementation of common features , Most of the content comes from Redis 6.0 Source code analysis , And summarize the design ideas 、 Realization principle . By reading this book , Readers can quickly 、 Understand easily Redis Internal operation mechanism of . 
With the consent of the editor of the book , I will continue to be in the official account of personal technology. (binecy) Release some chapters in the book , As a preview of the book , Welcome to , thank you .

版权声明
本文为[binecy]所创,转载请带上原文链接,感谢
https://javamana.com/2021/10/20211002145701981o.html

  1. Day17 Java Foundation
  2. Day18 Java Foundation
  3. Linux installe JDK 1.8 et configure les variables d'environnement
  4. Tutoriel d'utilisation Maven super détaillé
  5. Spring boot reads project parameter configuration
  6. Docker installing rocketmq
  7. Java Zero Basic small white Beginner must make a summary of issues (recommended Collection) Chapitre 1
  8. Manuel pour vous apprendre à utiliser le développement Java pour générer des documents PDF en ligne
  9. 40 + comment les femmes s'habillent - elles pour montrer leur jeunesse?Un manteau et une jupe vous donnent un look haut de gamme tout au long de l'automne et de l'hiver
  10. Tutoriel d'installation Ubuntu 16.04 / Hadoop 3.1.3Configuration autonome / pseudo - distribuée
  11. L'apprentissage le plus détaillé de springboot à l'échelle du réseau - day01
  12. L'apprentissage le plus détaillé de springboot sur le Web - day02
  13. L'apprentissage le plus détaillé de springboot sur le Web - day03
  14. L'apprentissage le plus détaillé de springboot sur le Web - day04
  15. Tutoriel d'utilisation Maven super détaillé
  16. L'apprentissage le plus détaillé de springboot sur le Web - day05
  17. L'apprentissage le plus détaillé de springboot sur le Web - day06
  18. L'apprentissage le plus détaillé de springboot sur le Web - day07
  19. Introduction to JavaScript - write a photo album for your girlfriend
  20. [Hadoop 3. X] HDFS storage type and storage strategy (V) overview
  21. L'apprentissage le plus détaillé de springboot sur le Web - day08
  22. Introduction à la page Web de rabbitmq (3)
  23. No Converter found for return value of type: class java.util.arraylist Error Problem
  24. (16) , spring cloud stream message driven
  25. Que faut - il apprendre de l'architecture des microservices Spring Cloud?
  26. Résolution: erreur: Java: distribution cible invalide: 11problème d'erreur
  27. Springboot démarre en une minute et sort de l'enfer de la configuration SSM!
  28. Maven - un outil de gestion essentiel pour les grands projets d'usine, de l'introduction à la maîtrise![️ Collection recommandée]
  29. ️ Push to interview in Large Factory ᥧ - - Spring Boot Automatic Assembly Principle
  30. [️ springboot Template Engine] - thymeleaf
  31. Springboot - MVC Automatic configuration Principle
  32. Mybatis reverse engineering and the use of new version mybatisplus 3.4 reverse engineering
  33. Base de données MySQL - transactions et index
  34. Sécurité du printemps - [authentification, autorisation, déconnexion et contrôle des droits]
  35. Moteur de base de données InnoDB diffère de myisam
  36. Swagger - [springboot Integrated Swagger, configure Swagger, configure scan Interface, configure API Group]
  37. Cadre de sécurité Shiro - [QUICKstart, login Block, User Authentication, request Authorization]
  38. [Introduction à Java] installation de l'environnement de développement - Introduction à Java et construction de l'environnement
  39. 【 linux】 notes d'utilisation tmux
  40. MySQL + mybatis paging query - database series learning notes
  41. Usage relations and differences of count (1), count (*) and count (a field) in MySQL
  42. 2021 Ali Java advanced interview questions sharing, Java Architect interview materials
  43. Mybatis - dynamic SQL statement - if usage - MySQL series learning notes
  44. [go to Dachang series] deeply understand the use of where 1 = 1 in MySQL
  45. [secret room escape game theme ranking list] Based on spring MVC + Spring + mybatis
  46. Redis log: the killer mace of fearless downtime and rapid recovery
  47. 5 minutes to build redis cluster mode and sentinel mode with docker
  48. Java小白入门200例106之遍历ArrayList的几种方式
  49. Java小白入门200例105之Java ArrayList类
  50. Java小白入门200例104之JDK自带记录日志类logging
  51. Practice of high availability architecture of Tongcheng travel network based on rocketmq
  52. Chapter 9 - Linux learning will - file archiving and compression tar --- zip
  53. Java小白入門200例104之JDK自帶記錄日志類logging
  54. JDK avec journalisation de classe dans 200 cas 104
  55. Java ArrayList Class for Introduction to Java LITTLE WHITE 200 example 105
  56. Plusieurs façons de traverser ArrayList à partir de 200 exemples 106
  57. Provectus / Kafka UI: open source Apache Kafka's Web GUI Graphical interface management tool
  58. Design pattern series: Singleton pattern
  59. Java小白入門200例105之Java ArrayList類
  60. Understanding Java record types