Redis source code analysis of the compressed list (ziplist)

xindoo 2021-01-21 23:55:01
redis source code analysis compressed


I was going to use only one article to explain Redis Medium list, In the actual writing process, I found that Redis There are many kinds of list The implementation of the , So I'm going to split it into several articles , This article mainly talks about ziplist,ziplist It's also quicklist The basis of . And then there is skiplist,skiplist Although it is list, When the main and set Command related , So I'll put it in the back . This article mainly involves the source code in ziplist.c

What is the ziplist? We can do it in ziplist.c Find a section in the source code header Redis An introduction by the author .

The ziplist is a specially encoded dually linked list that is designed to be very memory efficient. It stores both strings and integer values, where integers are encoded as actual integers instead of a series of characters. It allows push and pop operations on either side of the list in O(1) time.However, because every operation requires a reallocation of the memory used by the ziplist, the actual complexity is related to the amount of memory used by the ziplist. ziplist It is a special coding bidirectional linked list designed to improve the storage efficiency . It can store strings or integers , When storing integers, they are stored in binary rather than string form . He can be in O(1) It's time complexity list At both ends push and pop operation . But because every operation needs to be reassigned ziplist Of memory , So the actual complexity and ziplist Is related to the amount of memory used .

The first half of the sentence is easy to understand , But every time Operations require reallocation of memory …… It's kind of intriguing . Don't worry. , You finish reading ziplist The concrete realization of .

ziplist Logically, it is a two-way linked list , But it's stored in a contiguous chunk of memory . Rather than ziplist It's a data structure , Rather, he is Redis The serialization storage mode of bidirectional linked list in .

ziplist structure

Whole ziplist The storage format in memory is as follows :

ziplist There are mainly several parts :

  • zlbytes: 32 Bit unsigned integer , Represents the whole ziplist The amount of space taken up , Contains zlbytes Occupied 4 Bytes .
  • This field can reset the entire ziplist You don't need to traverse the whole list To determine the size , Space for time .
  • zltail: 32 Bit unsigned integer , Represents the whole list The offset of the last item in , Easy to do at the tail pop operation .
  • zllen: 16 position , Express ziplist Stored in entry Number , But notice , At most, it means $2^{16} -2$ individual entry, If it is $2^{16}-1$ It has a special meaning ,$2^{16}-1$ Indicates that the amount of storage exceeds $2^{16}-2$ individual , But how many of them have to be traversed once to know .
  • zlend: 8 position ,ziplist The end of the letter means , The fixed value is 255.
  • entry: Indefinite length , There may be more than one ,list Specific data items in , I'll give you a detailed introduction . entry The core here is entry Data format ,entry It's a little complicated , As you can see from the picture above, it has three main parts .
    • prelen: Previous entry The storage size of , Mainly for the convenience of traversing from the back to the front .
    • encoding: The coding form of data ( String or number , What is the length )
    • data: Data actually stored

What's more complicated is Redis To save memory space , For the above three fields, a set of more complex coding methods is designed , It's essentially a variable length coding protocol , The specific rules are as follows :

prelen

If prelen The value is less than 254, So just one byte for the length , If the length is greater than or equal to 254 Just use 5 Bytes , The first byte is a fixed value 254(FE) To identify that this is a special piece of data , The rest 4 Bytes to represent the actual length .

encoding

encoding The specific value of depends on entry The specific content of the article , When entry It's a string when ,encoding The first two bytes of store the length of the string . When entry When it's an integer , By default, the first two bytes are 1, The last two bytes indicate which type of integer is stored , The first byte is enough to tell entry What kind of . Different encoding Examples of types are as follows :

  • |00pppppp| - 1 byte The length is less than or equal to 63 Of String type ,'pppppp' Unsigned 6 Number of digits string length .
  • |01pppppp|qqqqqqqq| - 2 byte The length is less than or equal to 16383 Of String type (14 position ), Be careful :14 position 'pppppp' Using big end storage
  • |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 byte The length is greater than or equal to 16384 Of String type , Starting with the second byte qqrrsstt Are binary bits used to store the length of a string , The length of the string that can be represented is the largest 2^32-1, The lower of the first byte 6 Bits are useless , So it's all 0. Be careful : 32 The number of bits is stored in the way of big end
  • |11000000| - 3 byte Storage int16_t (2 byte ).
  • |11010000| - 5 byte Storage int32_t (4 byte ).
  • |11100000| - 9 byte Storage int64_t (8 byte ).
  • |11110000| - 4 byte 24 Bit signed type integers (3 byte ).
  • |11111110| - 2 byte 8 Bit signed type integers (1 byte ).
    • |1111xxxx| - (xxxx stay 0001 and 1101 Between ) 4 Bit unsigned integer . 0 To 12 Of unsigned integers . The encoded value is actually derived from 1 To 13, because 0000 and 1111 Out of commission , Leave a space for 0, So it should be subtracted from the encoded value 1 It's the exact value

At some relatively small values , The specific value can be directly stored in encoding In the field .

ziplist Of API

ziplist.c There's a lot of code , Double list operation, a lot of code in ziplist There are a lot of , In fact, it's all caused by its complex storage format , Actually understood its encoding format , The specific code is not hard to understand . Here I just list a few that I think are more important API, Other can refer to the source code ziplist.c.

ziplist In fact, it's just a way of serializing two-way queues , Is the storage format in memory , In fact, it can't be used directly , What users see ziplist Just one. char * The pointer , Each of them entry In practical use, we also need to reverse sequence into zlentry Convenient to call .

typedef struct zlentry {
unsigned int prevrawlensize; /* Encoded in memory prevrawlen How many bytes are used */
unsigned int prevrawlen; /* Previous entry The length of occupation , Mainly for entry Jump between */
unsigned int lensize; /* Encoded in memory len How many bytes are used */
unsigned int len; /* At present entry The length of , If it is string said string The length of , If it's an integer , be len Depending on the size of the value .*/
unsigned int headersize; /* prevrawlensize + lensize. entry Of head How many bytes are used in the part */
unsigned char encoding; /* At present entry The encoding format of */
unsigned char *p; /* A pointer to a data field */
} zlentry;

There is another point ,ziplist In memory is highly compact continuous storage , That means it's not friendly to change at the beginning , If you want to ziplist Do the operation of modifying the class , Then you need to reallocate new memory to store new ziplist, The price is very big , The specific insertion and deletion codes are as follows .

/* stay p Position insert data *s. */
unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;
unsigned int prevlensize, prevlen = 0;
size_t offset;
int nextdiff = 0;
unsigned char encoding = 0;
long long value = 123456789; /* initialized to avoid warning. Using a value
that is easy to see if for some reason
we use it uninitialized. */
zlentry tail;
/* Find the previous node and calculate prevlensize and prevlen */
if (p[0] != ZIP_END) {
ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
} else {
unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);
if (ptail[0] != ZIP_END) {
prevlen = zipRawEntryLength(ptail);
}
}
/* See if the entry can be encoded */
if (zipTryEncoding(s,slen,&value,&encoding)) {
/* 'encoding' is set to the appropriate integer encoding */
reqlen = zipIntSize(encoding);
} else {
/* 'encoding' is untouched, however zipStoreEntryEncoding will use the
* string length to figure out how to encode it. */
reqlen = slen;
}
/* We need space for both the length of the previous entry and
* the length of the payload. */
reqlen += zipStorePrevEntryLength(NULL,prevlen);
reqlen += zipStoreEntryEncoding(NULL,encoding,slen);
/* When the insert position is not equal to the tail, we need to
* make sure that the next entry can hold this entry's length in
* its prevlen field. */
int forcelarge = 0;
nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;
if (nextdiff == -4 && reqlen < 4) {
nextdiff = 0;
forcelarge = 1;
}
/* Store offset because a realloc may change the address of zl. */
offset = p-zl;
// Calculate the amount of memory needed , And then regenerate a new size zl Replace the original zl.
zl = ziplistResize(zl,curlen+reqlen+nextdiff);
p = zl+offset;
/* Migrating data , And then update tail Of offset */
if (p[0] != ZIP_END) {
/* Subtract one because of the ZIP_END bytes */
memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);
/* Encode this entry's raw length in the next entry. */
if (forcelarge)
zipStorePrevEntryLengthLarge(p+reqlen,reqlen);
else
zipStorePrevEntryLength(p+reqlen,reqlen);
/* Update offset for tail */
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);
/* When the tail contains more than one entry, we need to take
* "nextdiff" in account as well. Otherwise, a change in the
* size of prevlen doesn't have an effect on the *tail* offset. */
zipEntry(p+reqlen, &tail);
if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
}
} else {
/* This element will be the new tail. */
ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);
}
/* When nextdiff != 0, the raw length of the next entry has changed, so
* we need to cascade the update throughout the ziplist */
if (nextdiff != 0) {
offset = p-zl;
zl = __ziplistCascadeUpdate(zl,p+reqlen);
p = zl+offset;
}
/* Write data */
p += zipStorePrevEntryLength(p,prevlen);
p += zipStoreEntryEncoding(p,encoding,slen);
if (ZIP_IS_STR(encoding)) {
memcpy(p,s,slen);
} else {
zipSaveInteger(p,value,encoding);
}
ZIPLIST_INCR_LENGTH(zl,1);
return zl;
}

ziplist The node to delete

unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
unsigned int i, totlen, deleted = 0;
size_t offset;
int nextdiff = 0;
zlentry first, tail;
zipEntry(p, &first);
for (i = 0; p[0] != ZIP_END && i < num; i++) {
p += zipRawEntryLength(p);
deleted++;
}
totlen = p-first.p; /* Reduced memory space after deleting elements ( byte ) */
if (totlen > 0) {
if (p[0] != ZIP_END) {
/* Storing `prevrawlen` in this entry may increase or decrease the
* number of bytes required compare to the current `prevrawlen`.
* There always is room to store this, because it was previously
* stored by an entry that is now being deleted. */
nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);
/* Note that there is always space when p jumps backward: if
* the new previous entry is large, one of the deleted elements
* had a 5 bytes prevlen header, so there is for sure at least
* 5 bytes free and we need just 4. */
p -= nextdiff;
zipStorePrevEntryLength(p,first.prevrawlen);
/* Update offset for tail */
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);
/* When the tail contains more than one entry, we need to take
* "nextdiff" in account as well. Otherwise, a change in the
* size of prevlen doesn't have an effect on the *tail* offset. */
zipEntry(p, &tail);
if (p[tail.headersize+tail.len] != ZIP_END) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
}
/* hold tail Move to ziplist In front of */
memmove(first.p,p,
intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);
} else {
/* The entire tail was deleted. No need to move memory. */
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe((first.p-zl)-first.prevrawlen);
}
/* to update ziplist size */
offset = first.p-zl;
zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);
ZIPLIST_INCR_LENGTH(zl,-deleted); // to update zllen
p = zl+offset;
/* When nextdiff != 0, the raw length of the next entry has changed, so
* we need to cascade the update throughout the ziplist */
if (nextdiff != 0)
zl = __ziplistCascadeUpdate(zl,p);
}
return zl;
}

The basic logic of insertion and deletion is similar , First position , And then I'm going to insert / The memory space required after deletion changes , According to the calculated new space size to zl do ziplistResize(), And then update zl Meta information of . In addition to insertion and deletion , image ziplistPush ziplistMerge, This kind of... With changes API, In the end, I call ziplistResize, ziplistResize The code is as follows :

unsigned char *ziplistResize(unsigned char *zl, unsigned int len) {
zl = zrealloc(zl,len);
ZIPLIST_BYTES(zl) = intrev32ifbe(len);
zl[len-1] = ZIP_END;
return zl;
}

It looks short , A lot of logic is there zrealloc in ,zrealloc Is a macro definition ( A sudden sensation c The macro definition of is very coquettish ), In fact, the main logic is to apply for a piece of len Space , And then release the original zl The space it points to . You can see here ziplist The cost of revision is high , If there are frequent updates in use list The operation of , It is suggested that list Optimize the relevant configuration .

other API

Specifically API See the source code for the definition list ziplist.h

unsigned char *ziplistNew(void); // newly build ziplist
unsigned char *ziplistMerge(unsigned char **first, unsigned char **second); // Merge two ziplist
unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where); // stay ziplist Head or tail push A node
unsigned char *ziplistIndex(unsigned char *zl, int index); // Find a subscript node
unsigned char *ziplistNext(unsigned char *zl, unsigned char *p); // find p The next node of the node
unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p); // find p The previous node of the node
unsigned int ziplistGet(unsigned char *p, unsigned char **sval, unsigned int *slen, long long *lval); // obtain entry Specific content stored in
unsigned char *ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen); // Insert
unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p); // Delete
unsigned char *ziplistDeleteRange(unsigned char *zl, int index, unsigned int num); // Delete a node in a subscript interval
unsigned int ziplistCompare(unsigned char *p, unsigned char *s, unsigned int slen); // Compare the size of two nodes
unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip); // Find a node with a specific value
unsigned int ziplistLen(unsigned char *zl); // ziplist The length of
size_t ziplistBlobLen(unsigned char *zl); // ziplist The size of the storage space
void ziplistRepr(unsigned char *zl); // 

Conclusion

ziplist It's actually a logical two-way linked list , You can quickly find the head node and tail node , And then each node (entry) It also includes pointing to the front / Back node " The pointer ", But in order to save memory to the extreme , Abandon the traditional linked list design ( The front and back pointers need 16 Byte space , And it can lead to serious memory fragmentation ), Designed a very compact memory storage format . Memory is saved , But the operational complexity is also up to the new complexity , Of course Redis The author also considers this point , So I also designed ziplist And the traditional two-way list compromise ——quicklist, We'll talk about it in detail in the next blog post quicklist.

This article is about Redis Source analysis series blog , At the same time, there are also corresponding Redis Chinese annotation version , I want to learn more about it Redis Classmate , welcome star And attention . Redis Chinese annotation version warehouse :https://github.com/xindoo/Redis Redis Source analysis column :https://zxs.io/s/1h

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

版权声明
本文为[xindoo]所创,转载请带上原文链接,感谢
https://javamana.com/2021/01/20210121234656662g.html

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云