Why does redis design simple strings as SDS?

Fan Li 2021-04-08 11:59:03
redis design simple strings sds


2021 The first day of construction , Some of my friends wrote to me , He also shared with me a piece of his face redis topic ( This guy is better than he's got his year-end bonus ), I think it's very interesting after reading it , The subject is very simple , It's the typical one who doesn't understand , Problems that are often overlooked by people . Let's sort it out and share it , By the way, consolidate your foundation , I hope it's helpful to the brothers who are interviewing and want to interview .

The topic is like this

interviewer : understand redis Of String The underlying implementation of data structure ?

Iron seed : Of course I do , Is based on SDS Realized

interviewer :redis Yes, it is C Language development , Then why not use it directly C String , It's also designed separately SDS Such a structure ?

Iron seed :·····

“ In fact, I can see that the interviewer wants to see , Tiezi only stays in redis The use level of , Or did you have a more in-depth study of the underlying data structure , Interviewers like to ask like this. We all know .

We know redis Yes, it is C Written , But it's not completely direct use C String , Instead, I rebuilt a new string called simple dynamic string SDS(simple dynamic string) Abstract type of .

redis Also supports the use of C The traditional string of language , It's just used in places where you don't need to modify strings , For example, static character output .

And we use it in development redis, The value of the string is often modified , I'll use it at this time SDS To represent the value of a string . It's worth it Be careful : stay redis In the database ,key-value Key value pairs contain string values , It's all by SDS To achieve .

such as : stay redis Execute one of the simplest set command , At this time redis A new key value pair will be created .

127.0.0.1:6379> set xiaofu " Something inside the programmer "

In this case, the key value is right key and value It's all a string object , The underlying implementations of objects are two that hold strings xiaofu and Something inside the programmer Of SDS structure .

Another example : I press data into a list ,redis A new key value pair will be created .

127.0.0.1:6379> lpush xiaofu " Something inside the programmer " " Programmer Xiaofu "

In this case, the key of the key value pair is the same as above , It's also a result of SDS Implemented string object , The value of a key value pair is a list object containing two string objects , And the bottom layer of these two objects is also made up of SDS Realization .

SDS structure

One SDS Data structure of values , Mainly by lenfreebuf[] These three attributes make up .

struct sdshdr{
int free; // buf[] The number of unused bytes in the array
int len; // buf[] The length of the string held by the array
char buf[]; // An array of strings
}

among buf[] To actually save the string char Type array ;free Express buf[] The number of unused bytes in the array ;len Express buf[] The length of the string held by the array .

For example, the image above shows buf[] The storage length is 6 A byte string , Number of unused bytes free by 0, But sharp eyed students will find that this is clearly 7 Characters , One more "\0" ah ?

It's mentioned above SDS Not completely direct use C String , Still use some C Characteristic , Like following C The rule that a string ends with a space character , You can also use some of them C String function . And for SDS Come on , A byte occupied by an empty string is not counted in the len In the attribute , He'll be given extra space .

Simple understanding SDS After structure , Let's take a look at SDS Compared with C What are the advantages of strings .

Efficient

for instance : Use in work redis, Often through STRLEN Command to get the length of a string , stay SDS In structure len Property records the length of the string , So we get the length of a string and take len Value , Complexity is O(1).

And if used C character string , When getting the length of a string , Need to traverse the entire string , Until the end of the traversal to the space character (C The space character encountered in represents a complete string ), The complexity at this point is O(N).

Traversing strings frequently in high concurrency scenarios , Getting the length of a string is likely to be redis Performance bottlenecks , therefore SDS Better performance .

Data overflow

Above mentioned C A string does not record its own length , Two adjacent strings may be stored as shown in the figure below , Appropriate memory space is allocated for the string .

If at this point I want to “ Something inside the programmer ” Change to “ Something inside the programmer 123”, The memory that can be allocated before is only 6 Bytes , The modified string needs 9 It's a byte to put it down , How make ?

There's no way but Embezzlement Space between adjacent strings , Self data overflow causes the contents of other strings to be modified .

and SDS It's a good way to avoid this , When we need to modify the data , First check the current SDS Space len Is it satisfactory? , If not, the space will be automatically expanded to the required size , And then make the changes , As shown in the figure below .

But there's a special thing , In the “ Something inside the programmer ” Of 6 Expand bytes to “ Something inside the programmer 123”9 After the bytes , Find out free The value of the property becomes the total length of the expanded string , This involves the following memory reallocation strategy .

Memory reallocation strategy

C The length of the string is fixed , So every time you grow or shorten a string , Do the reallocation of memory , The memory reallocation algorithm is usually a time-consuming operation , It's acceptable if the program doesn't modify the string very often .

But unfortunately ,redis As a database , Data is bound to be modified frequently , If you have to perform a memory reallocation every time you modify it , Then it will seriously affect the performance .

SDS Through two memory reallocation strategies , Good solution to the string in the growth and shortening of the memory allocation problem .

1. Space preallocation

Space pre allocation strategy is used to optimize SDS String growth operation , When you modify a string and you need to SDS When the space of is expanded , Not only for SDS Allocate the space necessary for modification , Also for the SDS Allocate additional unused space free, Check unused space next time free Is it satisfactory? , Satisfaction is not in the expansion space .

Through space pre allocation strategy ,redis Can effectively reduce the string continuous growth operation , The number of memory reallocations generated .

Allocate extra unused space free The rules of :

  • If the SDS After modifying the string ,len Less than 1M, Then allocate extra unused space at this time free The size and len equal .
  • If the SDS After modifying the string ,len A value greater than or equal to 1M, Then allocate extra unused space at this time free The size is 1M.

2. Inert space release

Inert space release strategy is used to optimize SDS String shortening operation , When shortening SDS After the string , No immediate memory reallocation is performed to reclaim the extra space , It's about using free Attribute to record these spaces , If there are subsequent growth operations , You can use .

Data format diversity

C Characters in a string must conform to certain encoding formats , And we also mentioned above ,C String to \0 The end of a null character marks the end of a string , So the string cannot contain \0 Of , Otherwise, it will be mistaken for multiple .

Because of this limitation , bring C Strings can only hold text data , Like audio and video 、 Data in binary format such as pictures cannot be stored .

redis Will operate in a binary way Buf Data in array , So there is no restriction on the data stored in it 、 Filter , Just save what you want , Take it out or something .

summary

It's just redis A little basic knowledge of data structure , It's easy , But with my interview experience , If asked such questions , Don't just say that the bottom is SDS, It's reasonable to explain why it is realized in this way .

On the one hand, I can show that I have solid basic skills , If the expression is clear , It's a great bonus ; In an interview to actively give up the idea of the interviewer to ask , Of course, I'm afraid of people who don't play according to the routine !

This article is from WeChat official account. - High performance server development (easyserverdev)

The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the yunjia_community@tencent.com Delete .

Original publication time : 2021-03-31

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

版权声明
本文为[Fan Li]所创,转载请带上原文链接,感谢
https://javamana.com/2021/04/20210408111751212K.html

  1. A love diary about http
  2. navicat连接win10 mysql8.0 报错2059
  3. [rocketmq source code analysis] in depth message storage (3)
  4. Implementation of service configuration center with spring cloud + Nacos (Hoxton version)
  5. SCIP: constructing data abstraction -- Explanation of queue and tree in data structure
  6. SCIP: abstraction of construction process -- object oriented explanation
  7. Using docker to build elasticsearch + kibana cluster
  8. What are the spring IOC features? I can't understand the source code!
  9. Spring cloud upgrade road - 2020.0. X - 3. Accesslog configuration of undertow
  10. 导致Oracle性能抖动的参数提醒
  11. 风险提醒之Oracle RAC高可用失效
  12. 小机上运行Oracle需要注意的进程调度bug
  13. Oracle内存过度消耗风险提醒
  14. Oracle SQL monitor
  15. 使用Bifrost实现Mysql的数据同步
  16. 揭秘Oracle数据库truncate原理
  17. 看了此文,Oracle SQL优化文章不必再看!
  18. Mybatis (3) map and fuzzy query expansion
  19. Kafka性能篇:为何这么“快”?
  20. 两个高频设计类面试题:如何设计HashMap和线程池
  21. [TTS] AIX - & gt; Linux -- Based on RMAN (real environment)
  22. 为什么学编程大部分人选Java编程语言?
  23. Redis 高可用篇:你管这叫 Sentinel 哨兵集群原理
  24. redis 为什么把简单的字符串设计成 SDS?
  25. [TTS] transfer table space AIX - & gt; Linux based on RMAN
  26. Linux 网卡数据收发过程分析
  27. Redis 高可用篇:你管这叫 Sentinel 哨兵集群原
  28. Redis 6.X Cluster 集群搭建
  29. [TTS] transfer table space AIX ASM - & gt; Linux ASM
  30. [TTS] transfer table space Linux ASM - & gt; AIX ASM
  31. 高性能通讯框架——Netty
  32. Brief introduction and test of orchestrator, a high availability management tool for MySQL
  33. [TTS] transfer table space Linux - & gt; AIX based on RMAN
  34. A love diary about http
  35. [rocketmq source code analysis] in depth message storage (3)
  36. Implementation of service configuration center with spring cloud + Nacos (Hoxton version)
  37. SiCp: abstraction of construction process -- object oriented explanation
  38. springboot网上点餐系统
  39. 【SPM】oracle如何固定执行计划
  40. 用好HugePage,告别Linux性能故障
  41. 3 W word long text, java basic interview questions! It's amazing!!!
  42. Spring cloud upgrade road - 2020.0. X - 3. Accesslog configuration of undertow
  43. Win10 uninstall mysql5.7
  44. CentOS下dotnet Core使用HttpWebRequest进行HTTP通讯,系统存在大量CLOSE_WAIT连接问题的分析,已解决。
  45. MySQL batch insert, how not to insert duplicate data?
  46. K8s cronjob application example
  47. Unconventional method, easy to deal with Oracle database critical exception
  48. How to use sqlplus - prelim in Oracle hang
  49. How to search Oracle official documents in full text
  50. Install mysql8.0 on win10
  51. Oracle OCR的备份与恢复
  52. Oracle kill session相关问题
  53. 《Oracle DBA工作笔记》第二章 常用工具和问题分析
  54. Oracle回收站及flashback drop
  55. Hand in hand to teach you to write a spring IOC container
  56. Exception in Java (1) - basic concept
  57. 3w 字长文爆肝 Java 基础面试题!太顶了!!!
  58. Error 2059 when Navicat connects to win10 mysql8.0
  59. Parameter reminder causing Oracle Performance jitter
  60. 「技术分享」Java线程状态间的互相转换看这个就行了