When the interviewer asked me about the redis data types, I answered eight

Sun 2020-11-11 19:48:51
interviewer redis data types answered


interviewer : Xiao Ming ,redis There are several data structures ?

Xiao Ming :8 Kind of

interviewer : What are the differences ?

Xiao Ming :raw,int,ht,zipmap,linkedlist,ziplist,intset,skiplist,embstr

interviewer : forehead , What are you talking about ?

Xiao Ming : I'm answering your question , I have studied this problem before , No mistake

interviewer : ok , This is the first interview for today , You go back and wait for the notice

Xiao Ming :...


The conversation that happened above , It's the interviewer who has a problem , Or Xiao Ming has a problem ? In fact, they all have problems , The interviewer's questions are not accurate , Xiao Ming's answer is not accurate either .

But you can see , The level of the interviewer is average , Because I don't know what Xiaoming said when I heard these nouns redis The underlying encoding type , And then missed the opportunity to tap the potential of Xiaoming's technology . And Xiao Ming is also a bit clever , Ignore the knowledge that the interviewer wants to investigate , Take out some of the fur you've seen recently and show it , It turned out to be a misunderstanding .

With the introduction above , Let's talk about ,redis The data structure in the .

redis Source code selection version : 3.0.0

The goal of this article is : know redis The concept of encoding type of , And according to the depth of the source code level to understand why to set different encoding types , But it doesn't expand too much on the details of various underlying data structures

redis Object type and encoding type of

redis Object type of , That's what you often test in an interview redis What are the data types The exact answer to this question is , For those of us who can only interview but not develop , It's so familiar , It's just a string 、 Hash 、 list 、 aggregate 、 Ordered set , This is in redis The exact definition can be found in the source code :

redis.c

/* Object types */
#define REDIS_STRING 0
#define REDIS_LIST 1
#define REDIS_SET 2
#define REDIS_ZSET 3
#define REDIS_HASH 4

Many people are right. redis The understanding of data structure may stop here , But it's just redis The abstract structure of exposure , The underlying implementation depends on its encoding type to determine the data structure corresponding to the encoding type .

If an object type has only one implementation of the underlying data structure , Then the encoding type is completely redundant , In the early redis There is no such concept . But then in order to optimize performance , One object type may correspond to many different encoding implementations , So it's about redis Knowledge points of the underlying data structure , It's starting to get complicated . The encoding type is in redis There are also exact definitions in the source code :

redis.c

/* Objects encoding. Some kind of objects like Strings and Hashes can be
* internally represented in multiple ways. The 'encoding' field of the object
* is set to one of this fields for this object. */
#define REDIS_ENCODING_RAW 0 /* Raw representation */
#define REDIS_ENCODING_INT 1 /* Encoded as integer */
#define REDIS_ENCODING_HT 2 /* Encoded as hash table */
#define REDIS_ENCODING_ZIPMAP 3 /* Encoded as zipmap */
#define REDIS_ENCODING_LINKEDLIST 4 /* Encoded as regular linked list */
#define REDIS_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
#define REDIS_ENCODING_INTSET 6 /* Encoded as intset */
#define REDIS_ENCODING_SKIPLIST 7 /* Encoded as skiplist */
#define REDIS_ENCODING_EMBSTR 8 /* Embedded sds string encoding */

We don't have to look for any additional secondary data to explain the role of coding types , Look at the source code in English comments can .

Object code ( The encoding type ): Some object types, such as strings 、 Hash , It can be implemented in many ways , One redis Object's encoding Field can set the following values to represent the underlying encoding type of the object

The same object type , There can be different encoding types as the underlying implementation . And the same encoding type , It can also support multiple object types at the upper level . Their relationship is as follows :

redis Object type and encoding type

Reading this, you must have at least three questions :

  • Why does an object type correspond to multiple encoding types , It's to solve the problem ?
  • redis How to know when to use this encoding type , When to use that encoding type , And the encoding type can be changed at any time ?
  • What are the implementation principles of various coding types ?( This chapter does not focus on , It will introduce some basic ideas throughout the paper , The specific implementation will be explained in other chapters )

Don't worry. , This part is just to let you know ,redis What is exposed to users is only an abstract data structure , It doesn't represent the concrete implementation of its underlying layer . And then I'll take you deeper .

Why does an object type correspond to multiple encoding types

Write redis Daniel is also a programmer , No, he added complexity to his code , It doesn't help performance again ? After all redis This kind of intermediate component must be superior to similar products in terms of performance . you 're right , Just for Performance improvement .

Intuitively feel the different types of coding

First of all, let's intuitively feel the scene that the same object corresponds to different coding types , It's used here object encoding xxx This redis Order to see a certain key Its value The type of encoding used by the object

127.0.0.1:6379> set number 100
OK
127.0.0.1:6379> object encoding number
"int"
127.0.0.1:6379> set number "100"
OK
127.0.0.1:6379> object encoding number
"int"
127.0.0.1:6379> set number abc
OK
127.0.0.1:6379> object encoding number
"embstr"
127.0.0.1:6379> set number aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
OK
127.0.0.1:6379> object encoding number
"raw"
127.0.0.1:6379> set number 9999999999999999999999999
OK
127.0.0.1:6379> object encoding number
"embstr"
127.0.0.1:6379> set number 99999999999999999999999999999999999999999999999999999999999999
OK
127.0.0.1:6379> object encoding number
"raw"

We tested it with the string we use most often , I observed that the encoding type was set by me value The value varies , I organized the following table to show the above test results

value The encoding type
100 int
"100" int
abc embstr
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa raw
9999999999999999999999999 embstr
99999999999999999999999999999999999999999 raw

Of course , I know the condition of encoding type of string , These representative values are specially selected for testing , We can sum up a rule

  • Whether it's 100 still "100", The encoding type is int, explain redis When judging whether an object can be represented by an integer encoding type , Just see if the value can be converted into an integer
  • Shorter strings abc Encoded as embstr, Longer strings aaaaaaa..a Encoded as raw, The encoding types of long and short strings are different , From this we can guess redis It may be the short string storage optimization strategy ( Now, of course, it's just a reasonable guess , There may also be some optimization of long strings )
  • Integers 999...9 And longer integers 9999999...9 They are also converted to the corresponding string type , Indicates a value that can be represented as an integer encoded type , There is a certain size limit

redis On the optimization of string encoding type

In the above experiment, we learned that , There are really three encoding types for string objects :int,raw,embstr.

int Type analysis doesn't mean much , If you think about it, you can use integer storage , Try to use integer storage , It must be more space efficient than string . Let's analyze , The encoding types of long and short strings are distinguished , Why is that ?

It's not just string types , Include hash 、 List these object types , They all use a unified structure redisObject To represent the . His structure is as follows :

redis.h

typedef struct redisObject {
unsigned type:4; // object type
unsigned encoding:4; // The encoding type
void *ptr; // Pointer to value
...( Omit some unimportant fields )
} robj;

Account for the 4 Bit type Express object type (5 Plant that ), It also takes up 4 Bit encoding Field representation The encoding type (8 Plant that ), Pointer field ptr Representing the actual value of Memory address .

If the encoding type of the object is Integers (encoding=REDIS_ENCODING_INT), So this ptr It's going to point to a long Variable of type .

util.c

if (!string2ll(s,slen,&llval))
return 0;
...
*lval = (long)llval;
return 1;

object.c

...
o->ptr = (void*) value;

If the encoding type of the object is raw perhaps embstr, So this ptr It's going to point to a sdshdr Structural variables

sds.h

struct sdshdr {
unsigned int len; // String length
unsigned int free; // buf Free number
char buf[]; // A character array
};

Since they all point to the same structure , How is that optimized ? Then we have to enter the following two methods to have a specific look

object.c

robj *createStringObject(char *ptr, size_t len) {
if (len <= 39)
return createEmbeddedStringObject(ptr,len);
else
return createRawStringObject(ptr,len);
}

You see , This code is very clear , String length <=39 when , create embstr Type string object , Otherwise create raw Type string object . So the difference between the two creation methods , It must be hidden in these two methods , Let's go in !

embstr type

robj *createEmbeddedStringObject(char *ptr, size_t len) {
robj *o = zmalloc(sizeof(robj)+sizeof(struct sdshdr)+len+1);
struct sdshdr *sh = (void*)(o+1);
o->type = REDIS_STRING;
o->encoding = REDIS_ENCODING_EMBSTR;
o->ptr = sh+1;
... ( Some assignment operations )
return o;
}

raw type

robj *createRawStringObject(char *ptr, size_t len) {
return createObject(REDIS_STRING,sdsnewlen(ptr,len));
}
sds sdsnewlen(const void *init, size_t initlen) {
...
struct sdshdr *sh = zmalloc(sizeof(struct sdshdr)+initlen+1);
...
}
robj *createObject(int type, void *ptr) {
robj *o = zmalloc(sizeof(*o));
o->type = type;
o->encoding = REDIS_ENCODING_RAW;
o->ptr = ptr;
...( Some assignment operations )
return o;
}

For students who read more source code , The difference may be immediately perceived . It's very simple , Namely raw Type this way , by redisObject and sdshdr Structure respectively applied for memory space , and embstr Only applied for memory space once , Then put the two structures next to each other . There's no other difference . The visual diagram is as follows :

BjBxXj.png

BjBwpF.png

See this , It all makes sense , It's simple , It's just the difference between applying for memory . But for those of us who want to package simple things into high-end grandiose scripts , We have to find a way to install it , We concluded that using embstr Coding is compared to raw The benefits of coding :

  • embstr Only applied for memory once , and raw You need to apply twice , Therefore, the memory consumption of one application is saved
  • Release embstr You only need to free memory once , and raw It takes two times , Therefore, it saves the consumption of releasing memory once
  • embstr Of redisObject and sdshdr Put it in a continuous piece of memory , So it's better to use cache Advantages

What about? , Source level understanding , Plus the summary of the interviewer , That's interesting .

Conditions for different coding types

In the last part, we used strings , Different coding types were observed , Also understand why there are different implementations of encoding types . Let's summarize the other objects and encoding types , The principle is not in-depth analysis of the source code , And the basic idea of strings is the same .

The encoding type of the string

  • int:8 A long integer of bytes
  • embstr: Less than or equal to 39 Byte string
  • raw: Greater than 39 Byte string

The encoding type of the hash

  • ziplist: The number of elements is less than 512, And all values are less than 64 byte
  • hashtable: In addition to the above conditions

The encoding type of the list

  • ziplist: The number of elements is less than 512, And all values are less than 64 byte
  • hashtable: In addition to the above conditions

The encoding type of the set

  • intset: The number of elements is less than 512, And all values are integers
  • hashtable: In addition to the above conditions

The encoding type of an ordered set

  • ziplist: The number of elements is less than 128, And all values are less than 64 byte
  • hashtable: In addition to the above conditions

Because there is no explanation , The pure memory thing I think uses the cleanest way to describe to everybody , There's no excess . Details of the specific data structure , I'll use other articles to explain .


here , Xiao Ming, who has undergone some practice , I met a professional interviewer again

Professional interviewers : Xiao Ming ,redis There are several data knots ...

Xiaoming of evolution : The interviewer the interviewer , There are two cases of your question ,redis Object type of , This is what we often call the type of data exposed to the public , Yes 5 Kind of , Namely .... The underlying coding type , stay 3.0.0 Source code has 8 Kind of , Namely ....

Professional interviewers : Who let you answer ?

Xiaoming of evolution :...

Professional interviewers : All right , This is the first interview for today , You go back and wait for the notice

Xiaoming of evolution :...


End

版权声明
本文为[Sun]所创,转载请带上原文链接,感谢

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云