Redis source code analysis of robj (redisobject)

xindoo 2021-01-21 23:55:50
redis source code analysis robj


We are Previous post I've learned a bit about it in Redis The data structure of , In especial dict It's said in , You can put redis As a hashtable, There's a pile of key-value, Today, let's take a look at key-value in value The main storage structure of redisObject( Later referred to robj). robj For detailed code, see object.c

Field details

Compared with other data structures ,robj Relatively simple , Because it only contains a few fields , The meaning is very clear .

typedef struct redisObject {
unsigned type:4; // data type integer string list set
unsigned encoding:4;
unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
* LFU data (least significant 8 bits frequency
* and most significant 16 bits access time).
* redis use 24 A bit to save LRU and LFU Information about , When using LRU Save last time when
* Read and write timestamps ( second ), Use LFU Save the last timestamp when (16 position min level ) Save approximate Statistics 8 position */
int refcount; // Reference count
void *ptr; // Pointer to a specific stored value , Type use type distinguish
} robj;

The core is five fields , Let's introduce .

type(4 position )

type Of course robj The type of data stored in , at present redis There are several types of .

identifier

value

meaning

OBJ_STRING

0

character string (string)

OBJ_LIST

1

list (list)

OBJ_SET

2

aggregate (set)

OBJ_ZSET

3

Ordered set (zset)

OBJ_HASH

4

Hashtable (hash)

OBJ_MODULE

5

modular (module)

OBJ_STREAM

6

flow (stream)

encoding(4 position )

Encoding mode , If there is only one way for each type , So in fact type and encoding You only need to keep one of the two fields , but redis In order to introduce memory as much as possible in various situations , There are different encoding formats for each type of data in different situations , So it needs to be identified with extra fields . At present, there are several kinds of coding (redis 6.2).

identifier

value

meaning

OBJ_ENCODING_RAW

0

The original way of identification , Only string Will be used

OBJ_ENCODING_INT

1

Integers

OBJ_ENCODING_HT

2

dict

OBJ_ENCODING_ZIPMAP

3

zipmap It's no longer in use

OBJ_ENCODING_LINKEDLIST

4

It's a linked list , It's not used anymore

OBJ_ENCODING_ZIPLIST

5

ziplist

OBJ_ENCODING_INTSET

6

intset

OBJ_ENCODING_SKIPLIST

7

Jump watch skiplist

OBJ_ENCODING_EMBSTR

8

Embedded sds

OBJ_ENCODING_QUICKLIST

9

Watch it quicklist

OBJ_ENCODING_STREAM

10

flow stream

Here is a OBJ_ENCODING_EMBSTR, Here is the introduction .

robj *createEmbeddedStringObject(const char *ptr, size_t len) {
robj *o = zmalloc(sizeof(robj)+sizeof(struct sdshdr8)+len+1);
struct sdshdr8 *sh = (void*)(o+1);
o->type = OBJ_STRING;
o->encoding = OBJ_ENCODING_EMBSTR;
o->ptr = sh+1;
o->refcount = 1;
if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
} else {
o->lru = LRU_CLOCK();
}
sh->len = len;
sh->alloc = len;
sh->flags = SDS_TYPE_8;
if (ptr == SDS_NOINIT)
sh->buf[len] = '\0';
else if (ptr) {
memcpy(sh->buf,ptr,len);
sh->buf[len] = '\0';
} else {
memset(sh->buf,0,len+1);
}
return o;
}

As you can see from the code above , It is robj and sds A combination of , take sds Directly on robj in , The limit here is up to 44 Byte length string . because robj Occupy 16 byte ,sdshdr8 Head occupation 3 byte ,'\0' A byte , Limit the length of the string to 44 You can make sure that 64 All the contents are stored in one byte (16+3+1+44==64).

lru(24 position )

as everyone knows ,redis Provides a strategy for automatically eliminating expired data , How to know if the data has expired ? According to what kind of strategy to eliminate data ? The answer to both questions is the same as lru This field is about .redis Give it to lru This field 24 position , But don't think the field is called lru Just think it's just LRU It will be used in the elimination strategy , Actually LFU This field is also used . I guess it is redis The author wrote first lru Strategy , So it's called lru 了 , Later, I added lfu This field will be reused directly when implementing the policy . lru Fields have different meanings in different elimination strategies . When using LRU when , It's a 24 Second order of bits unix Time stamp , Represents the number of seconds this data has been updated . But use LFU strategy ,24 The bit will be divided into two parts ,16 Bit minute timestamps and 8 Special counter of bit , I won't go into details here , More specifically, you can pay attention to my follow-up blog .

refcount

Reference count , Express this robj It has been applied in many places at present ,refcount The emergence of object reuse provides the basis for . Students who have known about garbage collection all know that the recycling strategy is to use counters , When refcount by 0 when , Indicates that the object is useless , It can be recycled ,redis The author of has also implemented this strategy of reference recycling .

*ptr

This is very simple , The first few fields are for course robj Provide meta Information , This field is the address of the data .

robj The codec

redis Always save memory space to the extreme , here redis And the author of string type robj Special coding has been done , In order to save memory , The code and comments of the coding process are as follows :

/* take string Type of robj Do special coding , To save storage space */
robj *tryObjectEncoding(robj *o) {
long value;
sds s = o->ptr;
size_t len;
/* Make sure this is a string object, the only type we encode
* in this function. Other types use encoded memory efficient
* representations but are handled by the commands implementing
* the type.
* Here's just the code string object , Other types of coding are handled by their corresponding implementations */
serverAssertWithInfo(NULL,o,o->type == OBJ_STRING);
/* We try some specialized encoding only for objects that are
* RAW or EMBSTR encoded, in other words objects that are still
* in represented by an actually array of chars.
* Not sds string Return the original data directly */
if (!sdsEncodedObject(o)) return o;
/* It's not safe to encode shared objects: shared objects can be shared
* everywhere in the "object space" of Redis and may end in places where
* they are not handled. We handle them only as values in the keyspace.
* If it's a shared object , Can't code , Because it may affect the use of other places */
if (o->refcount > 1) return o;
/* Check if we can represent this string as a long integer.
* Note that we are sure that a string larger than 20 chars is not
* representable as a 32 nor 64 bit integer.
* Check whether a string can be represented as a long integer . Note that if the length is greater than 20 The character string is
* It cannot be expressed as 32 perhaps 64 Bit of an integer */
len = sdslen(s);
if (len <= 20 && string2l(s,len,&value)) {
/* This object is encodable as a long. Try to use a shared object.
* Note that we avoid using shared integers when maxmemory is used
* because every object needs to have a private LRU field for the LRU
* algorithm to work well.
* If it can be encoded as long type , And the encoded value is less than OBJ_SHARED_INTEGERS(10000), And not equipped with
* Set up LRU Replace the elimination strategy , Use this number of shared objects , Equivalent to all less than 10000 All the numbers are the same robj*/
if ((server.maxmemory == 0 ||
!(server.maxmemory_policy & MAXMEMORY_FLAG_NO_SHARED_INTEGERS)) &&
value >= 0 &&
value < OBJ_SHARED_INTEGERS)
{
decrRefCount(o);
incrRefCount(shared.integers[value]);
return shared.integers[value];
} else {
/* Otherwise, it turns out that if it is RAW type , Go straight to OBJ_ENCODING_INT type , And then use long To store strings directly */
if (o->encoding == OBJ_ENCODING_RAW) {
sdsfree(o->ptr);
o->encoding = OBJ_ENCODING_INT;
o->ptr = (void*) value;
return o;
/* If it is OBJ_ENCODING_EMBSTR, It will also be transformed into OBJ_ENCODING_INT, And use long Store string */
} else if (o->encoding == OBJ_ENCODING_EMBSTR) {
decrRefCount(o);
return createStringObjectFromLongLongForValue(value);
}
}
}
// For those who can't turn into long String , Do the following
/* If the string is small and is still RAW encoded,
* try the EMBSTR encoding which is more efficient.
* In this representation the object and the SDS string are allocated
* in the same chunk of memory to save space and cache misses.
* If the string is too small , Length less than or equal to 44, Go straight to OBJ_ENCODING_EMBSTR*/
if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT) {
robj *emb;
if (o->encoding == OBJ_ENCODING_EMBSTR) return o;
emb = createEmbeddedStringObject(s,sdslen(s));
decrRefCount(o);
return emb;
}
/* We can't encode the object...
*
* Do the last try, and at least optimize the SDS string inside
* the string object to require little space, in case there
* is more than 10% of free space at the end of the SDS string.
*
* We do that only for relatively large strings as this branch
* is only entered if the length of the string is greater than
* OBJ_ENCODING_EMBSTR_SIZE_LIMIT.
*
* If the previous encoding is not successful , Here's a last try , If sds There are more than 10% Free space available ,
* And the character length is greater than OBJ_ENCODING_EMBSTR_SIZE_LIMIT(44) Then try to release sds It's superfluous
* To save memory .
**/
trimStringObjectIfNeeded(o);
/* Return directly to the original object . */
return o;
}
  1. Check if it's a string , If it's not a direct return .
  2. Check if it's a shared object (refcount > 1), Shared objects are not encoded .
  3. If the string length is less than or equal to 20, It can be directly encoded as a long Integer of type , This is less than 10000 Of long Objects are shared .
  4. If the string length is less than or equal to 44, Direct use OBJ_ENCODING_EMBSTR Storage .
  5. If it's not encoded , And the string length exceeds 44, And sds More free space in than 10%, Then clear the free space , To save memory .

Of course, when there is coding, there is decoding , The code is as follows , Relatively simple :

/* Get a decoded version of an encoded object (returned as a new object).
* If the object is already raw-encoded just increment the ref count.
* Get the decoded object ( It returns a new object ), If the object is a primitive type , Just add a quote . */
robj *getDecodedObject(robj *o) {
robj *dec;
if (sdsEncodedObject(o)) {
incrRefCount(o);
return o;
}
if (o->type == OBJ_STRING && o->encoding == OBJ_ENCODING_INT) {
char buf[32];
ll2string(buf,32,(long)o->ptr);
dec = createStringObject(buf,strlen(buf));
return dec;
} else {
serverPanic("Unknown encoding type");
}
}

Reference counting and automatic cleanup

As mentioned above ,redis To save space , Will reuse some objects , Objects that are not referenced are automatically cleaned up . The author uses the way of reference counting to realize gc, The code is also relatively simple , as follows :

void incrRefCount(robj *o) {
if (o->refcount < OBJ_FIRST_SPECIAL_REFCOUNT) {
o->refcount++;
} else {
if (o->refcount == OBJ_SHARED_REFCOUNT) {
/* Nothing to do: this refcount is immutable. */
} else if (o->refcount == OBJ_STATIC_REFCOUNT) {
serverPanic("You tried to retain an object allocated in the stack");
}
}
}
/* Reduce reference count , If there is no reference, free memory space */
void decrRefCount(robj *o) {
// Clean up the space
if (o->refcount == 1) {
switch(o->type) {
case OBJ_STRING: freeStringObject(o); break;
case OBJ_LIST: freeListObject(o); break;
case OBJ_SET: freeSetObject(o); break;
case OBJ_ZSET: freeZsetObject(o); break;
case OBJ_HASH: freeHashObject(o); break;
case OBJ_MODULE: freeModuleObject(o); break;
case OBJ_STREAM: freeStreamObject(o); break;
default: serverPanic("Unknown object type"); break;
}
zfree(o);
} else {
if (o->refcount <= 0) serverPanic("decrRefCount against refcount <= 0");
if (o->refcount != OBJ_SHARED_REFCOUNT) o->refcount--;
}
}

summary

Sum up , It can be said that robj There are several functions .

  1. For all types of value Provide a unified package .
  2. Save the necessary information for data elimination .
  3. Realize data reuse , And automatic gc function .

This article is about Redis Source analysis series blog , At the same time, there are also corresponding Redis Chinese annotation version , I want to learn more about it Redis Classmate , welcome star And attention . Redis Chinese annotation version warehouse :https://github.com/xindoo/Redis Redis Source analysis column :https://zxs.io/s/1h If you found this article useful , welcome One key, three links .

My blog will be synchronized to tencent cloud + Community , Invite everyone to join us :https://cloud.tencent.com/developer/support-plan?invite_code=2c4n5piee1lw8

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

版权声明
本文为[xindoo]所创,转载请带上原文链接,感谢
https://javamana.com/2021/01/20210121234656720F.html

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云