Redis源代码分析-内存数据结构intset
這次研究了一下intset。研究的過程中,一度看不下過去,可是還是咬牙挺過來了。看懂了也就是那么回事。靜下心來,切莫浮躁
Redis為了追求高效,在存儲下做了非常多的優化,像intset就是作者為了節約內存定制的數據結構,包含后面將要閱讀的壓縮列表。
intset是一個有序的整數集,提供了添加,刪除,查找的接口,針對uint16_t uint32_t uint64_t,提供了不同編碼的轉換(嚴格的說僅僅是類型的提升)
首先。看一下它的結構定義:
typedef struct intset { uint32_t encoding; uint32_t length; int8_t contents[]; } intset;encoding:有例如以下幾種編碼 #define INTSET_ENC_INT16 (sizeof(int16_t)) #define INTSET_ENC_INT32 (sizeof(int32_t)) #define INTSET_ENC_INT64 (sizeof(int64_t))實際上這里使用一個uint8_t存儲就夠了length:當前整數集有多少個整數
contents[]:詳細存儲的位置。這里以一個字節為存儲單元,方便對高類型進行尋址
看一下它對外提供的接口:
intset *intsetNew(void); intset *intsetAdd(intset *is, int64_t value, uint8_t *success); intset *intsetRemove(intset *is, int64_t value, int *success); uint8_t intsetFind(intset *is, int64_t value); int64_t intsetRandom(intset *is); uint8_t intsetGet(intset *is, uint32_t pos, int64_t *value); uint32_t intsetLen(intset *is); size_t intsetBlobLen(intset *is); 一種數據結構。必定要提供類似插入。查詢。刪除這種接口。另外不要暴露內部使用的接口,這里提供的接口,我們詳細分析幾個初始化接口:
/* Create an empty intset. */ intset *intsetNew(void) {intset *is = malloc(sizeof(intset));is->encoding = intrev32ifbe(INTSET_ENC_INT16);is->length = 0;return is; }沒什么難的,注意默認使用最低的2字節存儲 /* Insert an integer in the intset */ intset *intsetAdd(intset *is, int64_t value, uint8_t *success) {uint8_t valenc = _intsetValueEncoding(value);uint32_t pos;if (success) *success = 1;/* Upgrade encoding if necessary. If we need to upgrade, we know that* this value should be either appended (if > 0) or prepended (if < 0),* because it lies outside the range of existing values. */if (valenc > intrev32ifbe(is->encoding)) {/* This always succeeds, so we don't need to curry *success. */return intsetUpgradeAndAdd(is,value);} else {/* Abort if the value is already present in the set.* This call will populate "pos" with the right position to insert* the value when it cannot be found. */if (intsetSearch(is,value,&pos)) {if (success) *success = 0;return is;}is = intsetResize(is,intrev32ifbe(is->length)+1);if (pos < intrev32ifbe(is->length)) intsetMoveTail(is,pos,pos+1);}_intsetSet(is,pos,value);is->length = intrev32ifbe(intrev32ifbe(is->length)+1);return is; }這個接口比較有難度。詳細分析:
1、首先推斷要添加的值的編碼是否大于當前編碼,大于則進行類型提升。并添加value
2、假設小于當前編碼,首先查詢數據是否存在,存在則返回,不存在則設置插入位置pos
3、又一次分配內存大小
4、移動數據。全部數據往后移動。復雜度有點高啊
5、插入數據,設置數據個數
當中。類型提升并插入value的接口例如以下:
/* Upgrades the intset to a larger encoding and inserts the given integer. */ static intset *intsetUpgradeAndAdd(intset *is, int64_t value) {uint8_t curenc = intrev32ifbe(is->encoding);uint8_t newenc = _intsetValueEncoding(value);int length = intrev32ifbe(is->length);int prepend = value < 0 ? 1 : 0;/* First set new encoding and resize */is->encoding = intrev32ifbe(newenc);is = intsetResize(is,intrev32ifbe(is->length)+1);/* Upgrade back-to-front so we don't overwrite values.* Note that the "prepend" variable is used to make sure we have an empty* space at either the beginning or the end of the intset. */while(length--)_intsetSet(is,length+prepend,_intsetGetEncoded(is,length,curenc));/* Set the value at the beginning or the end. */if (prepend)_intsetSet(is,0,value);else_intsetSet(is,intrev32ifbe(is->length),value);is->length = intrev32ifbe(intrev32ifbe(is->length)+1);return is; }能夠看到,類型提升的步驟例如以下:1、由于整數集是有序的,所以首先推斷要加入的數是正數還是負數,正數就在尾部加入,負數則在頭部加入
2、添加內存大小
3、移動數據,這里和第一步掛鉤。并且移動的過程比較難以理解,首先依據原來編碼取出數據,然后依據新的編碼插入數據
4、插入數據。在頭部還是尾部插入
5、改動數據個數
另外移動數據的接口例如以下:
static void intsetMoveTail(intset *is, uint32_t from, uint32_t to) {void *src, *dst;uint32_t bytes = intrev32ifbe(is->length)-from;uint32_t encoding = intrev32ifbe(is->encoding);if (encoding == INTSET_ENC_INT64) {src = (int64_t*)is->contents+from;dst = (int64_t*)is->contents+to;bytes *= sizeof(int64_t);} else if (encoding == INTSET_ENC_INT32) {src = (int32_t*)is->contents+from;dst = (int32_t*)is->contents+to;bytes *= sizeof(int32_t);} else {src = (int16_t*)is->contents+from;dst = (int16_t*)is->contents+to;bytes *= sizeof(int16_t);}memmove(dst,src,bytes); }由于是連續的內存,找到移動的起始位置,然后memmove(),bingo!!!
查找數據的接口實現:
static uint8_t intsetSearch(intset *is, int64_t value, uint32_t *pos) {int min = 0, max = intrev32ifbe(is->length)-1, mid = -1;int64_t cur = -1;/* The value can never be found when the set is empty */if (intrev32ifbe(is->length) == 0) {if (pos) *pos = 0;return 0;} else {/* Check for the case where we know we cannot find the value,* but do know the insert position. */if (value > _intsetGet(is,intrev32ifbe(is->length)-1)) {if (pos) *pos = intrev32ifbe(is->length);return 0;} else if (value < _intsetGet(is,0)) {if (pos) *pos = 0;return 0;}}while(max >= min) {mid = ((unsigned int)min + (unsigned int)max) >> 1;cur = _intsetGet(is,mid);if (value > cur) {min = mid+1;} else if (value < cur) {max = mid-1;} else {break;}}if (value == cur) {if (pos) *pos = mid;return 1;} else {if (pos) *pos = min;return 0;} }還是個二分查找,niubility!!
!個人感覺這樣的數據結構的高效就體如今這里。由于是有序。所以查找高速,由于是數組。所以插入。刪除。是連續內存拷貝,也非常快
有時間突然想去看一下STL Vector的實現了,它的insert是怎樣實現的?
轉載于:https://www.cnblogs.com/mfrbuaa/p/5265268.html
總結
以上是生活随笔為你收集整理的Redis源代码分析-内存数据结构intset的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 理解CSS3中的background-s
- 下一篇: 蒙特卡洛法—非均匀随机数的产生