Redis 数据类型介绍
Redis 數據類型介紹
你也許已經知道Redis并不是簡單的key-value存儲,實際上他是一個數據結構服務器,支持不同類型的值。也就是說,你不必僅僅把字符串當作鍵所指向的值。下列這些數據類型都可作為值類型:
- 二進制安全的字符串
- Lists: 按插入順序排序的字符串元素的集合。他們基本上就是鏈表(linked lists)。
- Sets: 不重復且無序的字符串元素的集合。
- Sorted sets,類似Sets,但是每個字符串元素都關聯到一個叫score浮動數值(floating number value)。里面的元素總是通過score進行著排序,所以不同的是,它是可以檢索的一系列元素。(例如你可能會問:給我前面10個或者后面10個元素)。
- Hashes,由field和關聯的value組成的map。field和value都是字符串的。這和Ruby、Python的hashes很像。
- Bit arrays (或者說 simply bitmaps): 通過特殊的命令,你可以將 String 值當作一系列 bits 處理:可以設置和清除單獨的 bits,數出所有設為 1 的 bits 的數量,找到最前的被設為 1 或 0 的 bit,等等。
- HyperLogLogs: 這是被用于估計一個 set 中元素數量的概率性的數據結構。別害怕,它比看起來的樣子要簡單…參見本教程的 HyperLogLog 部分。D
學習這些數據類型的原理,以及如何使用它們解決?command reference?中的特定問題,并不總是不關緊要的。所以,本文檔是一個關于 Redis 數據類型和它們最常見特性的導論。 在所有的例子中,我們將使用?redis-cli?工具。它是一個簡單而有用的命令行工具,用于向 Redis 服務器發出命令。
Redis keys
Redis key值是二進制安全的,這意味著可以用任何二進制序列作為key值,從形如”foo”的簡單字符串到一個JPEG文件的內容都可以。空字符串也是有效key值。
關于key的幾條規則:
- 太長的鍵值不是個好主意,例如1024字節的鍵值就不是個好主意,不僅因為消耗內存,而且在數據中查找這類鍵值的計算成本很高。
- 太短的鍵值通常也不是好主意,如果你要用”u:1000:pwd”來代替”user:1000:password”,這沒有什么問題,但后者更易閱讀,并且由此增加的空間消耗相對于key object和value object本身來說很小。當然,沒人阻止您一定要用更短的鍵值節省一丁點兒空間。
- 最好堅持一種模式。例如:”object-type:id:field”就是個不錯的注意,像這樣”user:1000:password”。我喜歡對多單詞的字段名中加上一個點,就像這樣:”comment:1234:reply.to”。
Redis Strings
這是最簡單Redis類型。如果你只用這種類型,Redis就像一個可以持久化的memcached服務器(注:memcache的數據僅保存在內存中,服務器重啟后,數據將丟失)。
我們用redis-cli來玩一下字符串類型:
> set mykey somevalue OK > get mykey "somevalue"正如你所見到的,通常用SET command 和 GET command來設置和獲取字符串值。
值可以是任何種類的字符串(包括二進制數據),例如你可以在一個鍵下保存一副jpeg圖片。值的長度不能超過512 MB。
SET?命令有些有趣的操作,例如,當key存在時SET會失敗,或相反的,當key不存在時它只會成功。
> set mykey newval nx (nil) > set mykey newval xx OK雖然字符串是Redis的基本值類型,但你仍然能通過它完成一些有趣的操作。例如:原子遞增:
> set counter 100 OK > incr counter (integer) 101 > incr counter (integer) 102 > incrby counter 50 (integer) 152INCR?命令將字符串值解析成整型,將其加一,最后將結果保存為新的字符串值,類似的命令有INCRBY,?DECR?和?DECRBY。實際上他們在內部就是同一個命令,只是看上去有點兒不同。
INCR是原子操作意味著什么呢?就是說即使多個客戶端對同一個key發出INCR命令,也決不會導致競爭的情況。例如如下情況永遠不可能發生:『客戶端1和客戶端2同時讀出“10”,他們倆都對其加到11,然后將新值設置為11』。最終的值一定是12,read-increment-set操作完成時,其他客戶端不會在同一時間執行任何命令。
對字符串,另一個的令人感興趣的操作是GETSET命令,行如其名:他為key設置新值并且返回原值。這有什么用處呢?例如:你的系統每當有新用戶訪問時就用INCR命令操作一個Redis key。你希望每小時對這個信息收集一次。你就可以GETSET這個key并給其賦值0并讀取原值。
為減少等待時間,也可以一次存儲或獲取多個key對應的值,使用MSET和MGET命令:
> mset a 10 b 20 c 30 OK > mget a b c 1) "10" 2) "20" 3) "30"MGET 命令返回由值組成的數組。
修改或查詢鍵空間
有些指令不是針對任何具體的類型定義的,而是用于和整個鍵空間交互的。因此,它們可被用于任何類型的鍵。
使用EXISTS命令返回1或0標識給定key的值是否存在,使用DEL命令可以刪除key對應的值,DEL命令返回1或0標識值是被刪除(值存在)或者沒被刪除(key對應的值不存在)。
> set mykey hello OK > exists mykey (integer) 1 > del mykey (integer) 1 > exists mykey (integer) 0TYPE命令可以返回key對應的值的存儲類型:
> set mykey x OK > type mykey string > del mykey (integer) 1 > type mykey noneRedis超時:數據在限定時間內存活
在介紹復雜類型前我們先介紹一個與值類型無關的Redis特性:超時。你可以對key設置一個超時時間,當這個時間到達后會被刪除。精度可以使用毫秒或秒。
> set key some-value OK > expire key 5 (integer) 1 > get key (immediately) "some-value" > get key (after some time) (nil)上面的例子使用了EXPIRE來設置超時時間(也可以再次調用這個命令來改變超時時間,使用PERSIST命令去除超時時間 )。我們也可以在創建值的時候設置超時時間:
> set key 100 ex 10 OK > ttl key (integer) 9TTL命令用來查看key對應的值剩余存活時間。
Redis Lists
要說清楚列表數據類型,最好先講一點兒理論背景,在信息技術界List這個詞常常被使用不當。例如”Python Lists”就名不副實(名為Linked Lists),但他們實際上是數組(同樣的數據類型在Ruby中叫數組)
一般意義上講,列表就是有序元素的序列:10,20,1,2,3就是一個列表。但用數組實現的List和用Linked List實現的List,在屬性方面大不相同。
Redis lists基于Linked Lists實現。這意味著即使在一個list中有數百萬個元素,在頭部或尾部添加一個元素的操作,其時間復雜度也是常數級別的。用LPUSH 命令在十個元素的list頭部添加新元素,和在千萬元素list頭部添加新元素的速度相同。
那么,壞消息是什么?在數組實現的list中利用索引訪問元素的速度極快,而同樣的操作在linked list實現的list上沒有那么快。
Redis Lists用linked list實現的原因是:對于數據庫系統來說,至關重要的特性是:能非??斓脑诤艽蟮牧斜砩咸砑釉?。另一個重要因素是,正如你將要看到的:Redis lists能在常數時間取得常數長度。
如果快速訪問集合元素很重要,建議使用可排序集合(sorted sets)??膳判蚣衔覀儠S后介紹。
Redis lists 入門
LPUSH?命令可向list的左邊(頭部)添加一個新元素,而RPUSH命令可向list的右邊(尾部)添加一個新元素。最后LRANGE?命令可從list中取出一定范圍的元素:
> rpush mylist A (integer) 1 > rpush mylist B (integer) 2 > lpush mylist first (integer) 3 > lrange mylist 0 -1 1) "first" 2) "A" 3) "B"注意:LRANGE?帶有兩個索引,一定范圍的第一個和最后一個元素。這兩個索引都可以為負來告知Redis從尾部開始計數,因此-1表示最后一個元素,-2表示list中的倒數第二個元素,以此類推。
上面的所有命令的參數都可變,方便你一次向list存入多個值。
> rpush mylist 1 2 3 4 5 "foo bar" (integer) 9 > lrange mylist 0 -1 1) "first" 2) "A" 3) "B" 4) "1" 5) "2" 6) "3" 7) "4" 8) "5" 9) "foo bar"還有一個重要的命令是pop,它從list中刪除元素并同時返回刪除的值。可以在左邊或右邊操作。
> rpush mylist a b c (integer) 3 > rpop mylist "c" > rpop mylist "b" > rpop mylist "a"我們增加了三個元素,并彈出了三個元素,因此,在這最后 列表中的命令序列是空的,沒有更多的元素可以被彈出。如果我們嘗試彈出另一個元素,這是我們得到的結果:
> rpop mylist (nil)當list沒有元素時,Redis 返回了一個NULL。
List的常用案例
正如你可以從上面的例子中猜到的,list可被用來實現聊天系統。還可以作為不同進程間傳遞消息的隊列。關鍵是,你可以每次都以原先添加的順序訪問數據。這不需要任何SQL ORDER BY 操作,將會非???#xff0c;也會很容易擴展到百萬級別元素的規模。
例如在評級系統中,比如社會化新聞網站 reddit.com,你可以把每個新提交的鏈接添加到一個list,用LRANGE可簡單的對結果分頁。
在博客引擎實現中,你可為每篇日志設置一個list,在該list中推入博客評論,等等。
Capped lists
可以使用LTRIM把list從左邊截取指定長度。
> rpush mylist 1 2 3 4 5 (integer) 5 > ltrim mylist 0 2 OK > lrange mylist 0 -1 1) "1" 2) "2" 3) "3"List上的阻塞操作
可以使用Redis來實現生產者和消費者模型,如使用LPUSH和RPOP來實現該功能。但會遇到這種情景:list是空,這時候消費者就需要輪詢來獲取數據,這樣就會增加redis的訪問壓力、增加消費端的cpu時間,而很多訪問都是無用的。為此redis提供了阻塞式訪問?BRPOP?和?BLPOP?命令。 消費者可以在獲取數據時指定如果數據不存在阻塞的時間,如果在時限內獲得數據則立即返回,如果超時還沒有數據則返回null, 0表示一直阻塞。
同時redis還會為所有阻塞的消費者以先后順序排隊。
如需了解詳細信息請查看?RPOPLPUSH?和?BRPOPLPUSH。
key 的自動創建和刪除
目前為止,在我們的例子中,我們沒有在推入元素之前創建空的 list,或者在 list 沒有元素時刪除它。在 list 為空時刪除 key,并在用戶試圖添加元素(比如通過?LPUSH)而鍵不存在時創建空 list,是 Redis 的職責。
這不光適用于 lists,還適用于所有包括多個元素的 Redis 數據類型 – Sets, Sorted Sets 和 Hashes。
基本上,我們可以用三條規則來概括它的行為:
規則 1 示例:
> del mylist (integer) 1 > lpush mylist 1 2 3 (integer) 3但是,我們不能對存在但類型錯誤的 key 做操作: ? > set foo bar OK > lpush foo 1 2 3 (error) WRONGTYPE Operation against a key holding the wrong kind of value > type foo string
規則 2 示例:
> lpush mylist 1 2 3 (integer) 3 > exists mylist (integer) 1 > lpop mylist "3" > lpop mylist "2" > lpop mylist "1" > exists mylist (integer) 0所有的元素被彈出之后, key 不復存在。
規則 3 示例:
> del mylist (integer) 0 > llen mylist (integer) 0 > lpop mylist (nil)Redis Hashes —
Redis hash 看起來就像一個 “hash” 的樣子,由鍵值對組成:
> hmset user:1000 username antirez birthyear 1977 verified 1 OK > hget user:1000 username "antirez" > hget user:1000 birthyear "1977" > hgetall user:1000 1) "username" 2) "antirez" 3) "birthyear" 4) "1977" 5) "verified" 6) "1"Hash 便于表示?objects,實際上,你可以放入一個 hash 的域數量實際上沒有限制(除了可用內存以外)。所以,你可以在你的應用中以不同的方式使用 hash。
HMSET?指令設置 hash 中的多個域,而?HGET?取回單個域。HMGET?和?HGET?類似,但返回一系列值:
> hmget user:1000 username birthyear no-such-field 1) "antirez" 2) "1977" 3) (nil)也有一些指令能夠對單獨的域執行操作,比如?HINCRBY:
> hincrby user:1000 birthyear 10 (integer) 1987 > hincrby user:1000 birthyear 10 (integer) 1997你可以在文檔中找到?hash 指令的完整列表。
值得注意的是,小的 hash 被用特殊方式編碼,非常節約內存。
Redis Sets —
Redis Set 是 String 的無序排列。SADD?指令把新的元素添加到 set 中。對 set 也可做一些其他的操作,比如測試一個給定的元素是否存在,對不同 set 取交集,并集或差,等等。
> sadd myset 1 2 3 (integer) 3 > smembers myset 1. 3 2. 1 3. 2現在我已經把三個元素加到我的 set 中,并告訴 Redis 返回所有的元素??梢钥吹?#xff0c;它們沒有被排序 —— Redis 在每次調用時可能按照任意順序返回元素,因為對于元素的順序并沒有規定。
Redis 有檢測成員的指令。一個特定的元素是否存在?
> sismember myset 3 (integer) 1 > sismember myset 30 (integer) 0“3” 是 set 的一個成員,而 “30” 不是。
Sets 適合用于表示對象間的關系。 例如,我們可以輕易使用 set 來表示標記。
一個簡單的建模方式是,對每一個希望標記的對象使用 set。這個 set 包含和對象相關聯的標簽的 ID。
假設我們想要給新聞打上標簽。 假設新聞 ID 1000 被打上了 1,2,5 和 77 四個標簽,我們可以使用一個 set 把 tag ID 和新聞條目關聯起來:
> sadd news:1000:tags 1 2 5 77 (integer) 4但是,有時候我可能也會需要相反的關系:所有被打上相同標簽的新聞列表:
> sadd tag:1:news 1000 (integer) 1 > sadd tag:2:news 1000 (integer) 1 > sadd tag:5:news 1000 (integer) 1 > sadd tag:77:news 1000 (integer) 1獲取一個對象的所有 tag 是很方便的:
> smembers news:1000:tags 1. 5 2. 1 3. 77 4. 2注意:在這個例子中,我們假設你有另一個數據結構,比如一個 Redis hash,把標簽 ID 對應到標簽名稱。
使用 Redis 命令行,我們可以輕易實現其它一些有用的操作。比如,我們可能需要一個含有 1, 2, 10, 和 27 標簽的對象的列表。我們可以用?SINTER?命令來完成這件事。它獲取不同 set 的交集。我們可以用:
> sinter tag:1:news tag:2:news tag:10:news tag:27:news ... results here ...不光可以取交集,還可以取并集,差集,獲取隨機元素,等等。
獲取一個元素的命令是?SPOP,它很適合對特定問題建模。比如,要實現一個基于 web 的撲克游戲,你可能需要用 set 來表示一副牌。假設我們用一個字符的前綴來表示不同花色:
> sadd deck C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 CJ CQ CKD1 D2 D3 D4 D5 D6 D7 D8 D9 D10 DJ DQ DK H1 H2 H3H4 H5 H6 H7 H8 H9 H10 HJ HQ HK S1 S2 S3 S4 S5 S6S7 S8 S9 S10 SJ SQ SK(integer) 52現在,我們想要給每個玩家 5 張牌。SPOP?命令刪除一個隨機元素,把它返回給客戶端,因此它是完全合適的操作。
但是,如果我們對我們的牌直接調用它,在下一盤我們就需要重新充滿這副牌。開始,我們可以復制?deck?鍵中的內容,并放入?game:1:deck?鍵中。
這是通過?SUNIONSTORE?實現的,它通常用于對多個集合取并集,并把結果存入另一個 set 中。但是,因為一個 set 的并集就是它本身,我可以這樣復制我的牌:
> sunionstore game:1:deck deck (integer) 52現在,我已經準備好給 1 號玩家發五張牌:
> spop game:1:deck "C6" > spop game:1:deck "CQ" > spop game:1:deck "D1" > spop game:1:deck "CJ" > spop game:1:deck "SJ"One pair of jacks, not great…
Now it’s a good time to introduce the set command that provides the number of elements inside a set. This is often called the?cardinality of a set?in the context of set theory, so the Redis command is called?SCARD.
> scard game:1:deck (integer) 47The math works: 52 - 5 = 47.
When you need to just get random elements without removing them from the set, there is the?SRANDMEMBER?command suitable for the task. It also features the ability to return both repeating and non-repeating elements.
Redis Sorted sets —
Sorted sets are a data type which is similar to a mix between a Set and a Hash. Like sets, sorted sets are composed of unique, non-repeating string elements, so in some sense a sorted set is a set as well.
However while elements inside sets are not ordered, every element in a sorted set is associated with a floating point value, called?the score?(this is why the type is also similar to a hash, since every element is mapped to a value).
Moreover, elements in a sorted sets are?taken in order?(so they are not ordered on request, order is a peculiarity of the data structure used to represent sorted sets). They are ordered according to the following rule:
- If A and B are two elements with a different score, then A > B if A.score is > B.score.
- If A and B have exactly the same score, then A > B if the A string is lexicographically greater than the B string. A and B strings can’t be equal since sorted sets only have unique elements.
Let’s start with a simple example, adding a few selected hackers names as sorted set elements, with their year of birth as “score”.
> zadd hackers 1940 "Alan Kay" (integer) 1 > zadd hackers 1957 "Sophie Wilson" (integer 1) > zadd hackers 1953 "Richard Stallman" (integer) 1 > zadd hackers 1949 "Anita Borg" (integer) 1 > zadd hackers 1965 "Yukihiro Matsumoto" (integer) 1 > zadd hackers 1914 "Hedy Lamarr" (integer) 1 > zadd hackers 1916 "Claude Shannon" (integer) 1 > zadd hackers 1969 "Linus Torvalds" (integer) 1 > zadd hackers 1912 "Alan Turing" (integer) 1As you can see?ZADD?is similar to?SADD, but takes one additional argument (placed before the element to be added) which is the score.?ZADD?is also variadic, so you are free to specify multiple score-value pairs, even if this is not used in the example above.
With sorted sets it is trivial to return a list of hackers sorted by their birth year because actually?they are already sorted.
Implementation note: Sorted sets are implemented via a dual-ported data structure containing both a skip list and a hash table, so every time we add an element Redis performs an O(log(N)) operation. That’s good, but when we ask for sorted elements Redis does not have to do any work at all, it’s already all sorted:
> zrange hackers 0 -1 1) "Alan Turing" 2) "Hedy Lamarr" 3) "Claude Shannon" 4) "Alan Kay" 5) "Anita Borg" 6) "Richard Stallman" 7) "Sophie Wilson" 8) "Yukihiro Matsumoto" 9) "Linus Torvalds"Note: 0 and -1 means from element index 0 to the last element (-1 works here just as it does in the case of the?LRANGE?command).
What if I want to order them the opposite way, youngest to oldest? Use?ZREVRANGE?instead of?ZRANGE:
> zrevrange hackers 0 -1 1) "Linus Torvalds" 2) "Yukihiro Matsumoto" 3) "Sophie Wilson" 4) "Richard Stallman" 5) "Anita Borg" 6) "Alan Kay" 7) "Claude Shannon" 8) "Hedy Lamarr" 9) "Alan Turing"It is possible to return scores as well, using the?WITHSCORES?argument:
> zrange hackers 0 -1 withscores 1) "Alan Turing" 2) "1912" 3) "Hedy Lamarr" 4) "1914" 5) "Claude Shannon" 6) "1916" 7) "Alan Kay" 8) "1940" 9) "Anita Borg" 10) "1949" 11) "Richard Stallman" 12) "1953" 13) "Sophie Wilson" 14) "1957" 15) "Yukihiro Matsumoto" 16) "1965" 17) "Linus Torvalds" 18) "1969"Operating on ranges
Sorted sets are more powerful than this. They can operate on ranges. Let’s get all the individuals that were born up to 1950 inclusive. We use the?ZRANGEBYSCORE?command to do it:
> zrangebyscore hackers -inf 1950 1) "Alan Turing" 2) "Hedy Lamarr" 3) "Claude Shannon" 4) "Alan Kay" 5) "Anita Borg"We asked Redis to return all the elements with a score between negative infinity and 1950 (both extremes are included).
It’s also possible to remove ranges of elements. Let’s remove all the hackers born between 1940 and 1960 from the sorted set:
> zremrangebyscore hackers 1940 1960 (integer) 4ZREMRANGEBYSCORE?is perhaps not the best command name, but it can be very useful, and returns the number of removed elements.
Another extremely useful operation defined for sorted set elements is the get-rank operation. It is possible to ask what is the position of an element in the set of the ordered elements.
> zrank hackers "Anita Borg" (integer) 4The?ZREVRANK?command is also available in order to get the rank, considering the elements sorted a descending way.
Lexicographical scores
With recent versions of Redis 2.8, a new feature was introduced that allows getting ranges lexicographically, assuming elements in a sorted set are all inserted with the same identical score (elements are compared with the C?memcmp?function, so it is guaranteed that there is no collation, and every Redis instance will reply with the same output).
The main commands to operate with lexicographical ranges are?ZRANGEBYLEX,?ZREVRANGEBYLEX,?ZREMRANGEBYLEX?and?ZLEXCOUNT.
For example, let’s add again our list of famous hackers, but this time use a score of zero for all the elements:
> zadd hackers 0 "Alan Kay" 0 "Sophie Wilson" 0 "Richard Stallman" 0"Anita Borg" 0 "Yukihiro Matsumoto" 0 "Hedy Lamarr" 0 "Claude Shannon"0 "Linus Torvalds" 0 "Alan Turing"Because of the sorted sets ordering rules, they are already sorted lexicographically:
> zrange hackers 0 -1 1) "Alan Kay" 2) "Alan Turing" 3) "Anita Borg" 4) "Claude Shannon" 5) "Hedy Lamarr" 6) "Linus Torvalds" 7) "Richard Stallman" 8) "Sophie Wilson" 9) "Yukihiro Matsumoto"Using?ZRANGEBYLEX?we can ask for lexicographical ranges:
> zrangebylex hackers [B [P 1) "Claude Shannon" 2) "Hedy Lamarr" 3) "Linus Torvalds"Ranges can be inclusive or exclusive (depending on the first character), also string infinite and minus infinite are specified respectively with the?+?and?-?strings. See the documentation for more information.
This feature is important because it allows us to use sorted sets as a generic index. For example, if you want to index elements by a 128-bit unsigned integer argument, all you need to do is to add elements into a sorted set with the same score (for example 0) but with an 8 byte prefix consisting of?the 128 bit number in big endian. Since numbers in big endian, when ordered lexicographically (in raw bytes order) are actually ordered numerically as well, you can ask for ranges in the 128 bit space, and get the element’s value discarding the prefix.
If you want to see the feature in the context of a more serious demo, check the?Redis autocomplete demo.
Updating the score: leader boards
Just a final note about sorted sets before switching to the next topic. Sorted sets’ scores can be updated at any time. Just calling?ZADD?against an element already included in the sorted set will update its score (and position) with O(log(N)) time complexity. As such, sorted sets are suitable when there are tons of updates.
Because of this characteristic a common use case is leader boards. The typical application is a Facebook game where you combine the ability to take users sorted by their high score, plus the get-rank operation, in order to show the top-N users, and the user rank in the leader board (e.g., “you are the #4932 best score here”).
Bitmaps —
Bitmaps are not an actual data type, but a set of bit-oriented operations defined on the String type. Since strings are binary safe blobs and their maximum length is 512 MB, they are suitable to set up to 2^32 different bits.
Bit operations are divided into two groups: constant-time single bit operations, like setting a bit to 1 or 0, or getting its value, and operations on groups of bits, for example counting the number of set bits in a given range of bits (e.g., population counting).
One of the biggest advantages of bitmaps is that they often provide extreme space savings when storing information. For example in a system where different users are represented by incremental user IDs, it is possible to remember a single bit information (for example, knowing whether a user wants to receive a newsletter) of 4 billion of users using just 512 MB of memory.
Bits are set and retrieved using the?SETBIT?and?GETBIT?commands:
> setbit key 10 1 (integer) 1 > getbit key 10 (integer) 1 > getbit key 11 (integer) 0The?SETBIT?command takes as its first argument the bit number, and as its second argument the value to set the bit to, which is 1 or 0. The command automatically enlarges the string if the addressed bit is outside the current string length.
GETBIT?just returns the value of the bit at the specified index. Out of range bits (addressing a bit that is outside the length of the string stored into the target key) are always considered to be zero.
There are three commands operating on group of bits:
Both?BITPOS?and?BITCOUNT?are able to operate with byte ranges of the string, instead of running for the whole length of the string. The following is a trivial example of?BITCOUNT?call:
> setbit key 0 1 (integer) 0 > setbit key 100 1 (integer) 0 > bitcount key (integer) 2Common user cases for bitmaps are:
- Real time analytics of all kinds.
- Storing space efficient but high performance boolean information associated with object IDs.
For example imagine you want to know the longest streak of daily visits of your web site users. You start counting days starting from zero, that is the day you made your web site public, and set a bit with?SETBIT?every time the user visits the web site. As a bit index you simply take the current unix time, subtract the initial offset, and divide by 3600*24.
This way for each user you have a small string containing the visit information for each day. With?BITCOUNT?it is possible to easily get the number of days a given user visited the web site, while with a few?BITPOS?calls, or simply fetching and analyzing the bitmap client-side, it is possible to easily compute the longest streak.
Bitmaps are trivial to split into multiple keys, for example for the sake of sharding the data set and because in general it is better to avoid working with huge keys. To split a bitmap across different keys instead of setting all the bits into a key, a trivial strategy is just to store M bits per key and obtain the key name with?bit-number/M?and the Nth bit to address inside the key with?bit-number MOD M.
HyperLogLogs —
A HyperLogLog is a probabilistic data structure used in order to count unique things (technically this is referred to estimating the cardinality of a set). Usually counting unique items requires using an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times. However there is a set of algorithms that trade memory for precision: you end with an estimated measure with a standard error, in the case of the Redis implementation, which is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory! 12k bytes in the worst case, or a lot less if your HyperLogLog (We’ll just call them HLL from now) has seen very few elements.
HLLs in Redis, while technically a different data structure, is encoded as a Redis string, so you can call?GET?to serialize a HLL, and?SET?to deserialize it back to the server.
Conceptually the HLL API is like using Sets to do the same task. You would?SADD?every observed element into a set, and would use?SCARD?to check the number of elements inside the set, which are unique since?SADD?will not re-add an existing element.
While you don’t really?add items?into an HLL, because the data structure only contains a state that does not include actual elements, the API is the same:
- Every time you see a new element, you add it to the count with?PFADD.
-
Every time you want to retrieve the current approximation of the unique elements?added?with?PFADD?so far, you use the?PFCOUNT.
> pfadd hll a b c d(integer) 1> pfcount hll(integer) 4
An example of use case for this data structure is counting unique queries performed by users in a search form every day.
Redis is also able to perform the union of HLLs, please check the?full documentation?for more information.
Other notable features
There are other important things in the Redis API that can’t be explored in the context of this document, but are worth your attention:
- It is possible to?iterate the key space of a large collection incrementally.
- It is possible to run?Lua scripts server side?to win latency and bandwidth.
- Redis is also a?Pub-Sub server.
Learn more
This tutorial is in no way complete and has covered just the basics of the API. Read the?command reference?to discover a lot more.
Thanks for reading, and have fun hacking with Redis!
總結
以上是生活随笔為你收集整理的Redis 数据类型介绍的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: face recognition[翻译]
- 下一篇: Redis FAQ