Thursday, November 25, 2010

Redis patterns | search

Problem

You want to implement search against user objects stored in redis using Python. Something like querying for all user ids whose username begins with "an".

Solution

Here we have user objects stored in as hashes with "user:obj:" as prefix.

For example

user:obj:3955 {id: 3955, username: 'John', ..}

We need some extra data structures to support our search i.e. (search user objects where username begins with given phrase. So search for jo should match John, Joe and so on. We will use sorted sets of all usernames and will assign every element a score. This score is a float and helps us in finding the matching words.

Some scores for eg.

a -> 0.097
ab -> 0.097098
ac -> 0.097099
bc -> 0.098099

So for above four string if we find strings that has score that is => 0.097 and < 0.098, we find all strings that begins with 'a'


Code



Discussion

This to demonstrate simple redis pattern and using it in Python.

See Also

There are already some good writeups on related topics.

3 comments:

  1. I noticed you don't use whole numbers as the scores. Does this save memory, or what is the reason behind this? Good post, by the way.

    ReplyDelete
  2. Thanks Kevin. Honestly I don't remember why did I use float, memory saving wasn't surely the aim. But guess whole number should work and possibly should be preferred that way.

    ReplyDelete
  3. I think that it shouldn't be the preferred way since you don't handle padding.

    You used it (0. representation) most likely to avoid any padding since you are sorting according to the %03d representation of a string.

    And you don't use the whole number because if the string gets too long, redis allows you to store a float double precision only (ie: limited numbers of data after the 0.).


    Hope this helps,

    cheers,

    ReplyDelete