-
Given location of the user return the top 10 "things" aroung the user.
For simplicity, things = restaurants.
- The restaurants should be within 2 mile radius of the user.
- The users should be able to add/delete/update the restaurant information.
- The users should be able to add comments/feedback/review about the restaurant.
- The result should have minimul latency.
Too long didn't read:
Input: User location in latitude and longitude
Output: Top 10 restaurents around user within 2 mile radius.
For now let's design this for one user.
We are assuming that the restaurent data is already populated.
Search(user object, latitude, longitude)
Returns: Top 10 Restaurant objects.
The mobile client can then render these objects into list that the user can add comment/feedback to.
The main crux of the problem is, how do we store this data and how do we retrieve it.
You can store the world map in a tree like structure like R-Tree, Quad tree. I am gonna be discussing about quad trees here.
Quad Tree
You take the globe and divide it into 4 quadrants.
Then you further take each quadrant and divide it into 4 more quadrants.
So densely populated areas will have more rectangles on it, and sparsely populated areas will have less rectangels over it.
For simplicity let's assume the most desely populated area has grid size of 2 miles by 2 miles.
Further more, the leaf nodes can be connected so we can easily search over large areas.
So for below use case:
Since the leafes are connected all we'd need to do is fetch the appropriate result from each leaf node.
The leaf nodes have the Ids of the restaurants contained within that grid locaiton.
So when the user requests for the information, we can query the resulting leaf nodes, get the restaurents and perform the K nearest point algo on those restaurents to get our top 10 list.
Insert
So to insert a restaurant, we find the appropriate leaf node and add the restaurant ID in it.
Let's say the max capacity a leaf can hold is 200 restaurant Ids and we reach that, then we split the leaf node into 4 quadrants and put appropriate restaurents in each new leaf.
With this approch both insert and search would be log(n) time.
What if the entire quad tree cannot fit in memory?
Drawbacks:
What if one region becomes hot ?
What if one region contains more data than other ?
We can fit grids on the database, instead of specific location.
Both the sharding approches are similar, but with grid id we are fitting the quadrants on database instead of city or state.
We can have master slave architecture to ensure availability.
Reads: The reads can be done from the master.
Writes: The writes can be done to one of the slave, and after some time period we can make this slave the master, so other slaves can copy data off of this new master.
To deal with hot Places, we can introduce a cache in front of our database.
We can add Load Balencer in following 2 places:
1) Between Clients and Application servers
2) Between Application servers and Dababase server.
