Shortest Path in Binary Trees

‘Hello world’ , how you doing ?

This post is about a problem that I came across while trying my hand on April long challenge contest on CodeChef. After breaking my head on laptop this weekend, managed to code 4 problems.

The problem to discuss today is related to binary trees – BINTREE
Gist – Assume a long infinite binary tree stored in an array. Given two nodes i , j in it, find the shortest path between i and j.

Solution that immediately comes to mind is finding the LCA (least common ancestor) between the two given nodes and figuring out the shortest path using this LCA. The shortest path can be figured out once we know the LCA using these two approches –

1. Height(node1) – Height(LCA) + Height(node2) – Height(LCA)

int(math.log(l, 2) + 1) + int(math.log(r, 2) + 1)   - 2*int(math.log(lca, 2)+1  # +1 needed as index starts from 0

2. Determine shortest path while finding LCA itself.

def LCA(root, lchild, rchild, max_root):
""" Least ancestor problem """
    global path   # Shortest distance counter
    global stop
    if (root > max_root):
        return None
    if root == lchild or root == rchild:
        return root
    else:
        lroot = LCA(2*root, lchild, rchild, max_root)
        if lroot and not stop:
            path = path + 1
        rroot = LCA((2*root) + 1, lchild, rchild, max_root)
        if rroot and not stop:
            path = path + 1
        if lroot and rroot:
            stop = 1
            return root
    return lroot or rroot

But the problem is – the above algorithm does not finish in required timings –  indicative of inefficiency in algorithm.

The reason is quite clear – In top-down approach, we are vising lot of nodes that we could have just skipped. The worst case here is – total number of nodes – 2**(h+1) -1  and considering the fact that it is infinitely large, the time spent is huge.

Now the other solution :

We can start from bottom to up, i.e, build chain of parent nodes for node-i and build another chain of parent nodes for node-j.
Now we can just intersect the two lists to get the first common element as the LCA and the shortest path also can be calculated very easily.

lca = list(set(chain_1) & set(chain_2))
distance = chain_1.index(lca) + chain_2.index(lca)

So what is the interesting pythonic learning ?

While trying to intersect the parent node chains, I thought it would be good to write own logic rather than using one of functional programming aspect. This is to save us operations , considering the fact that parent chains will be very large for infinitely large binary tree. So I wrote this –

if len(chain_1) > len(chain_2):
    shorter = chain_2
    larger = chain_1
else:
    shorter = chain_1
    larger = chain_2
for i in shorter:
    if i in larger:
       lca = i
       break

Note the always shorter is outer loop to larger : This way it will reduce the number of loops to get the common ancestor on average cases. Worst cases, it doesn’t matter.

Other way – ( using functional programming  )

ca_all = list(set(chain_1) & set(chain_2))
ca_all.sort()
lca = ca_all[-1]

With larger inputs, the second approach is 2X faster than first approach !  I mean the second approach requires doing a lot of unnecessary stuff –
1. Getting all the intersections,
2. Converting to list and sorting it.

and Still it is faster !

So why are “sets” so fast in python ?

According to this thread
Indeed, CPython’s sets are implemented as something like dictionaries with dummy values (the keys being the members of the set), with some optimization(s) that exploit this lack of values.Since lists are ordered, searching a list to see if something is in it can be slow. Sets give up order so that they can make membership testing blindingly fast and these are also implemented in C.

If interested, you can view the source code of sets here.

Code for other three problems on my Github is here.
Hope you enjoyed reading it, please share your comments below. Thanks !

 

Little Elephant and Candies

Lets look at an easy problem – Little Elephant and Candies.

The solution to the problem is very simple, but the main point here is that your solution has to be fast. Go through the problem again and especially the constraints.

First work on the solution and then continue reading (You can also look here)

Thoughts –
The main bottleneck is summing up all the Ak’s (candies to make them happy)
1. Use add = sum([ int(x) for x in sys.stdin.readline().strip().split(‘ ‘)])
2. Use reduce(lambda x, y: int(x) + int(y), sys.stdin.readline().strip().split(‘ ‘) , 0)  (Don’t forget to use third argument ! )

Well, timing of both of the above ways are more or less same with constraints provided in the problem, So I wrote this extrapolate the constraints with ~209889 testcases (N) and 10^21 as max value for C and 10^11 as max value for Ak

Timings – 

Reduce (2nd approach)
real 0m14.901s
user 0m13.377s
sys 0m1.008s

List Comprehension (1st approach)
real 0m11.201s
user 0m9.529s
sys 0m1.152s

So basically list comprehension here is ~33% faster than reduce.
Map, reduce and filters are among awesome features of python but nothing can beat ever awesome list comprehensions. Reduce losses in the race because of huge number of lambda function calls during the run. Function calls are always expensive due to allocation /deallocation of stack. ( Related question on SO )

List comprehensions are fastest in python as they are implemented in C language. So next time when in doubt, always use list comprehensions.

Its time for some candies !