In previous posts, we have discussed how to calculate the Lowest Common Ancestor (LCA) for a binary tree and a binary search tree (this, this and this). Now let’s look at a method that can calculate LCA for any tree (not only for binary tree). We use Dynamic Programming with Sparse Matrix Approach in our method. This method is very handy and fast when you need to answer multiple queries of LCA for a tree.
Pre-requisites : –
1) DFS
2) Basic DP knowledge (This and this)
3) Range Minimum Query (Square Root Decomposition and Sparse Table)
The naive approach for this general tree LCA calculation will be the same as the naive approach for the LCA calculation of Binary Tree (this naive approach is already well described here.
The C++ implementation for the naive approach is given below :-
/* Program to find LCA of n1 and n2 using one DFS on the Tree */ #include "iostream" #include "vector" using namespace std; // Maximum number of nodes is 100000 and nodes are // numbered from 1 to 100000 #define MAXN 100001 vector < int > tree[MAXN]; int path[3][MAXN]; // storing root to node path // storing the path from root to node void dfs( int cur, int prev, int pathNumber, int ptr, int node, bool &flag) { for ( int i=0; i<tree[cur].size(); i++) { if (tree[cur][i] != prev and !flag) { // pushing current node into the path path[pathNumber][ptr] = tree[cur][i]; if (tree[cur][i] == node) { // node found flag = true ; // terminating the path path[pathNumber][ptr+1] = -1; return ; } dfs(tree[cur][i], cur, pathNumber, ptr+1, node, flag); } } } // This Function compares the path from root to 'a' & root // to 'b' and returns LCA of a and b. Time Complexity : O(n) int LCA( int a, int b) { // trivial case if (a == b) return a; // setting root to be first element in path path[1][0] = path[2][0] = 1; // calculating path from root to a bool flag = false ; dfs(1, 0, 1, 1, a, flag); // calculating path from root to b flag = false ; dfs(1, 0, 2, 1, b, flag); // runs till path 1 & path 2 mathches int i = 0; while (path[1][i] == path[2][i]) i++; // returns the last matching node in the paths return path[1][i-1]; } void addEdge( int a, int b) { tree[a].push_back(b); tree[b].push_back(a); } // Driver code int main() { int n = 8; // Number of nodes addEdge(1,2); addEdge(1,3); addEdge(2,4); addEdge(2,5); addEdge(2,6); addEdge(3,7); addEdge(3,8); cout << "LCA(4, 7) = " << LCA(4,7) << endl; cout << "LCA(4, 6) = " << LCA(4,6) << endl; return 0; } |
Output:
LCA(4, 7) = 1 LCA(4, 6) = 2
Pre-computation :- Here we store the 2^i th parent for every node, where 0 <= i < LEVEL, here “LEVEL” is a constant integer that tells the maximum number of 2^i th ancestor possible.
Therefore, we assume the worst case to see what is the value of the constant LEVEL. In our worst case every node in our tree will have at max 1 parent and 1 child or we can say it simply reduces to a linked list.
So, in this case
LEVEL = ceil ( log(number of nodes) ).
We also pre-compute the height for each node using one dfs in O(n) time.
int n // number of nodes int parent[MAXN][LEVEL] // all initialized to -1 parent[node][0] : contains the 2^0th(first) parent of all the nodes pre-computed using DFS // Sparse matrix Approach for node -> 1 to n : for i-> 1 to LEVEL : if ( parent[node][i-1] != -1 ) : parent[node][i] = parent[ parent[node][i-1] ][i-1]
Now , as we see the above dynamic programming code runs two nested loop that runs over their complete range respectively.
Hence, it can be easily be inferred that its asymptotic Time Complexity is O(number of nodes * LEVEL) ~ O(n*LEVEL) ~ O(nlogn).
Return LCA(u,v) :-
1) First Step is to bring both the nodes at the same height. As we have already pre-computed the heights for each node. We first calculate the difference in the heights of u and v (let’s say v >=u). Now we need the node ‘v’ to jump h nodes above. This can be easily done in O(log h) time ( where h is the difference in the heights of u and v) as we have already stored the 2^i parent for each node. This process is exactly same as calculating x^y in O(log y) time. (See the code for better understanding).
2) Now both u and v nodes are at same height. Therefore now once again we will use 2^i jumping strategy to reach the first Common Parent of u and v.
Pseudo-code:
For i-> LEVEL to 0 :
If parent[u][i] != parent[v][i] :
u = parent[u][i]
v = parent[v][i]
C++ implementation of the above algorithm is given below:
// Sparse Matrix DP approach to find LCA of two nodes #include <bits/stdc++.h> using namespace std; #define MAXN 100000 #define level 18 vector < int > tree[MAXN]; int depth[MAXN]; int parent[MAXN][level]; // pre-compute the depth for each node and their // first parent(2^0th parent) // time complexity : O(n) void dfs( int cur, int prev) { depth[cur] = depth[prev] + 1; parent[cur][0] = prev; for ( int i=0; i<tree[cur].size(); i++) { if (tree[cur][i] != prev) dfs(tree[cur][i], cur); } } // Dynamic Programming Sparse Matrix Approach // populating 2^i parent for each node // Time complexity : O(nlogn) void precomputeSparseMatrix( int n) { for ( int i=1; i<level; i++) { for ( int node = 1; node <= n; node++) { if (parent[node][i-1] != -1) parent[node][i] = parent[parent[node][i-1]][i-1]; } } } // Returning the LCA of u and v // Time complexity : O(log n) int lca( int u, int v) { if (depth[v] < depth[u]) swap(u, v); int diff = depth[v] - depth[u]; // Step 1 of the pseudocode for ( int i=0; i<level; i++) if ((diff>>i)&1) v = parent[v][i]; // now depth[u] == depth[v] if (u == v) return u; // Step 2 of the pseudocode for ( int i=level-1; i>=0; i--) if (parent[u][i] != parent[v][i]) { u = parent[u][i]; v = parent[v][i]; } return parent[u][0]; } void addEdge( int u, int v) { tree[u].push_back(v); tree[v].push_back(u); } // driver function int main() { memset (parent,-1, sizeof (parent)); int n = 8; addEdge(1,2); addEdge(1,3); addEdge(2,4); addEdge(2,5); addEdge(2,6); addEdge(3,7); addEdge(3,8); depth[0] = 0; // running dfs and precalculating depth // of each node. dfs(1,0); // Precomputing the 2^i th ancestor for evey node precomputeSparseMatrix(n); // calling the LCA function cout << "LCA(4, 7) = " << lca(4,7) << endl; cout << "LCA(4, 6) = " << lca(4,6) << endl; return 0; } |
Output:
LCA(4,7) = 1 LCA(4,6) = 2
Time Complexity: The time complexity for answering a single LCA query will be O(logn) but the overall time complexity is dominated by precalculation of the 2^i th ( 0<=i<=level ) ancestors for each node. Hence, the overall asymptotic Time Complexity will be O(n*logn) and Space Complexity will be O(nlogn), for storing the data about the ancestors of each node.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
leave a comment
0 Comments