LCA for general or n-ary trees (Sparse Matrix DP approach < O(nlogn), O(logn)>)

In previous posts, we have discussed how to calculate the Lowest Common Ancestor (LCA) for a binary tree and a binary search tree (this, this and this). Now let’s look at a method that can calculate LCA for any tree (not only for binary tree). We use Dynamic Programming with Sparse Matrix Approach in our method. This method is very handy and fast when you need to answer multiple queries of LCA for a tree.


Pre-requisites : –
1) DFS
2) Basic DP knowledge (This and this)
3) Range Minimum Query (Square Root Decomposition and Sparse Table)

Naive Approach:- O(n)

The naive approach for this general tree LCA calculation will be the same as the naive approach for the LCA calculation of Binary Tree (this naive approach is already well described here.

The C++ implementation for the naive approach is given below :-

/* Program to find LCA of n1 and n2 using one DFS on
   the Tree */
#include "iostream"
#include "vector"
using namespace std;
// Maximum number of nodes is 100000 and nodes are
// numbered from 1 to 100000
#define MAXN 100001
vector < int > tree[MAXN];
int path[3][MAXN]; // storing root to node path
// storing the path from root to node
void dfs(int cur, int prev, int pathNumber, int ptr,
                             int node, bool &flag)
    for (int i=0; i<tree[cur].size(); i++)
        if (tree[cur][i] != prev and !flag)
            // pushing current node into the path
            path[pathNumber][ptr] = tree[cur][i];
            if (tree[cur][i] == node)
                // node found
                flag = true;
                // terminating the path
                path[pathNumber][ptr+1] = -1;
            dfs(tree[cur][i], cur, pathNumber, ptr+1,
                                        node, flag);
// This Function compares the path from root to 'a' & root
// to 'b' and returns LCA of a and b. Time Complexity : O(n)
int LCA(int a, int b)
    // trivial case
    if (a == b)
        return a;
    // setting root to be first element in path
    path[1][0] = path[2][0] = 1;
    // calculating path from root to a
    bool flag = false;
    dfs(1, 0, 1, 1, a, flag);
    // calculating path from root to b
    flag = false;
    dfs(1, 0, 2, 1, b, flag);
    // runs till path 1 & path 2 mathches
    int i = 0;
    while (path[1][i] == path[2][i])
    // returns the last matching node in the paths
    return path[1][i-1];
void addEdge(int a,int b)
// Driver code
int main()
    int n = 8; // Number of nodes
    cout << "LCA(4, 7) = " << LCA(4,7) << endl;
    cout << "LCA(4, 6) = " << LCA(4,6) << endl;
    return 0;


LCA(4, 7) = 1
LCA(4, 6) = 2


Sparse Matrix Approach (O(nlogn) pre-processing, O(log n) – query)

Pre-computation :- Here we store the 2^i th parent for every node, where 0 <= i < LEVEL, here “LEVEL” is a constant integer that tells the maximum number of 2^i th ancestor possible.
Therefore, we assume the worst case to see what is the value of the constant LEVEL. In our worst case every node in our tree will have at max 1 parent and 1 child or we can say it simply reduces to a linked list.
So, in this case LEVEL = ceil ( log(number of nodes) ).
We also pre-compute the height for each node using one dfs in O(n) time.

int n             // number of nodes
int parent[MAXN][LEVEL] // all initialized to -1 

parent[node][0] : contains the 2^0th(first) 
parent of all the nodes pre-computed using DFS

// Sparse matrix Approach
for node -> 1 to n :        
  for i-> 1 to LEVEL :
    if ( parent[node][i-1] != -1 ) :
      parent[node][i]  =  
         parent[ parent[node][i-1] ][i-1]

Now , as we see the above dynamic programming code runs two nested loop that runs over their complete range respectively.
Hence, it can be easily be inferred that its asymptotic Time Complexity is O(number of nodes * LEVEL) ~ O(n*LEVEL) ~ O(nlogn).

Return LCA(u,v) :-
1) First Step is to bring both the nodes at the same height. As we have already pre-computed the heights for each node. We first calculate the difference in the heights of u and v (let’s say v >=u). Now we need the node ‘v’ to jump h nodes above. This can be easily done in O(log h) time ( where h is the difference in the heights of u and v) as we have already stored the 2^i parent for each node. This process is exactly same as calculating x^y in O(log y) time. (See the code for better understanding).

2) Now both u and v nodes are at same height. Therefore now once again we will use 2^i jumping strategy to reach the first Common Parent of u and v.

For i->  LEVEL to 0 :
      If parent[u][i] != parent[v][i] :
           u = parent[u][i]
           v = parent[v][i]

C++ implementation of the above algorithm is given below:

// Sparse Matrix DP approach to find LCA of two nodes
#include <bits/stdc++.h>
using namespace std;
#define MAXN 100000
#define level 18
vector <int> tree[MAXN];
int depth[MAXN];
int parent[MAXN][level];
// pre-compute the depth for each node and their
// first parent(2^0th parent)
// time complexity : O(n)
void dfs(int cur, int prev)
    depth[cur] = depth[prev] + 1;
    parent[cur][0] = prev;
    for (int i=0; i<tree[cur].size(); i++)
        if (tree[cur][i] != prev)
            dfs(tree[cur][i], cur);
// Dynamic Programming Sparse Matrix Approach
// populating 2^i parent for each node
// Time complexity : O(nlogn)
void precomputeSparseMatrix(int n)
    for (int i=1; i<level; i++)
        for (int node = 1; node <= n; node++)
            if (parent[node][i-1] != -1)
                parent[node][i] =
// Returning the LCA of u and v
// Time complexity : O(log n)
int lca(int u, int v)
    if (depth[v] < depth[u])
        swap(u, v);
    int diff = depth[v] - depth[u];
    // Step 1 of the pseudocode
    for (int i=0; i<level; i++)
        if ((diff>>i)&1)
            v = parent[v][i];
    // now depth[u] == depth[v]
    if (u == v)
        return u;
    // Step 2 of the pseudocode
    for (int i=level-1; i>=0; i--)
        if (parent[u][i] != parent[v][i])
            u = parent[u][i];
            v = parent[v][i];
    return parent[u][0];
void addEdge(int u,int v)
// driver function
int main()
    int n = 8;
    depth[0] = 0;
    // running dfs and precalculating depth
    // of each node.
    // Precomputing the 2^i th ancestor for evey node
    // calling the LCA function
    cout << "LCA(4, 7) = " << lca(4,7) << endl;
    cout << "LCA(4, 6) = " << lca(4,6) << endl;
    return 0;


LCA(4,7) = 1
LCA(4,6) = 2

Time Complexity: The time complexity for answering a single LCA query will be O(logn) but the overall time complexity is dominated by precalculation of the 2^i th ( 0<=i<=level ) ancestors for each node. Hence, the overall asymptotic Time Complexity will be O(n*logn) and Space Complexity will be O(nlogn), for storing the data about the ancestors of each node.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

This article is attributed to GeeksforGeeks.org

leave a comment



load comments

Subscribe to Our Newsletter