- Search time is O(Log n) in worst case. Time taken by deletion and insertion is amortized O(Log n)
- The balancing idea is to make sure that nodes are α size balanced. Α size balanced means sizes of left and right subtrees are at most α * (Size of node). The idea is based on the fact that if a node is Α weight balanced, then it is also height balanced: height <= log1/&aplpha;(size) + 1
- Unlike other self-balancing BSTs, ScapeGoat tree doesn’t require extra space per node. For example, Red Black Tree nodes are required to have color. In below implementation of ScapeGoat Tree, we only have left, right and parent pointers in Node class. Use of parent is done for simplicity of implementation and can be avoided.
Insertion (Assuming α = 2/3):
To insert value x in a Scapegoat Tree:
- Create a new node u and insert x using the BST insert algorithm.
- If the depth of u is greater than log3/2n where n is number of nodes in tree then we need to make tree balanced. To make balanced, we use below step to find a scapegoat.
- Walk up from u until we reach a node w with size(w) > (2/3)*size(w.parent). This node is scapegoat
- Rebuild the subtree rooted at w.parent.
What does rebuilding the subtree mean?
In rebuilding, we simply convert the subtree to the most possible balanced BST. We first store inorder traversal of BST in an array, then we build a new BST from array by recursively dividing it into two halves.
60 50 / / 40 42 58 Rebuild / / 50 ---------> 40 47 55 60 55 / 47 58 / 42
Below is C++ implementation of insert operation on Scapegoat Tree.
Preorder traversal of the constructed ScapeGoat tree is 7 6 3 1 0 2 4 3.5 5 8 9
A scapegoat tree with 10 nodes and height 5.
7 / 6 8 / 5 9 / 2 / 1 4 / / 0 3 Let’s insert 3.5 in the below scapegoat tree.
Initially d = 5 < log3/2n where n = 10;
Since, d > log3/2n i.e., 6 > log3/2n, so we have to find the scapegoat in order to solve the problem of exceeding height.
- Now we find a ScapeGoat. We start with newly added node 3.5 and check whether size(3.5)/size(3) >2/3.
- Since, size(3.5) = 1 and size(3) = 2, so size(3.5)/size(3) = ½ which is less than 2/3. So, this is not the scapegoat and we move up .
- Since 3 is not the scapegoat, we move and check the same condition for node 4. Since size(3) = 2 and size(4) = 3, so size(3)/size(4) = 2/3 which is not greater than 2/3. So, this is not the scapegoat and we move up .
- Since 3 is not the scapegoat, we move and check the same condition for node 4. Since, size(3) = 2 and size(4) = 3, so size(3)/size(4) = 2/3 which is not greater than 2/3. So, this is not the scapegoat and we move up .
- Now, size(4)/size(2) = 3/6. Since, size(4)= 3 and size(2) = 6 but 3/6 is still less than 2/3, which does not fulfill the condition of scapegoat so we again move up.
- Now, size(2)/size(5) = 6/7. Since, size(2) = 6 and size(5) = 7. 6/7 >2/3 which fulfills the condition of scapegoat, so we stop here and hence node 5 is a scapegoat
Finally, after finding the scapegoat, rebuilding will be taken at the subtree rooted at scapegoat i.e., at 5.
Comparison with other self-balancing BSTs
Red-Black and AVL : Time complexity of search, insert and delete is O(Log n)
Splay Tree : Worst case time complexities of search, insert and delete is O(n). But amortized time complexity of these operations is O(Log n).
ScapeGoat Tree: Like Splay Tree, it is easy to implement and has worst case time complexity of search as O(Log n). Worst case and amortized time complexities of insert and delete are same as Splay Tree for Scapegoat tree.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above