COMP1927 - Binary Search Trees

A binary search tree is defined as a tree where:

  • All left children are <= parent
  • All right children are >= parent
  • Each subtree is a binary search tree

BSTs?action=AttachFile&do=get&target=keyBST.png

Binary Searching with Binary Trees

Binary searching is about splitting the list in half and narrowing down the location that way. Hence (ordered) trees are perfect for that.
At every comparison:

  • If the item you're looking for is smaller, go left.
  • Else right.
  • Continue until found.
// searchTree returns 1 if found, 0 if not
 
int searchTree (tree t, int data) {
 
   int found = 0;
 
   if (t != NULL) {
      if (data < t->data) {
         found = searchTree(t->left, data);
      } else if (data > t->data) {
         found = searchtree(t->right, data);
      } else {
         found = 1;
      }
   }
 
   return found;
 
}

Creating Nodes in Binary Trees

Creating a node in a tree is essentially the same as in a list - we malloc it, give it data and then give it two children pointers (left and right).

Tree createTree (int data) {
 
   Tree t = malloc(sizeof(struct tree));
   t->left = NULL;
   t->right = NULL;
   t->data = data;
 
   return t;
 
}

Freeing Nodes in Binary Trees

Freeing nodes in binary trees is done by recursing down to the bottom leaf level, freeing then recursing upwards. If we don't work from bottom to top then we can lose track of the children nodes and having memory leaks.

void freeTree (Tree t) {
 
   if (t != NULL) {
 
      if ((t->left == NULL) && (t->right == NULL)) {
         free (t);
      } else if (t->left == NULL) {
         freeTree (t->right);
      } else if (t->right == NULL) {
         freeTree (t->right);
      } else {
         freeTree (t->left);
         freeTree (t->right);
      }
 
   }
 
}

Inserting a Node into a Binary Tree

Binary tree insertion always occurs at the leaf level. We never have relinking issues, we always just search down to the bottom level (if we're smaller, go left, if bigger go right) until we find ourselves at a null node. We then create that child as a node and set its value as our insertion value.

(We can of course do this iteratively, but trees lend themselves to recursion).

Inserting nodes in different orders creates very different trees - all sorted, but arrange differently. There are n! different binary trees possible with n inputs.
E.g.

1234:
     1 
   /  \
       2
      / \
         3
        / \   
           4
3241:
      3 
     / \
    2   4
   / \ / \
  1
// createTree rather than createNode
// as each new node is the start of a new subtree
 
Tree insertTree (Tree t, int data) {
 
   // If our node/Tree is null we're at a leaf and can stop
   // We hence set the node above us to point to this
   if (t == NULL) {
      t = createTree(data); 
   } else {
 
      // Otherwise recurse left/right until we find the leaf
      // Setting t->x to equal it so as to link parent and child
      if (data < t->data) {
         t->left = insertTree(t->left, data);
      } else {
         t-> right = insertTree(t->right, data);      
      }
   }
 
   return t;
 
}

Deleting a Node in a Binary Tree

To delete a node in a binary tree we need to reshuffle the entire subtree to fix it. This isn't as bad as it sounds:

We replace a node with the leftmost node in the right subtree (DLMD - deepest left most descendant), or the rightmost node in the left subtree (DRMD - deepest right most descendent).

This must be done recursively until we're replacing a node (i.e. putting x where y was involves deleting x too) with a leaf node which can simply be free'd.

E.g. in the following tree to delete '8' we can either replace it with the right branch leaf node 9 as shown (which is then called with the delete function and free'd), or with the left branch node 4, which would require itself being called and then replaced with the left leafnode 2.

BSTs?action=AttachFile&do=get&target=delete2BST.png

// Find the node with the given value and then delete it
 
Tree deleteTree (Tree t, int value) {
 
   // We need a pointer to enable us to relink later
   Tree replacement = t;
 
   if (t != NULL) {
      // If the value is less than current, go left and try again
      if (value < t->data) {
         // We need to relink the pointers
         // Hence left child is now replaced by what we pass back at the end
         t->left = deleteTree(t->left, value);
      // If the value is greater than current, go right and try again
      } else if (value > t->data) {
         t->right = deleteTree (t->right, value);
      // Else we've found it and can delete it
      } else {
          // If the node is a leaf node we can just delete it 
          // And have things point to NULL
          if ((t->left == NULL) && (t->right == NULL)) {
              replacement = NULL;
          // If the left node is null we can simply shift the right up 
          } else if (t->left == NULL) {
             replacement = t->right;
          // If the right node is null we can simply shift the left up
          } else if (t->right == NULL) {
             replacement = t->left;
          // If there are 2 children we recurse down to find the replacement
          // Once we find it we need to make its parent point to NULL instead of it
          // And also make it point to t's two children
          } else {
             // DLMD = Deepest Left Most Descendant 
             // (i.e. the smallest one bigger than t)
             replacement = t->right;
             Tree parent = t;
             while (replacement->left != NULL) {
                parent = replacement;
                replacement = replacement->left;
             }  
             // The parent is now equal to one above the smallest
             // But if the right node is the entire subtree then we should set right to null
             // In that instance we need to make sure we check and do the above
             if (parent == t) {
                parent->right = NULL;  
             } else {
                parent->left = NULL;
             }
             replacement->left = t->left;
             replacement->right = t->right;
          }
          free(t);
          t = replacement;
      }
   }
 
   return replacement; 
 
}

Printing a Binary Tree

To simply print all the elements of a binary tree in order we recurse down as left as we can at each point. If we can't go left, we either go right or go up a level and try again.

We can print trees with structure by printing them sideways. To do this we can start at the bottom and after every left print a newline.

The catch being that each row up has one less tab indent than the row before, so that when looked at on the side the newlines become the horizontal spacing, and the tabs the vertical.

void printTree (tree t, int depth) {
 
   if (t != NULL) {
      // We're now a step further down, one depth further
      depth++;
      // Now call the left side, as per usual
      printTree (t->left, depth);
      // When we get to a null, and hence reach this step
      // We need to print enough tabs to indicate levels down
      int i;
      for (i = 0; i < depth; i++) {
         printf ("\t");
      }
      // We then print the data on a newline, to show the horizontal spacing
      printf ("%d\n", t->data);
      // We now call the right branch of the tree, having exhausted the left
      printTree (t->right, depth);
   }
 
}

We can try printing trees in the normal tree structure by using a Breadth First Search, rather than the Depth First Search that is prefix, postfix or infix.

Counting Nodes in a Binary Tree

To count the number of nodes in a tree we essentially do the same thing as print (go down to the left as much as we can. If we can't go left, try right), except increment a counter rather than printing out the data.

int countTree (tree t) {
 
   int numNodes = 0;
 
   if (t != NULL) {
      // 1 for yourself, then the sum of left and right
      numNodes = 1 + countTree(t->left) + countTree(t->right);
   }
 
   return numNodes;
 
}

Finding Height of a Binary Tree

Finding the height of a binary tree is very simple; essentially we recurse down comparing the height of each left and right subtree, and setting our total to equal the max (+1 for the current level) before going back up.

int height (tree t) {
 
   // treeHeight is -1 because it serves as error checking
   // As well as allowing the bottom null nodes to correctly increment it to 0 
   // (they'll return -1, be added to 1 and get 0)
   int treeHeight = -1
 
   if (t != NULL) {
      treeHeight = 1 + max(height(t->left), height(t->right));
   }
 
   return treeHeight;
 
}

Checking Balance of a Binary Tree

To find the balance of a binary tree we compare the counts of the left subtree and the right subtree:

int balance (tree t) {
 
   int treeBalance = 0;
 
   if (t != NULL) {
      treeBalance = abs(count(t->left) - count(t->right));
   }
 
}

Complexity Analysis

Searching

Best case in a binary search is when the key is your root.
Hence: O(1)

Worst case in a binary search is when the key is not in the tree.
Hence O(lg(n)) for a balanced tree, O(n) for a degenerate tree.

Average case in a binary search is when the key is in the middle. It averages out to the same as worst case.

Insertion and Deletion

To insert you always traverse the full height of the tree.
Hence O(lg(n) if balanced, O(n) if degenerate.

Deletion averages out to the same as insert, as it must be traversed down to a leaf node as well, in order to replace with either DLMD or DRMD.