Sunday 8 July 2012

Binary Tree Operations


 Binary Tree Operations

Traversals of Binary Trees

Common parent-child traversals
Inorder: visit left subtree, visit root, visit right subtree 
Preorder: visit root, visit left, visit right 
Postorder: visit left, visit right, visit root 
private void Inorder(BSTNode root) 
{ 
  if(root != null) { 
    Inorder(root.left);
    Process(root.value);
    Inorder(root.right);
  } 
}
public void Inorder(){
    Inorder(root);
}

private void Preorder(BSTNode root) 
{ 
  if(root != null) {
    Process(root.value);
    Preorder(root.left);
    Preorder(root.right);
  } 
}
public void Preorder(){
    Preorder(root);
}

private void Postorder(BTSNode root)
{ 
  if(root != null) {
    Postorder(root.left);
    Postorder(root.right);
    Process(root.value); 
  } 
}
public void Postorder(){
    Postorder(root);
}

Level order traversal: starting with the root, process all nodes at each level left to right before proceeding to the next level. What data structure will you need to accomplish this?
Note the traversals as implemented in BinaryTreePT.java
 
 
 

Examples of Traversals

Deletion of an entire tree (post order traversal)
void DeleteAll(BTNode cur)
{
    if(cur != NULL) {
        DeleteAll(cur.left);
        DeleteAll(cur.right);
        cur = null;
    }
}






Implementation of isEmpty and isFull


  public boolean isEmpty()
  // Determines whether this BST is empty
  {
    return (root == null);
  }
 
  public boolean isFull()
  // Determines whether this BST is full
  {
    return false;
  }




Number of Nodes

Note difference in code complexity for recursive and nonrecursive solutions
  private int recNumberOfNodes(BSTNode tree)
  // Determines the number of nodes in tree
  {
    if (tree == null)    
      return 0;
    else
      return recNumberOfNodes(tree.left) + recNumberOfNodes(tree.right) + 1;
  }
 
  public int numberOfNodes()
  // Determines the number of nodes in this BST
  {
    return recNumberOfNodes(root);
  }
 
  public int numberOfNodes2()
  // Determines the number of nodes in this BST
  {
    int count = 0;
    if (root != null)
    {
      LinkedStack hold = new LinkedStack();
      BSTNode currNode;
      hold.push(root);
      while (!hold.isEmpty())
      {
        currNode = (BSTNode) hold.top();
        hold.pop();
        count++;
        if (currNode.left != null)
          hold.push(currNode.left);
        if (currNode.right != null)
          hold.push(currNode.right);
      }
    }
    return count;
  }



isThere and retrieve Implementation

  private boolean recIsThere(Comparable item, BSTNode tree)
  // returns true if item is in tree, false otherwise
  {
    if (tree == null)
      return false;                      // item is not found
    else if (item.compareTo(tree.info) < 0)
      return recIsThere(item, tree.left);  // Search left subtree
    else if (item.compareTo(tree.info) > 0)
      return recIsThere(item, tree.right); // Search right subtree
    else
      return true;                         // item is found
  }
 
  public boolean isThere (Comparable item)
  // Determines if element matching item is in this BST.
  {
    return recIsThere(item, root);
  }
  
  private Comparable recRetrieve(Comparable item, BSTNode tree)
  // Returns the element in tree with the same key as item.
  {
    if (item.compareTo(tree.info) < 0)
      return recRetrieve(item, tree.left);          // retrieve from left subtree
    else
    if (item.compareTo(tree.info) > 0)
      return recRetrieve(item, tree.right);         // retrieve from right subtree
    else
      return tree.info;
  }
 
  public Comparable retrieve(Comparable item)
  // Returns the BST element with the same key as item.
  {
    return recRetrieve(item, root);
  }



Insertion and Deletion

  private BSTNode recInsert(Comparable item, BSTNode tree)
  // Inserts item into the tree
  {
    if (tree == null)
    {// Insertion place found
      tree = new BSTNode();
      tree.right = null;
      tree.left = null;
      tree.info = item;
    }
    else if (item.compareTo(tree.info) < 0)
      tree.left = recInsert(item, tree.left);    // Insert in left subtree
    else
      tree.right = recInsert(item, tree.right);   // Insert in right subtree
    return tree;
  }
 
  public void insert (Comparable item)
  // Adds item to this BST.
  {
    root = recInsert(item, root);
  }
 
 
    private BSTNode deleteNode(BSTNode tree)
  // Deletes the node referenced by tree
  // Post: The user's data in the node referenced to by tree is no
  //       longer in the tree.  If tree is a leaf node or has only
  //       a non-null child pointer, the node pointed to by tree is
  //       deleted; otherwise, the user's data is replaced by its
  //       logical predecessor and the predecessor's node is deleted
  {
    Comparable data;
 
    if (tree.left == null)
      return tree.right;
    else if (tree.right == null) 
      return tree.left;
    else
    {
      data = getPredecessor(tree.left);
      tree.info = data;
      tree.left = recDelete(data, tree.left);  // Delete predecessor node
      return tree;
    }
  }
    private BSTNode recDelete(Comparable item, BSTNode tree)
  // Deletes item from the tree
  {
    if (item.compareTo(tree.info) < 0)
      tree.left = recDelete(item, tree.left);
    else if (item.compareTo(tree.info) > 0)
      tree.right = recDelete(item, tree.right);
    else 
      tree = deleteNode(tree);
    return tree;
  }
 
  public void delete (Comparable item)
  // Delete the element of this BST whose key matches item’s key.
  {
    root = recDelete(item, root);
  }



reset and getNext operations

 public int reset(int orderType)
  // Initializes current position for an iteration through this BST in orderType order
  {
    int numNodes = numberOfNodes();
    if (orderType == INORDER)
    {
      inOrderQueue = new ArrayQueue(numNodes);
      inOrder(root);
    }
    else
    if (orderType == PREORDER)
    {
      preOrderQueue = new ArrayQueue(numNodes);
      preOrder(root);
    }
    if (orderType == POSTORDER)
    {
      postOrderQueue = new ArrayQueue(numNodes);
      postOrder(root);
    }
    return numNodes;
  }
 
  public Comparable getNextItem (int orderType)
  // Returns a copy of the element at the current position in this BST and advances 
  // the value of the current position based on the orderType set through reset
  {
    if (orderType == INORDER)
      return (Comparable)inOrderQueue.dequeue();
    else
    if (orderType == PREORDER)
      return (Comparable)preOrderQueue.dequeue();
    else
    if (orderType == POSTORDER)
      return (Comparable)postOrderQueue.dequeue();
    else return null;
  }
 
    private void inOrder(BSTNode tree)
  // initializes inOrderQueue with tree elements in inOrder order
  {
    if (tree != null)
    {
      inOrder(tree.left);
      inOrderQueue.enqueue(tree.info);
      inOrder(tree.right);
    }
  }
 
  private void preOrder(BSTNode tree)
  // initializes preOrderQueue with tree elements in preOrder order
  {
    if (tree != null)
    {
      preOrderQueue.enqueue(tree.info);
      preOrder(tree.left);
      preOrder(tree.right);
    }
  }
 
  private void postOrder(BSTNode tree)
  // initializes postOrderQueue with tree elements in postOrder order
  {
    if (tree != null)
    {
      postOrder(tree.left);
      postOrder(tree.right);
      postOrderQueue.enqueue(tree.info);
    }
  }

Binary Trees


Trees are natural structures for representing certain kinds of hierarchical data. A (rooted) tree consists of a set of nodes (or vertices) and a set of arcs (or edges). Each arc links a parent node to one of the parent's children. A special root node has no parent. Every other node has exactly one parent. It is possible to reach any node by following a unique path of arcs from the root. If arcs are considered bidirectional, there is a unique path between any two nodes. The simplest kind of tree is a binary tree where each parent has at most two children.
(We consider unrooted trees elsewhere, e.g. in [minimum spanning trees].)

A way to think of a binary tree is that it is either empty (emptyTree) or it is a fork which contains an element and two subtrees which are themselves binary trees.
©
L
.
A
l
l
i
s
o
n
 
Types:
   type tree e = emptyTree | fork e × (tree e) × (tree e)
 
Operations:
   empty: tree e -> boolean
   leaf : tree e -> boolean
   fork : e × tree e × tree e -> tree e
   left : tree e -> tree e
   right: tree e -> tree e
   contents: tree e -> e
 
Rules:
   empty(emptyTree) = true
   empty(fork(x, T, T'))= false
 
   leaf(fork(x, emptyTree, emptyTree)) = true
   leaf(fork(x, T, T'))
      = false, if not empty(T) or not empty(T')
   leaf(emptyTree) =  error
 
   left(fork(x, T, T')) = T
   left(emptyTree) = error
 
   right(fork(x, T, T')) = T'
   right(emptyTree) = error
 
   contents(fork(x, T, T')) = x
   contents(emptyTree) = error
 
 
The Tree Abstract Data Type.
(It is also common to use curried versions of constructors and functions, such as fork, especially in functional languages.)
Many operations can be defined on trees. For example:
 
height(emptyTree) = 0 |
height(fork(e,T,T')) = 1+max(height(T), height(T'))
 
weight(emptyTree) = 0 |
weight(fork(e,T,T')) = 1+weight(T)+weight(T')
 
Note that these functions are binary-recursive as is the definition of the tree type. (Also note that a tree consisting of a single leaf is sometimes defined to be of height zero, rather than 1, as here.)
Trees have many uses in computing. For example a parse-tree can represent the structure of an expression:
 
input: a+b*c
 
----------> +
           . .
          .   .
         a     *
              . .
             .   .
            b     c
 
Multiplication has a higher priority then addition and binds more tightly. This tree shows that a+b*c is interpreted as a+(b*c) rather than as (a+b)*c. Such trees are used in compilers and other programs.

Implementation of Binary Trees by Pointers and Records

A tree data type can be implemented as a collection of records and pointers.
 
[C/Tree/Tree.h]
[C/Tree/TreeElement.h]
.so tree/tree.type.t
 
The basic operations can create new records and manipulate pointers.
 
[C/Tree/Ops.c]
.so tree/tree.cons.t
 
Not surprisingly this tree implementation is similar to the implementation of hierarchical lists.
An output routine for trees is a virtual necessity. A tree can be printed in a style like that used for lists but a graphical two-dimensional layout is more informative. It is difficult to print trees down the page, because they quickly grow too wide, but it is relatively easy to print them across the page so that the root is at the left and the tree grows to the right.
 
[C/Tree/Write.c]
.so tree/tree.put.t
 
For an example of tree output, see later sections.

Example: Parse Trees

Parsing an input language according to a grammar is frequently necessary in computing. Here we consider a language of simple arithmetic expressions. A grammar written in Backus Naur form (BNF) is given below.
 
<exp>     ::= <exp> + <term> | <term>
<term     ::= <term> * <operand> | <operand>
<operand> ::= <identifier> | ( <exp> )
<identifier> ::= <letter> <identifier> | <letter>
<letter>  ::= "a" | "b" | ... | "z"
 
Grammar for Simple Expression.
 
The grammar can be read as a definition of the syntactic structure of expressions, <exp>. The symbol `::=' can be read as `is' or `can be replaced by'. The symbol `|' can be read as `or'. The names in angle brackets, such as '<exp>', are variables for parts of the language. The string <exp>+1 is not a legal expression but it stands for many legal expressions such as x+1, y+1 and (x+y*z)+1. The first line of the grammar can be read as: an <exp> is an <exp> plus <term> or a <term>. An expression is either an expression plus a term or just a term. Note that the syntax rules are recursive. A little thought shows that an expression is a sequence of terms separated by plus signs. Similarly, a term is a sequence of operands separated by multiplication signs. An operand is either an identifier or a bracketed subexpression. An identifier is a sequence of letters.
A parser can be written using the recursive-descent technique. A procedure is written to recognise each class of object in the grammar - exp, term, operand. These routines are recursive, as is the grammar, and call each other to implement the grammar rules. For example, the routine for expression calls the routine for term. The repetitive nature of expressions and terms is coded by the use of a loops. A bracketed subexpression is inherently recursive and so is the parser at that point. The complete parser is given below. It is moderately long but not complex, especially if read with the grammar in mind. A lexical routine, insymbol, skips white space and packages input letters into identifiers and other kinds of symbol. It is followed by the parser proper consisting of the routines Operand, Term and Exp.
 
[C/Tree/Parser.h]
[C/Tree/Parser.c]
.so tree/tree.parse.t
 
Note the use of mutual recursion. Various errors may be detected during parsing. If the expression is syntactically correct, the tree representing its structure is finally printed.
 
input: a + b*(c+d)
 
Parse Tree:
 
          |          |          |d---------|
          |          |+---------|
          |          |          |c---------|
          |*---------|
          |          |b---------|
+---------|
          |a---------|
 

Recursive Tree Traversal

There are three classic ways of recursively traversing a tree or of visiting every one of its nodes once. In each of these, the left and right subtrees are visited recursively and the distinguishing feature is when the element in the root is visited or processed.
In a preorder or prefix traversal the root is visited first (pre) and then the left and right subtrees are traversed.
 
[C/Tree/Prefix.c]
.so tree/tree.prefix.t
 
In an infix traversal, the left subtree is traversed and then the root is visited and finally the right subtree is traversed.
 
[C/Tree/Infix.c]
.so tree/tree.infix.t
 
In a postorder or postfix traversal the left and right subtrees are traversed and then the root is visited afterwards (post).
 
[C/Tree/Postfix.c]
.so tree/tree.postfix.t
 
This method can be used to generate postfix or reverse Polish code for a stack machine.
Given the following tree,
 
          |type------|
the-------|
          |          |rover-----|
          |          |          |of--------|
          |          |          |          |motor-----|
          |land------|
          |          |is--------|
          |          |          |          |car-------|
          |          |          |a---------|
 
Example Tree.
 
the three traversals give the following results. The results of yet another method, breadth-first traversal, are included for comparison; see the next section.
 
infix:   a car is land motor of rover the type
prefix:  the land is a car rover of motor type
postfix: car a is motor of rover land type the
b-first: the land type is rover a of car motor
 
Traversals.
 
Note that the method given for printing a tree is a reversed infix traversal.

Breadth-First Traversal

A breadth-first traversal of a tree starts at the root of the tree. It next visits the children, then the grand-children and so on.
 
                     1
                   .   .
                 .       .
                2         3
               . .       . .
              .   .     .   .
             4     5   6     7
            . .       .     . .
           .   .     .     .   .
          8     9   10    11   12
               .         . .
              .         .   .
             13        14   15
 
         Example: Breadth-First Order.
 
The numbers indicate the order in which the nodes are visited, not the contents of the nodes. Because children are only accessible from a parent, they must be stored while the parent's siblings and cousins are visited. A [queue] is used to do this.
 
[C/Queue/Queue.h]
[C/Tree/QueueElement.h]
[C/Tree/BFirst.c]
.so tree/tree.bf.t
 
Note that the queue is a queue of pointer to node or a queue of tree type. Initially the queue contains just the root of the tree. At each iteration of the algorithm, the first element is removed from the queue. Its children, if any, are pushed onto the end of the queue and the element is processed. The algorithm terminates when the queue is empty. See also the chapter on queues.

Recursion

Most routines on trees are recursive. This is natural because the tree is a recursive data type. It is possible to write iterative versions of these operations but it is harder to do so than is the case for flat lists because the tree type is binary recursive. The flat list and hence most of its operations are linear recursive and a linear recursive routine usually has a simple iterative version. It is often necessary to introduce an explicit stack into a program when writing a non-recursive tree routine. This is often not worth the effort as the language implementors can usually do a better job with the system stack.
The object-oriented programming language Java has the notion of an Enumeration, i.e. an iterator, which steps through the elements of a set of values. An enumerator must implement the predicate hasMoreElements and the function nextElement. To program an enumerator which will emit the contents of a tree in one of the standard orders it is useful to a employ a Stack:
 
[Java/Tree/egPrefixEnum.txt]
 

Side-Effects

The tree operations described so far have no side-effects except for input, output and manipulation of dynamic storage; they are pure tree operations. As is the case with lists, it is often necessary or desirable to use operations having side-effects on efficiency grounds. This is particularly natural if a program uses a single tree as a dictionary or database structure. As before, should multiple trees share components, changing one tree may change another and if this is not anticipated it will cause program bugs.

Search Trees

Demonstration: There is a demonstration of Binary Search (& AVL) Trees [here].
A binary search tree can be created so that the elements in it satisfy an ordering property. This allows elements to be searched for quickly. All of the elements in the left subtree are less than the element at the root which is less than all of the elements in the right subtree and this property applies recursively to all the subtrees. The great advantage of this is that when searching for an element, a comparison with the root will either find the element or indicate which one subtree to search. The ordering is an invariant property of the search tree. All routines that operate on the tree can make use of it provided that they also keep it holding true.
 
.so tree/tree.search.t
 
It takes O(h) time to search a search tree of height h. Since a tree of height `h' can hold n=2h-1 elements, the search takes O(log(n)) time under favourable circumstances. The search-tree and its search routine should be compared with the use of the binary search algorithm on sorted arrays in the chapter on tables.
The use of the tree speeds up the insertion and deletion operations at the price of the space needed to hold the pointers. The tree has the speed advantage when the data in the structure changes rapidly.
The routine given here to insert an element does so as a side-effect by changing the tree.
 
[C/Tree/Insert.c]
.so tree/tree.insert.t
 
If elements are inserted in the order d, b, a, c, e, f, g the following tree is created:
 
----------->d
           . .
          .   .
         .     .
        b       e
       . .       .
      .   .       .
     a     c       f
                    .
                     .
                      g
 
Note that an insertion takes O(h) time. Under favourable circumstances, a balanced tree is created, as for b, a, c, giving O(log(n)) search time. If the input is sorted however an unbalanced tree approximating a list is created, as for e, f, g, and the search degenerates to an O(n) linear search. This problem is addressed later.
 
input: the land rover is a type of motor car
 
Search Tree:
 
          |type------|
the-------|
          |          |rover-----|
          |          |          |of--------|
          |          |          |          |motor-----|
          |land------|
          |          |is--------|
          |          |          |          |car-------|
          |          |          |a---------|
 
Example Search Tree.
 
A new element is added to the tree as a new peripheral, leaf node. However if an element can also be deleted it is possible for it to be internal. This makes deletion rather more difficult than insertion.
A leaf element is easily deleted by setting the pointer to it to emptyTree.
 
T:-----|                T:emptyTree
       |
       |
       v
       x
 
  before                 after
 
 
Deletion of a Leaf.
The node becomes garbage and can be freed.
An element with one child can be deleted by by-passing it.
 
T:------|               T:|
        |                 |
        |                 |
        v                 |
        x                 |
       . emptyTree        |
     .                    v
   e                      e
  . .                    . .
 .   .                  .   .
 
 before                after
 
 
Deletion of a Single-Child Parent.
An internal element x with two children cannot easily be bypassed without loosing one of its subtrees. The solution is to overwrite x with some other element y of the tree and then to delete the original copy of y. There are two obvious elements to choose from - either the largest element `A' less than x or the smallest element `B' greater than x. Each of these has at most one child! The sortedness of the tree is maintained if x is overwritten with either of these.
 
T:----------|               T:-----|
            |                      |
            |                      |
            v                      v
            x                      B
          .   .                  .   .
        .       .              .       .
      L           R          L           R
     . .         . .        . .         . .
    .   .       .   .      .   .       .   .
         A     B                A     C
        .       .              .     . .
       .         .            .     .   .
                 C
                . .
               .   .
 
 before                    after
 
 
Deletion of a Two-Child Parent.
Both of A and B fall into one of the cases previously dealt with. A can be found from x by taking one step left and then as many steps right as possible. B can be found by taking one step right and then as many steps left as possible.
 
.so tree/tree.delete.t
 
Deletion takes O(h) time.

Notes

  • [Tries] and [PATRICIA trees] are an alternatives to binary search trees.
  • The [Suffix Tree] is a data-structure for solving the string searching problem.

Height-Balanced Trees

Demonstration: There is a demonstration of AVL (& bin. s.) Trees [here].
As mentioned above, if elements are inserted into a search tree in sorted order, a tree is created that is equivalent to a list. This will lead to inefficient searches. A height-balanced tree or an AVL-tree (after G. M. Adel'son-Velskii and E. M. Landis) is a search tree in which the height of the right subtree minus the height of the left subtree equals 1, 0, or -1. This property also applies recursively to all subtrees. It can be shown that a height-balanced tree of `n' elements has height O(log(n)) and this guarantees efficient search. Fortunately fast O(log(n)) insertion and deletion is still possible.
 
.so tree/tree.AVLtype.t
 
A flag indicating the balance of each subtree is added to the node record.
There are four crucial cases during insertion. In the first case, the left subtree L grows too tall on its left:
 
        T                        L
       . .                      . .
      .   .                    .   .
     L     .     ----->       .     T
    . .    |                  |    . .
   .   .   |                  |   .   .
   .   .   |                  |   .   .
   |   |                      *   |   |
   |   |                      *   |   |
   |   |                      *   |   |
   *       new elt'           rebalanced
   *       disturbs
   *       balance
 
Left-2 Rotation.
 
By rotating about L and T the height balance is restored and the ordering in the tree is maintained. In the second case, L grows too tall on its right:
 
        T                      LR
      .   .                   .   .
     .     .                 .     .
    L       .               L       T
   . .      |              . .     . .
  .   .     |  ---->      .   .   .   .
 .    LR    |             .   .   .   .
 |    ..    |             |   |   |   |
 |   .  .   |             |   |   |   |
 |  .    .  |             |   |   |   |
 |  |    |  |             |   |   |   |
 |  |    |  |             |   |   |   |
 |  |    |  |             |   |   |   |
 |  |    |                |   *       |
 |  |    |                |   *       |
 |  |    |                |   *       |
    *
    *
    *
 
Left-3 Rotation.
 
The rotation restores the balance while maintaining the tree ordering. In the above example left(LR) grew; an alternative is that right(LR) grew but the same operations still restore the balance. There are mirror-image right-2 and right-3 rotations.
Maintaining the balance significantly complicates the tree insertion routine. However a fixed amount of work is done at each node that is visited and so it takes O(h) time.
 
.so tree/tree.AVLinsert.t
 
Note that if a tree has just grown in height, it can only be perfectly balanced if it is a single (new) leaf. Otherwise one subtree must be higher than the other.
 
input: the land rover is a type of motor car
 
          |          |type------|
          |the-------|
rover-----|
          |          |          |of--------|
          |          |motor-----|
          |          |          |land------|
          |is--------|
          |          |          |car-------|
          |          |a---------|
 
    Height-Balanced Tree.
 
Compare this with the tree given earlier and created by the non-balancing insert.

Notes

  • G. M. Adelson-Velskii and E. M. Landis [AVL]. An algorithm for the organization of information. Soviet Math. Dokl. 3, p1259-1263, 1962.
  • N. Wirth. Algorithms and Data Structures. p218, Prentice-Hall, 1986.
    AVL-tree insertion and deletion.
  • M. A. Weiss. Data Structures and Algorithm Analysis in C. s4.4, pp110. Addison Wesley 1997.
  • [2-3 trees], [2-3-4 trees] and [B-trees]

Implementation of Binary Trees by Arrays

A binary tree can be implemented as an array of records.
 
type TreeT :0..nmax
 
var Tree: array 1..nmax of
   record elt :Element_Type
          left, right :TreeT
   end record
 
The empty tree is represented by zero. Left and right are indexes to left and right subtrees.
This implementation has no advantage in a language supporting dynamic storage unless random access to nodes is also needed. It is useful in a language without dynamic storage.

Full Trees by Arrays

It is possible to define a full or weight balanced tree in contrast to a height balanced tree. The empty tree is a full tree of zero levels. If T and T' are full binary trees of n levels then fork(e,T,T') is a full binary tree of n+1 levels.
 
           1
          . .
         .   .
        .     .
       .       .
      2         3
     . .       . .
    .   .     .   .
   4     5   6     7
 
Full Binary Tree of 3 Levels.
 
The numbering of the nodes corresponds to a breadth-first traversal. This suggests that such a tree can be stored in an array:
 
const n = 2**Levels - 1
 
array 1..n of Element_Type
 
 
type tree = 0..n
 
empty(T)   = T=0
left(T)   = 2*T
right(T)  = 2*T+1
parent(T) = T div 2
leaf(T)   = T >= 2**(Levels-1)
 
Such an implementation is very efficient indeed, if the tree is full, because no space at all is used for pointers. See the chapter on sorting for an application of this technique in [heap-sort].

Testing and Debugging

Programming techniques for trees share a great deal in common with those for lists. Pre and post conditions and assertions should be included where possible. Minimise the use of side-effects in general and the manipulation of global variables in particular.
Note that there are really just two kinds of tree - emptyTree and non-emptyTree. Most operations on a tree follow this pattern:
 
f(emptyTree) = ... |
f(fork(e,T,T')) = ...often f(T)...f(T')...
 
If a case is missing it probably indicates an error. The empty case is usually very simple, often returning a constant. The main case often operates on one or both subtrees recursively.
When testing or debugging a tree routine, test cases should cover the above options. For non-emptyTree trees, it may be appropriate to try a tree of a "few" nodes and a tree of a single node. As usual it is invaluable to write the output routine first. The most common problems are: . unassigned pointers, particularly those that should be emptyTree, . lost pointer values through a wrong ordering of operations, . knotted pointers maybe introducing a cycle, . side-effects through shared pointers, intentional or otherwise.
The coding of reasonably complex routines such as those on AVL trees is prone to small typographical errors. Nothing can beat careful proof reading but one good testing trick is to use a set of ascending data and then its reversal in which case the mirror image tree should be produced. This test is not sufficient on its own! It is important to exercise all the paths through a complex routine and this requires a great deal of thought.

Exercises

  1. Classify tree operations (height, weight, min_element, max_element, flatten, insert, lookup, delete, pre,- in- and post-order traversal) on unordered binary trees and on search trees as either linear or binary recursive. Note that a linear-recursive operation may contain two or more calls on itself but only ever execute at most one of them under the control of an if statement. Write iterative versions of one or more of the linear operations.
  2. Add a full set of operators +, -, *, / to the expression parser.
  3. Write a routine to traverse the expression parse tree to generate reverse Polish code. eg. a*b+c*d ---> a b * c d * +
  4. Write a routine to do symbolic differentiation. Input is the parse tree of an arithmetic expression which may contain numerals, variables {x, y, ...}, + and *. Output is the tree of the expression that is the result of differentiation with respect to a given variable.
Use the rules:
diff(x, x) = 1
diff(y, x) = 0 if y is a constant or a variable different from x
diff(A+B, x) = diff(A,x)+diff(B,x)
diff(A*B, x) = A*diff(B,x)+diff(A,x)*B
eg. diff(x*x+3*x+4, x) = x*1+1*x+3*1+0*x+0 = 2*x+3
  1. Write a routine to free all of the nodes in a binary tree. Count the number of new and free operations and check that they are equal.
  2. Write a routine to collect only the elements in all the leaf nodes of a binary tree into a list.
  3. The path of a node in a tree is the sequence of elements contained in nodes from the root, down to and including the node itself. Give a procedure to print the path of every leaf node in a tree.
  4. A height-balanced tree is not necessarily full or weight-balanced. How un-weight-balanced can one be? ie. What is the minimum possible number `n' of nodes in a height-balanced tree for height h=1, 2, ..., 8. Is there a pattern? How does n compare with the value 2h-1 for a full tree?
  5. (Hard) Implement a delete operation for height-balanced trees.
-- L.A., Department of Computer Science, University of Western Australia 1984-86.
window on the wide world:
Algorithms in C
Sedgwick
8 The Darwin Awards 4: Intelligent Design


Linux
free op. sys.
OpenOffice
free office suite,
ver 2.2+

The GIMP
~ free photoshop
Firefox
we

Stack - Array Implementation


  1. Abstract idea of a stack:
The stack is a very common data structure used in programs. By data structure, we mean something that is meant to hold data and provides certain operations on that data.
One way to describe how a stack data structure behaves is to look at a physical analogy, a stack of books...
Now, a minimal set of things that we might do with the stack are the following:
Place a book on the top...
Take one off the top...
See if the stack is empty...
NOT Empty, there are books on the stack!
We'll consider other, more complex operations to be inappropriate for our stack. For example, pulling out the 3rd book from the top cannot be done directly because the stack might fall over.
How might you get the 3rd book from the top using only the simple operations?

Abstraction

Now, let's think about a stack in an abstract way. I.e., it doesn't hold any particular kind of thing (like books) and we aren't restricting ourselves to any particular programming language or any particular implementation of a stack.
Stacks hold objects, usually all of the same type. Most stacks support just the simple set of operations we introduced above; and thus, the main property of a stack is that objects go on and come off of the top of the stack.
Here are the minimal operations we'd need for an abstract stack (and their typical names):
    • Push: Places an object on the top of the stack.
    • Pop: Removes an object from the top of the stack and produces that object.
    • IsEmpty: Reports whether the stack is empty or not.
Because we think of stacks in terms of the physical analogy, we usually draw them vertically (so the top is really on top).
  1. Order produced by a stack:
Stacks are linear data structures. This means that their contexts are stored in what looks like a line (although vertically). This linear property, however, is not sufficient to discriminate a stack from other linear data structures. For example, an array is a sort of linear data structure. However, you can access any element in an array--not true for a stack, since you can only deal with the element at its top.
One of the distinguishing characteristics of a stack, and the thing that makes it useful, is the order in which elements come out of a stack. Let's see what order that is by looking at a stack of letters...
Suppose we have a stack that can hold letters, call it stack. What would a particular sequence of Push and Pops do to this stack?
We begin with stack empty:
-----
stack
Now, let's perform Push(stack, A), giving:
-----
| A |  <-- top
-----
stack
Again, another push operation, Push(stack, B), giving:
-----
| B |  <-- top
-----
| A |
-----
stack
Now let's remove an item, letter = Pop(stack), giving:
-----              -----
| A |  <-- top     | B |
-----              -----
stack              letter
And finally, one more addition, Push(stack, C), giving:
-----
| C |  <-- top
-----
| A |
-----
stack
You'll notice that the stack enforces a certain order to the use of its contents, i.e., the Last thing In is the First thing Out. Thus, we say that a stack enforces LIFO order.
Now we can see one of the uses of a stack...To reverse the order of a set of objects.
  1. Implementing a stack with an array:
Let's think about how to implement this stack in the C programming language.
First, if we want to store letters, we can use type char. Next, since a stack usually holds a bunch of items with the same type (e.g., char), we can use an array to hold the contents of the stack.
Now, consider how we'll use this array of characters, call it contents, to hold the contents of the stack. At some point we'll have to decide how big this array is; keep in mind that a normal array has a fixed size.
Let's choose the array to be of size 4 for now. So, an array getting A, then B, will look like:
-----------------
| A | B |   |   |
-----------------
  0   1   2   3
contents
Is this array sufficient, or will we need to store more information concerning the stack?
Answer: We need to keep track of the top of the stack since not all of the array holds stack elements.
What type of thing will we use to keep track of the top of the stack?
Answer: One choice is to use an integer, top, which will hold the array index of the element at the top of the stack.

Example:

Again suppose the stack has (A,B) in it already...
stack (made up of 'contents' and 'top')
-----------------   -----
| A | B |   |   |   | 1 |
-----------------   -----
  0   1   2   3      top
contents
Since B is at the top of the stack, the value top stores the index of B in the array (i.e., 1).
Now, suppose we push something on the stack, Push(stack, 'C'), giving:
stack (made up of 'contents' and 'top')
-----------------   -----
| A | B | C |   |   | 2 |
-----------------   -----
  0   1   2   3      top
contents
(Note that both the contents and top part have to change.)
So, a sequence of pops produce the following effects:
1.                  letter = Pop(stack)
2.  stack (made up of 'contents' and 'top')
3.  -----------------   -----    -----
4.  | A | B |   |   |   | 1 |    | C |
5.  -----------------   -----    -----
6.    0   1   2   3      top     letter
7.  contents
8.                  letter = Pop(stack)
9.  stack (made up of 'contents' and 'top')
10.-----------------   -----    -----
11.| A |   |   |   |   | 0 |    | B |
12.-----------------   -----    -----
13.  0   1   2   3      top     letter
14.contents
15.              letter = Pop(stack)
16.stack (made up of 'contents' and 'top')
17.-----------------   -----    -----
18.|   |   |   |   |   | -1|    | A |
19.-----------------   -----    -----
20.  0   1   2   3      top     letter
21.contents
so that you can see what value top should have when it is empty, i.e., -1.
Let's use this implementation of the stack with contents and top fields.

What happens if we apply the following set of operations?
22.              Push(stack, 'D')
    1. Push(stack, 'E')
    2. Push(stack, 'F')
    3. Push(stack, 'G')
giving:
stack (made up of 'contents' and 'top')
-----------------   -----
| D | E | F | G |   | 3 |
-----------------   -----
  0   1   2   3      top
contents
and then try to add H with Push(stack, 'H')?
Thus, what is the disadvantage of using an array to implement a stack?
  1. Dynamically-sized stack:
Now, we will add one more choice to how we'll implement our stack. We want to be able to decide the maximum size of the stack at run-time (not compile-time).
Thus, we cannot use a regular array, but must use a pointer to a dynamically-allocated array.
Now, will we need to keep track of any more information besides the contents and top?
Answer: Yes! We'll need to keep the size of this array, i.e., the maximum size of the stack. We'll see why this is necessary as we write the code.
  1. C code:
Now, let's think about how to actually code this stack of characters in C.
It is usually convenient to put a data structure in its own module, thus, we'll want to create files stack.h and stack.c.
Now, there are 2 main parts to a C data structure: the data types needed to keep track of a stack and the functions needed to implement stack operations.
0.                  The main data type we need is a type that people can use to declare new stacks, as in:
1.  type-of-a-stack s1, s2;
2.                  Some of the functions we'll need come directly from the operations needed for an abstract stack, like:
3.  StackPush(ref-to-s1, 'A');
4.  ch = StackPop(ref-to-s2);
Notice how each stack operation needs some way to refer to a specific stack (so that we can have more than one stack at a time).
We may need to add a few other operations to help implement a stack. These will become apparent as we start to implement the stack. Remember that we need to put prototypes for each stack function in stack.h and put the function definitions (bodies) in stack.c.
Before we ponder the details of the stack functions, we should decide on the types we need...
  1. Data types for a stack:
For the array implementation, we need to keep track of (at least) the array contents and a top index. How could we combine these 2 into a single C construct of type stackT?
Answer: Put them into a struct.
So, our stack types become:
typedef char stackElementT;  /* Give it a generic name--makes  */
                             /* changing the type of things in */
                             /* the stack easy.                */
 
typedef struct {
  stackElementT *contents;
  int top;
  /* Other fields? */
} stackT;
Note that the contents is a pointer since it will be dynamically-allocated.
Are any other fields needed? Well, remember that the maximum size of the array is determined at run-time...We'll probably need to keep that value around so that we can tell when the stack is full... The final type, thus, is:
typedef struct {
  stackElementT *contents;
  int top;
  int maxSize;
} stackT;
These types should go in the interface, stack.h.
  1. Filling in stack functions:
Now that we've decided on the data types for a stack, let's think about the functions we need...
First, we need the standard stack operations:
StackPush()
StackPop()
StackIsEmpty()
We'll use the convention of placing the data structure name at the beginning of the function name (e.g., StackIsEmpty). That will help us down the line. For example, suppose we use 2 different data structures in a program, both with IsEmpty operations--our naming convention will prevent the 2 different IsEmpty functions from conflicting.
We'll also need 2 extra operations:
StackInit()
StackDestroy()
They are not part of the abstract concept of a stack, but they are necessary for setup and cleanup when writing the stack in C.
Finally, while the array that holds the contents of the stack will be dynamically-allocated, it still has a maximum size. So, this stack is unlike the abstract stack in that it can get full. We should add something to be able to test for this state:
StackIsFull()
  1. StackInit():
The first function we'll implement is StackInit(). It will need to set up a stackT structure so that it represents an empty stack.
Here is what the prototype for StackInit() looks like...
void StackInit(stackT *stackP, int maxSize);
It needs to change the stack passed to it, so the stack is passed by reference (stackT *). It also needs to know what the maximum size of the stack will be (i.e., maxSize).
Now, the body of the function must:
0.                  Allocate space for the contents.
    1. Store the maximum size (for checking fullness).
    2. Set up the top.
Here is the full function:
void StackInit(stackT *stackP, int maxSize)
{
  stackElementT *newContents;
 
  /* Allocate a new array to hold the contents. */
 
  newContents = (stackElementT *)malloc(sizeof(stackElementT)
                                        * maxSize);
 
  if (newContents == NULL) {
    fprintf(stderr, "Insufficient memory to initialize stack.\n");
    exit(1);  /* Exit, returning error code. */
  }
 
  stackP->contents = newContents;
  stackP->maxSize = maxSize;
  stackP->top = -1;  /* I.e., empty */
}
Note how we make sure that space was allocated (by testing the pointer against NULL). Also, note that if the stack was not passed by reference, we could not have changed its fields.
  1. StackDestroy():
The next function we'll consider is the one that cleans up a stack when we are done with it. It should get rid of any dynamically-allocated memory and set the stack to some reasonable state.
This function only needs the stack to operate on:
void StackDestroy(stackT *stackP);
and should reset all the fields set by the initialize function:
void StackDestroy(stackT *stackP)
{
  /* Get rid of array. */
  free(stackP->contents);
 
  stackP->contents = NULL;
  stackP->maxSize = 0;
  stackP->top = -1;  /* I.e., empty */
}
  1. StackIsEmpty() and StackIsFull():
Let's look at the functions that determine emptiness and fullness. Now, it's not necessary to pass a stack by reference to these functions, since they do not change the stack. So, we could prototype them as:
int StackIsEmpty(stackT stack);
int StackIsFull(stackT stack);
However, then some of the stack functions would take pointers (e.g., we need them for StackInit(), etc.) and some would not. It is more consistent to just pass stacks by reference (with a pointer) all the time. Furthermore, if the struct stackT is large, passing a pointer is more efficient (since it won't have to copy a big struct).
So, our prototypes will be:
int StackIsEmpty(stackT *stackP);
int StackIsFull(stackT *stackP);

Emptiness

Now, testing for emptyness is an easy operation. We've said that the top field is -1 when the stack is empty. Here's a simple implementation of the function...
int StackIsEmpty(stackT *stackP)
{
  return stackP->top < 0;
}

Fullness

Testing for fullness is only slightly more complicated. Let's look at an example stack.
Suppose we asked for a stack with a maximum size of 1 and it currently contained 1 element (i.e., it was full)...
stack (made up of 'contents' and 'top')
-----    -----
| A |    | 0 |
-----    -----
  0       top
contents
We can see from this example that when the top is equal to the maximum size minus 1 (e.g., 0 = 1 - 1), then it is full. Thus, our fullness function is...
int StackIsFull(stackT *stackP)
{
  return stackP->top >= stackP->maxSize - 1;
}
This illustrates the importance of keeping the maximum size around in a field like maxSize.
  1. StackPush():
Now, pushing onto the stack requires the stack itself as well as something to push. So, its prototype will look like:
void StackPush(stackT *stackP, stackElementT element);
The function should place an element at the correct position in the contents array and update the top. However, before the element is placed in the array, we should make sure the array is not already full...Here is the body of the function:
void StackPush(stackT *stackP, stackElementT element)
{
  if (StackIsFull(stackP)) {
    fprintf(stderr, "Can't push element on stack: stack is full.\n");
    exit(1);  /* Exit, returning error code. */
  }
 
  /* Put information in array; update top. */
 
  stackP->contents[++stackP->top] = element;
}
Note how we used the prefix ++ operator. It increments the top index before it is used as an index in the array (i.e., where to place the new element).
Also note how we just reuse the StackIsFull() function to test for fullness.
  1. StackPop():
Finally, popping from a stack only requires a stack parameter, but the value popped is typically returned. So, its prototype will look like:
stackElementT StackPop(stackT *stackP);
The function should return the element at the top and update the top. Again, before an element is removed, we should make sure the array is not empty....Here is the body of the function:
stackElementT StackPop(stackT *stackP)
{
  if (StackIsEmpty(stackP)) {
    fprintf(stderr, "Can't pop element from stack: stack is empty.\n");
    exit(1);  /* Exit, returning error code. */
  }
 
  return stackP->contents[stackP->top--];
}
Note how we had the sticky problem that we had to update the top before the function returns, but we need the current value of top to return the correct array element. This is accomplished easily using the postfix -- operator, which allows us to use the current value of top before it is decremented.
  1. Stack module:
Finally, don't forget that we are putting this stack in its own module. The stack types and function prototypes should go in stack.h. The stack function definitions should go in stack.c.
People that need to use the stack must include stack.h and link their code with stack.c (really, link their code with its object file, stack.o).
Finally, since we wrote the types and functions for a stack, we know how to use a stack. For example, when you need stacks, declare stack variables:
stackT s1, s2;
When you pass a stack to stack functions, pass it by reference:
StackInit(&s1, 10);
StackPush(&s1, 'Z');
  1. Testing the implementation:
We've written the complete stack module for you (stack.h and stack.c). Also, here's a sample main program stacktest.c and a Makefile.
The program demonstrates one use of the ordering produced by a stack...What is it?

BU CAS CS - Stack - Array Implementation
Copyright © 1993-2000 by Robert I. Pitts <rip@bu.edu> All Rights Reserved.

No comments:

Post a Comment

Write your openion about my blog spot..To get automatic facebook updates like my Pagehttps://www.facebook.com/shivashankar4u ..It takes only 1 min to write the comment and to like the page.. Thanks.