Heap Sort

Paul Ngugi - Jul 17 - - Dev Community

A heap sort uses a binary heap. It first adds all the elements to a heap and then removes the largest elements successively to obtain a sorted list. Heap sorts use a binary heap, which is a complete binary tree. A binary tree is a hierarchical structure. It either is empty or it consists of an element, called the root, and two distinct binary trees, called the left subtree and right subtree. The length of a path is the number of the edges in the path. The depth of a node is the length of the path from the root to the node.

A binary heap is a binary tree with the following properties:

  • Shape property: It is a complete binary tree.
  • Heap property: Each node is greater than or equal to any of its children.

A binary tree is complete if each of its levels is full, except that the last level may not be full and all the leaves on the last level are placed leftmost. For example, in Figure below, the binary trees in (a) and (b) are complete, but the binary trees in (c) and (d) are not complete. Further, the binary tree in (a) is a heap, but the binary tree in (b) is not a heap, because the root (39) is less than its right child (42).

Image description

Heap is a term with many meanings in computer science. In this chapter, heap means a binary heap. A heap can be implemented efficiently for inserting keys and for deleting the root.

Storing a Heap

A heap can be stored in an ArrayList or an array if the heap size is known in advance. The heap in Figure below (a) can be stored using the array in Figure below (b). The root is at position 0, and its two children are at positions 1 and 2. For a node at position i, its left child is at position 2i + 1, its right child is at position 2i + 2, and its parent is (i - 1)/2. For example, the node for element 39 is at position 4, so its left child (element 14) is at 9 (2 * 4 + 1), its right child (element 33) is at 10 (2 * 4 + 2), and its parent (element 42) is at 1 ((4 - 1)/2).

Image description

Adding a New Node

To add a new node to the heap, first add it to the end of the heap and then rebuild the tree as follows:

Let the last node be the current node;
while (the current node is greater than its parent) {
Swap the current node with its parent;
Now the current node is one level up;
}

Suppose a heap is initially empty. That heap is shown in Figure below, after adding numbers 3, 5, 1, 19, 11, and 22 in this order.

Image description

Now consider adding 88 into the heap. Place the new node 88 at the end of the tree, as shown in Figure below (a). Swap 88 with 19, as shown in Figure below (b). Swap 88 with 22, as shown in Figure below (c).

Image description

Removing the Root

Often you need to remove the maximum element, which is the root in a heap. After the root is removed, the tree must be rebuilt to maintain the heap property. The algorithm for rebuilding the tree can be described as follows:

Move the last node to replace the root;
Let the root be the current node;
while (the current node has children and the current node is
smaller than one of its children) {
Swap the current node with the larger of its children;
Now the current node is one level down;
}

Figure below shows the process of rebuilding a heap after the root 62 is removed from Figure in Storing a Heap (a). Move the last node, 9, to the root, as shown in Figure below (a). Swap 9 with 59, as shown in Figure below (b); swap 9 with 44, as shown in Figure below (c); and swap 9 with 30, as shown in Figure below (d).

Image description

Figure below shows the process of rebuilding a heap after the root, 59, is removed from Figure above (d). Move the last node, 17, to the root, as shown in Figure below (a). Swap 17 with 44, as shown in Figure below (b), and then swap 17 with 30, as shown in Figure below (c).

Image description

The Heap Class

Now you are ready to design and implement the Heap class. The class diagram is shown in Figure below. Its implementation is given in code below.

Image description

package demo;

public class Heap<E extends Comparable<E>> {
    private java.util.ArrayList<E> list = new java.util.ArrayList<>();

    /** Create a default heap */
    public Heap() {
    }

    /** Create a heap from an array of objects */
    public Heap(E[] objects) {
        for(int i = 0; i < objects.length; i++)
            add(objects[i]);
    }

    /** Add a new object into the heap */
    public void add(E newObject) {
        list.add(newObject); // Append to the heap
        int currentIndex = list.size() - 1; // The index of the last node

        while(currentIndex > 0) {
            int parentIndex = (currentIndex - 1) / 2;
            // Swap if the current object is greater than its parent
            if(list.get(currentIndex).compareTo(list.get(parentIndex)) > 0) {
                E temp = list.get(currentIndex);
                list.set(currentIndex, list.get(parentIndex));
                list.set(parentIndex, temp);
            }
            else
                break; // The tree is a heap now

            currentIndex = parentIndex;
        }
    }

    /** Remove the root from the heap */
    public E remove() {
        if(list.size() == 0) return null;

        E removedObject = list.get(0);
        list.set(0, list.get(list.size() - 1));
        list.remove(list.size() - 1);

        int currentIndex = 0;
        while(currentIndex < list.size()) {
            int leftChildIndex = 2 * currentIndex + 1;
            int rightChildIndex = 2 * currentIndex + 2;

            // Find the maximum between two children
            if(leftChildIndex >= list.size()) break; // The tree is a heap
            int maxIndex = leftChildIndex;
            if(rightChildIndex < list.size()) {
                if(list.get(maxIndex).compareTo(list.get(rightChildIndex)) < 0) {
                    maxIndex = rightChildIndex;
                }
            }

            // Swap if the current node is less than the maximum
            if(list.get(currentIndex).compareTo(list.get(maxIndex)) < 0) {
                E temp = list.get(maxIndex);
                list.set(maxIndex, list.get(currentIndex));
                list.set(currentIndex, temp);
                currentIndex = maxIndex;
            }
            else
                break; // The tree is a heap
        }

        return removedObject;
    }

    /** Get the number of nodes in the tree */
    public int getSize() {
        return list.size();
    }
}

Enter fullscreen mode Exit fullscreen mode

A heap is represented using an array list internally (line 4). You can change the array list to other data structures, but the Heap class contract will remain unchanged.

The add(E newObject) method (lines 17–34) appends the object to the tree and then swaps the object with its parent if the object is greater than its parent. This process continues until the new object becomes the root or is not greater than its parent.

The remove() method (lines 37–70) removes and returns the root. To maintain the heap property, the method moves the last object to the root position and swaps it with its larger child if it is less than the larger child. This process continues until the last object becomes a leaf or is not less than its children.

Sorting Using the Heap Class

To sort an array using a heap, first create an object using the Heap class, add all the elements to the heap using the add method, and remove all the elements from the heap using the remove method. The elements are removed in descending order. The code below gives a program for sorting an array using a heap.

Image description

Heap Sort Time Complexity

Let us turn our attention to analyzing the time complexity for the heap sort. Let h denote the height for a heap of n elements. The height of a heap is the number of nodes in the longest path from the root to a leaf node. Since a heap is a complete binary tree, the first level has 1 node,
the second level has 2 nodes, the kth level has 2^(k - 1) nodes, the (h - 1) level has 2^(h - 2) nodes, and the hth level has at least 1 and at most 2^(h - 1) nodes. Therefore,

Image description

Since the add method traces a path from a leaf to a root, it takes at most h steps to add a new element to the heap. Thus, the total time for constructing an initial heap is O(n logn) for an array of n elements. Since the remove method traces a path from a root to a leaf, it takes at most h steps to rebuild a heap after removing the root from the heap. Since the remove method is invoked n times, the total time for producing a sorted array from a heap is O(n logn).

Both merge sorts and heap sorts require O(n logn) time. A merge sort requires a temporary array for merging two subarrays; a heap sort does not need additional array space. Therefore, a heap sort is more space efficient than a merge sort.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .