Exploring HashSet: A Dive into Unordered Collections

Arshi Saxena - Oct 14 - - Dev Community

Introduction

The HashSet class is part of the Java Collections Framework, providing a fast, unordered collection that doesn't allow duplicate elements. It is built on top of the HashMap, meaning it inherits the same time complexity benefits but focuses purely on element uniqueness. In this article, we’ll explore how HashSet works, what makes it unique, and why it's different from other collections.


What is a HashSet?

A HashSet is:

  • Unordered: The elements have no predictable sequence.
  • Unique: Duplicate elements are ignored.
  • Internally backed by a HashMap: It uses a HashMap to store elements, focusing only on the keys while discarding values.
  • O(1) average time complexity: Operations like insertion, deletion, and search are highly efficient.

1. Initializing a HashSet

// Parameterized constructor with initial capacity
Set<Integer> setWithInitialCapacity = new HashSet<>(5);

// Parameterized constructor using a collection
Set<Integer> setWithCollection = new HashSet<>(Arrays.asList(4, 4, 3));

// Default constructor with default capacity 16
Set<Integer> set = new HashSet<>();
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • Default Constructor: Creates a HashSet with an initial capacity of 16.
  • Parameterized Constructor: You can specify the initial capacity, but note that capacity isn’t the same as size. Size refers to the actual number of elements in the set.
  • Using a Collection: A HashSet can be created from a collection like a list, ensuring that only unique elements are retained.

2. Adding Elements to a HashSet

set.add(1);
set.add(2);
set.add(1); // Duplicate value is ignored
System.out.println(set); // Output -> [1, 2]
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • The add() method inserts elements into the HashSet.
  • Duplicate elements are ignored. When you try to add 1 twice, only the first occurrence is retained.

Key Takeaway

If you need to replace duplicate values instead of ignoring them, HashSet won’t be the right choice. This is because it prioritizes element uniqueness.


3. Checking Size vs Capacity

// Parameterized constructor with initial capacity
Set<Integer> setWithInitialCapacity = new HashSet<>(5);
System.out.println(setWithInitialCapacity.size()); // Output -> 0
Enter fullscreen mode Exit fullscreen mode

Even though the capacity of setWithInitialCapacity is 5, the size is 0 because size reflects the number of elements present in the set, not the initial capacity. You can think of capacity as the internal storage space, which adjusts as elements are added.


4. Using HashSet with Collections

// Parameterized constructor using a collection
Set<Integer> setWithCollection = new HashSet<>(Arrays.asList(4, 4, 3));
System.out.println(setWithCollection); // Output -> [3, 4] or [4, 3]
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • Although three elements were provided in the list (4, 4, 3), the duplicate value 4 was discarded, leaving only two elements (3 and 4).
  • The order of elements is unpredictable because HashSet doesn’t maintain any insertion or natural ordering.

If you need to retain sorted elements, consider using a TreeSet, which ensures elements are arranged in ascending order.


5. Indexing in HashSet – Is it Possible?

In interviews, a common question is whether you can retrieve an index of an element in a HashSet. The answer is No, because HashSet uses a hashing mechanism to store elements, not an index-based structure like a list or an array.


Summary of Key Points

  1. Unordered and Unique: HashSet only retains unique elements, ignoring duplicates.
  2. Built on HashMap: It uses the keys of an internal HashMap to store elements.
  3. Fast Operations: Average time complexity is O(1) for adding, removing, and checking elements.
  4. Capacity vs Size: Capacity is the allocated space, while size is the actual number of elements.
  5. No Indexing: You cannot retrieve elements by index due to the hashing mechanism.

Relation to HashMap

Since HashSet is backed by a HashMap, it uses the keys of the map to store elements, while the values are irrelevant. This is why every element in a HashSet must be unique, just like the keys in a HashMap.


Conclusion

HashSet is a powerful tool when you need a fast, unordered collection that avoids duplicates. While it offers O(1) time complexity for most operations, it lacks features like sorting and indexing. For developers, knowing how HashSet relates to HashMap helps to understand its inner workings and make better use of the collections framework.

In the next post, we’ll explore a common interview question frequently asked in interviews to test candidates' knowledge of collections concepts.


Related Posts

Happy Coding!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .