Hey There 👋🏻
We've previously talked about Arrays and Lists in this series, in this post, we'll dissect an interesting data structure, which is the HashSet data structure. This structure is useful in many cases and it does come up often times in coding problems.
Check out the end of this post for some practice problems that involve using the HashSet data structure for solving them.
Let's get started 🚀
we'll kick things off by asking the question, What's a HashSet?
A hash set is a collection that stores unique elements, meaning that there can't be duplicates inside of it. This data structure uses hashing as a mechanism for storing and retrieving the elements.
By the way, hashing is the process of assigning distinct codes for elements, the unique code is used to store and retrieve the element.
We can perform operations such as adding/removing elements, checking if a set contains an element, and a few set-based operations which we'll look at down the line.
HashSets Vs Lists
Both data structures are generic collections, meaning they're type-safe and are great options for storing items of some type, however, you may want to consider the distinguishing points in which they differ:
In lists, the elements are ordered, in hash sets, they're not
Lists allow duplication, hash sets don't
In short 🩳, lists are ordered collections while hash sets are not ordered
How do we create a HashSet?
There're two ways for creating a hash set, we can either create an empty hash set that we can add elements to later on, or create a hash set from an existing collection, we'll look at both ways.
Creating an empty hash set
HashSet<int> numberSet = new();
HashSet<int> numberSet = new HashSet<int>();
var numberSet = new HashSet<int>();
All of the above statements are valid ways to creating an empty hash set.
But now our hash set doesn't have anything, let's see how we can add elements to it.
numberSet.Add(5);
numberSet.Add(10);
These two lines add two values to our number set, the values are 5 and 10, we'll use a foreach loop to go print out the values to the console:
foreach (int number in numberSet)
{
Console.Write("{0}\t", number);
}
//OUTPUT: 5 10
That's kind of lame though, let's look at the second method of creating a hash set.
Creating a hash set from a collection
We can use the HashSet Class Constructor, which takes an IEnumerable as a parameter, and builds a hash set using the values inside the parameter:
List<int> numbers = new() {5, 10, 5, 15, 15, 10};
HashSet<int> uniqueNumbers = new(numbers);
The previous code snippet creates a list of integers, with 6 elements in it, then we create a hash set called uniqueNumbers
that uses the numbers
list as a parameter for its constructor, the resulting set is a set with 3 values, 5, 10, and 15. Why?
Because the set has removed all the duplicates and only stored the unique elements, you can now use a foreach loop to look at the elements of the set.
Removing elements from a hash set
We say how we can add elements to a set in a code snippet before, removing an element is no different than adding, we call the Remove();
method and pass the element we want to remove, let's look at an example:
HashSet<string> values = new HashSet<string> {"Value1", "Value2", "Value3"};
values.Remove("Value2");
The previous code creates a hash set of type string, and stores 3 values in it, we're then calling the remove method to remove the value "Value2", if you now print out the values in the set, you'll see that Value2 is in fact gone.
How to check if an element exists in a set
Sometimes we may want to search for a particular element in a hash set to see wether it does exist or not, for that purpose, we have the Contains();
method, it takes an element, and searches for it, if it finds it, it returns true, and false is returned otherwise, we're not going to look at a coding example, because I think it's really straightforward.
How to remove all elements from a set
To do that, we can use the Clear();
method, this method will remove all elements from a given set.
HashSet<string> values = new HashSet<string> {"Value1", "Value2", "Value3"};
values.Clear();
How to get the number of elements in a set
We may sometimes want to check if the set contains any elements, like what's the total number of elements contained in the Hashset, for that, we can use the Count
property, which as the name implies, counts the number of elements that exist in the set and return it.
Let's talk about some interesting functionality now
We've just discussed some of the fundamental operations that could be done with hash sets, now, we'll look at some really cool yet useful methods that could be used with hash sets.
Intersect
Say we have 2 sets, each one contains some unique values, and we want to get the values that are common/shared between the 2 sets, we can use the IntersectWith();
method, this method returns the values that are common between the set on which we're calling the IntersectWith
method, and the other set which we have to pass as an argument to the method, let's look at some code so the picture becomes clearer:
HashSet<char> firstSet = new HashSet<char> { 'b', 'A', 'c', 'R', 'n' };
HashSet<char> secondSet = new HashSet<char> { 'A', 'x', 'q', 'b', 'c' };
HashSet<char> commonChars = new HashSet<char>(firstSet);
commonChars.IntersectWith(secondSet);
foreach (var c in commonChars)
{
Console.Write($"{c,-5}"); //OUTPUT: b A c
}
We first begin by creating 2 sets, and initialize them we some random chars, and then I'm creating a new set called commonChars
, and I'm passing the first set to the hash set constructor, at first, this set will include the unique chars from the first set, but then when I make the call to IntersectWith
and I pass the second set as an argument, the commonChars
set will contain the commons between the 2 sets.
Union
We've seen how we can find the intersection between 2 sets, what if we want to combine 2 sets, we can do that too, using the UnionWith();
method, this method takes 2 sets, and combine them together while also keeping the duplicates out because you must recall that hash sets store only unique elements inside them, we'll use the sets from the previous example to see what we get as a result:
HashSet<char> firstSet = new HashSet<char> { 'b', 'A', 'c', 'R', 'n' };
HashSet<char> secondSet = new HashSet<char> { 'A', 'x', 'q', 'b', 'c' };
HashSet<char> union = new HashSet<char>(firstSet);
union.UnionWith(secondSet);
foreach (var c in union)
{
Console.Write($"{c,-5}"); //OUTPUT: b A c R n x q
}
Everything here is exact to what we had before, except for the name of the new set, and the method we're calling. You can see from the output that our new set union
is keeping hold of chars from both the sets, while also not keeping any duplicates.
Your practice question
For this post, the practice question will be slightly different, I'll tell you a story with the question included, and you'll have to figure out a solution for the core mechanic needed to be built.
Make sure to check the end of the post for the test data
An insurmountable(sorta) programming assignment
This post has been drafted for more than 3 weeks, I had my first semester college assignments and I'd gotten really busy hence the delay of this post which is part of the Data Structures Series I was doing. So I had a programming assignment, it was a poorly laid out assignment and figuring out what the assignment wanted was itself a hurdle, this aside, the assignment had insane constraints, only things allowed were loops, if statements and ASCII Codes, no built-in methods and no user defined methods, the main purpose was actually getting a user Id and storing the unique chars from it, and also generate a random string for a given length n, then the user must be prompted to type in the unique chars that are common between the question and the user Id, so, my question to you, with the constraints aside, how do you think we could do that? Using what we've done and learned from this post!
Now the assignment had much more than that, which was meant to be implemented manually, like ask the user to type in the number of questions they want to be asked, or the length of each question, that may sound simple and straightforward, but since no built-in methods were allowed, it was pretty painful doing every checking manually from the ground up, but for your practice question, I just want you to figure out the core mechanic that made up the whole assignment, that is finding the correct answer i.e the common chars between the question unique chars and the user id unique chars.
Constraint: The correct answer must have the chars appear in the order in which the chars appear in the question.
Test Data
1:
User ID: Rasheed Mozaffar
ID Distincts: RashedMozfr
Question: PnCCfRB3539ZvO3X69Q0rvJUU
Answer/Output: fRr
2:
User ID: Tony
ID Distincts: Tony
Question: b390cD45qn1y9sb
Answer/Output: ny
And here you go, the fundamentals of hash sets and a question to answer, have a go it, and remember, have fun!
...