DAY 38 - Alien Dictionary

Nayan Pahuja - Jul 11 '23 - - Dev Community

Hey everyone! Welcome back to another day of problem-solving. It's day 38, and I'm still going strong on my quest for consistency. Today, we have an interesting problem to tackle. But before we dive in, let's make sure we're all on the same page.

Incase you are unfamiliar with topological sort in graphs, I suggest reading the previous article 37 to understand working behind topological sort.

Question: Alien Dictionary

Given a sorted dictionary of an alien language having N words and k starting alphabets of standard dictionary. Find the order of characters in the alien language.

Note: Many orders may be possible for a particular test case, thus you may return any valid order and output will be 1 if the order of string returned by the function is correct else 0 denoting incorrect string returned.


Example 1:

  • Input:
  • N = 5, K = 4
  • dict = {"baa","abcd","abca","cab","cad"}
  • Output: 1
  • Explanation: Here order of characters is 'b', 'd', 'a', 'c' Note that words are sorted and in the given language "baa" comes before "abcd", therefore 'b' is before 'a' in output.

Example 2:

  • Input:
  • N = 3, K = 3
  • dict = {"caa","aaa","aab"}
  • Output: 1
  • Explanation: Here order of characters is 'c', 'a', 'b' Note that words are sorted and in the given language "caa" comes before "aaa", therefore 'c' is before 'a' in output.

Constraints:

  • 1 ≤ N, M ≤ 300
  • 1 ≤ K ≤ 26
  • 1 ≤ Length of words ≤ 50

Intuition:

Let's start by breaking down the question first.
We are given a new dictionary where the orders of alphabetical characters is not the same as it is in our English dictionary.
We follow the character ordering as 'a' -> 'b' -> 'c'........->'z'.
We are given a new dictionary and we have to find it's character order. To help us out they have told us that the dictionary is sorted(like it should be) means if we can differentiate the characters from one word to it's next word, we can find one thing, that in the that specific order it comes first.

Let's check out an example first.

Say in Example 1: baa comes before abcd.
If we compare both words character by character we can obtain from our observation that 'b' comes before 'a'. So we store that.
Next if we check 'abcd' comes before 'abca' ,hence 'd' > 'a'.
Similarly as we go ahead we get 'a' > 'c'

Our order at the end will be:
'b' -> 'd' -> 'a' -> 'c'

Hence we can see it's like topological sort where one must come before another. At the end we need to return the order that for that graph. We can apply that here.

Algorithm:

Hey there! Let's dive into the algorithm for solving the problem at hand. We need to follow a few steps to create the adjacency list for the graph and perform a topological sort. I'll break down the steps for you:

Step 1: Creating the Adjacency List

  • We'll start by running a loop from the starting index up to the second last index. This allows us to check the current index (let's call it s1) and the next index (s2).
  • Inside this loop, we'll compare two words: the word at the current index (s1) and the word at the next index (s2). To compare them, we'll run another loop up to the length of the smaller string.
  • Within this second loop, if we find any inequality at a particular index (s1[i] != s2[i]), we'll add them to the adjacency list as an edge from s1[i] to s2[i], considering them as numbers (subtracting 'a' from them). Once we find this differing character, we'll immediately break out of the loop.
  • This process allows us to find the first differentiating character for adjacent strings and create the graph. So at the end of this step, we'll have our adjacency list ready.

Step 2: Topological Sort

  • Moving on, we'll calculate the indegree of each node in the graph and store it in an indegree array. We can iterate through the adjacency list and for every node u->v, we'll increase the indegree of v by 1 in the indegree array.
  • Initially, there will always be at least one node with an indegree of 0. We'll push such nodes into a queue.
  • Next, we'll start popping nodes from the queue. For each popped node, we'll add it to our answer array. Then, for all its adjacent nodes, we'll decrease their indegree by 1 in the indegree array. For example, if node u (popped from the queue) has an edge towards node v (u->v), we'll decrease indegree[v] by 1.
  • If, after decrementing the indegree, any node's indegree becomes 0, we'll push that node back into the queue.
  • We'll repeat these steps until the queue is completely empty, ensuring we traverse the entire graph in a Breadth-First Search (BFS) manner. By the end, we'll have the linear ordering of nodes in our answer array.

Step 3: Finalizing the Answer

  • Finally, to obtain the final answer, we'll iterate over the elements in the answer array. We'll add each element to a final string, converting it back into a character by adding 'a' to each of them.
  • After this iteration, we'll have our final string ready.
  • We can return this string as our ultimate answer, solving the problem at hand.
  • That's it! Following these steps, we can create the adjacency list, perform a topological sort using BFS, and obtain the final answer as a string. I hope this breakdown helps you understand the algorithm better.

Code:

// User function Template for C++

class Solution{
    public:
    vector<int> topoSort(int V, vector<int> adj[]) 
    {
       vector<int> indegree(V,0);
       for(int i = 0; i < V; i++){
           for(auto it: adj[i]){
               indegree[it]++;
           }
       }
       queue<int> q;
       for(int i = 0;  i < V; i++){
           if(indegree[i] == 0){
               q.push(i);
           }
       }
       vector<int> res;
       while(!q.empty()){
           int node = q.front();
           q.pop();
           res.push_back(node);

           for(auto it: adj[node]){
               indegree[it]--;
               if(indegree[it] == 0){
                   q.push(it);
               }
           }
       }
       return res;
    }
    string findOrder(string dict[], int N, int K) {
        vector<int> adj[K];
        //step to get adjacency list
        for(int i = 0; i < N-1; i++){
            string s1 = dict[i];
            string s2 = dict[i+1];
            int len = min(s1.size(),s2.size());
            for(int j = 0; j < len; j++){
                if(s1[j] != s2[j]){
                    adj[s1[j] - 'a'].push_back(s2[j] - 'a');
                    break;
                }
            }
        }
       vector<int> res = topoSort(K,adj);
       string ans = "";
       for(auto it : res){
           ans  = ans + char(it + 'a');
       }
       return ans;

    }
};
Enter fullscreen mode Exit fullscreen mode

Complexity Analysis:

  • Time: O(N*L)
  • Space: O(K)

Feel free to let me know if you have any questions or need further clarification. Happy problem-solving!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .