LeetCode 30 - Substring with Concatenation of All Words - Python


Question:


You are given a string s and an array of strings words. All the strings of words are of the same length.


A concatenated string is a string that exactly contains all the strings of any permutation of words concatenated.


For example, if words = ["ab","cd","ef"], then "abcdef", "abefcd", "cdabef", "cdefab", "efabcd", and "efcdab" are all concatenated strings. "acdbef" is not a concatenated string because it is not the concatenation of any permutation of words.


Return an array of the starting indices of all the concatenated substrings in s. You can return the answer in any order.


Example 1:

Input: s = "barfoothefoobarman", words = ["foo","bar"]


Output: [0,9]


Explanation:


The substring starting at 0 is "barfoo". It is the concatenation of ["bar","foo"] which is a permutation of words.

The substring starting at 9 is "foobar". It is the concatenation of ["foo","bar"] which is a permutation of words.

The output order does not matter. Returning [9,0] is fine too.


Example 2:

Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]


Output: []


Explanation:


There is no concatenated substring.


Example 3:

Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]


Output: [6,9,12]


Explanation:


The substring starting at 6 is "foobarthe". It is the concatenation of ["foo","bar","the"].

The substring starting at 9 is "barthefoo". It is the concatenation of ["bar","the","foo"].

The substring starting at 12 is "thefoobar". It is the concatenation of ["the","foo","bar"].


Answer:


We will be given a string s and an array of strings words. The ask is to combine every element of the words array, in any order to make a long string and check if such string exists in the given string s. If it does, then we need to return the starting index of the combined string in the s.


For e.g., if s = barfoothefoobarman and words = ["foo","bar"], we have two concatenated permutations of strings in the words array which are foobar and barfoo.


Now, we need to search both these permutations in s and return their starting indices. In our case, we will return indices list containing 0 and 9.


It is important to note that all the strings in the words array will should be present in the combined string to be searched in s. If any string is not present in s, then we will return empty list.


Also, note that every string in words is of same length.


Approach:


The main rule we need to follow is that all the strings in the 'words' array must be part of the combination we're searching for. Even if a string is repeated in the array, it should be considered that many times in the combination.


For instance, if 'words' is ["foo", "foo"], then we should search for "foofoo" in the given string 's'. To handle this, we can use a Map to store the count of each string in the 'words' array.


Since all strings in 'words' are of equal length, we can calculate the length of the combination we need to search for in 's' using this formula:


wordLength = words[0].length()

wordArrayLength = wordLength * words.length


Now, we'll search for a string of length 'wordArrayLength' in 's'. In each iteration, we'll:


1. Get a substring of length 'wordArrayLength'.

2. Break this substring into further substrings of length 'wordLength'.

3. Store the count of these substrings found into another map.

4. Finally, we'll compare both maps. If they're equal, it means all strings from 'words' are present in the substring, so we'll add the current index to the result list.


Why are we comparing maps? Because if the combined string exists in 's', then the counts of all strings in 'words' will match the counts of all the partitions in the substring of length 'wordArrayLength'.


Time Complexity:

Since we are scanning every string in words and every character in s, the time complexity will be O(mn), where m => length of words and n => length of s


Space Complexity:

We are using two maps to store the contents of words and partitions of substrings of s therefore, the space complexity will be O(m + n).


Code:


class SubstringWithConcatenationOfAllWords:

    def findSubstring(self, s: str, words: List[str]) -> List[int]:

        # Resultant list

        indices = []

        # Base conditions

        if s is None or len(s) == 0 or words is None or len(words) == 0:

            return indices


        # Dictionary to store the count of each word in the words array

        wordCount = dict()

        # Loop to store count of each word in the array

        for i in range(len(words)):

            if words[i] in wordCount:

                wordCount[words[i]] += 1

            else:

                wordCount[words[i]] = 1

        # Length of each word in the words array

        wordLength = len(words[0])

        # Total length of all the words in the array

        wordArrayLength = wordLength * len(words)

        # Loop for each character in the string

        for i in range(0, len(s) - wordArrayLength + 1):

            # Get the current string

            current = s[i:i + wordArrayLength]

            # Map to store the count of each word in the current

            wordMap = dict()

            # Index to loop through the array

            index = 0

            # Index to partition the current string

            j = 0

            # Loop through the words array

            while index < len(words):

                # Partition the string

                part = current[j: j + wordLength]

                # Save this in wordMap

                if part in wordMap:

                    wordMap[part] += 1

                else:

                    wordMap[part] = 1

                # Update the indices

                j += wordLength

                index += 1

            # Compare the two maps

            if wordMap == wordCount:

                indices.append(i)

        return indices