Maximum Length of a Concatenated String with Unique Characters

Microsoft Online Assessment Maximum Length of a Concatenated String Length with unique Characters

Given an array of strings arr. String s is a concatenation of a sub-sequence of arr which have unique characters. Return the maximum possible length of s.

Also See: Microsoft Online Assessment Questions and Solution

Example 1:

Input: arr = ["un","iq","ue"]
Output: 4
Explanation: All possible concatenations are "","un","iq","ue","uniq" and "ique".
Maximum length is 4.

Example 2:

Input: arr = ["cha","r","act","ers"]
Output: 6
Explanation: Possible solutions are "chaers" and "acters".

Example 3:

Input: arr = ["abcdefghijklmnopqrstuvwxyz"]
Output: 26

Constraints:

  • 1 <= arr.length <= 16
  • 1 <= arr[i].length <= 26
  • arr[i] contains only lower case English letters.

Solution:

We have a given array of strings. We have to find all possible combinations of these strings. We have one limitation — we can’t combine strings which have the same letters, in other words we can combine strings only with unique letters. It easy to see that we need to invent some way of fast check if two strings have the same letters. And it’s easy to see that length of any result string cannot exceed length of the alphabet, in our case 26 letters.

We can make a map where given strings will be the values and each string will have a sorted array of letters which used in this string as a key.

For example array of strings “coco”,”dodo”,”interactive” will have following look as a map:

Maximum Length of a Concatenated String with Unique Characters

It is easy to compare sorted letters and find if there are common letters in two strings.

But may be there is a way to speedup this comparison?

If each letter can appears only one time in the result string then maximum result string is the alphabet and has length 26 letters. We can reflect the alphabet into a bitset. If a letter appears in the string the corresponding bit is set to 1.

Word “interactive” as a bitset will looks like this:

Concatenated String Length with unique Characters

26 bits is less than a register of CPU, so in ideal case, we can compare two bitsets in one CPU command. Nothing can be faster.

Therefore the final algorithm should looks like this:

  1. Iterate the input strings, skipping the strings that have duplicate characters.
  2. For each string with unique chars, check if it has the same chars as the result string.
  3. If they have intersection of characters, we skip it.
  4. If not, we append this new combination to the result.
  5. return length of the result string.

We should not work with the strings directly. We can convert them into the bitsets and work with the bits only. It significantly decrease memory consumption and increase speed of the program.

Program C++:

#include <iostream>
#include <vector>
#include <bitset>

using namespace std;

int solution(const vector<string>& v) {
    // We will keep in this vector bitmaps of used letters
    // for the processed words.
    // Add one empty bitset for comparison with the first
    // processed word. It makes the algorithm a bit shorter
    vector<bitset<26>> char_bits_vector = {bitset<26>()};
    int result = 0;
    // for each string in the vector make a bitset where all
    // bits corresponding to characters in alphabet are set.
    for (auto& str : v) {
        bitset<26> char_bits;
        // set bits corresponding to chars in the string.
        for (char c : str) { char_bits.set(c - 'a'); }
        // How many bits were set.
        int bit_num = char_bits.count();
        // the string contains duplicate characters so don't process it
        if (bit_num < str.size()) continue;

        // Check if current word has common letters with already processed words
        for (int i = char_bits_vector.size() - 1; i >= 0; --i) {
            auto& c_b = char_bits_vector[i];
            // if two bitsets have common 1 bits i.e.
            // if two words have common letters don't process current word
            if ((c_b & char_bits).any()) continue;
            // if current word has unique letters add to the vector a bitset where
            // all bits corresponding to letters of the current word are set to 1.
            char_bits_vector.push_back(c_b | char_bits);
            // add length of the current word to the result
            result = max<int>(result, c_b.count() + bit_num);
        }
    }
    return result;
}

int main() {

    cout << solution({"co","dil","ity"}) << " Expected 5" << endl;

    return 0;
}

Program Python:

Explanation

  • Basic idea is to turn arr to [(set(word), len(word))] if word has no repeat letter
  • Then we can use union (&) operation to check whether length of set after union is same as the sum of two size
    • if (union_len:=len(union:=arr[i][0] | cur_s)) == arr[i][1] + cur_l]
    • if len(arr[i][0] | cur_s) == arr[i][1] + cur_l same as above
  • Due to the small data sacle (max length of arr is 16), use backtracking to try out all possbilities
  • Maintain a maximum length with ans
class Solution:
    def maxLength(self, arr: List[str]) -> int:
        arr = [(s, l) for word in arr if (l:=len(s:=set(word))) == len(word)]
        ans, n = 0, len(arr)
        def dfs(arr, cur_s, cur_l, idx):
            nonlocal ans, n
            ans = max(ans, cur_l)
            if idx == n: return 
            [dfs(arr, union, union_len, i+1) for i in range(idx, n) if (union_len:=len(union:=arr[i][0] | cur_s)) == arr[i][1] + cur_l]
        dfs(arr, set(), 0, 0)                    
        return ans

Program Python: Code 2

from typing import List
def maxLength(arr: List[str]) -> int:
      maxlen = 0
      unique = ['']
      def isvalid(s):
          return len(s) == len(set(s))
      for word in arr:
          for j in unique:
              tmp = word + j
              if isvalid(tmp):
                  unique.append(tmp)
                  maxlen = max(maxlen, len(tmp))
      return maxlen
if __name__ == "__main__":
    word_list = input().split()
    print(maxLength(word_list))

Program Java:

import java.util.*;
import java.util.stream.Collectors;
class Solution {
    public static int maxLength(String[] args) {
      
        List<String> arr = Arrays.asList(args);
        int maxLen = 0;
        arr = arr.stream().filter(str-> isUnique(str)).collect(Collectors.toList());
        Map<String,Integer> mem = new HashMap<>();
        maxLen = dfs(arr, "", 0, maxLen, mem);
        return maxLen;
      }
      private static int dfs(List<String> arr, String path, int i, int maxLen, Map<String, Integer> mem) {
            if (mem.get(path) != null) return mem.get(path);
            boolean pathIsUnique = isUnique(path);
            if (pathIsUnique) {
                maxLen = Math.max(path.length(), maxLen);
            }
            if (i == arr.size() || !pathIsUnique) {
                mem.put(path, maxLen);
                return maxLen;
            }
            for (int j = i; j < arr.size(); j++) {
                maxLen = dfs(arr, path + arr.get(j), j + 1, maxLen, mem);
            }
            mem.put(path, maxLen);
            return maxLen;
      }
      public static boolean isUnique(String str) {
          char[] characters = str.toCharArray();
          if (characters != null) Arrays.sort(characters);
          for (int i = 0; i < characters.length - 1; i++) {
              if (characters[i] == characters[i + 1]) return false;
          }
          return true;
    }
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);
        String[] wordList = scanner.nextLine().split(" ");
        scanner.close();
        System.out.println(maxLength(wordList));
    }
}

Program Javascript:

const readline = require('readline');
function maxLength(arr) {
    let maxLen = 0;
    arr = arr.filter(isUnique);
    const mem = {};
    maxLen = dfs(arr, "", 0, maxLen, mem);
    function dfs(arr, path, i, maxLen, mem) {
        if (mem[path]) return mem[path];
        let pathIsUnique = isUnique(path);
        if (pathIsUnique) {
            maxLen = Math.max(path.length, maxLen);
        }
        if (i === arr.length || !pathIsUnique) {
            mem[path] = maxLen;
            return maxLen;
        }
        for (let j = i; j < arr.length; j++) {
            maxLen = dfs(arr, path + arr[j], j + 1, maxLen, mem);
        }
        mem[path] = maxLen;
        return maxLen;
    }
    function isUnique(str) {
        const set = new Set(str);
        return set.size === str.length
    }
    return maxLen;
}
const rl = readline.createInterface(process.stdin);
const inputs = [];
rl.on('line', (input) => {
    input !== '' ? inputs.push(input) : rl.close();
}).on('close', () => {
    const word_list = inputs[0].split(' ');
    console.log(maxLength(word_list));
});

Also See: AMCAT Study Materials, Preparation Guide

Also See: Amazon OA 2021 (Online Assessment) Questions Preparation

Microsoft Online Assessment 2021 Questions

Leave a Comment