Word Break Problem | DP-32

Given an input string and a dictionary of words, find out if the input string can be segmented into a space-separated sequence of dictionary words. See following examples for more details.
This is a famous Google interview question, also being asked by many other companies now a days.

Consider the following dictionary
{ i, like, sam, sung, samsung, mobile, ice,
cream, icecream, man, go, mango}

Input:  ilike
Output: Yes
The string can be segmented as "i like".

Input:  ilikesamsung
Output: Yes
The string can be segmented as "i like samsung"
or "i like sam sung".

Recursive implementation:
The idea is simple, we consider each prefix and search it in dictionary. If the prefix is present in dictionary, we recur for rest of the string (or suffix). If the recursive call for suffix returns true, we return true, otherwise we try next prefix. If we have tried all prefixes and none of them resulted in a solution, we return false.

We strongly recommend to see substr function which is used extensively in following implementations.

C++

 // A recursive program to test whether a given  // string can be segmented into space separated  // words in dictionary #include using namespace std;    /* A utility function to check whether a word is    present in dictionary or not. An array of strings    is used for dictionary.  Using array of strings for   dictionary is definitely not a good idea. We have    used for simplicity of the program*/ int dictionaryContains(string word) {     string dictionary[] = {"mobile","samsung","sam","sung",                             "man","mango","icecream","and",                              "go","i","like","ice","cream"};     int size = sizeof(dictionary)/sizeof(dictionary);     for (int i = 0; i < size; i++)         if (dictionary[i].compare(word) == 0)            return true;     return false; }    // returns true if string can be segmented into space  // separated words, otherwise returns false bool wordBreak(string str) {     int size = str.size();        // Base case     if (size == 0)  return true;        // Try all prefixes of lengths from 1 to size     for (int i=1; i<=size; i++)     {         // The parameter for dictionaryContains is          // str.substr(0, i) which is prefix (of input          // string) of length 'i'. We first check whether          // current prefix is in  dictionary. Then we          // recursively check for remaining string         // str.substr(i, size-i) which is suffix of           // length size-i         if (dictionaryContains( str.substr(0, i) ) &&             wordBreak( str.substr(i, size-i) ))             return true;     }        // If we have tried all prefixes and      // none of them worked     return false; }    // Driver program to test above functions int main() {     wordBreak("ilikesamsung")? cout <<"Yes ": cout << "No ";     wordBreak("iiiiiiii")? cout <<"Yes ": cout << "No ";     wordBreak("")? cout <<"Yes ": cout << "No ";     wordBreak("ilikelikeimangoiii")? cout <<"Yes ": cout << "No ";     wordBreak("samsungandmango")? cout <<"Yes ": cout << "No ";     wordBreak("samsungandmangok")? cout <<"Yes ": cout << "No ";     return 0; }

Java

 import java.util.*;    // Recursive implementation of  // word break problem in java public class WordBreakProblem {        // set to hold dictionary values     private static Set dictionary = new HashSet<>();            public static void main(String []args)     {                    // array of strings to be added in dictionary set.         String temp_dictionary[] = {"mobile","samsung","sam","sung",                              "man","mango","icecream","and",                              "go","i","like","ice","cream"};                                        // loop to add all strings in dictionary set          for (String temp :temp_dictionary)         {             dictionary.add(temp);         }                    // sample input cases         System.out.println(wordBreak("ilikesamsung"));         System.out.println(wordBreak("iiiiiiii"));         System.out.println(wordBreak(""));         System.out.println(wordBreak("ilikelikeimangoiii"));         System.out.println(wordBreak("samsungandmango"));         System.out.println(wordBreak("samsungandmangok"));                }            // returns true if the word can be segmented into parts such     // that each part is contained in dictionary     public static boolean wordBreak(String word)     {         int size = word.length();                    // base case         if (size == 0)         return true;                    //else check for all words         for (int i = 1; i <= size; i++)         {             // Now we will first divide the word into two parts ,             // the prefix will have a length of i and check if it is              // present in dictionary ,if yes then we will check for              // suffix of length size-i recursively. if both prefix and              // suffix are present the word is found in dictionary.                if (dictionary.contains(word.substring(0,i)) &&                      wordBreak(word.substring(i,size)))             return true;         }                    // if all cases failed then return false         return false;     } }    // This code is contributed by Sparsh Singhal

Output:

Yes
Yes
Yes
Yes
Yes
No

Dynamic Programming
Why Dynamic Programming? The above problem exhibits overlapping sub-problems. For example, see the following partial recursion tree for string “abcde” in worst case. // A Dynamic Programming based program to test whether a given string can // be segmented into space separated words in dictionary #include #include using namespace std;    /* A utility function to check whether a word is present in dictionary or not.   An array of strings is used for dictionary.  Using array of strings for   dictionary is definitely not a good idea. We have used for simplicity of   the program*/ int dictionaryContains(string word) {     string dictionary[] = {"mobile","samsung","sam","sung","man","mango",                            "icecream","and","go","i","like","ice","cream"};     int size = sizeof(dictionary)/sizeof(dictionary);     for (int i = 0; i < size; i++)         if (dictionary[i].compare(word) == 0)            return true;     return false; }    // Returns true if string can be segmented into space separated // words, otherwise returns false bool wordBreak(string str) {     int size = str.size();     if (size == 0)   return true;        // Create the DP table to store results of subroblems. The value wb[i]     // will be true if str[0..i-1] can be segmented into dictionary words,     // otherwise false.     bool wb[size+1];     memset(wb, 0, sizeof(wb)); // Initialize all values as false.        for (int i=1; i<=size; i++)     {         // if wb[i] is false, then check if current prefix can make it true.         // Current prefix is "str.substr(0, i)"         if (wb[i] == false && dictionaryContains( str.substr(0, i) ))             wb[i] = true;            // wb[i] is true, then check for all substrings starting from         // (i+1)th character and store their results.         if (wb[i] == true)         {             // If we reached the last prefix             if (i == size)                 return true;                for (int j = i+1; j <= size; j++)             {                 // Update wb[j] if it is false and can be updated                 // Note the parameter passed to dictionaryContains() is                 // substring starting from index 'i' and length 'j-i'                 if (wb[j] == false && dictionaryContains( str.substr(i, j-i) ))                     wb[j] = true;                    // If we reached the last character                 if (j == size && wb[j] == true)                     return true;             }         }     }        /* Uncomment these lines to print DP table "wb[]"      for (int i = 1; i <= size; i++)         cout << " " << wb[i]; */        // If we have tried all prefixes and none of them worked     return false; }    // Driver program to test above functions int main() {     wordBreak("ilikesamsung")? cout <<"Yes ": cout << "No ";     wordBreak("iiiiiiii")? cout <<"Yes ": cout << "No ";     wordBreak("")? cout <<"Yes ": cout << "No ";     wordBreak("ilikelikeimangoiii")? cout <<"Yes ": cout << "No ";     wordBreak("samsungandmango")? cout <<"Yes ": cout << "No ";     wordBreak("samsungandmangok")? cout <<"Yes ": cout << "No ";     return 0; }

Output:

Yes
Yes
Yes
Yes
Yes
No

Optimized Dynamic Programming:
In this approach, apart from the dp table, we also maintain all the indexes which have matched earlier. Then we will check the substrings from those indexes to the current index. If anyone of that matches then we can divide the string up to that index.

In this program, we are using some extra space. However, its time complexity is O(n*s) where s is the length of the largest string in the dictionary and n is the length of the given string.

 // A Dynamic Programming based program to test // whether a given string can  be segmented into // space separated words in dictionary #include using namespace std;    /* A utility function to check whether a word    is present in dictionary or not. An array of     strings is used for dictionary.  Using array    of strings for dictionary is definitely not     a good idea. We have used for simplicity of    the program*/ int dictionaryContains(string word) {     string dictionary[] = { "mobile", "samsung", "sam",                             "sung", "man", "mango",                             "icecream", "and", "go",                             "i", "like", "ice", "cream" };     int size = sizeof(dictionary) / sizeof(dictionary);     for (int i = 0; i < size; i++)         if (dictionary[i].compare(word) == 0)             return true;     return false; }    // Returns true if string can be segmented into space // separated words, otherwise returns false bool wordBreak(string s) {     int n = s.size();     if (n == 0)         return true;        // Create the DP table to store results of subroblems.     // The value dp[i] will be true if str[0..i] can be     // segmented into dictionary words, otherwise false.     vector dp(n + 1, 0); // Initialize all values     // as false.        // matched_index array represents the indexes for which     // dp[i] is true. Initially only -1 element is present     // in this array.     vector matched_index;     matched_index.push_back(-1);        for (int i = 0; i < n; i++) {         int msize = matched_index.size();            // Flag value which tells that a substring matches         // with given words or not.         int f = 0;            // Check all the substring from the indexes matched         // earlier. If any of that substring matches than         // make flag value = 1;         for (int j = msize - 1; j >= 0; j--) {                // sb is substring starting from matched_index[j]             // + 1  and of length i - matched_index[j]             string sb = s.substr(matched_index[j] + 1, i - matched_index[j]);                if (dictionaryContains(sb)) {                 f = 1;                 break;             }         }            // If substring matches than do dp[i] = 1 and         // push that index in matched_index array.         if (f == 1) {             dp[i] = 1;             matched_index.push_back(i);         }     }     return dp[n - 1]; }    // Driver code int main() {     wordBreak("ilikesamsung") ? cout << "Yes " : cout << "No ";     wordBreak("iiiiiiii") ? cout << "Yes " : cout << "No ";     wordBreak("") ? cout << "Yes " : cout << "No ";     wordBreak("ilikelikeimangoiii") ? cout << "Yes " : cout << "No ";     wordBreak("samsungandmango") ? cout << "Yes " : cout << "No ";     wordBreak("samsungandmangok") ? cout << "Yes " : cout << "No ";     return 0; }

Output:

Yes
Yes
Yes
Yes
Yes
No

Word Break Problem | (Trie solution)

Exercise:
The above solutions only finds out whether a given string can be segmented or not. Extend the above Dynamic Programming solution to print all possible partitions of input string.

Examples:

Input: ilikeicecreamandmango
Output:
i like ice cream and man go
i like ice cream and mango
i like icecream and man go
i like icecream and mango

Input: ilikesamsungmobile
Output:
i like sam sung mobile
i like samsung mobile

Refer below post for solution of exercise.
Word Break Problem using Backtracking