h*****g 发帖数: 312 | 1 Given a list of words, L, that are all the same length, and a string, S,
find the starting position of the substring of S that is a concatenation of
each word in L exactly once and without any intervening characters. This
substring will occur exactly once in S..
.
Example:.
L: "fooo", "barr", "wing", "ding", "wing".
S: "lingmindraboofooowingdingbarrwingmonkeypoundcake".
fooowingdingbarrwing.
trie的话,现场写程序不方便吧? | p*****2 发帖数: 21240 | 2 trie的话写起来也还好。实现add, find两个函数应该就可以了。
这样就是N*L的复杂度了。 | p*****2 发帖数: 21240 | 3 写了一个练练手。
import java.util.*;
class Node
{
Node[] children=new Node[26];
public Node Set(char c)
{
Node node=new Node();
children[c-'a']=node;
return node;
}
public Node Get(char c)
{
return children[c-'a'];
}
}
class Trie
{
Node root=new Node();
public void Add(String word)
{
int i=0;
Node node=root;
while(i
{
char c=word.charAt(i);
if(node.Get(c)==null)
node=node.Set(c);
else
node=node.Get(c);
i++;
}
}
public boolean Find(String word)
{
boolean ret=true;
int i=0;
Node node=root;
while(i
{
char c=word.charAt(i);
if(node.Get(c)!=null)
node=node.Get(c);
else
{
ret=false;
break;
}
i++;
}
return ret;
}
}
public class Solution
{
static String[] L;
static int len;
static Trie trie=new Trie();
static HashMap hm=new HashMap();
public static boolean Check(String str)
{
HashMap ht=new HashMap();
for(int i=0;i
{
String sub=str.substring(i*len,i*len+len);
if(trie.Find(sub))
{
if(ht.containsKey(sub))
ht.put(sub, ht.get(sub)+1);
else
ht.put(sub, 1);
}
else
return false;
}
if(hm.size()!=ht.size())
return false;
for(String s: hm.keySet())
{
if(hm.get(s)!=ht.get(s))
return false;
}
return true;
}
public static void main(String[] args)
{
L=new String[]{ "fooo", "barr", "wing", "ding", "wing"};
len=4;
for(String s:L)
{
trie.Add(s);
if(hm.containsKey(s))
hm.put(s, hm.get(s)+1);
else
hm.put(s, 1);
}
int totalLen=len*L.length;
String S="lingmindraboofooowingdingbarrwingmonkeypoundcake";
for(int i=0;i<=S.length()-totalLen;i++)
{
if(Check(S.substring(i,i+totalLen)))
{
System.out.println(S.substring(i,i+totalLen));
break;
}
}
}
} | c*****e 发帖数: 3226 | 4 不觉得你需要trie, hashtable is enough..
【在 p*****2 的大作中提到】 : 写了一个练练手。 : import java.util.*; : class Node : { : Node[] children=new Node[26]; : : public Node Set(char c) : { : Node node=new Node(); : children[c-'a']=node;
| p*****2 发帖数: 21240 | 5
没错。
【在 c*****e 的大作中提到】 : 不觉得你需要trie, hashtable is enough..
| H***e 发帖数: 476 | 6 这个find是不是有点问题?只要是前缀就return true了吧?
【在 p*****2 的大作中提到】 : 写了一个练练手。 : import java.util.*; : class Node : { : Node[] children=new Node[26]; : : public Node Set(char c) : { : Node node=new Node(); : children[c-'a']=node;
| p*****2 发帖数: 21240 | 7
他的题目说的是单词的长度一致,所以应该一定match.
【在 H***e 的大作中提到】 : 这个find是不是有点问题?只要是前缀就return true了吧?
| b***u 发帖数: 12010 | 8 嗯,trie的特性没什么帮助。
【在 c*****e 的大作中提到】 : 不觉得你需要trie, hashtable is enough..
| a*****n 发帖数: 158 | 9 直接用HASHMAP/RABIN-KARP测试就行了吧,,,加上长度都是4就更简单了。。。每次
产生4个字符的字符串到HASH-MAP里面匹配。全找到就结束,没找到就继续。。 | q********c 发帖数: 1774 | 10 Isn't it similar to strstr? Is it a phone or onsite question? | | | a*****g 发帖数: 13 | 11 c++ practice (use multiset):
#include
#include
using std::multiset;
using std::string;
int len_p = 4;
bool check_substr(string s, const multiset &L) {
multiset dup_L = L;
for (unsigned i=0; i
string token = s.substr(i, len_p);
multiset::iterator itor = dup_L.find(token);
if (itor == dup_L.end()) {
return false;
}
dup_L.erase(itor);
}
return dup_L.empty();
}
int main(int argc, char* argv[])
{
multiset L;
L.insert("fooo");
L.insert("barr");
L.insert("wing");
L.insert("ding");
L.insert("wing");
const string S = "lingmindraboofooowingdingbarrwingmonkeypoundcake";
int size_L = L.size();
int len_substr = len_p * size_L;
for (unsigned i=0; i < S.length() - len_substr; i++) {
if (check_substr(S.substr(i, len_substr), L)) {
printf("found. pos:%d\n", i);
return 0;
}
}
printf("not found.\n");
return 0;
}
It's not trivial to me how to apply RK algorithm to this problem but I guess
defining a hash function using histogram of characters might work. | g*********e 发帖数: 14401 | 12 如果某个字符串在table 里找到了重复的,但这时其他串还没全找到,怎么办?
【在 a*****n 的大作中提到】 : 直接用HASHMAP/RABIN-KARP测试就行了吧,,,加上长度都是4就更简单了。。。每次 : 产生4个字符的字符串到HASH-MAP里面匹配。全找到就结束,没找到就继续。。
| A**u 发帖数: 2458 | 13 你这不对
差多了
【在 a*****g 的大作中提到】 : c++ practice (use multiset): : #include : #include : using std::multiset; : using std::string; : int len_p = 4; : bool check_substr(string s, const multiset &L) { : multiset dup_L = L; : for (unsigned i=0; i: string token = s.substr(i, len_p);
| j****b 发帖数: 108 | 14 没想到什么特别巧的办法,brute force + hashmap?
public static void main(String[] args) {
String l[] = { "fooo", "barr", "wing", "ding", "wing" };
Map map = new HashMap();
for (int i = 0; i < l.length; i++){
if(map.containsKey(l[i]))
map.put(l[i], map.get(l[i]) +1);
else
map.put(l[i], 1);
}
String s = "lingmindraboofooowingdingbarrwingmonkeypoundcake";
int len = l[0].length();
for (int j = 0; j < len; j++) {
Map map2 = new HashMap();
for (int i = j; i <= s.length() - len; i+=len) {
String ss = s.substring(i, i + len);
if (map.containsKey(ss)) {
if(!map2.containsKey(ss) || map2.get(ss) < map.get(ss)){
if(map2.containsKey(ss))
map2.put(ss, map2.get(ss)+1);
else
map2.put(ss, 1);
if(map2.entrySet().equals(map.entrySet())){
String r =s.substring(i+len-l.length*len,i+len);
System.out.println(r);
return;
}
}else{
map2.clear();
}
}else{
map2.clear();
}
}
}
} | h********w 发帖数: 221 | | s*******e 发帖数: 35 | 16 用HASHMAP 找出每个WORD 所在的位置 occurrence map,在做一个反向的由所在位置(
position)到WORD 的reverse occurrence map。 吧这些position 排序,然后递归搜索。
package smartnose;
import java.util.Arrays;
import java.util.*;
public class SubWordSeq {
public static void main(String [] args){
List L = new LinkedList();
L.add("fooo");L.add("barr");L.add("wing");L.add("ding");L.add("wing"
);
String S= "lingmindraboofooowingdingbarrwingmonkeypoundcake";
HashMap> occurMap = new HashMap
Integer>>();
for(String word:L){
occurMap.put(word, new ArrayList());
}
int wordLen = L.get(0).length();
for(int i=0;i+wordLen
String sub = S.subSequence(i, i+wordLen).toString();
if(occurMap.containsKey(sub))//add the occurrance
occurMap.get(sub).add(i);
}
HashMap> rOccurMap = new HashMap
>();
for(String w: occurMap.keySet()){
List occ = occurMap.get(w);
for(Integer pos: occ){
if(rOccurMap.containsKey(pos)){
rOccurMap.get(pos).add(w);
}
else{
rOccurMap.put(pos, Arrays.asList(w));
}
}
}
List distinctPos = new ArrayList();
distinctPos.addAll(rOccurMap.keySet());
Collections.sort(distinctPos);
//now search the f*ck out of it.
for(Integer start:distinctPos)
if(recurSearch(L,rOccurMap,start,wordLen)){
System.out.println("found one instance starting at:"+start);
break;
}
}
public static boolean recurSearch(List L, HashMap
String>> rMap, int curPos, int wLen){
if(L.isEmpty())
return true;
List cWords = rMap.get(curPos);
for(String w: cWords){
if(L.contains(w)){
//L.remove(w);
int idx=L.indexOf(w);
L.remove(idx);//can be more efficient
if(recurSearch(L, rMap, curPos+wLen, wLen))
return true;
L.add(w);
}
}
return false;
}
} | a*****s 发帖数: 1121 | | l**********e 发帖数: 147 | | a*****g 发帖数: 13 | 19 I will really appreciate if you could elaborate more.
【在 A**u 的大作中提到】 : 你这不对 : 差多了
| k*****y 发帖数: 744 | 20 Python写了一个,练练手
L = ['fooo', 'barr', 'wing', 'ding', 'wing']
S = 'lingmindraboofooowingdingbarrwingmonkeypoundcake'
# L1 stores distinct stings in L
# multiples counts the corresponding multiples
L1 = []
multiples = []
for str in L:
if str not in L1:
L1.append(str)
multiples.append(L.count(str))
# k is the lenght of each substring
k = len(L[0])
match = [0]*len(S)
checklist = []
for i in range(0, len(L1)):
mul = multiples[i]
list_pos = []
# find all possible positions matching L1[i]
pos = 0; pos = S.find(L1[i], pos, len(S))
while pos != -1:
list_pos.append(pos)
pos = S.find(L1[i], pos+1, len(S))
for j in range(0, len(list_pos)):
# only count those without too many occurrence
if (j+mul >= len(list_pos)
or list_pos[j] + k*len(L) < list_pos[j+mul]):
match[list_pos[j]] = 1
checklist.append(list_pos[j])
for i in range(0, len(checklist)):
pos = checklist[i]
j = 0
while j < len(L):
if match[pos+j*k] == 0:
break
j += 1
if j == len(L):
print S[checklist[i]:checklist[i]+k*len(L)]
break | | | h*****g 发帖数: 312 | 21 Given a list of words, L, that are all the same length, and a string, S,
find the starting position of the substring of S that is a concatenation of
each word in L exactly once and without any intervening characters. This
substring will occur exactly once in S..
.
Example:.
L: "fooo", "barr", "wing", "ding", "wing".
S: "lingmindraboofooowingdingbarrwingmonkeypoundcake".
fooowingdingbarrwing.
trie的话,现场写程序不方便吧? | p*****2 发帖数: 21240 | 22 trie的话写起来也还好。实现add, find两个函数应该就可以了。
这样就是N*L的复杂度了。 | p*****2 发帖数: 21240 | 23 写了一个练练手。
import java.util.*;
class Node
{
Node[] children=new Node[26];
public Node Set(char c)
{
Node node=new Node();
children[c-'a']=node;
return node;
}
public Node Get(char c)
{
return children[c-'a'];
}
}
class Trie
{
Node root=new Node();
public void Add(String word)
{
int i=0;
Node node=root;
while(i
{
char c=word.charAt(i);
if(node.Get(c)==null)
node=node.Set(c);
else
node=node.Get(c);
i++;
}
}
public boolean Find(String word)
{
boolean ret=true;
int i=0;
Node node=root;
while(i
{
char c=word.charAt(i);
if(node.Get(c)!=null)
node=node.Get(c);
else
{
ret=false;
break;
}
i++;
}
return ret;
}
}
public class Solution
{
static String[] L;
static int len;
static Trie trie=new Trie();
static HashMap hm=new HashMap();
public static boolean Check(String str)
{
HashMap ht=new HashMap();
for(int i=0;i
{
String sub=str.substring(i*len,i*len+len);
if(trie.Find(sub))
{
if(ht.containsKey(sub))
ht.put(sub, ht.get(sub)+1);
else
ht.put(sub, 1);
}
else
return false;
}
if(hm.size()!=ht.size())
return false;
for(String s: hm.keySet())
{
if(hm.get(s)!=ht.get(s))
return false;
}
return true;
}
public static void main(String[] args)
{
L=new String[]{ "fooo", "barr", "wing", "ding", "wing"};
len=4;
for(String s:L)
{
trie.Add(s);
if(hm.containsKey(s))
hm.put(s, hm.get(s)+1);
else
hm.put(s, 1);
}
int totalLen=len*L.length;
String S="lingmindraboofooowingdingbarrwingmonkeypoundcake";
for(int i=0;i<=S.length()-totalLen;i++)
{
if(Check(S.substring(i,i+totalLen)))
{
System.out.println(S.substring(i,i+totalLen));
break;
}
}
}
} | c*****e 发帖数: 3226 | 24 不觉得你需要trie, hashtable is enough..
【在 p*****2 的大作中提到】 : 写了一个练练手。 : import java.util.*; : class Node : { : Node[] children=new Node[26]; : : public Node Set(char c) : { : Node node=new Node(); : children[c-'a']=node;
| p*****2 发帖数: 21240 | 25
没错。
【在 c*****e 的大作中提到】 : 不觉得你需要trie, hashtable is enough..
| H***e 发帖数: 476 | 26 这个find是不是有点问题?只要是前缀就return true了吧?
【在 p*****2 的大作中提到】 : 写了一个练练手。 : import java.util.*; : class Node : { : Node[] children=new Node[26]; : : public Node Set(char c) : { : Node node=new Node(); : children[c-'a']=node;
| p*****2 发帖数: 21240 | 27
他的题目说的是单词的长度一致,所以应该一定match.
【在 H***e 的大作中提到】 : 这个find是不是有点问题?只要是前缀就return true了吧?
| b***u 发帖数: 12010 | 28 嗯,trie的特性没什么帮助。
【在 c*****e 的大作中提到】 : 不觉得你需要trie, hashtable is enough..
| a*****n 发帖数: 158 | 29 直接用HASHMAP/RABIN-KARP测试就行了吧,,,加上长度都是4就更简单了。。。每次
产生4个字符的字符串到HASH-MAP里面匹配。全找到就结束,没找到就继续。。 | q********c 发帖数: 1774 | 30 Isn't it similar to strstr? Is it a phone or onsite question? | | | a*****g 发帖数: 13 | 31 c++ practice (use multiset):
#include
#include
using std::multiset;
using std::string;
int len_p = 4;
bool check_substr(string s, const multiset &L) {
multiset dup_L = L;
for (unsigned i=0; i
string token = s.substr(i, len_p);
multiset::iterator itor = dup_L.find(token);
if (itor == dup_L.end()) {
return false;
}
dup_L.erase(itor);
}
return dup_L.empty();
}
int main(int argc, char* argv[])
{
multiset L;
L.insert("fooo");
L.insert("barr");
L.insert("wing");
L.insert("ding");
L.insert("wing");
const string S = "lingmindraboofooowingdingbarrwingmonkeypoundcake";
int size_L = L.size();
int len_substr = len_p * size_L;
for (unsigned i=0; i < S.length() - len_substr; i++) {
if (check_substr(S.substr(i, len_substr), L)) {
printf("found. pos:%d\n", i);
return 0;
}
}
printf("not found.\n");
return 0;
}
It's not trivial to me how to apply RK algorithm to this problem but I guess
defining a hash function using histogram of characters might work. | g*********e 发帖数: 14401 | 32 如果某个字符串在table 里找到了重复的,但这时其他串还没全找到,怎么办?
【在 a*****n 的大作中提到】 : 直接用HASHMAP/RABIN-KARP测试就行了吧,,,加上长度都是4就更简单了。。。每次 : 产生4个字符的字符串到HASH-MAP里面匹配。全找到就结束,没找到就继续。。
| A**u 发帖数: 2458 | 33 你这不对
差多了
【在 a*****g 的大作中提到】 : c++ practice (use multiset): : #include : #include : using std::multiset; : using std::string; : int len_p = 4; : bool check_substr(string s, const multiset &L) { : multiset dup_L = L; : for (unsigned i=0; i: string token = s.substr(i, len_p);
| j****b 发帖数: 108 | 34 没想到什么特别巧的办法,brute force + hashmap?
public static void main(String[] args) {
String l[] = { "fooo", "barr", "wing", "ding", "wing" };
Map map = new HashMap();
for (int i = 0; i < l.length; i++){
if(map.containsKey(l[i]))
map.put(l[i], map.get(l[i]) +1);
else
map.put(l[i], 1);
}
String s = "lingmindraboofooowingdingbarrwingmonkeypoundcake";
int len = l[0].length();
for (int j = 0; j < len; j++) {
Map map2 = new HashMap();
for (int i = j; i <= s.length() - len; i+=len) {
String ss = s.substring(i, i + len);
if (map.containsKey(ss)) {
if(!map2.containsKey(ss) || map2.get(ss) < map.get(ss)){
if(map2.containsKey(ss))
map2.put(ss, map2.get(ss)+1);
else
map2.put(ss, 1);
if(map2.entrySet().equals(map.entrySet())){
String r =s.substring(i+len-l.length*len,i+len);
System.out.println(r);
return;
}
}else{
map2.clear();
}
}else{
map2.clear();
}
}
}
} | h********w 发帖数: 221 | | s*******e 发帖数: 35 | 36 用HASHMAP 找出每个WORD 所在的位置 occurrence map,在做一个反向的由所在位置(
position)到WORD 的reverse occurrence map。 吧这些position 排序,然后递归搜索。
package smartnose;
import java.util.Arrays;
import java.util.*;
public class SubWordSeq {
public static void main(String [] args){
List L = new LinkedList();
L.add("fooo");L.add("barr");L.add("wing");L.add("ding");L.add("wing"
);
String S= "lingmindraboofooowingdingbarrwingmonkeypoundcake";
HashMap> occurMap = new HashMap
Integer>>();
for(String word:L){
occurMap.put(word, new ArrayList());
}
int wordLen = L.get(0).length();
for(int i=0;i+wordLen
String sub = S.subSequence(i, i+wordLen).toString();
if(occurMap.containsKey(sub))//add the occurrance
occurMap.get(sub).add(i);
}
HashMap> rOccurMap = new HashMap
>();
for(String w: occurMap.keySet()){
List occ = occurMap.get(w);
for(Integer pos: occ){
if(rOccurMap.containsKey(pos)){
rOccurMap.get(pos).add(w);
}
else{
rOccurMap.put(pos, Arrays.asList(w));
}
}
}
List distinctPos = new ArrayList();
distinctPos.addAll(rOccurMap.keySet());
Collections.sort(distinctPos);
//now search the f*ck out of it.
for(Integer start:distinctPos)
if(recurSearch(L,rOccurMap,start,wordLen)){
System.out.println("found one instance starting at:"+start);
break;
}
}
public static boolean recurSearch(List L, HashMap
String>> rMap, int curPos, int wLen){
if(L.isEmpty())
return true;
List cWords = rMap.get(curPos);
for(String w: cWords){
if(L.contains(w)){
//L.remove(w);
int idx=L.indexOf(w);
L.remove(idx);//can be more efficient
if(recurSearch(L, rMap, curPos+wLen, wLen))
return true;
L.add(w);
}
}
return false;
}
} | a*****s 发帖数: 1121 | | l**********e 发帖数: 147 | | a*****g 发帖数: 13 | 39 I will really appreciate if you could elaborate more.
【在 A**u 的大作中提到】 : 你这不对 : 差多了
| k*****y 发帖数: 744 | 40 Python写了一个,练练手
L = ['fooo', 'barr', 'wing', 'ding', 'wing']
S = 'lingmindraboofooowingdingbarrwingmonkeypoundcake'
# L1 stores distinct stings in L
# multiples counts the corresponding multiples
L1 = []
multiples = []
for str in L:
if str not in L1:
L1.append(str)
multiples.append(L.count(str))
# k is the lenght of each substring
k = len(L[0])
match = [0]*len(S)
checklist = []
for i in range(0, len(L1)):
mul = multiples[i]
list_pos = []
# find all possible positions matching L1[i]
pos = 0; pos = S.find(L1[i], pos, len(S))
while pos != -1:
list_pos.append(pos)
pos = S.find(L1[i], pos+1, len(S))
for j in range(0, len(list_pos)):
# only count those without too many occurrence
if (j+mul >= len(list_pos)
or list_pos[j] + k*len(L) < list_pos[j+mul]):
match[list_pos[j]] = 1
checklist.append(list_pos[j])
for i in range(0, len(checklist)):
pos = checklist[i]
j = 0
while j < len(L):
if match[pos+j*k] == 0:
break
j += 1
if j == len(L):
print S[checklist[i]:checklist[i]+k*len(L)]
break | | | s*******f 发帖数: 1114 | 41 //码遍本版
//inside for LOOP:
//1st round: check (ling)(mind)(rabo)(ofooowingdingbarrwingmonkeypoundcake
//2nd round: check l(ingm)(indr)(aboo)(fooowingdingbarrwingmonkeypoundcake
//...
//this can avoid double check, and, O(n) time, for slice window
string SubStrWordsGreat(const char *str, const vector &vs){
map ms;
int len = strlen(str);
const char *end = str + len;
int lensub = 0;
if (vs.size() == 0)
return "";
int sz = vs[0].size();
for (vector::const_iterator it = vs.begin(); it != vs.end(); ++
it){
++ms[*it];
lensub += it->size();
}
for (int i = 0; i < sz; ++i){
const char *left = str + i;
const char *right = left;
int count = 0;
while (end - left >= lensub){
string s = string(right, right + sz);
if (ms.find(s) == ms.end()){
//recover map
for (; left < right; left += sz){
string ss = string(left, left + sz);
++ms[ss];
--count;
}
left += sz;
right = left;
} else if (ms[s] == 0){
for (; left < right; left += sz){
string ss = string(left, left + sz);
if (ss == s){
left += sz;
break;
} else {
++ms[ss];
--count;
}
}
} else {
--ms[s];
if (++count == vs.size()){
return string(left, right + sz);
}
right += sz;
}
}
}
return "";
}
void TestSubStrWords(){
vector vs;
vs.push_back("fooo");
vs.push_back("barr");
vs.push_back("wing");
vs.push_back("ding");
vs.push_back("wing");
string a = SubStrWordsGreat("
lingmindraboofooowingdingbarrwingmonkeypoundcake", vs);
}
of
【在 h*****g 的大作中提到】 : Given a list of words, L, that are all the same length, and a string, S, : find the starting position of the substring of S that is a concatenation of : each word in L exactly once and without any intervening characters. This : substring will occur exactly once in S.. : . : Example:. : L: "fooo", "barr", "wing", "ding", "wing". : S: "lingmindraboofooowingdingbarrwingmonkeypoundcake". : fooowingdingbarrwing. : trie的话,现场写程序不方便吧?
| d**u 发帖数: 1065 | 42 这题用hashtable的话算是比较平庸的解法,应该达不到facebook或google的要求。正
确的做法是用单词树作为查找容器。 | S******t 发帖数: 151 | 43 It is easy to come up with an $O(SL)$ algorithm for this problem but what is
good about this problem is that it actually has a linear time solution:
Let’s first make some notations clear:
1) |S| -> The length of S.
2) |L| -> The number of words in the list $L$.
3) $len_L$ -> The length of each string in $L$.
First we use an Aho-Corasick automaton to find out the occurrences of all
patterns. The time complexity of this step is $O(|S| + |L| * len_L + z)$
where $z$ is the number of output matches. However, since in this problem,
each index in the text string can correspond to at most one match. Therefore
, z < |S| and the complexity is therefore bounded by $O(|S| + |L| * len_L)$.
In the second step, we’ve now had an array $f$, where f[i] indicates
whether we can match a string in $L$ with substring $S[i..i+len_L-1]$. If
there is a match, f[i] = #the index of matched string in L$, otherwise f[i]
= -1.
Next, we divide this array into $len_L$ subarrays, where each subarray is of
the form: {f[i], f[i+len_L], f[i+2*len_L], …, f[i+k*len_L]}. So in each
array, what we are looking for is in fact a contiguous subarray of length |L
| which is in fact a permutation of numbers from 1 to L. We can use a deque
to achieve this aim. We first push_back the elements in each array till we
find a conflict. (i.e. the element we are going to push has already existed
in the deque). When this happens, we pop_front until the conflicted element
has been popped out of the deque. This algorithm continues till the number
of elements in the deque reaches |L|. (Yes, we do have to handle -1 cases
separately but that is quite easy.) The complexity to handle each subarray
is linear and the overall complexity is still linear.
Therefore the time complexity of the algorithm is dominated by the first
step, which is $O(|S| + |L| * len_L)$ and since |L| * len_L < |S|, the
algorithm is indeed $O(S)$.
Now let’s consider the case where there might be repetitive strings in L.
We first count the numbers of unique strings in $L$ and for each unique
string, we count the actual appearance of that string in $L$.
Then the Aho-Corasick part is still the same and we can guarantee the time
complexity is still bounded linearly.
The only difference is in the deque part where we change the conflicting
part a bit from whether that $f[i]$ already exists in deque to whether $f[i]
< cnt[f[i]]$ in deque.
And that still makes our algorithm linear. Problem solved. | h******8 发帖数: 55 | 44 Can you elaborate it? Thanks!
is
Therefore
【在 S******t 的大作中提到】 : It is easy to come up with an $O(SL)$ algorithm for this problem but what is : good about this problem is that it actually has a linear time solution: : Let’s first make some notations clear: : 1) |S| -> The length of S. : 2) |L| -> The number of words in the list $L$. : 3) $len_L$ -> The length of each string in $L$. : First we use an Aho-Corasick automaton to find out the occurrences of all : patterns. The time complexity of this step is $O(|S| + |L| * len_L + z)$ : where $z$ is the number of output matches. However, since in this problem, : each index in the text string can correspond to at most one match. Therefore
|
|