facebook一题 - JobHunting版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

相关主题
● lengthOfLongestSubstring 最后一个test case总是超时	● 这个题目难度在leetcode里算什么
● 贡献一个中型软件公司面经	● 贴一下我google第一轮店面的题目
● 问一道string match的题目出自glassdoor facebook版	● 问几道较难的字符串题
● 问道题	● 问个简单的问题...
● Amazon常见设计题——设计电话簿求解	● Yahoo Platform组面经
● 上午偷闲把TopKFrequentWords写出来了	● 新鲜电面
● 电面不好，求bless。这题怎么答？	● share int2roman and roman2int java version
● 面试中遇到suffix tree / trie这种题，需要自己实现吗？	● 4sum o(n^2)超时

相关话题的讨论汇总
话题: string话题: len话题: integer话题: node话题: hashmap

进入JobHunting版参与讨论

(共1页)

h*****g
发帖数: 312

Given a list of words, L, that are all the same length, and a string, S,
find the starting position of the substring of S that is a concatenation of
each word in L exactly once and without any intervening characters. This
substring will occur exactly once in S..
.
Example:.
L: "fooo", "barr", "wing", "ding", "wing".
S: "lingmindraboofooowingdingbarrwingmonkeypoundcake".
fooowingdingbarrwing.
trie的话，现场写程序不方便吧？

p*****2
发帖数: 21240

trie的话写起来也还好。实现add, find两个函数应该就可以了。
这样就是N*L的复杂度了。

p*****2
发帖数: 21240

写了一个练练手。
import java.util.*;
class Node
{
Node[] children=new Node[26];

public Node Set(char c)
{
Node node=new Node();
children[c-'a']=node;
return node;
}

public Node Get(char c)
{
return children[c-'a'];
}
}
class Trie
{
Node root=new Node();

public void Add(String word)
{
int i=0;
Node node=root;
while(i {
char c=word.charAt(i);
if(node.Get(c)==null)
node=node.Set(c);
else
node=node.Get(c);
i++;
}
}

public boolean Find(String word)
{
boolean ret=true;

int i=0;
Node node=root;
while(i {
char c=word.charAt(i);
if(node.Get(c)!=null)
node=node.Get(c);
else
{
ret=false;
break;
}
i++;
}

return ret;
}
}
public class Solution
{
static String[] L;
static int len;
static Trie trie=new Trie();
static HashMap hm=new HashMap();

public static boolean Check(String str)
{
HashMap ht=new HashMap();

for(int i=0;i {
String sub=str.substring(i*len,i*len+len);
if(trie.Find(sub))
{
if(ht.containsKey(sub))
ht.put(sub, ht.get(sub)+1);
else
ht.put(sub, 1);
}
else
return false;
}

if(hm.size()!=ht.size())
return false;

for(String s: hm.keySet())
{
if(hm.get(s)!=ht.get(s))
return false;
}
return true;
}
public static void main(String[] args)
{
L=new String[]{ "fooo", "barr", "wing", "ding", "wing"};
len=4;

for(String s:L)
{
trie.Add(s);
if(hm.containsKey(s))
hm.put(s, hm.get(s)+1);
else
hm.put(s, 1);
}

int totalLen=len*L.length;

String S="lingmindraboofooowingdingbarrwingmonkeypoundcake";
for(int i=0;i<=S.length()-totalLen;i++)
{
if(Check(S.substring(i,i+totalLen)))
{
System.out.println(S.substring(i,i+totalLen));
break;
}
}
}
}

c*****e
发帖数: 3226

不觉得你需要trie, hashtable is enough..

【在 p*****2 的大作中提到】

: 写了一个练练手。
: import java.util.*;
: class Node
: {
: Node[] children=new Node[26];
:
: public Node Set(char c)
: {
: Node node=new Node();
: children[c-'a']=node;

p*****2
发帖数: 21240

没错。

【在 c*****e 的大作中提到】

: 不觉得你需要trie, hashtable is enough..

H***e
发帖数: 476

这个find是不是有点问题？只要是前缀就return true了吧？

【在 p*****2 的大作中提到】

: 写了一个练练手。
: import java.util.*;
: class Node
: {
: Node[] children=new Node[26];
:
: public Node Set(char c)
: {
: Node node=new Node();
: children[c-'a']=node;

p*****2
发帖数: 21240

他的题目说的是单词的长度一致，所以应该一定match.

【在 H***e 的大作中提到】

: 这个find是不是有点问题？只要是前缀就return true了吧？

b***u
发帖数: 12010

嗯，trie的特性没什么帮助。

【在 c*****e 的大作中提到】

: 不觉得你需要trie, hashtable is enough..

a*****n
发帖数: 158

直接用HASHMAP/RABIN-KARP测试就行了吧，，，加上长度都是４就更简单了。。。每次
产生４个字符的字符串到HASH-MAP里面匹配。全找到就结束，没找到就继续。。

q********c
发帖数: 1774

Isn't it similar to strstr? Is it a phone or onsite question?

相关主题
● 上午偷闲把TopKFrequentWords写出来了	● 这个题目难度在leetcode里算什么
● 电面不好，求bless。这题怎么答？	● 贴一下我google第一轮店面的题目
● 面试中遇到suffix tree / trie这种题，需要自己实现吗？	● 问几道较难的字符串题
进入JobHunting版参与讨论

a*****g
发帖数: 13

c++ practice (use multiset):
#include
#include
using std::multiset;
using std::string;
int len_p = 4;
bool check_substr(string s, const multiset &L) {
multiset dup_L = L;
for (unsigned i=0; i string token = s.substr(i, len_p);
multiset::iterator itor = dup_L.find(token);
if (itor == dup_L.end()) {
return false;
}
dup_L.erase(itor);
}
return dup_L.empty();
}
int main(int argc, char* argv[])
{
multiset L;
L.insert("fooo");
L.insert("barr");
L.insert("wing");
L.insert("ding");
L.insert("wing");
const string S = "lingmindraboofooowingdingbarrwingmonkeypoundcake";
int size_L = L.size();
int len_substr = len_p * size_L;
for (unsigned i=0; i < S.length() - len_substr; i++) {
if (check_substr(S.substr(i, len_substr), L)) {
printf("found. pos:%d\n", i);
return 0;
}
}
printf("not found.\n");
return 0;
}
It's not trivial to me how to apply RK algorithm to this problem but I guess
defining a hash function using histogram of characters might work.

g*********e
发帖数: 14401

如果某个字符串在table 里找到了重复的，但这时其他串还没全找到，怎么办？

【在 a*****n 的大作中提到】

: 直接用HASHMAP/RABIN-KARP测试就行了吧，，，加上长度都是４就更简单了。。。每次
: 产生４个字符的字符串到HASH-MAP里面匹配。全找到就结束，没找到就继续。。

A**u
发帖数: 2458

你这不对
差多了

【在 a*****g 的大作中提到】

: c++ practice (use multiset):
: #include
: #include
: using std::multiset;
: using std::string;
: int len_p = 4;
: bool check_substr(string s, const multiset &L) {
: multiset dup_L = L;
: for (unsigned i=0; i: string token = s.substr(i, len_p);

j****b
发帖数: 108

没想到什么特别巧的办法，brute force ＋ hashmap？
public static void main(String[] args) {
String l[] = { "fooo", "barr", "wing", "ding", "wing" };
Map map = new HashMap();
for (int i = 0; i < l.length; i++){
if(map.containsKey(l[i]))
map.put(l[i], map.get(l[i]) +1);
else
map.put(l[i], 1);
}
String s = "lingmindraboofooowingdingbarrwingmonkeypoundcake";
int len = l[0].length();
for (int j = 0; j < len; j++) {
Map map2 = new HashMap();
for (int i = j; i <= s.length() - len; i+=len) {
String ss = s.substring(i, i + len);
if (map.containsKey(ss)) {
if(!map2.containsKey(ss) || map2.get(ss) < map.get(ss)){
if(map2.containsKey(ss))
map2.put(ss, map2.get(ss)+1);
else
map2.put(ss, 1);
if(map2.entrySet().equals(map.entrySet())){
String r =s.substring(i+len-l.length*len,i+len);
System.out.println(r);
return;
}
}else{
map2.clear();
}
}else{
map2.clear();
}

}
}
}

h********w
发帖数: 221

mark

s*******e
发帖数: 35

用HASHMAP 找出每个WORD 所在的位置 occurrence map，在做一个反向的由所在位置(
position)到WORD 的reverse occurrence map。吧这些position 排序，然后递归搜索。
package smartnose;
import java.util.Arrays;
import java.util.*;
public class SubWordSeq {

public static void main(String [] args){
List L = new LinkedList();

L.add("fooo");L.add("barr");L.add("wing");L.add("ding");L.add("wing"
);

String S= "lingmindraboofooowingdingbarrwingmonkeypoundcake";

HashMap> occurMap = new HashMap Integer>>();
for(String word:L){
occurMap.put(word, new ArrayList());
}
int wordLen = L.get(0).length();

for(int i=0;i+wordLen String sub = S.subSequence(i, i+wordLen).toString();
if(occurMap.containsKey(sub))//add the occurrance
occurMap.get(sub).add(i);
}

HashMap> rOccurMap = new HashMap >();

for(String w: occurMap.keySet()){
List occ = occurMap.get(w);
for(Integer pos: occ){
if(rOccurMap.containsKey(pos)){
rOccurMap.get(pos).add(w);
}
else{
rOccurMap.put(pos, Arrays.asList(w));
}
}
}

List distinctPos = new ArrayList();
distinctPos.addAll(rOccurMap.keySet());
Collections.sort(distinctPos);

//now search the f*ck out of it.
for(Integer start:distinctPos)
if(recurSearch(L,rOccurMap,start,wordLen)){
System.out.println("found one instance starting at:"+start);
break;
}

}
public static boolean recurSearch(List L, HashMap String>> rMap, int curPos, int wLen){
if(L.isEmpty())
return true;

List cWords = rMap.get(curPos);

for(String w: cWords){
if(L.contains(w)){
//L.remove(w);
int idx=L.indexOf(w);
L.remove(idx);//can be more efficient
if(recurSearch(L, rMap, curPos+wLen, wLen))
return true;
L.add(w);
}
}

return false;
}
}

a*****s
发帖数: 1121

mark

l**********e
发帖数: 147

mark

a*****g
发帖数: 13

I will really appreciate if you could elaborate more.

【在 A**u 的大作中提到】

: 你这不对
: 差多了

k*****y
发帖数: 744

Python写了一个，练练手
L = ['fooo', 'barr', 'wing', 'ding', 'wing']
S = 'lingmindraboofooowingdingbarrwingmonkeypoundcake'
# L1 stores distinct stings in L
# multiples counts the corresponding multiples
L1 = []
multiples = []
for str in L:
if str not in L1:
L1.append(str)
multiples.append(L.count(str))
# k is the lenght of each substring
k = len(L[0])
match = [0]*len(S)
checklist = []
for i in range(0, len(L1)):
mul = multiples[i]
list_pos = []
# find all possible positions matching L1[i]
pos = 0; pos = S.find(L1[i], pos, len(S))
while pos != -1:
list_pos.append(pos)
pos = S.find(L1[i], pos+1, len(S))

for j in range(0, len(list_pos)):
# only count those without too many occurrence
if (j+mul >= len(list_pos)
or list_pos[j] + k*len(L) < list_pos[j+mul]):
match[list_pos[j]] = 1
checklist.append(list_pos[j])
for i in range(0, len(checklist)):
pos = checklist[i]
j = 0
while j < len(L):
if match[pos+j*k] == 0:
break
j += 1
if j == len(L):
print S[checklist[i]:checklist[i]+k*len(L)]
break

相关主题
● 问个简单的问题...	● share int2roman and roman2int java version
● Yahoo Platform组面经	● 4sum o(n^2)超时
● 新鲜电面	● 关于java synchronized statement和static method or variable
进入JobHunting版参与讨论

h*****g
发帖数: 312

p*****2
发帖数: 21240

trie的话写起来也还好。实现add, find两个函数应该就可以了。
这样就是N*L的复杂度了。

p*****2
发帖数: 21240

c*****e
发帖数: 3226

不觉得你需要trie, hashtable is enough..

【在 p*****2 的大作中提到】

: 写了一个练练手。
: import java.util.*;
: class Node
: {
: Node[] children=new Node[26];
:
: public Node Set(char c)
: {
: Node node=new Node();
: children[c-'a']=node;

p*****2
发帖数: 21240

没错。

【在 c*****e 的大作中提到】

: 不觉得你需要trie, hashtable is enough..

H***e
发帖数: 476

这个find是不是有点问题？只要是前缀就return true了吧？

【在 p*****2 的大作中提到】

: 写了一个练练手。
: import java.util.*;
: class Node
: {
: Node[] children=new Node[26];
:
: public Node Set(char c)
: {
: Node node=new Node();
: children[c-'a']=node;

p*****2
发帖数: 21240

他的题目说的是单词的长度一致，所以应该一定match.

【在 H***e 的大作中提到】

: 这个find是不是有点问题？只要是前缀就return true了吧？

b***u
发帖数: 12010

嗯，trie的特性没什么帮助。

【在 c*****e 的大作中提到】

: 不觉得你需要trie, hashtable is enough..

a*****n
发帖数: 158

q********c
发帖数: 1774

Isn't it similar to strstr? Is it a phone or onsite question?

相关主题
● 问一下OJ的Anagrams那道题	● 贡献一个中型软件公司面经
● 发个evernote的code challenge	● 问一道string match的题目出自glassdoor facebook版
● lengthOfLongestSubstring 最后一个test case总是超时	● 问道题
进入JobHunting版参与讨论

a*****g
发帖数: 13

g*********e
发帖数: 14401

如果某个字符串在table 里找到了重复的，但这时其他串还没全找到，怎么办？

【在 a*****n 的大作中提到】

A**u
发帖数: 2458

你这不对
差多了

【在 a*****g 的大作中提到】

j****b
发帖数: 108

h********w
发帖数: 221

mark

s*******e
发帖数: 35

a*****s
发帖数: 1121

mark

l**********e
发帖数: 147

mark

a*****g
发帖数: 13

I will really appreciate if you could elaborate more.

【在 A**u 的大作中提到】

: 你这不对
: 差多了

k*****y
发帖数: 744

相关主题
● 问道题	● 电面不好，求bless。这题怎么答？
● Amazon常见设计题——设计电话簿求解	● 面试中遇到suffix tree / trie这种题，需要自己实现吗？
● 上午偷闲把TopKFrequentWords写出来了	● 这个题目难度在leetcode里算什么
进入JobHunting版参与讨论

s*******f
发帖数: 1114

//码遍本版
//inside for LOOP:
//1st round: check (ling)(mind)(rabo)(ofooowingdingbarrwingmonkeypoundcake
//2nd round: check l(ingm)(indr)(aboo)(fooowingdingbarrwingmonkeypoundcake
//...
//this can avoid double check, and, O(n) time, for slice window
string SubStrWordsGreat(const char *str, const vector &vs){
map ms;
int len = strlen(str);
const char *end = str + len;
int lensub = 0;
if (vs.size() == 0)
return "";
int sz = vs[0].size();
for (vector::const_iterator it = vs.begin(); it != vs.end(); ++
it){
++ms[*it];
lensub += it->size();
}
for (int i = 0; i < sz; ++i){
const char *left = str + i;
const char *right = left;
int count = 0;
while (end - left >= lensub){
string s = string(right, right + sz);
if (ms.find(s) == ms.end()){
//recover map
for (; left < right; left += sz){
string ss = string(left, left + sz);
++ms[ss];
--count;
}
left += sz;
right = left;
} else if (ms[s] == 0){
for (; left < right; left += sz){
string ss = string(left, left + sz);
if (ss == s){
left += sz;
break;
} else {
++ms[ss];
--count;
}
}
} else {
--ms[s];
if (++count == vs.size()){
return string(left, right + sz);
}
right += sz;
}
}
}
return "";
}
void TestSubStrWords(){
vector vs;
vs.push_back("fooo");
vs.push_back("barr");
vs.push_back("wing");
vs.push_back("ding");
vs.push_back("wing");
string a = SubStrWordsGreat("
lingmindraboofooowingdingbarrwingmonkeypoundcake", vs);
}

of

【在 h*****g 的大作中提到】

: Given a list of words, L, that are all the same length, and a string, S,
: find the starting position of the substring of S that is a concatenation of
: each word in L exactly once and without any intervening characters. This
: substring will occur exactly once in S..
: .
: Example:.
: L: "fooo", "barr", "wing", "ding", "wing".
: S: "lingmindraboofooowingdingbarrwingmonkeypoundcake".
: fooowingdingbarrwing.
: trie的话，现场写程序不方便吧？

d**u
发帖数: 1065

这题用hashtable的话算是比较平庸的解法，应该达不到facebook或google的要求。正
确的做法是用单词树作为查找容器。

S******t
发帖数: 151

It is easy to come up with an $O(SL)$ algorithm for this problem but what is
good about this problem is that it actually has a linear time solution:
Let’s first make some notations clear:
1) |S| -> The length of S.
2) |L| -> The number of words in the list $L$.
3) $len_L$ -> The length of each string in $L$.
First we use an Aho-Corasick automaton to find out the occurrences of all
patterns. The time complexity of this step is $O(|S| + |L| * len_L + z)$
where $z$ is the number of output matches. However, since in this problem,
each index in the text string can correspond to at most one match. Therefore
, z < |S| and the complexity is therefore bounded by $O(|S| + |L| * len_L)$.
In the second step, we’ve now had an array $f$, where f[i] indicates
whether we can match a string in $L$ with substring $S[i..i+len_L-1]$. If
there is a match, f[i] = #the index of matched string in L$, otherwise f[i]
= -1.
Next, we divide this array into $len_L$ subarrays, where each subarray is of
the form: {f[i], f[i+len_L], f[i+2*len_L], …, f[i+k*len_L]}. So in each
array, what we are looking for is in fact a contiguous subarray of length |L
| which is in fact a permutation of numbers from 1 to L. We can use a deque
to achieve this aim. We first push_back the elements in each array till we
find a conflict. (i.e. the element we are going to push has already existed
in the deque). When this happens, we pop_front until the conflicted element
has been popped out of the deque. This algorithm continues till the number
of elements in the deque reaches |L|. (Yes, we do have to handle -1 cases
separately but that is quite easy.) The complexity to handle each subarray
is linear and the overall complexity is still linear.
Therefore the time complexity of the algorithm is dominated by the first
step, which is $O(|S| + |L| * len_L)$ and since |L| * len_L < |S|, the
algorithm is indeed $O(S)$.
Now let’s consider the case where there might be repetitive strings in L.
We first count the numbers of unique strings in $L$ and for each unique
string, we count the actual appearance of that string in $L$.
Then the Aho-Corasick part is still the same and we can guarantee the time
complexity is still bounded linearly.
The only difference is in the deque part where we change the conflicting
part a bit from whether that $f[i]$ already exists in deque to whether $f[i]
< cnt[f[i]]$ in deque.
And that still makes our algorithm linear. Problem solved.

h******8
发帖数: 55

Can you elaborate it? Thanks!

is
Therefore

【在 S******t 的大作中提到】

: It is easy to come up with an $O(SL)$ algorithm for this problem but what is
: good about this problem is that it actually has a linear time solution:
: Let’s first make some notations clear:
: 1) |S| -> The length of S.
: 2) |L| -> The number of words in the list $L$.
: 3) $len_L$ -> The length of each string in $L$.
: First we use an Aho-Corasick automaton to find out the occurrences of all
: patterns. The time complexity of this step is $O(|S| + |L| * len_L + z)$
: where $z$ is the number of output matches. However, since in this problem,
: each index in the text string can correspond to at most one match. Therefore

(共1页)

进入JobHunting版参与讨论

相关主题
● 4sum o(n^2)超时	● Amazon常见设计题——设计电话簿求解
● 关于java synchronized statement和static method or variable	● 上午偷闲把TopKFrequentWords写出来了
● 问一下OJ的Anagrams那道题	● 电面不好，求bless。这题怎么答？
● 发个evernote的code challenge	● 面试中遇到suffix tree / trie这种题，需要自己实现吗？
● lengthOfLongestSubstring 最后一个test case总是超时	● 这个题目难度在leetcode里算什么
● 贡献一个中型软件公司面经	● 贴一下我google第一轮店面的题目
● 问一道string match的题目出自glassdoor facebook版	● 问几道较难的字符串题
● 问道题	● 问个简单的问题...

相关话题的讨论汇总
话题: string话题: len话题: integer话题: node话题: hashmap

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天