由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
JobHunting版 - 问个算法题3
相关主题
Is this a DP problem?谁给解释解释这题?
smallest snippet 问题你如何给一个百万页的书建立index?
list of words找两个没有相同字母的string S和T并且使得S.length()*T.length()最大面经
关于 unique paths,总是过不了 OJ, 请牛牛们帮忙看看~~~先谢过。。。rocket fuel 面试题
一道微软面试题求教一个F家的 design题目
问道老题some experience to share (8) --- final words
问个word排版问题google 电话面试题
longest word made of other words简历用哪种文件格式: Word, PDF, HTML 还是纯文本? /硅谷猎头
相关话题的讨论汇总
话题: word话题: index话题: words话题: count话题: document
进入JobHunting版参与讨论
1 (共1页)
F**********r
发帖数: 237
1
版上的题,但是考古找不到讨论了。。。大家看看怎么做好?只能想到brute force的
。。。。
Given a document and a query of K words, how do u find the smallest window
that covers all the words at least once in that document? (given you know
the inverted lists of all K words, that is, for each word, you have a list
of all its occurrrences).
g***y
发帖数: 764
2
inverted list:
A: 3, 5, 99, 21
B: 4, 23
C: 9, 20, 35
D: 1, 6, 24
Merge sort all list to produce to below sorted array:
1D 3A 4B 5A 6D 9C 20C 21A 23B 24D 35C 99A
the problem is find the smallest sub array which contains all four letters (
A, B, C, D)
It can be solved by O(n) where n is the length of the array by idea of
dynamiac progeamming and also by looking up the original inverted list:
Starting from index 0, end index 3 (we know the min lenght of the subarry is
4),
scan the four items, bookkeeping the occruences:
for eaxmple: when start index is 2, window now is 4B 5A 6D, 9C, next step is
start index = start index + 1= 3, and find next available B which is 23,
and length is 23-5, bookkeep the 5A 6D, 9C, 20C, 21A, 23B, and then slide
windows to right (by start index = start index + 1)

【在 F**********r 的大作中提到】
: 版上的题,但是考古找不到讨论了。。。大家看看怎么做好?只能想到brute force的
: 。。。。
: Given a document and a query of K words, how do u find the smallest window
: that covers all the words at least once in that document? (given you know
: the inverted lists of all K words, that is, for each word, you have a list
: of all its occurrrences).

F**********r
发帖数: 237
3
不是很明白startindex从0到2的过程要做些什么?这个slide window要怎么用呢?谢谢!

(

【在 g***y 的大作中提到】
: inverted list:
: A: 3, 5, 99, 21
: B: 4, 23
: C: 9, 20, 35
: D: 1, 6, 24
: Merge sort all list to produce to below sorted array:
: 1D 3A 4B 5A 6D 9C 20C 21A 23B 24D 35C 99A
: the problem is find the smallest sub array which contains all four letters (
: A, B, C, D)
: It can be solved by O(n) where n is the length of the array by idea of

g***y
发帖数: 764
4
每次start index +1,然后根据前一步的bookkeeping觉得end index move到哪儿
然后继续bookkeeping,记录上新的start index, end index直接的A, B, C, D的个数以
供下一步用。
key observation是 :这个sliding windows肯定是那个sorted arry的sub array

谢!

【在 F**********r 的大作中提到】
: 不是很明白startindex从0到2的过程要做些什么?这个slide window要怎么用呢?谢谢!
:
: (

F**********r
发帖数: 237
5
能给个pesudo code或是个链接么? 每一步收集的sliding window对下一步有什么用呢?

【在 g***y 的大作中提到】
: 每次start index +1,然后根据前一步的bookkeeping觉得end index move到哪儿
: 然后继续bookkeeping,记录上新的start index, end index直接的A, B, C, D的个数以
: 供下一步用。
: key observation是 :这个sliding windows肯定是那个sorted arry的sub array
:
: 谢!

g**********r
发帖数: 27
6
An ugly pesudo code
MinSlidingWindow(document){
number_of_words=0;
count_word={};
for(word in document){
if(word not in count_word){
count_word[word]=1;
number_of_words++;
}
}
best_interval=[1,n];
min_index=max_index=0;
count_word={};
current_words=1;
for(word in document){
if(current_words==number_of_words){
//all words are in count_word, find one candinate
if(better([min_index,max_index],best_interval){
best_interval = [min_index, max_index];
}
}
if(count_word[word]==0){
//the word never appear, add it to count_word
current_words++;
count_word[word]++;
}else{
//the word appeared before, try to delete it from front
while(count_word[document[min_index]]>1){
count_word[document[min_index]]--;
min_index++;
}
}
max_index++;
}
}
F**********r
发帖数: 237
7
这两天光顾着关注地震形势了。。。写得很明白,多谢多谢啊!

【在 g**********r 的大作中提到】
: An ugly pesudo code
: MinSlidingWindow(document){
: number_of_words=0;
: count_word={};
: for(word in document){
: if(word not in count_word){
: count_word[word]=1;
: number_of_words++;
: }
: }

P********l
发帖数: 452
1 (共1页)
进入JobHunting版参与讨论
相关主题
简历用哪种文件格式: Word, PDF, HTML 还是纯文本? /硅谷猎头一道微软面试题
我也攒人品问道老题
如何用WORD编辑下标,上标,倒数等?问个word排版问题
monster上面的简历格式longest word made of other words
Is this a DP problem?谁给解释解释这题?
smallest snippet 问题你如何给一个百万页的书建立index?
list of words找两个没有相同字母的string S和T并且使得S.length()*T.length()最大面经
关于 unique paths,总是过不了 OJ, 请牛牛们帮忙看看~~~先谢过。。。rocket fuel 面试题
相关话题的讨论汇总
话题: word话题: index话题: words话题: count话题: document