F**********r 发帖数: 237 | 1 版上的题,但是考古找不到讨论了。。。大家看看怎么做好?只能想到brute force的
。。。。
Given a document and a query of K words, how do u find the smallest window
that covers all the words at least once in that document? (given you know
the inverted lists of all K words, that is, for each word, you have a list
of all its occurrrences). | g***y 发帖数: 764 | 2 inverted list:
A: 3, 5, 99, 21
B: 4, 23
C: 9, 20, 35
D: 1, 6, 24
Merge sort all list to produce to below sorted array:
1D 3A 4B 5A 6D 9C 20C 21A 23B 24D 35C 99A
the problem is find the smallest sub array which contains all four letters (
A, B, C, D)
It can be solved by O(n) where n is the length of the array by idea of
dynamiac progeamming and also by looking up the original inverted list:
Starting from index 0, end index 3 (we know the min lenght of the subarry is
4),
scan the four items, bookkeeping the occruences:
for eaxmple: when start index is 2, window now is 4B 5A 6D, 9C, next step is
start index = start index + 1= 3, and find next available B which is 23,
and length is 23-5, bookkeep the 5A 6D, 9C, 20C, 21A, 23B, and then slide
windows to right (by start index = start index + 1)
【在 F**********r 的大作中提到】 : 版上的题,但是考古找不到讨论了。。。大家看看怎么做好?只能想到brute force的 : 。。。。 : Given a document and a query of K words, how do u find the smallest window : that covers all the words at least once in that document? (given you know : the inverted lists of all K words, that is, for each word, you have a list : of all its occurrrences).
| F**********r 发帖数: 237 | 3 不是很明白startindex从0到2的过程要做些什么?这个slide window要怎么用呢?谢谢!
(
【在 g***y 的大作中提到】 : inverted list: : A: 3, 5, 99, 21 : B: 4, 23 : C: 9, 20, 35 : D: 1, 6, 24 : Merge sort all list to produce to below sorted array: : 1D 3A 4B 5A 6D 9C 20C 21A 23B 24D 35C 99A : the problem is find the smallest sub array which contains all four letters ( : A, B, C, D) : It can be solved by O(n) where n is the length of the array by idea of
| g***y 发帖数: 764 | 4 每次start index +1,然后根据前一步的bookkeeping觉得end index move到哪儿
然后继续bookkeeping,记录上新的start index, end index直接的A, B, C, D的个数以
供下一步用。
key observation是 :这个sliding windows肯定是那个sorted arry的sub array
谢!
【在 F**********r 的大作中提到】 : 不是很明白startindex从0到2的过程要做些什么?这个slide window要怎么用呢?谢谢! : : (
| F**********r 发帖数: 237 | 5 能给个pesudo code或是个链接么? 每一步收集的sliding window对下一步有什么用呢?
【在 g***y 的大作中提到】 : 每次start index +1,然后根据前一步的bookkeeping觉得end index move到哪儿 : 然后继续bookkeeping,记录上新的start index, end index直接的A, B, C, D的个数以 : 供下一步用。 : key observation是 :这个sliding windows肯定是那个sorted arry的sub array : : 谢!
| g**********r 发帖数: 27 | 6 An ugly pesudo code
MinSlidingWindow(document){
number_of_words=0;
count_word={};
for(word in document){
if(word not in count_word){
count_word[word]=1;
number_of_words++;
}
}
best_interval=[1,n];
min_index=max_index=0;
count_word={};
current_words=1;
for(word in document){
if(current_words==number_of_words){
//all words are in count_word, find one candinate
if(better([min_index,max_index],best_interval){
best_interval = [min_index, max_index];
}
}
if(count_word[word]==0){
//the word never appear, add it to count_word
current_words++;
count_word[word]++;
}else{
//the word appeared before, try to delete it from front
while(count_word[document[min_index]]>1){
count_word[document[min_index]]--;
min_index++;
}
}
max_index++;
}
} | F**********r 发帖数: 237 | 7 这两天光顾着关注地震形势了。。。写得很明白,多谢多谢啊!
【在 g**********r 的大作中提到】 : An ugly pesudo code : MinSlidingWindow(document){ : number_of_words=0; : count_word={}; : for(word in document){ : if(word not in count_word){ : count_word[word]=1; : number_of_words++; : } : }
| P********l 发帖数: 452 | |
|