S******y 发帖数: 1123 | 1 I have 100,000 + text files. The total size of those files are about 30 GBs.
I would like to pre-index those files regarding a bunch of keywords to
search.
For example, I type "cat" + "dog", the python pgm would return snippets of
text from those files (Just like Google search), sorted by the distances
between two words.
Is there a smart algorithm to do that?
I am thinking -
for 'cat', search all files, and record which file and which location the
word appear.
for 'dog', search all files, and re | D******n 发帖数: 2836 | 2 这个问CS好一点,IR的东西。
GBs.
【在 S******y 的大作中提到】 : I have 100,000 + text files. The total size of those files are about 30 GBs. : I would like to pre-index those files regarding a bunch of keywords to : search. : For example, I type "cat" + "dog", the python pgm would return snippets of : text from those files (Just like Google search), sorted by the distances : between two words. : Is there a smart algorithm to do that? : I am thinking - : for 'cat', search all files, and record which file and which location the : word appear.
| z**k 发帖数: 378 | 3 trie,这个最基本了
GBs.
of
the
【在 S******y 的大作中提到】 : I have 100,000 + text files. The total size of those files are about 30 GBs. : I would like to pre-index those files regarding a bunch of keywords to : search. : For example, I type "cat" + "dog", the python pgm would return snippets of : text from those files (Just like Google search), sorted by the distances : between two words. : Is there a smart algorithm to do that? : I am thinking - : for 'cat', search all files, and record which file and which location the : word appear.
|
|