t*g 发帖数: 1758 | 1 我需要把token从lucene index中dump出来,可能要很多数据。怎么做呢?要用Term做吗
?我是一个新手。。谢谢! | t*******e 发帖数: 684 | 2 Depending on how index files are created in the first place, Lucene may
store a full copy of the original text to be indexed, such that you can
restore the text from the query results. Otherwise, you only get other
fields like IDs from the Hit Documents. | t*g 发帖数: 1758 | 3 We did store the original text. I don't have problems in dumping the
original text. I can dump it from through Hit Documents. However, what I
need is to dump the tokenized text. It doesn't exist in the Hit Documents.
Looks like I need to go into indices to get the tokenized documents. But I'm
new to Lucene, I can't find a way to do it. Need help! Thx.
【在 t*******e 的大作中提到】 : Depending on how index files are created in the first place, Lucene may : store a full copy of the original text to be indexed, such that you can : restore the text from the query results. Otherwise, you only get other : fields like IDs from the Hit Documents.
| t*******e 发帖数: 684 | 4
.
'm
This is impossible. Inverted index in a search engine stores terms
(tokens) in a term index file as the search key, which maps Document IDs,
and returns matched Documents as the query results. But not the other way around.
The terms you specified in you query are the tokens you may use to highlight
the original text.
【在 t*g 的大作中提到】 : We did store the original text. I don't have problems in dumping the : original text. I can dump it from through Hit Documents. However, what I : need is to dump the tokenized text. It doesn't exist in the Hit Documents. : Looks like I need to go into indices to get the tokenized documents. But I'm : new to Lucene, I can't find a way to do it. Need help! Thx.
| b******y 发帖数: 9224 | 5 You will need to store the terms in lucene index. But, I don't see why you
want to do that. |
|