再请教一个lucene的问题 - Java版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Java版 - 再请教一个lucene的问题

相关主题
● 再请教一个lucene的问题	● httpsession 问题
● A question about how to segment intput text file	● Help! (转载)
● 清教关于编译原理	● 怎么实现 twitter 桌面程序
● 请帮忙看看这个编译错误	● 网上web services的免费书，哪本好点？c#或者java都可以。 (转载)
● 再问generic问题：tomcat编译错误	● 有谁在做drupal programming吗? (转载)
● 新手问一个弱问题, 关于从stdin输入int或者其他数值的实现方法	● 在手机上怎么远程控制一个房间的灯亮灯灭？
● ConcurrentModificationException	● goodbug, aws上怎么实现web server,app server分离，2个firewall的？
● How to prevent double submission in web form?	● how to pass a client certificate (x509) while calling a web service?

相关话题的讨论汇总
话题: lucene话题: documents话题: text话题: index话题: hit

进入Java版参与讨论

1

(共1页)

t*g 发帖数: 1758	1 我需要把token从lucene index中dump出来，可能要很多数据。怎么做呢?要用Term做吗 ?我是一个新手。。谢谢！
t*******e 发帖数: 684	2 Depending on how index files are created in the first place, Lucene may store a full copy of the original text to be indexed, such that you can restore the text from the query results. Otherwise, you only get other fields like IDs from the Hit Documents.
t*g 发帖数: 1758	3 We did store the original text. I don't have problems in dumping the original text. I can dump it from through Hit Documents. However, what I need is to dump the tokenized text. It doesn't exist in the Hit Documents. Looks like I need to go into indices to get the tokenized documents. But I'm new to Lucene, I can't find a way to do it. Need help! Thx. 【在 t*******e 的大作中提到】 : Depending on how index files are created in the first place, Lucene may : store a full copy of the original text to be indexed, such that you can : restore the text from the query results. Otherwise, you only get other : fields like IDs from the Hit Documents.
t*******e 发帖数: 684	4 . 'm This is impossible. Inverted index in a search engine stores terms (tokens) in a term index file as the search key, which maps Document IDs, and returns matched Documents as the query results. But not the other way around. The terms you specified in you query are the tokens you may use to highlight the original text. 【在 t*g 的大作中提到】 : We did store the original text. I don't have problems in dumping the : original text. I can dump it from through Hit Documents. However, what I : need is to dump the tokenized text. It doesn't exist in the Hit Documents. : Looks like I need to go into indices to get the tokenized documents. But I'm : new to Lucene, I can't find a way to do it. Need help! Thx.
b******y 发帖数: 9224	5 You will need to store the terms in lucene index. But, I don't see why you want to do that.

1

(共1页)

进入Java版参与讨论

相关主题
● how to pass a client certificate (x509) while calling a web service?	● 再问generic问题：tomcat编译错误
● java security	● 新手问一个弱问题, 关于从stdin输入int或者其他数值的实现方法
● 这叫啥名词？	● ConcurrentModificationException
● 再请教一个lucene的问题	● How to prevent double submission in web form?
● 再请教一个lucene的问题	● httpsession 问题
● A question about how to segment intput text file	● Help! (转载)
● 清教关于编译原理	● 怎么实现 twitter 桌面程序
● 请帮忙看看这个编译错误	● 网上web services的免费书，哪本好点？c#或者java都可以。 (转载)

相关话题的讨论汇总
话题: lucene话题: documents话题: text话题: index话题: hit

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)