Lucene RangeQuery doesn#39;t filter appropriately(Lucene RangeQuery 没有正确过滤)
问题描述
我正在使用 RangeQuery
来获取数量在 0 到 2 之间的所有文档.当我执行查询时,Lucene 也会给我数量大于 2 的文档.我在这里错过了什么?
I'm using RangeQuery
to get all the documents which have amount between say 0 to 2.
When i execute the query, Lucene gives me documents which have amount greater than 2 also. What am I missing here?
这是我的代码:
Term lowerTerm = new Term("amount", minAmount);
Term upperTerm = new Term("amount", maxAmount);
RangeQuery amountQuery = new RangeQuery(lowerTerm, upperTerm, true);
finalQuery.Add(amountQuery, BooleanClause.Occur.MUST);
这是我索引中的内容:
doc.Add(new Field("amount", amount.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.YES));
推荐答案
UPDATE:就像@basZero在他的评论中所说,从Lucene 2.9开始,你可以添加数字字段到您的文档.只要记住使用 NumericRangeQuery 搜索时代替 RangeQuery.
UPDATE: Like @basZero said in his comment, starting with Lucene 2.9, you can add numeric fields to your documents. Just remember to use NumericRangeQuery instead of RangeQuery when you search.
Lucene 将数字视为单词,因此它们的顺序是字母顺序:
Lucene treats numbers as words, so their order is alphabetic:
0
1
12
123
2
22
这意味着对于 Lucene,12 介于 0 和 2 之间.如果要进行适当的数字范围,则需要索引数字零填充,然后执行 [0000 TO 0002] 的范围搜索.(您需要的填充量取决于预期的值范围).
That means that for Lucene, 12 is between 0 and 2. If you want to do a proper numerical range, you need to index the numbers zero-padded, then do a range search of [0000 TO 0002]. (The amount of padding you need depends on the expected range of values).
如果您有负数,只需为非负数添加另一个零.(错错了.查看更新)
If you have negative numbers, just add another zero for non-negative numbers. ( WRONG WRONG WRONG. See update)
如果您的数字包含小数部分,请保持原样,仅对整数部分进行零填充.
If your numbers include a fraction part, leave it as is, and zero-pad the integer part only.
例子:
<罢工>
-00002.12
-00001
000000
000001
000003.1415
000022
更新:负数有点棘手,因为 -1 按字母顺序排在 -2 之前.这篇文章给出了关于在 Lucene 中处理负数和一般数字的完整解释.基本上,您必须将数字编码"成使项目的顺序有意义的东西.
UPDATE: Negative numbers are a bit tricky, since -1 comes before -2 alphabetically. This article gives a complete explanation about dealing with negative numbers and numbers in general in Lucene. Basically, you have to "encode" numbers into something that makes the order of the items make sense.
这篇关于Lucene RangeQuery 没有正确过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Lucene RangeQuery 没有正确过滤
基础教程推荐
- 为什么Flurl.Http DownloadFileAsync/Http客户端GetAsync需要 2022-09-30
- MS Visual Studio .NET 的替代品 2022-01-01
- 将 Office 安装到 Windows 容器 (servercore:ltsc2019) 失败,错误代码为 17002 2022-01-01
- 如何激活MC67中的红灯 2022-01-01
- 有没有办法忽略 2GB 文件上传的 maxRequestLength 限制? 2022-01-01
- c# Math.Sqrt 实现 2022-01-01
- 将 XML 转换为通用列表 2022-01-01
- SSE 浮点算术是否可重现? 2022-01-01
- 如何在 IDE 中获取 Xamarin Studio C# 输出? 2022-01-01
- rabbitmq 的 REST API 2022-01-01