ElasticSearch returning only documents with distinct value(ElasticSearch 仅返回具有不同值的文档)
问题描述
假设我有这个给定的数据
Let's say I have this given data
{
"name" : "ABC",
"favorite_cars" : [ "ferrari","toyota" ]
}, {
"name" : "ABC",
"favorite_cars" : [ "ferrari","toyota" ]
}, {
"name" : "GEORGE",
"favorite_cars" : [ "honda","Hyundae" ]
}
每当我在搜索最喜欢的汽车是丰田的人时查询此数据时,它都会返回此数据
Whenever I query this data when searching for people who's favorite car is toyota, it returns this data
{
"name" : "ABC",
"favorite_cars" : [ "ferrari","toyota" ]
}, {
"name" : "ABC",
"favorite_cars" : [ "ferrari","toyota" ]
}
结果是两条名为 ABC 的记录.如何仅选择不同的文档?我想得到的结果只有这个
the result is Two records of with a name of ABC. How do I select distinct documents only? The result I want to get is only this
{
"name" : "ABC",
"favorite_cars" : [ "ferrari","toyota" ]
}
这是我的查询
{
"fuzzy_like_this_field" : {
"favorite_cars" : {
"like_text" : "toyota",
"max_query_terms" : 12
}
}
}
我正在使用 ElasticSearch 1.0.0.使用 java api 客户端
I am using ElasticSearch 1.0.0. with the java api client
推荐答案
您可以使用 聚合.使用 术语聚合结果将按一个字段分组,例如name
,还提供了该字段每个值的出现次数,并将按此计数对结果进行排序(降序).
You can eliminate duplicates using aggregations. With term aggregation the results will be grouped by one field, e.g. name
, also providing a count of the ocurrences of each value of the field, and will sort the results by this count (descending).
{
"query": {
"fuzzy_like_this_field": {
"favorite_cars": {
"like_text": "toyota",
"max_query_terms": 12
}
}
},
"aggs": {
"grouped_by_name": {
"terms": {
"field": "name",
"size": 0
}
}
}
}
除了 hits
之外,结果还将包含 buckets
,其中 key
中的唯一值和 中的计数>doc_count
:
In addition to the hits
, the result will also contain the buckets
with the unique values in key
and with the count in doc_count
:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.19178301,
"hits" : [ {
"_index" : "pru",
"_type" : "pru",
"_id" : "vGkoVV5cR8SN3lvbWzLaFQ",
"_score" : 0.19178301,
"_source":{"name":"ABC","favorite_cars":["ferrari","toyota"]}
}, {
"_index" : "pru",
"_type" : "pru",
"_id" : "IdEbAcI6TM6oCVxCI_3fug",
"_score" : 0.19178301,
"_source":{"name":"ABC","favorite_cars":["ferrari","toyota"]}
} ]
},
"aggregations" : {
"grouped_by_name" : {
"buckets" : [ {
"key" : "abc",
"doc_count" : 2
} ]
}
}
}
请注意,由于重复消除和结果排序,使用聚合的成本会很高.
Note that using aggregations will be costly because of duplicate elimination and result sorting.
这篇关于ElasticSearch 仅返回具有不同值的文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:ElasticSearch 仅返回具有不同值的文档


基础教程推荐
- 在 Libgdx 中处理屏幕的正确方法 2022-01-01
- 减少 JVM 暂停时间 >1 秒使用 UseConcMarkSweepGC 2022-01-01
- Java Keytool 导入证书后出错,"keytool error: java.io.FileNotFoundException &拒绝访问" 2022-01-01
- 降序排序:Java Map 2022-01-01
- Java:带有char数组的println给出乱码 2022-01-01
- 如何使用 Java 创建 X509 证书? 2022-01-01
- FirebaseListAdapter 不推送聊天应用程序的单个项目 - Firebase-Ui 3.1 2022-01-01
- “未找到匹配项"使用 matcher 的 group 方法时 2022-01-01
- 无法使用修饰符“public final"访问 java.util.Ha 2022-01-01
- 设置 bean 时出现 Nullpointerexception 2022-01-01