Elasticsearch 查询结果分组统计,聚合检索(group by stats)

Elasticsearch发布于2021-03-05 / 更新于2021-09-15 04:05

前言

在使用Elasticsearch做搜索的时候,经常会碰到需要将搜索的结果分好类别,每个只取指定数量给到前端,举个例子:斗鱼直播平台搜索

Elasticsearch 查询结果分组统计,聚合检索(group by stats)

假设搜索结果就分三个组:相关主播, 相关视频, 相关UP主 再假设所有组的数据都存储在同一个索引

假设索引结构

data: 综合搜索展示的简要数据 content: 搜索关键词集合, 例如: "卢本伟,55开,世界第一小鱼人,英雄联盟" type: 类别, 例如: 主播,视频,UP主 weight: 权重

{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "data": {
        "type": "object"
      },
      "content": {
        "type": "text",
        "analyzer": "ik_max_word",
        "search_analyzer": "ik_smart"
      },
      "type": {
        "type": "keyword"
      },
      "weight": {
        "type": "keyword"
      }
    }
  }
}

查询结果分组统计

  1. 将结果按类别分组,每个组都按权重进行倒序,每组只展示前6个,只保留data数据项
{
  "size": 0,
  "query": {
    "match_phrase": {
      "content": "大神"
    }
  },
  "aggs": {
    "ret": {
      "terms": {
        "field": "type",
        "size": "3"
      },
      "aggs": {
        "type": {
          "terms": {
            "field": "type",
            "size": 3
          },
          "aggs": {
            "top": {
              "top_hits": {
                "sort": [
                  {
                    "weight": {
                      "order": "desc"
                    }
                  }
                ],
                "_source": [
                  "data"
                ],
                "size": 6
              }
            }
          }
        }
      }
    }
  }
}
  1. 如果每个类别都需要单独定制参数,比如主播按权重倒序, 视频按发布时间倒序
{
  "size": 0,
  "query": {
    "match_phrase": {
      "content": "大神"
    }
  },
  "aggs": {
    "anchor": { //主播
      "filter": {
        "bool": {
          "must": {
            "term": {
              "type": "1"
            }
          }
        }
      },
      "aggs": {
        "type": {
          "terms": {
            "field": "type",
            "size": 1
          },
          "aggs": {
            "top": {
              "top_hits": {
                "sort": [
                  {
                    "weight": {
                      "order": "desc"
                    }
                  }
                ],
                "_source": [
                  "data"
                ],
                "size": 6
              }
            }
          }
        }
      }
    },
    "video": { //视频
      "filter": {
        "bool": {
          "must": {
            "term": {
              "type": "2"
            }
          }
        }
      },
      "aggs": {
        "type": {
          "terms": {
            "field": "type",
            "size": 1
          },
          "aggs": {
            "top": {
              "top_hits": {
                "sort": [
                  {
                    "create_time": {
                      "order": "desc"
                    }
                  }
                ],
                "_source": [
                  "data"
                ],
                "size": 6
              }
            }
          }
        }
      }
    }
  }
}

结语

Elasticsearch的功能非常强大,个人偏向于能在Elasticsearch上完成的事不会放在php代码中处理, 如果你有更便捷更好的写法欢迎分享

I am a full-stack independent development engineer from China. I love to participate in open source and focus on developing the Web, iOS App & Android App (React Native), desktop applications (Eletron), crawlers, back-end services, system architecture

讨论

ar414
test

目录

1.前言
1.1假设索引结构
1.2查询结果分组统计
2.结语