es经验

分析器:"standard", "ik_max_word" , "ik_smart"

  1. standard标准分析器是将每个字都分出来;而ik_max_word是将所有可能的词都分出来;ik_smart 是只分出自认为最正确的词;
# standard分词器
GET /_analyze
{
  "analyzer": "standard", 
  "text": "练习分词"
}

# 分词结果
{
  "tokens": [
    {
      "token": "练",
      "start_offset": 0,
      "end_offset": 1,
      "type": "<IDEOGRAPHIC>",
      "position": 0
    },
    {
      "token": "习",
      "start_offset": 1,
      "end_offset": 2,
      "type": "<IDEOGRAPHIC>",
      "position": 1
    },
    {
      "token": "分",
      "start_offset": 2,
      "end_offset": 3,
      "type": "<IDEOGRAPHIC>",
      "position": 2
    },
    {
      "token": "词",
      "start_offset": 3,
      "end_offset": 4,
      "type": "<IDEOGRAPHIC>",
      "position": 3
    },
    {
      "token": "的",
      "start_offset": 4,
      "end_offset": 5,
      "type": "<IDEOGRAPHIC>",
      "position": 4
    }
  ]
}

**************************************************************************
# ik_max_word分词器
GET /_analyze
{
  "analyzer": "ik_max_word", 
  "text": "练习分词"
}
GET /_analyze
{
  "analyzer": "ik_smart", 
  "text": "练习分词"
}
# 分词结果
{
  "tokens": [
    {
      "token": "练习",
      "start_offset": 0,
      "end_offset": 2,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "分词",
      "start_offset": 2,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 1
    },
    {
      "token": "词",
      "start_offset": 3,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 2
    }
  ]
}
******************************************************************************************
# ik_smart分词器
GET /_analyze 
{
  "analyzer": "ik_smart", 
  "text": "练习分词"
}

# 分词结果
{
  "tokens": [
    {
      "token": "练习",
      "start_offset": 0,
      "end_offset": 2,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "分词",
      "start_offset": 2,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 1
    }
  ]
}
  1. term查询只匹配分词分出该词的查询,如果你分出的词时“练习”、“分词”(分析器是ik_smart),而你的查询语句如下
GET adu/adu2/_search
{"query": {"bool": {"filter": {"term": {
  "title": "练"
}}}}}

最后是没有结果的

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

或者如果你的分词是“分”、“词”、“练”、“习”(分析器是standard),而你的查询如下:

GET adu/adu2/_search
{"query": {"bool": {"filter": {"term": {
  "title": "练习"
}}}}}

最后也是没有结果的

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

# 只能按照分词结果查
GET adu/adu2/_search
{"query": {"bool": {"filter": {"term": {
  "title": "练"
}}}}}
# 才有结果
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0,
    "hits": [
      {
        "_index": "adu",
        "_type": "adu2",
        "_id": "1",
        "_score": 0,
        "_source": {
          "title": "练习分词",
          "city": "北京",
          "time": 1
        }
      }
    ]
  }
}

  1. 而match_phrase是可以先分词,然后在匹配,比较符合分析text类型的

推荐阅读更多精彩内容