开始使用Elasticsearch （2）

在上一篇文章中，我们已经介绍了如何使用REST接口来在Elasticsearch中创建index，文档以及对它们的操作。在今天的文章里，我们来介绍如何利用Elasticsearch来搜索我们的数据。Elasticsearch是近实时的搜索。我们还是接着我们上次的练习“开始使用Elasticsearch （1）”

在Elasticsearch中的搜索中，有两类搜索：

queries
aggregations

它们之间的区别在于：query可以帮我们进行全文搜索，而aggregation可以帮我们对数据进行统计及分析。我们有时也可以结合query及aggregation一起使用，比如我们可以先对文档进行搜索然后在进行aggregation:

GET blogs/_search
{
  "query": {
    "match": {
      "title": "community"
    }
  },
  "aggregations": {
    "top_authors": {
      "terms": {
        "field": "author"
      }
    }
  }
}

在上面的搜索中，先搜寻在title含有community的文档，然后再对数据进行aggregation。

搜索所有的文档

我们可以使用如下的命令来搜索到所有的文档：

GET /_search

在这里我们没有指定任何index，我们将搜索在该cluster下的所有的index。目前默认的返回个数是10个，除非我们设定size:

GET /_search?size=20

如果我们只想搜索我们特定的index，比如twitter，我们可以这么做：

GET twitter/_search

从上面我们可以看出来，在twitter index里我们有7个文档。在上面的hits数组里，我们可以看到所有的结果。同时，我们也可以看到一个叫做_score的项。它表示我们搜索结果的相关度。这个分数值越高，表明我们搜索匹配的相关度越高。在默认没有sort的情况下，所有搜索的结果的是按照分数由大到小来进行排列的。

在默认的情况下，我们可以得到10个结果。我们可以通过设置size参数得到我们想要的个数。同时，我们可以也配合from来进行page。

GET twitter/_search?size=2&from=2

并且只要两个文档显示。我们可以通过这个方法让我们的文档进行分页显示。

上面的查询类似于DSL查询的如下语句：

GET twitter/_search
{
  "size": 2,
  "from": 2, 
  "query": {
    "match_all": {}
  }
}

source filtering

我们可以通过_source来定义返回想要的字段：

GET twitter/_search
{
  "_source": ["user", "city"],
  "query": {
    "match_all": {
    }
  }
}

返回的结果:

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "city" : "北京",
          "user" : "张三"
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "city" : "北京",
          "user" : "老刘"
        }
      },
      ...
    ]

我们可以看到只有user及city两个字段在_source里返回。我们可以可以通过设置_source为false，这样不返回任何的_source信息：

GET twitter/_search
{
  "_source": false,
  "query": {
    "match": {
      "user": "张三"
    }
  }
}

返回的信息：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 3.0808902
      }
    ]

我们可以看到只有_id及_score等信息返回。其它任何的_source字段都没有被返回。它也可以接收通配符形式的控制，比如：

GET twitter/_search
{
  "_source": {
    "includes": [
      "user*",
      "location*"
    ],
    "excludes": [
      "*.lat"
    ]
  },
  "query": {
    "match_all": {}
  }
}

返回的结果是：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "location" : {
            "lon" : "116.325747"
          },
          "user" : "张三"
        }
      },
      ...
    ]

Script fields

有些时候，我们想要的field可能在_source里根本没有，那么我们可以使用script field来生成这些field。允许为每个匹配返回script evaluation（基于不同的字段），例如：

GET twitter/_search
{
  "query": {
    "match_all": {}
  },
  "script_fields": {
    "years_to_100": {
      "script": {
        "lang": "painless",
        "source": "100-doc['age'].value"
      }
    },
    "year_of_birth":{
      "script": "2019 - doc['age'].value"
    }
  }
}

返回的结果是：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "fields" : {
          "years_to_100" : [
            80
          ],
          "year_of_birth" : [
            1999
          ]
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "fields" : {
          "years_to_100" : [
            70
          ],
          "year_of_birth" : [
            1989
          ]
        }
      },
    ...
  ]

必须注意的是这种使用script的方法来生成查询的结果对于大量的文档来说，可能会占用大量资源。

Count API

我们经常会查询我们的索引里到底有多少文档，那么我们可以使用_count重点来查询：

GET twitter/_count

如果我们想知道满足条件的文档的数量，我们可以采用如下的格式：

GET twitter/_count
{
  "query": {
    "match": {
      "city": "北京"
    }
  }
}

在这里，我们可以得到city为“北京”的所有文档的数量：

{
  "count" : 5,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

修改settings

我们可以通过如下的接口来获得一个index的settings

GET twitter/_settings

从这里我们可以看到我们的twitter index有多少个shards及多少个replicas。我们也可以通过如下的接口来设置：

PUT twitter
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}

一旦我们把number_of_shards定下来了，我们就不可以修改了，除非把index删除，并重新index它。这是因为每个文档存储到哪一个shard是和number_of_shards这个数值有关的。一旦这个数值发生改变，那么之后寻找那个文档所在的shard就会不准确。

修改index的mapping

Elasticsearch号称是schemaless，在实际所得应用中，每一个index都有一个相应的mapping。这个mapping在我们生产第一个文档时已经生产。它是对每个输入的字段进行自动的识别从而判断它们的数据类型。我们可以这么理解schemaless：

不需要事先定义一个相应的mapping才可以生产文档。字段类型是动态进行识别的。这和传统的数据库是不一样的
如果有动态加入新的字段，mapping也可以自动进行调整并识别新加入的字段

自动识别字段有一个问题，那就是有的字段可能识别并不精确，比如对于我们例子中的位置信息。那么我们需要对这个字段进行修改。

我们可以通过如下的命令来查询目前的index的mapping:

GET twitter/_mapping

它显示的数据如下：

{
  "twitter" : {
    "mappings" : {
      "properties" : {
        "address" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "age" : {
          "type" : "long"
        },
        "city" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "country" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "location" : {
          "properties" : {
            "lat" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "lon" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "message" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "province" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "uid" : {
          "type" : "long"
        },
        "user" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

从上面的显示中可以看出来location里的经纬度是一个multi-field的类型。

        "location" : {
          "properties" : {
            "lat" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "lon" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        }

这个显然不是我们所需的。正确的类型应该是：geo_point。我们重新修正我们的mapping。

注意：我们不能为已经建立好的index动态修改mapping。这是因为一旦修改，那么之前建立的索引就变成不能搜索的了。一种办法是reindex从而重新建立我们的索引。如果在之前的mapping加入新的字段，那么我们可以不用重新建立索引。

为了能够正确地创建我们的mapping，我们必须先把之前的twitter索引删除掉，并同时使用settings来创建这个index。具体的步骤如下：

DELETE twitter
PUT twitter
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}

PUT twitter/_mapping
{
  "properties": {
    "address": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    },
    "age": {
      "type": "long"
    },
    "city": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    },
    "country": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    },
    "location": {
      "type": "geo_point"
    },
    "message": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    },
    "province": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    },
    "uid": {
      "type": "long"
    },
    "user": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    }
  }
}

重新查看我们的mapping:

GET twitter/_mapping

我们可以看到我们已经创建好了新的mapping。我们再次运行之前我们的bulk接口，并把我们所需要的数据导入到twitter索引中。

POST _bulk
{ "index" : { "_index" : "twitter", "_id": 1} }
{"user":"双榆树-张三","message":"今儿天气不错啊，出去转转去","uid":2,"age":20,"city":"北京","province":"北京","country":"中国","address":"中国北京市海淀区","location":{"lat":"39.970718","lon":"116.325747"}}
{ "index" : { "_index" : "twitter", "_id": 2 }}
{"user":"东城区-老刘","message":"出发，下一站云南！","uid":3,"age":30,"city":"北京","province":"北京","country":"中国","address":"中国北京市东城区台基厂三条3号","location":{"lat":"39.904313","lon":"116.412754"}}
{ "index" : { "_index" : "twitter", "_id": 3} }
{"user":"东城区-李四","message":"happy birthday!","uid":4,"age":30,"city":"北京","province":"北京","country":"中国","address":"中国北京市东城区","location":{"lat":"39.893801","lon":"116.408986"}}
{ "index" : { "_index" : "twitter", "_id": 4} }
{"user":"朝阳区-老贾","message":"123,gogogo","uid":5,"age":35,"city":"北京","province":"北京","country":"中国","address":"中国北京市朝阳区建国门","location":{"lat":"39.718256","lon":"116.367910"}}
{ "index" : { "_index" : "twitter", "_id": 5} }
{"user":"朝阳区-老王","message":"Happy BirthDay My Friend!","uid":6,"age":50,"city":"北京","province":"北京","country":"中国","address":"中国北京市朝阳区国贸","location":{"lat":"39.918256","lon":"116.467910"}}
{ "index" : { "_index" : "twitter", "_id": 6} }
{"user":"虹桥-老吴","message":"好友来了都今天我生日，好友来了,什么 birthday happy 就成!","uid":7,"age":90,"city":"上海","province":"上海","country":"中国","address":"中国上海市闵行区","location":{"lat":"31.175927","lon":"121.383328"}}

至此，我们已经完整地建立了我们所需要的索引。在下面，我们开始使用DSL（Domain Specifc Lanaguage）来帮我们进行查询。

查询数据

在这个章节里，我们来展示一下从我们的ES索引中查询我们所想要的数据。

match query

GET twitter/_search
{
  "query": {
    "match": {
      "city": "北京"
    }
  }
}

从我们查询的结果来看，我们可以看到有5个用户是来自北京的，而且查询出来的结果是按照关联（relavance)来进行排序的。

在很多的情况下，我们也可以使用script query来完成：

GET twitter/_search
{
  "query": {
    "script": {
      "script": {
        "source": "doc['city'].contains(params.name)",
        "lang": "painless",
        "params": {
          "name": "北京"
        }
      }
    }
  }
}

上面的script query和上面的查询是一样的结果，但是我们不建议大家使用这种方法。相比较而言，script query的方法比较低效。

如果我们不需要这个score，我们可以选择filter来完成。

GET twitter/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "city.keyword": "北京"
        }
      }
    }
  }
}

这里我们使用了filter来过滤我们的搜索，显示的结果如下：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.0,
        "_source" : {
          "user" : "双榆树-张三",
          "message" : "今儿天气不错啊，出去转转去",
          "uid" : 2,
          "age" : 20,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市海淀区",
          "location" : {
            "lat" : "39.970718",
            "lon" : "116.325747"
          }
        }
      },

   ...
}

从返回的结果来看，_score项为0。对于这种搜索，只要yes或no。我们并不关心它们是的相关性。在这里我们使用了city.keyword。对于一些刚接触Elasticsearch的人来说，这个可能比较陌生。正确的理解是city在我们的mapping中是一个multi-field项。它既是text也是keyword类型。对于一个keyword类型的项来说，这个项里面的所有字符都被当做一个字符串。它们在建立文档时，不需要进行index。keyword字段用于精确搜索，aggregation和排序（sorting）。

所以在我们的filter中，我们是使用了term来完成这个查询。

我们也可以使用如下的办法达到同样的效果：

GET twitter/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "city": {
            "value": "北京"
          }
        }
      }
    }
  }
}

在我们使用match query时，默认的操作是OR，我们可以做如下的查询：

GET twitter/_search
{
  "query": {
    "match": {
      "user": {
        "query": "朝阳区-老贾",
        "operator": "or"
      }
    }
  }
}

上面的查询也和如下的查询是一样的：

GET twitter/_search
{
 "query": {
   "match": {
     "user": "朝阳区-老贾"
   }
 }
}

这是因为默认的操作是or操作。上面查询的结果是任何文档匹配：“朝”，“阳”，“区”，“老”及“贾”这5个字中的任何一个将被显示：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 4.4209847,
        "_source" : {
          "user" : "朝阳区-老贾",
          "message" : "123,gogogo",
          "uid" : 5,
          "age" : 35,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区建国门",
          "location" : {
            "lat" : "39.718256",
            "lon" : "116.367910"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 2.9019678,
        "_source" : {
          "user" : "朝阳区-老王",
          "message" : "Happy BirthDay My Friend!",
          "uid" : 6,
          "age" : 50,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区国贸",
          "location" : {
            "lat" : "39.918256",
            "lon" : "116.467910"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.8713734,
        "_source" : {
          "user" : "东城区-老刘",
          "message" : "出发，下一站云南！",
          "uid" : 3,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区台基厂三条3号",
          "location" : {
            "lat" : "39.904313",
            "lon" : "116.412754"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 0.4753614,
        "_source" : {
          "user" : "虹桥-老吴",
          "message" : "好友来了都今天我生日，好友来了,什么 birthday happy 就成!",
          "uid" : 7,
          "age" : 90,
          "city" : "上海",
          "province" : "上海",
          "country" : "中国",
          "address" : "中国上海市闵行区",
          "location" : {
            "lat" : "31.175927",
            "lon" : "121.383328"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.4356867,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        }
      }
    ]

我们也可以设置参数minimum_should_match来设置至少匹配的term。比如：

GET twitter/_search
{
  "query": {
    "match": {
      "user": {
        "query": "朝阳区-老贾",
        "operator": "or",
        "minimum_should_match": 3
      }
    }
  }
}

上面显示我们至少要匹配“朝”，“阳”，“区”，“老”及“贾这5个中的3个字才可以。显示结果：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 4.4209847,
        "_source" : {
          "user" : "朝阳区-老贾",
          "message" : "123,gogogo",
          "uid" : 5,
          "age" : 35,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区建国门",
          "location" : {
            "lat" : "39.718256",
            "lon" : "116.367910"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 2.9019678,
        "_source" : {
          "user" : "朝阳区-老王",
          "message" : "Happy BirthDay My Friend!",
          "uid" : 6,
          "age" : 50,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区国贸",
          "location" : {
            "lat" : "39.918256",
            "lon" : "116.467910"
          }
        }
      }
    ]

我们也可以改为"and“操作看看：

GET twitter/_search
{
  "query": {
    "match": {
      "user": {
        "query": "朝阳区-老贾",
        "operator": "and"
      }
    }
  }
}

显示的结果是：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 4.4209847,
        "_source" : {
          "user" : "朝阳区-老贾",
          "message" : "123,gogogo",
          "uid" : 5,
          "age" : 35,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区建国门",
          "location" : {
            "lat" : "39.718256",
            "lon" : "116.367910"
          }
        }
      }
    ]

在这种情况下，需要同时匹配索引的5个字才可以。显然我们可以通过使用and来提高搜索的精度。

Multi_match

在上面的搜索之中，我们特别指明一个专有的field来进行搜索，但是在很多的情况下，我们并胡知道是哪一个field含有这个关键词，那么在这种情况下，我们可以使用multi_match来进行搜索：

GET twitter/_search
{
  "query": {
    "multi_match": {
      "query": "朝阳",
      "fields": [
        "user",
        "address^3",
        "message"
      ],
      "type": "best_fields"
    }
  }
}

在上面，我们可以同时对三个fields: user，adress及message进行搜索，但是我们对address含有 “朝阳” 的文档的分数进行3倍的加权。返回的结果：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 6.1777167,
        "_source" : {
          "user" : "朝阳区-老王",
          "message" : "Happy good BirthDay My Friend!",
          "uid" : 6,
          "age" : 50,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区国贸",
          "location" : {
            "lat" : "39.918256",
            "lon" : "116.467910"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 5.9349246,
        "_source" : {
          "user" : "朝阳区-老贾",
          "message" : "123,gogogo",
          "uid" : 5,
          "age" : 35,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区建国门",
          "location" : {
            "lat" : "39.718256",
            "lon" : "116.367910"
          }
        }
      }
    ]

Prefix query

返回在提供的字段中包含特定前缀的文档。

GET twitter/_search
{
  "query": {
    "prefix": {
      "user": {
        "value": "朝"
      }
    }
  }
}

查询user字段里以“朝”为开头的所有文档：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "user" : "朝阳区-老贾",
          "message" : "123,gogogo",
          "uid" : 5,
          "age" : 35,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区建国门",
          "location" : {
            "lat" : "39.718256",
            "lon" : "116.367910"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "user" : "朝阳区-老王",
          "message" : "Happy BirthDay My Friend!",
          "uid" : 6,
          "age" : 50,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区国贸",
          "location" : {
            "lat" : "39.918256",
            "lon" : "116.467910"
          }
        }
      }
    ]

Term query

Term query会在给定字段中进行精确的字词匹配。因此，您需要提供准确的术语以获取正确的结果。

GET twitter/_search
{
  "query": {
    "term": {
      "user.keyword": {
        "value": "朝阳区-老贾"
      }
    }
  }
}

在这里，我们使用user.keyword来对“朝阳区-老贾”进行精确匹配查询相应的文档：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.5404451,
        "_source" : {
          "user" : "朝阳区-老贾",
          "message" : "123,gogogo",
          "uid" : 5,
          "age" : 35,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区建国门",
          "location" : {
            "lat" : "39.718256",
            "lon" : "116.367910"
          }
        }
      }
    ]

Terms query

如果我们想对多个terms进行查询，我们可以使用如下的方式来进行查询：

GET twitter/_search
{
  "query": {
    "terms": {
      "user.keyword": [
        "双榆树-张三",
        "东城区-老刘"
      ]
    }
  }
}

上面查询user.keyword里含有“双榆树-张三”或“东城区-老刘”的所有文档。

复合查询（compound query）

什么是复合查询呢？如果说上面的查询是leaf查询的话，那么复合查询可以把很多个leaf查询组合起来从而形成更为复杂的查询。它一般的格式是：

POST _search
{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "user" : "kimchy" }
      },
      "filter": {
        "term" : { "tag" : "tech" }
      },
      "must_not" : {
        "range" : {
          "age" : { "gte" : 10, "lte" : 20 }
        }
      },
      "should" : [
        { "term" : { "tag" : "wow" } },
        { "term" : { "tag" : "elasticsearch" } }
      ],
      "minimum_should_match" : 1,
      "boost" : 1.0
    }
  }
}

从上面我们可以看出，它是由bool下面的must, must_not, should及filter共同来组成的。针对我们的例子，

GET twitter/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "city": "北京"
          }
        },
        {
          "match": {
            "age": "30"
          }
        }
      ]
    }
  }
}

这个查询的是必须是北京城市的，并且年刚好是30岁的。

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.4823241,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.4823241,
        "_source" : {
          "user" : "东城区-老刘",
          "message" : "出发，下一站云南！",
          "uid" : 3,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区台基厂三条3号",
          "location" : {
            "lat" : "39.904313",
            "lon" : "116.412754"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.4823241,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        }
      }
    ]
  }
}

如果我们想知道为什么得出来这样的结果，我们可以在搜索的指令中加入"explained" : true。

GET twitter/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "city": "北京"
          }
        },
        {
          "match": {
            "age": "30"
          }
        }
      ]
    }
  },
  "explain": true
}

这样在我们的显示的结果中，可以看到一些一些解释：

我们的显示结果有2个。同样，我们可以把一些满足条件的排出在外，我们可以使用must_not。

GET twitter/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "city": "北京"
          }
        }
      ]
    }
  }
}

我们想寻找不在北京的所有的文档：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 0.0,
        "_source" : {
          "user" : "虹桥-老吴",
          "message" : "好友来了都今天我生日，好友来了,什么 birthday happy 就成!",
          "uid" : 7,
          "age" : 90,
          "city" : "上海",
          "province" : "上海",
          "country" : "中国",
          "address" : "中国上海市闵行区",
          "location" : {
            "lat" : "31.175927",
            "lon" : "121.383328"
          }
        }
      }
    ]
  }
}

我们显示的文档只有一个。他来自上海，其余的都北京的。

接下来，我们来尝试一下should。它表述“或”的意思，也就是有就更好，没有就算了。比如：

GET twitter/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "age": "30"
          }
        }
      ],
      "should": [
        {
          "match_phrase": {
            "message": "Happy birthday"
          }
        }
      ]
    }
  }
}

这个搜寻的意思是，age必须是30岁，但是如果文档里含有“Hanppy birthday”，相关性会更高，那么搜索得到的结果会排在前面：

{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.641438,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 2.641438,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "user" : "东城区-老刘",
          "message" : "出发，下一站云南！",
          "uid" : 3,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区台基厂三条3号",
          "location" : {
            "lat" : "39.904313",
            "lon" : "116.412754"
          }
        }
      }
    ]
  }
}

在上面的结果中，我们可以看到：同样是年龄30岁的两个文档，第一个文档由于含有“Happy birthday”这个字符串在message里，所以它的结果是排在前面的，相关性更高。我们可以从它的_score中可以看出来。第二个文档里age是30，但是它的message里没有“Happy birthday”字样，但是它的结果还是有显示，只是得分比较低一些。

位置查询

Elasticsearch最厉害的是位置查询。这在很多的关系数据库里并没有。我们举一个简单的例子：

GET twitter/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "address": "北京"
          }
        }
      ]
    }
  },
  "post_filter": {
    "geo_distance": {
      "distance": "3km",
      "location": {
        "lat": 39.920086,
        "lon": 116.454182
      }
    }
  }
}

在这里，我们查找在地址栏里有“北京”，并且在以位置(116.454182, 39.920086)为中心的3公里以内的所有文档。

{
  "took" : 58,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.48232412,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 0.48232412,
        "_source" : {
          "user" : "朝阳区-老王",
          "message" : "Happy BirthDay My Friend!",
          "uid" : 6,
          "age" : 50,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区国贸",
          "location" : {
            "lat" : "39.918256",
            "lon" : "116.467910"
          }
        }
      }
    ]
  }
}

在我们的查询结果中只有一个文档满足要求。

下面，我们找出在5公里以内的所有位置信息，并按照远近大小进行排序：

GET twitter/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "address": "北京"
          }
        }
      ]
    }
  },
  "post_filter": {
    "geo_distance": {
      "distance": "5km",
      "location": {
        "lat": 39.920086,
        "lon": 116.454182
      }
    }
  },
  "sort": [
    {
      "_geo_distance": {
        "location": "39.920086,116.454182",
        "order": "asc",
        "unit": "km"
      }
    }
  ]
}

在这里，我们看到了使用sort来对我们的搜索的结果进行排序。按照升序排列。

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : null,
        "_source" : {
          "user" : "朝阳区-老王",
          "message" : "Happy BirthDay My Friend!",
          "uid" : 6,
          "age" : 50,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区国贸",
          "location" : {
            "lat" : "39.918256",
            "lon" : "116.467910"
          }
        },
        "sort" : [
          1.1882901656104885
        ]
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : null,
        "_source" : {
          "user" : "东城区-老刘",
          "message" : "出发，下一站云南！",
          "uid" : 3,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区台基厂三条3号",
          "location" : {
            "lat" : "39.904313",
            "lon" : "116.412754"
          }
        },
        "sort" : [
          3.9447355972239952
        ]
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        },
        "sort" : [
          4.837769064666224
        ]
      }
    ]
  }
}

我们可以看到有三个显示的结果。在sort里我们可以看到距离是越来越大啊。另外我们可以看出来，如果_score不是sort的field，那么在使用sort后，所有的结果的_score都变为null。如果排序的如果在上面的搜索也可以直接写作为：

GET twitter/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "address": "北京"
        }
      },
      "filter": {
        "geo_distance": {
          "distance": "5km",
          "location": {
            "lat": 39.920086,
            "lon": 116.454182
          }
        }
      }
    }
  },
  "sort": [
    {
      "_geo_distance": {
        "location": "39.920086,116.454182",
        "order": "asc",
        "unit": "km"
      }
    }
  ]
}

范围查询

在ES中，我们也可以进行范围查询。我们可以根据设定的范围来对数据进行查询：

GET twitter/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 30,
        "lte": 40
      }
    }
  }
}

在这里，我们查询年龄介于30到40岁的文档：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "user" : "东城区-老刘",
          "message" : "出发，下一站云南！",
          "uid" : 3,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区台基厂三条3号",
          "location" : {
            "lat" : "39.904313",
            "lon" : "116.412754"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "user" : "朝阳区-老贾",
          "message" : "123,gogogo",
          "uid" : 5,
          "age" : 35,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区建国门",
          "location" : {
            "lat" : "39.718256",
            "lon" : "116.367910"
          }
        }
      }
    ]
  }
}

如上所示，我们找到了3个匹配的文档。同样地，我们也可以对它们进行排序：

GET twitter/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 30,
        "lte": 40
      }
    }
  },"sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}

我们对整个搜索的结果按照降序进行排序。

Exists 查询

我们可以通过exists来查询一个字段是否存在。比如我们再增加一个文档：

PUT twitter/_doc/20
{
  "user" : "王二",
  "message" : "今儿天气不错啊，出去转转去",
  "uid" : 20,
  "age" : 40,
  "province" : "北京",
  "country" : "中国",
  "address" : "中国北京市海淀区",
  "location" : {
    "lat" : "39.970718",
    "lon" : "116.325747"
  }
}

在这个文档里，我们的city这一个字段是不存在的，那么一下的这个搜索将不会返回上面的这个搜索。

GET twitter/_search
{
  "query": {
    "exists": {
      "field": "city"
    }
  }
}

如果文档里只要city这个字段不为空，那么就会被返回。反之，如果一个文档里city这个字段是空的，那么就不会返回。

如果查询不含city这个字段的所有的文档，可以这样查询：

GET twitter/_search
{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "city"
        }
      }
    }
  }
}

匹配短语

我们可以通过如下的方法来查找happy birthday。

GET twitter/_search
{
  "query": {
    "match": {
      "message": "happy birthday"
    }
  }
}

展示的结果：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.9936417,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.9936417,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.733287,
        "_source" : {
          "user" : "朝阳区-老王",
          "message" : "Happy BirthDay My Friend!",
          "uid" : 6,
          "age" : 50,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市朝阳区国贸",
          "location" : {
            "lat" : "39.918256",
            "lon" : "116.467910"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 0.84768087,
        "_source" : {
          "user" : "虹桥-老吴",
          "message" : "好友来了都今天我生日，好友来了,什么 birthday happy 就成!",
          "uid" : 7,
          "age" : 90,
          "city" : "上海",
          "province" : "上海",
          "country" : "中国",
          "address" : "中国上海市闵行区",
          "location" : {
            "lat" : "31.175927",
            "lon" : "121.383328"
          }
        }
      }
    ]
  }
}

在默认的情况下，这个匹配是“或”的关系，也就是找到文档里含有“Happy"或者“birthday”的文档。如果我们新增加一个文档：

PUT twitter/_doc/8
{
  "user": "朝阳区-老王",
  "message": "Happy",
  "uid": 6,
  "age": 50,
  "city": "北京",
  "province": "北京",
  "country": "中国",
  "address": "中国北京市朝阳区国贸",
  "location": {
    "lat": "39.918256",
    "lon": "116.467910"
  }
}

那么我们重新进行搜索，我们可以看到这个新增加的id为8的也会在搜索出的结果之列，虽然它只含有“Happy"在message里。

如果我们想得到“与”的关系，我们可以采用如下的办法：

GET twitter/_search
{
  "query": {
    "match": {
      "message": {
        "query": "happy birthday",
        "operator": "and"
      }
    }
  }
}

经过这样的修改，我们再也看不见那个id为8的文档了，这是因为我们必须在message中同时匹配“happy”及“birthday”这两个词。

我们还有一种方法，那就是：

GET twitter/_search
{
  "query": {
    "match": {
      "message": {
        "query": "happy birthday",
        "minimum_should_match": 2
      }
    }
  }
}

在这里，我们采用了“minimum_should_match”来表面至少有2个是匹配的才可以。

我们可以看到在搜索到的结果中，无论我们搜索的是大小写字母，在搜索的时候，我们都可以匹配到，并且在message中，happy birthday这两个词的先后顺序也不是很重要。比如，我们把id为5的文档更改为：

PUT twitter/_doc/5
{
  "user": "朝阳区-老王",
  "message": "BirthDay My Friend Happy !",
  "uid": 6,
  "age": 50,
  "city": "北京",
  "province": "北京",
  "country": "中国",
  "address": "中国北京市朝阳区国贸",
  "location": {
    "lat": "39.918256",
    "lon": "116.467910"
  }
}

在这里，我们有意识地把BirthDay弄到Happy的前面。我们再次使用上面的查询看看是否找到id为5的文档。

显然，match查询时时不用分先后顺序的。我们下面使用match_phrase来看看。

GET twitter/_search
{
  "query": {
    "match_phrase": {
      "message": "Happy birthday"
    }
  },
  "highlight": {
    "fields": {
      "message": {}
    }
  }
}

在这里，我们可以看到我们使用了match_phrase。它要求Happy必须是在birthday的前面。下面是搜寻的结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.6363969,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.6363969,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        },
        "highlight" : {
          "message" : [
            "<em>happy</em> <em>birthday</em>!"
          ]
        }
      }
    ]
  }
}

假如我们把我们之前的那个id为5的文档修改为：

PUT twitter/_doc/5
{
  "user": "朝阳区-老王",
  "message": "Happy Good BirthDay My Friend!",
  "uid": 6,
  "age": 50,
  "city": "北京",
  "province": "北京",
  "country": "中国",
  "address": "中国北京市朝阳区国贸",
  "location": {
    "lat": "39.918256",
    "lon": "116.467910"
  }
}

在这里，我们在Happy 和Birthday之前加入了一个Good。如果用我们之前的那个match_phrase是找不到这个文档的。为了能够找到上面这个修正的结果，我们可以使用：

GET twitter/_search
{
  "query": {
    "match_phrase": {
      "message": {
        "query": "Happy birthday",
        "slop": 1
      }
    }
  },
  "highlight": {
    "fields": {
      "message": {}
    }
  }
}

注意：在这里，我们使用了slop为1，表面Happy和birthday之前是可以允许一个token的差别。

Named queries

我们可以使用_name为一个filter或query来取一个名字，比如：

GET twitter/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "city": {
              "query": "北京",
              "_name": "城市"
            }
          }
        },
        {
          "match": {
            "country": {
              "query": "中国",
              "_name": "国家"
            }
          }
        }
      ],
      "should": [
        {
          "match": {
            "_id": {
              "query": "1",
              "_name": "ID"
            }
          }
        }
      ]
    }
  }
}

返回结果：

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.6305401,
        "_source" : {
          "user" : "双榆树-张三",
          "message" : "今儿天气不错啊，出去转转去",
          "uid" : 2,
          "age" : 20,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市海淀区",
          "location" : {
            "lat" : "39.970718",
            "lon" : "116.325747"
          }
        },
        "matched_queries" : [
          "国家",
          "ID",
          "城市"
        ]
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.6305401,
        "_source" : {
          "user" : "东城区-老刘",
          "message" : "出发，下一站云南！",
          "uid" : 3,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区台基厂三条3号",
          "location" : {
            "lat" : "39.904313",
            "lon" : "116.412754"
          }
        },
        "matched_queries" : [
          "国家",
          "城市"
        ]
      },
    ...
  ]

我们从上面的返回结果可以看出来多了一个叫做matched_queries的字段。在它的里面罗列了每个匹配了的查询。第一个返回的查询结果是三个都匹配了的，但是第二个来说就只有两项是匹配的。

SQL查询

对于与很多已经习惯用RDMS数据库的工作人员，他们更喜欢使用SQL来进行查询。Elasticsearch也对SQL有支持：

GET /_sql?
{
  "query": """
    SELECT * FROM twitter 
    WHERE age = 30
  """
}

通过这个查询，我们可以找到所有在年龄等于30的用户。在个搜索中，我们使用了SQL语句。利用SQL端点我们可以很快地把我们的SQL知识转化为Elasticsearch的使用场景中来。我们可以通过如下的方法得到它对应的DSL语句：

GET /_sql/translate
{
  "query": """
    SELECT * FROM twitter 
    WHERE age = 30
  """
}

我们得到的结果是：

{
  "size" : 1000,
  "query" : {
    "term" : {
      "age" : {
        "value" : 30,
        "boost" : 1.0
      }
    }
  },
  "_source" : {
    "includes" : [
      "address",
      "message",
      "region",
      "script.source",
      "user"
    ],
    "excludes" : [ ]
  },
  "docvalue_fields" : [
    {
      "field" : "age"
    },
    {
      "field" : "city"
    },
    {
      "field" : "country"
    },
    {
      "field" : "location"
    },
    {
      "field" : "province"
    },
    {
      "field" : "script.params.value"
    },
    {
      "field" : "uid"
    }
  ],
  "sort" : [
    {
      "_doc" : {
        "order" : "asc"
      }
    }
  ]
}

Multi Search API

使用单个API请求执行几次搜索。这个API的好处是节省API的请求个数，把多个请求放到一个API中来实现。

为了说明问题的方便，我们可以多加一个叫做twitter1的index。它的内容如下：

POST _bulk
{"index":{"_index":"twitter1","_id":1}}
{"user":"张庆","message":"今儿天气不错啊，出去转转去","uid":2,"age":20,"city":"重庆","province":"重庆","country":"中国","address":"中国重庆地区","location":{"lat":"39.970718","lon":"116.325747"}}

这样在我们的Elasticsearch中就有两个索引了。我们可以做如下的_msearch。

GET twitter/_msearch
{"index":"twitter"}
{"query":{"match_all":{}},"from":0,"size":1}
{"index":"twitter"}
{"query":{"bool":{"filter":{"term":{"city.keyword":"北京"}}}}, "size":1}
{"index":"twitter1"}
{"query":{"match_all":{}}}

上面我们通过_msearch终点来实现在一个API请求中做多个查询，对多个index进行同时操作。显示结果为：

多个索引操作

在上面我们引入了另外一个索引twitter1。在实际的操作中，我们可以通过通配符，或者直接使用多个索引来进行搜索：

GET twitter*/_search

上面的操作是对所有的以twitter为开头的索引来进行搜索，显示的结果是在所有的twitter及twitter1中的文档：

GET /twitter,twitter1/_search

也可以做同样的事。在写上面的查询的时候，在两个索引之间不能加入空格，比如：

GET /twitter, twitter1/_search

上面的查询并不能返回你所想要的结果。

Profile API

Profile API是调试工具。它添加了有关执行的详细信息搜索请求中的每个组件。它为用户提供有关搜索的每个步骤的洞察力
请求执行并可以帮助确定某些请求为何缓慢。

GET twitter/_search
{
  "profile": "true", 
  "query": {
    "match": {
      "city": "北京"
    }
  }
}

在上面，我们加上了"profile":"true"后，除了显示搜索的结果之外，还显示profile的信息：

  "profile" : {
    "shards" : [
      {
        "id" : "[ZXGhn-90SISq1lePV3c1sA][twitter][0]",
        "searches" : [
          {
            "query" : [
              {
                "type" : "BooleanQuery",
                "description" : "city:北 city:京",
                "time_in_nanos" : 1390064,
                "breakdown" : {
                  "set_min_competitive_score_count" : 0,
                  "match_count" : 5,
                  "shallow_advance_count" : 0,
                  "set_min_competitive_score" : 0,
                  "next_doc" : 31728,
                  "match" : 3337,
                  "next_doc_count" : 5,
                  "score_count" : 5,
                  "compute_max_score_count" : 0,
                  "compute_max_score" : 0,
                  "advance" : 22347,
                  "advance_count" : 1,
                  "score" : 16639,
                  "build_scorer_count" : 2,
                  "create_weight" : 342219,
                  "shallow_advance" : 0,
                  "create_weight_count" : 1,
                  "build_scorer" : 973775
                },
                "children" : [
                  {
                    "type" : "TermQuery",
                    "description" : "city:北",
                    "time_in_nanos" : 107949,
                    "breakdown" : {
                      "set_min_competitive_score_count" : 0,
                      "match_count" : 0,
                      "shallow_advance_count" : 3,
                      "set_min_competitive_score" : 0,
                      "next_doc" : 0,
                      "match" : 0,
                      "next_doc_count" : 0,
                      "score_count" : 5,
                      "compute_max_score_count" : 3,
                      "compute_max_score" : 11465,
                      "advance" : 3477,
                      "advance_count" : 6,
                      "score" : 5793,
                      "build_scorer_count" : 3,
                      "create_weight" : 34781,
                      "shallow_advance" : 18176,
                      "create_weight_count" : 1,
                      "build_scorer" : 34236
                    }
                  },
                  {
                    "type" : "TermQuery",
                    "description" : "city:京",
                    "time_in_nanos" : 49929,
                    "breakdown" : {
                      "set_min_competitive_score_count" : 0,
                      "match_count" : 0,
                      "shallow_advance_count" : 3,
                      "set_min_competitive_score" : 0,
                      "next_doc" : 0,
                      "match" : 0,
                      "next_doc_count" : 0,
                      "score_count" : 5,
                      "compute_max_score_count" : 3,
                      "compute_max_score" : 5162,
                      "advance" : 15645,
                      "advance_count" : 6,
                      "score" : 3795,
                      "build_scorer_count" : 3,
                      "create_weight" : 13562,
                      "shallow_advance" : 1087,
                      "create_weight_count" : 1,
                      "build_scorer" : 10657
                    }
                  }
                ]
              }
            ],
            "rewrite_time" : 17930,
            "collector" : [
              {
                "name" : "CancellableCollector",


                        
                        
                            
                            版权声明：本文来源CSDN，感谢博主原创文章，遵循 CC 4.0 by-sa 版权协议，转载请附上原文出处链接和本声明。

                            原文链接：https://blog.csdn.net/UbuntuTouch/article/details/99546568

                            站方申明：本站部分内容来自社区用户分享，若涉及侵权，请联系站方删除。
                        
                        
                        
                            
                                
                                    
                                    发表于 2020-03-01 12:53:01
                                
                                阅读 ( 1434 )
                                分类：


                

    
        
            你可能感兴趣的文章
            
                
                
                    Elasticsearch之中文分词器插件es-ik（博主推荐）
                    1479 浏览
                
                
                
                    [Logstash 6.3.2] 配置以mysql、filebeat为输入,Elasticsearch为输出的管道
                    1805 浏览
                
                
                
                    腾讯如何用Elasticsearch挖掘万亿数据价值？
                    1339 浏览
                
                
                
                    ElasticSearch - 新老选主算法对比
                    1792 浏览
                
                
                
                    ElasticSearch系列 - ElasticSearch读写原理分析
                    966 浏览
                
                
                
                    linux-elasticsearch:unable to load JNA native support library, native methods will be disabled ---记录
                    1935 浏览
                
                
                
                    elasticsearch  查询两个字段值相同的数据
                    2615 浏览
                
                
                
                    elasticsearch：实现查询某个字段为固定值，另一个字段必须不能存在
                    1463 浏览
                
                
                
                    详述 Elasticsearch 通过范围条件查询索引数据的方法
                    1753 浏览
                
                
            
        
        
            精选的优质文章
            
                
                
                    也许 Go 开发可以更简单！
                    10573 浏览
                
                
                
                    如何使用 Golang 日志监控你的应用程序？
                    12045 浏览
                
                
                
                    从Go语言实现模板设计模式浅谈Go的抽象能力
                    14098 浏览
                
                
                
                    阿里云基于 Go 的微服务架构分享
                    23961 浏览
                
                
                
                    java是否会被取代？Go会否给Java带来冲击？
                    28488 浏览
                
                
                
                    千万级规模高性能、高并发的网络架构经验分享
                    30043 浏览
                
                
                
                    阿里部分面试题汇总,对想进阿里的同学非常实用
                    62337 浏览
                
                
                
                    实用好文：知乎实时数仓架构实践及演进
                    31358 浏览
                
                
                
                    支撑马蜂窝「双11」营销大战背后的技术架构
                    228304 浏览
                
                
                
                    想进大厂？50个多线程面试题，你会多少？（一）
                    23097 浏览
                
                
            
        

    



                
                    0 条评论
                    
                        
                        
                        
                            
                                请先 登录 后评论