文章目录

1. 问题引入1. 索引2个文档2. 给索引增加新的字段3. 再次索引1个文档4. 查看索引中的文档

2. must_not & exist3. 给历史数据赋初值

1. 问题引入

我们项目中有一个需求:ElasticSearch存在很多历史数据,然后需求中索引新增了一个字段,我们需要根据条件查询出历史数据,但历史数据中这个新增的字段并不存在,如何查询到历史数据呢?

1. 索引2个文档

PUT /user/_doc/1

{

"first_name" : "John",

"last_name" : "Smith",

"age" : 25,

"about" : "I love to go rock climbing",

"interests": [ "sports", "music" ]

}

PUT /user/_doc/2

{

"first_name" : "zhangsan",

"last_name" : "Smith",

"age" : 25,

"about" : "I love to go rock climbing",

"interests": [ "sports", "music" ]

}

2. 给索引增加新的字段

PUT /user/_mapping

{

"properties": {

"height": {

"type": "long"

}

}

}

3. 再次索引1个文档

这个文档新增了height字段的值

PUT /user/_doc/3

{

"first_name" : "lisi",

"last_name" : "Smith",

"age" : 25,

"about" : "I love to go rock climbing",

"interests": [ "sports", "music" ],

"height":175

}

4. 查看索引中的文档

GET /user/_search

{

"took" : 817,

"timed_out" : false,

"_shards" : {

"total" : 1,

"successful" : 1,

"skipped" : 0,

"failed" : 0

},

"hits" : {

"total" : {

"value" : 3,

"relation" : "eq"

},

"max_score" : 1.0,

"hits" : [

{

"_index" : "user",

"_type" : "_doc",

"_id" : "1",

"_score" : 1.0,

"_source" : {

"first_name" : "John",

"last_name" : "Smith",

"age" : 25,

"about" : "I love to go rock climbing",

"interests" : [

"sports",

"music"

]

}

},

{

"_index" : "user",

"_type" : "_doc",

"_id" : "2",

"_score" : 1.0,

"_source" : {

"first_name" : "zhangsan",

"last_name" : "Smith",

"age" : 25,

"about" : "I love to go rock climbing",

"interests" : [

"sports",

"music"

]

}

},

{

"_index" : "user",

"_type" : "_doc",

"_id" : "3",

"_score" : 1.0,

"_source" : {

"first_name" : "lisi",

"last_name" : "Smith",

"age" : 25,

"about" : "I love to go rock climbing",

"interests" : [

"sports",

"music"

],

"height" : 175

}

}

]

}

}

从上面的结果可以看出,在ElasticSearch中为已有索引增加一个新字段以后,老的数据并不会自动就拥有了这个新字段,也就不可能给他一个默认值。因此前面2条数据都没有 height 这个字段。

在ElasticSearch中,如果一个字段不存在或者这个字段的值为null,在检索的时候该字段会被忽略,因此也就无法做空值搜索。

PUT my_index/my_type/1

{

"first_name": "zhangsan"

}

PUT my_index/my_type/2

{

"first_name": "wangwu",

"height": null

}

例如上面的2个文档,都无法根据 height 这个字段检索。那么我们如何查询到没增加字段之前的历史数据呢?

2. must_not & exist

POST /user/_search

{

"query": {

"bool": {

"must_not": [

{

"exists": {

"field" : "height"

}

}

]

}

}

}

{

"took" : 7,

"timed_out" : false,

"_shards" : {

"total" : 1,

"successful" : 1,

"skipped" : 0,

"failed" : 0

},

"hits" : {

"total" : {

"value" : 2,

"relation" : "eq"

},

"max_score" : 0.0,

"hits" : [

{

"_index" : "user",

"_type" : "_doc",

"_id" : "1",

"_score" : 0.0,

"_source" : {

"first_name" : "John",

"last_name" : "Smith",

"age" : 25,

"about" : "I love to go rock climbing",

"interests" : [

"sports",

"music"

]

}

},

{

"_index" : "user",

"_type" : "_doc",

"_id" : "2",

"_score" : 0.0,

"_source" : {

"first_name" : "zhangsan",

"last_name" : "Smith",

"age" : 25,

"about" : "I love to go rock climbing",

"interests" : [

"sports",

"music"

]

}

}

]

}

}

exists 返回在原始字段中至少有一个非空值的文档:

GET /user/_search

{

"query": {

"exists" : { "field" : "height" }

}

}

{

"took" : 1,

"timed_out" : false,

"_shards" : {

"total" : 1,

"successful" : 1,

"skipped" : 0,

"failed" : 0

},

"hits" : {

"total" : {

"value" : 1,

"relation" : "eq"

},

"max_score" : 1.0,

"hits" : [

{

"_index" : "user",

"_type" : "_doc",

"_id" : "3",

"_score" : 1.0,

"_source" : {

"first_name" : "lisi",

"last_name" : "Smith",

"age" : 25,

"about" : "I love to go rock climbing",

"interests" : [

"sports",

"music"

],

"height" : 175

}

}

]

}

}

3. 给历史数据赋初值

对现有索引新增字段时并不会影响历史数据,因此我们可以修改历史数据文档,对历史数据设置默认值,然后根据默认值检索。 使用脚本批量更新文档:_update_by_query,如果字段的值为null,则给该字段赋初值为0

POST /user/_update_by_query

{

"script": {

"lang": "painless",

"inline": "if (ctx._source.height== null) {ctx._source.height=0}"

}

}

再次查看索引的文档:

{

"took" : 1,

"timed_out" : false,

"_shards" : {

"total" : 1,

"successful" : 1,

"skipped" : 0,

"failed" : 0

},

"hits" : {

"total" : {

"value" : 3,

"relation" : "eq"

},

"max_score" : 1.0,

"hits" : [

{

"_index" : "user",

"_type" : "_doc",

"_id" : "1",

"_score" : 1.0,

"_source" : {

"about" : "I love to go rock climbing",

"last_name" : "Smith",

"interests" : [

"sports",

"music"

],

"first_name" : "John",

"age" : 25,

"height" : 0

}

},

{

"_index" : "user",

"_type" : "_doc",

"_id" : "2",

"_score" : 1.0,

"_source" : {

"about" : "I love to go rock climbing",

"last_name" : "Smith",

"interests" : [

"sports",

"music"

],

"first_name" : "zhangsan",

"age" : 25,

"height" : 0

}

},

{

"_index" : "user",

"_type" : "_doc",

"_id" : "3",

"_score" : 1.0,

"_source" : {

"about" : "I love to go rock climbing",

"last_name" : "Smith",

"interests" : [

"sports",

"music"

],

"first_name" : "lisi",

"age" : 25,

"height" : 175

}

}

]

}

}

历史数据中 height 字段都有了默认值 0

好文阅读

评论可见,请评论后查看内容,谢谢!!!
 您阅读本篇文章共花了: