Elasticsearch安装及介绍(一)

Elasticsearch安装及介绍

安装Elasticsearch

Elasticsearch的安装很简单,直接下载官方包,运行即可

curl -L -O http://download.elasticsearch.org/PATH/TO/VERSION.zip 
unzip elasticsearch-$VERSION.zip
cd  elasticsearch-$VERSION
./bin/elasticsearch

使用curl ‘http://localhost:9200/?pretty’获取数据

{
  "name" : "node-0",
  "cluster_name" : "dongtai-es",
  "cluster_uuid" : "aIah5IcqS1y9qiJ4LphAbA",
  "version" : {
    "number" : "6.1.2",
    "build_hash" : "5b1fea5",
    "build_date" : "2018-01-10T02:35:59.208Z",
    "build_snapshot" : false,
    "lucene_version" : "7.1.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

基本概念介绍

Elasticsearch本质是一个分布式数据库,每个服务器上称为一个Elastic实例,这一组实例构成了ES集群

Document

ES中存储数据记录称为Document,Document使用JSON格式表示,同一个 Index 里面的Document没有要求有相同的结构,但最好一致,这样能提高效率,如下:

{
  "first_name": "Jane2",
  "last_name": "Smith",
  "age": 32,
  "about": "I like to collect rock albums",
  "interests": "music"
}

Index

ES文档的索引,定义了文档放在哪里,你可以理解为对于数据库名

Type

文档表示的对象类别,可以理解为数据库中的表,比如我们有一个机票数据,可以按舱位来分组(头等舱,经济舱),根据规则,Elastic 6.x 版只允许每个 Index 包含一个 Type,7.x 版将会彻底移除 Type

ID

文档的唯一标识,可以理解为表中的一条记录,根据index、type、_ID可以确定一个文档

基本用法

添加一个文档

PUT index/employee/6
{
  "first_name": "chuan",
  "last_name": "zhang",
  "age": 32,
  "about": "hi I'm hai na bai chuan",
  "interests": "read book"
}

上面PUT后面分别对应 index、Type、Id

获取文档

获取所有文档

ES默认会返回10条,可以指定size来改变

GET index/employee/_search
{
  "size": 3
}
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 9,
    "max_score": 1,
    "hits": [
      {
        "_index": "index",
        "_type": "employee",
        "_id": "5",
        "_score": 1,
        "_source": {
          "first_name": "Jane1",
          "last_name": "Smith",
          "age": 32,
          "about": "I like to collect rock albums",
          "interests": "music"
        }
      },
      {
        "_index": "index",
        "_type": "employee",
        "_id": "10",
        "_score": 1,
        "_source": {
          "first_name": "chuan",
          "last_name": "Smith",
          "age": 32,
          "about": "I like to collect rock albums",
          "interests": "music"
        }
      }
    ]
  }
}

按条件匹配

比如我需要查找firstname是’chuan‘的所有文档,如果要搜索firstname中多个关键字,中间用空格隔开就可以了,比如”firstname”: “chuan smith” 表示查询firstname包含chuan或smith的

GET index/employee/_search
{
  "query": {
    "match": {
            "first_name": "chuan"
        }
  }
}
{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.6931472,
    "hits": [
      {
        "_index": "index",
        "_type": "employee",
        "_id": "6",
        "_score": 0.6931472,
        "_source": {
          "first_name": "chuan",
          "last_name": "zhang",
          "age": 32,
          "about": "hi I'm hai na bai chuan",
          "interests": "read book"
        }
      }
    ]
  }
}

如果要使用and查询,比如我要查询firstname为chuan,lastname为chuan的文档记录

GET index/employee/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "first_name": "chuan" } },
        { "match": { "last_name": "zhang" } }
      ]
    }
  }
}

如果要根据范围查询或过滤,类似MySQL中’>’,或 ‘!=’操作,比如需要查询上面条件中,年龄大于30的文档

{
  "query": {
    "bool": {
      "must": [
        { "match": { "first_name": "chuan" } },
        { "match": { "last_name": "zhang" } }
      ]
      "filter": {
        "range" : {
            "age" : { "gt" : 30 } 
        }
      }
    }
  }
}

更新文档

更新记录使用PUT请求,重新发送一次数据,ES会根据id来修改,如果我需要更新id为6的about字段

PUT index/employee/6
{
  "first_name": "chuan",
  "last_name": "zhang",
  "age": 32,
  "about": "i modify my about",
  "interests": "read book"
}
{
  "_index": "index",
  "_type": "employee",
  "_id": "6",
  "_version": 2,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 12,
  "_primary_term": 2
}

_version为版本号,由原来的1变成了2,result为”update“,以上为全量更新,如果需要局部更新,可以使用POST请求

POST index/employee/6/_update
{
  "doc":{
    "about": "i modify my about"
  }
}

删除文档

如下,我需要删除id为6的文档

DELETE index/employee/6
{
  "found" :    true,
  "_index" :   "index",
  "_type" :    "employee",
  "_id" :      "6",
  "_version" : 3
}

可以看到version也增加了1,如果对于的id没有找到,found将返回false

数据分析

ES同样有类似MySQL group by的用法,ES称为聚合,比如我想统计employee中最受欢迎的兴趣,注意:ES在5.x之后对排序、聚合操作用单独的数据结构(fielddata)缓存到内存里了,需要单独开启。

PUT index/_mapping/employee/
{
  "properties": {
    "interests": { 
      "type":     "text",
      "fielddata": true
    }
  }
}

GET index/employee/_search
{
  "aggs": {
    "all_interests": {
      "terms": { "field": "interests" }
    }
  }
}
"aggregations": {
    "all_interests": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "music",
          "doc_count": 3
        },
        {
          "key": "forestry",
          "doc_count": 1
        },
        {
          "key": "read",
          "doc_count": 1
        }
      ]
    }
  }

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注