ELK-Elasticsearch

ELK之Elasticsearch


参考资料

Elastic官方网站

Elasticsearch Reference


1. Elasticsearch : introduction

Elasticsearch is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time.

  • Web store
  • Collect log or transaction data and analyze and mine the data
  • Anayltics/business-intelligence

2. Elasticsearch : basic concepts

Elasticsearch is a near real time search platform.

  • Cluster
  • Node
  • Index
  • Type(deprecated in 6.0.0)
  • Document
  • Shards & Replicas

3. Elasticsearch : installation

  • Install java firstly
  • Download
  • On macOS, install elasticsearch via homebrew

    1
    $ brew install elasticsearch
  • start

    1
    2
    3
    4
    5
    # 后台启动
    $ brew services start elasticsearch

    # 前台进程启动, 同时指定集群名称, node名称
    $ elasticsearch -Ecluster.name=my_name -Enode.name=node_name
  • Elasticsearch users port 9200 to provide access to its REST API.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    $ curl localhost:9200
    {
    "name" : "zhangjie",
    "cluster_name" : "Jst",
    "cluster_uuid" : "D8DNTTPHRjSnkXr83jGJpQ",
    "version" : {
    "number" : "6.2.4",
    "build_hash" : "ccec39f",
    "build_date" : "2018-04-12T20:37:28.497551Z",
    "build_snapshot" : false,
    "lucene_version" : "7.2.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
    },
    "tagline" : "You Know, for Search"
    }

4. Elasticsearch : explore your cluster

  • Cluster Health
    To check the cluster health, use REST Api or run command in Kibana’s Console.

    1
    2
    3
    4
    5
    6
    7
    $ curl localhost:9200/_cat/health?v
    epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
    1524139385 20:03:05 Jst green 1 1 0 0 0 0 0 0 - 100.0%

    $ curl localhost:9200/_cat/nodes?v
    ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
    127.0.0.1 19 61 6 1.78 mdi * zhangjie
  • List All Indices
    The following response simply means we have no indices yet in the cluster

    1
    2
    $ curl localhost:9200/_cat/indices?v
    health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
  • Create an Index

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    #创建index, 名称为customer
    $ curl -XPUT localhost:9200/customer?pretty
    {
    "acknowledged" : true,
    "shards_acknowledged" : true,
    "index" : "customer"
    }

    #查询index
    $ curl localhost:9200/_cat/indices?v
    health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
    yellow open customer RwIFrnN6R9uAwyi2LLY2BQ 5 1 0 0 1.1kb 1.1kb

    # 5 primary shards and 1 replica 0 documents in it.
    # yellow means some replicas are not(yet) allocated.
  • Index and Query a Document

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    #create a document into the customer index, with ID=1
    $ curl -X PUT "http://localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d'{"name": "John Doe"}'
    {
    "_index" : "customer",
    "_type" : "_doc",
    "_id" : "1",
    "_version" : 1,
    "result" : "created",
    "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
    },
    "_seq_no" : 0,
    "_primary_term" : 1
    }

    $ curl localhost:9200/customer/_doc/1
    {"_index":"customer","_type":"_doc","_id":"1","_version":1,"found":true,"_source":{"name": "John Doe"}}
  • Delete and Index

    1
    2
    3
    4
    $ curl -XDELETE localhost:9200/customer?pretty
    {
    "acknowledged" : true
    }

curl REST Verb localhost:9200/indexName//


5. ElasticSearch搜索请求

ES对搜索请求有简易语法和完整语法两种方式:
例如

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ curl -XGET http://127.0.0.1:9200/logstash-2015.06.21/testlog/_search?q=first
{
"took": 240,
"timed_out": false,
"_shards": {
"total": 27,
"successful": 27,
"failed": 0
},
"hits": {
"total": 1,
"max_"
date " : "
1434966686000 ",
"user": "Tom",
"mesg": "first message into Elasticsearch"
}
}

使用下面的语法来搜索是一样的

1
# curl -XGET http://127.0.0.1:9200/logstash-2015.06.21/testlog/_search?q=user:"chenlin7"

querystring语法

上面的例子中?q=后面写的, 就是querystring语法, 这部分经常在Kibana上面使用

  • 全文检索: 直接写搜索的单词, 例如上面的first

    1
    2
    3
    4
    5
    #全文搜索
    first

    #精确搜索
    "first"
  • 单字段的全文检索: 在搜索的单词之前加上冒号, 例如如果知道first一定出现在mesg:

    1
    mesg: first
  • 单字段的精确搜索: 在搜索单词前后加双引号

    1
    user:"Tom"
  • 多个条件的组合: 可以使用NOT, AND, OR,注意必须大写!

    1
    user:("Tom" OR "Tomas") AND NOT mesg:first
  • 字段是否存在

    1
    2
    3
    _exists_: user #表示要求字段必须存在

    _missing_: user #表示要求字段不存在
  • 通配符: ?表示单个字母, *表示任意个字母, 例如:

    1
    2
    3
    4
    fir?tmess*
    ```
    - 正则: [ES支持的正则表达式](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax)性能很差, 而且功能也不是很强大, 尽量不要使用
    - 近似搜索: 用~表示近似搜索, 可能有一两个字母不对, 请ES按照相似度返回结果.

first~

1
- 范围搜索: 对日期和数值都可以使用范围搜索, []表示闭集, {}表示开集

rrt>300

date: [“now-6h” TO “now”}
`