PHP  Elasticsearch 6.7初步学习实践

1.Mac电脑安装

brew install elasticsearch

#最后几行可以看到相关的配置目录

Data:    /usr/local/var/lib/elasticsearch/
Logs:    /usr/local/var/log/elasticsearch/elasticsearch_guodong.log
Plugins: /usr/local/var/elasticsearch/plugins/
Config:  /usr/local/etc/elasticsearch/

可以安装
https://github.com/NLPchina/elasticsearch-sql

可以用SQL语句查询ES的内容。

2.安装中文分词插件

https://github.com/medcl/elasticsearch-analysis-ik

根据es版本选择分词插件的版本,刚刚安装的6.7。

mv elasticsearch-analysis-ik-6.7.0.zip /usr/local/var/elasticsearch/plugins/

3. 初步使用

3.1测试分词器

使用默认的分词器

curl -X POST "localhost:9200/_analyze" -H 'Content-Type: application/json' -d'
{
  "analyzer": "standard",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog\u0027s bone."
}'

返回结果如下

{"tokens":[{"token":"the","start_offset":0,"end_offset":3,"type":"<ALPHANUM>","position":0},{"token":"2","start_offset":4,"end_offset":5,"type":"<NUM>","position":1},{"token":"quick","start_offset":6,"end_offset":11,"type":"<ALPHANUM>","position":2},{"token":"brown","start_offset":12,"end_offset":17,"type":"<ALPHANUM>","position":3},{"token":"foxes","start_offset":18,"end_offset":23,"type":"<ALPHANUM>","position":4},{"token":"jumped","start_offset":24,"end_offset":30,"type":"<ALPHANUM>","position":5},{"token":"over","start_offset":31,"end_offset":35,"type":"<ALPHANUM>","position":6},{"token":"the","start_offset":36,"end_offset":39,"type":"<ALPHANUM>","position":7},{"token":"lazy","start_offset":40,"end_offset":44,"type":"<ALPHANUM>","position":8},{"token":"dog's","start_offset":45,"end_offset":50,"type":"<ALPHANUM>","position":9},{"token":"bone","start_offset":51,"end_offset":55,"type":"<ALPHANUM>","position":10}]}%    

可以看出默认的分词器standard对英文分词有着不错的效果。

3.1 创建一个索引

一个索引相当于一个数据库,每个索引都可以有自己的单独的单独配置。

中文测试

curl -X POST "localhost:9200/_analyze" -H 'Content-Type: application/json' -d'       
{
  "analyzer": "standard",
  "text": "创业邦,帮助创业者走向成功的平台"                              
}
'
{"tokens":[{"token":"创","start_offset":0,"end_offset":1,"type":"<IDEOGRAPHIC>","position":0},{"token":"业","start_offset":1,"end_offset":2,"type":"<IDEOGRAPHIC>","position":1},{"token":"邦","start_offset":2,"end_offset":3,"type":"<IDEOGRAPHIC>","position":2},{"token":"帮","start_offset":4,"end_offset":5,"type":"<IDEOGRAPHIC>","position":3},{"token":"助","start_offset":5,"end_offset":6,"type":"<IDEOGRAPHIC>","position":4},{"token":"创","start_offset":6,"end_offset":7,"type":"<IDEOGRAPHIC>","position":5},{"token":"业","start_offset":7,"end_offset":8,"type":"<IDEOGRAPHIC>","position":6},{"token":"者","start_offset":8,"end_offset":9,"type":"<IDEOGRAPHIC>","position":7},{"token":"走","start_offset":9,"end_offset":10,"type":"<IDEOGRAPHIC>","position":8},{"token":"向","start_offset":10,"end_offset":11,"type":"<IDEOGRAPHIC>","position":9},{"token":"成","start_offset":11,"end_offset":12,"type":"<IDEOGRAPHIC>","position":10},{"token":"功","start_offset":12,"end_offset":13,"type":"<IDEOGRAPHIC>","position":11},{"token":"的","start_offset":13,"end_offset":14,"type":"<IDEOGRAPHIC>","position":12},{"token":"平","start_offset":14,"end_offset":15,"type":"<IDEOGRAPHIC>","position":13},{"token":"台","start_offset":15,"end_offset":16,"type":"<IDEOGRAPHIC>","position":14}]}%  

结果会把每个汉字都分词,效果不好。下面采用ik分词器进行分词测试。

 curl -X POST "localhost:9200/_analyze" -H 'Content-Type: application/json' -d'
{
  "analyzer": "ik_max_word",
  "text": "创业邦,帮助创业者走向成功的平台"
}
'

分词良好

{"tokens":[{"token":"创业","start_offset":0,"end_offset":2,"type":"CN_WORD","position":0},{"token":"邦","start_offset":2,"end_offset":3,"type":"CN_CHAR","position":1},{"token":"帮助","start_offset":4,"end_offset":6,"type":"CN_WORD","position":2},{"token":"创业者","start_offset":6,"end_offset":9,"type":"CN_WORD","position":3},{"token":"创业","start_offset":6,"end_offset":8,"type":"CN_WORD","position":4},{"token":"业者","start_offset":7,"end_offset":9,"type":"CN_WORD","position":5},{"token":"走向","start_offset":9,"end_offset":11,"type":"CN_WORD","position":6},{"token":"成功","start_offset":11,"end_offset":13,"type":"CN_WORD","position":7},{"token":"的","start_offset":13,"end_offset":14,"type":"CN_CHAR","position":8},{"token":"平台","start_offset":14,"end_offset":16,"type":"CN_WORD","position":9}]}% 

3.2 创建索引并使用分词器

 curl -X PUT "localhost:9200/cyzone" -H 'Content-Type: application/json' -d'
{
    "settings":{
        "index":{
            "number_of_shards":3,
            "number_of_replicas":2
        }
    },
    "mappings":{
        "goods":{ 
            "properties":{  
                "name":{
                    "type":"text",
                    "analyzer": "ik_max_word",
                    "search_analyzer": "ik_smart"
                },                               
                "content":{
                    "type":"text",
                    "analyzer": "ik_max_word",
                    "search_analyzer": "ik_smart"
                },                               
                "id":{
                    "type":"long"
                }                
            }
        }
    }
}'

3.3给索引的type为goods的添加一些数据

➜  ~ curl -X POST "localhost:9200/cyzone/goods" -H 'Content-Type: application/json' -d'
{
    "id" : "36",
    "name" : "创业邦会员 — 找人、找钱、找项目!",
    "content" : "这其实是内容,但是中间有双引号和单引号,我暂时不拼接了"
}'
{"_index":"cyzone","_type":"goods","_id":"1qtDo2oBACchbOnLTGLF","_version":1,"result":"created","_shards":{"total":3,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}%                                                                                                                                             ➜  ~ curl -X POST "localhost:9200/cyzone/goods" -H 'Content-Type: application/json' -d'
{
    "id" : "143",
    "name" : "创业邦会员 — 优惠价格 happy",       
    "content" : "just for test vip,创业"                               
}'
{"_index":"cyzone","_type":"goods","_id":"16tEo2oBACchbOnLZmK4","_version":1,"result":"created","_shards":{"total":3,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}%               

可以指定id,也可以不指定id,指定id的时候就是cyzone/goods/具体id号,但是要把POST改为PUT,不指定ID,ES内部有自增的id。

3.4测试搜索功能是否好用

简单搜索
https://www.elastic.co/guide/en/elasticsearch/reference/6.7/search-uri-request.html

➜  ~ curl -X GET "localhost:9200/cyzone/_search?q=id:143" 

{"took":1,"timed_out":false,"_shards":{"total":3,"successful":3,"skipped":0,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"cyzone","_type":"goods","_id":"16tEo2oBACchbOnLZmK4","_score":1.0,"_source":
{
    "id" : "143",
    "name" : "创业邦会员 — 优惠价格 happy",
    "content" : "just for test vip,创业"
}}]}}%  

复杂的搜索

https://www.elastic.co/guide/en/elasticsearch/reference/6.7/search-request-body.html

➜  ~ curl -XPOST "http://localhost:9200/cyzone/_search?pretty"  -H 'Content-Type:application/json' -d'
{
    "query" : { "match" : { "content" : "创业" }}
}
'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "cyzone",
        "_type" : "goods",
        "_id" : "16tEo2oBACchbOnLZmK4",
        "_score" : 0.2876821,
        "_source" : {
          "id" : "143",
          "name" : "创业邦会员 — 优惠价格 happy",
          "content" : "just for test vip,创业"
        }
      }
    ]
  }
}

再次尝试搜索name

➜  ~ curl -XPOST "http://localhost:9200/cyzone/_search?pretty"  -H 'Content-Type:application/json' -d'
{
    "query" : { "match" : { "name" : "创业" }}
}
'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "cyzone",
        "_type" : "goods",
        "_id" : "16tEo2oBACchbOnLZmK4",
        "_score" : 0.2876821,
        "_source" : {
          "id" : "143",
          "name" : "创业邦会员 — 优惠价格 happy",
          "content" : "just for test vip,创业"
        }
      },
      {
        "_index" : "cyzone",
        "_type" : "goods",
        "_id" : "1qtDo2oBACchbOnLTGLF",
        "_score" : 0.2876821,
        "_source" : {
          "id" : "36",
          "name" : "创业邦会员 — 找人、找钱、找项目!",
          "content" : "这其实是内容,但是中间有双引号和单引号,我暂时不拼接了"
        }
      }
    ]
  }
}

4.导入数据库已有的数据

使用logstash工具同步数据库的数据到ES,不仅仅是logstash,但是这个推荐的比较多,官方网站上有这个。
https://www.elastic.co/cn/products/logstash

➜  Desktop brew install logstash     
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 2 taps (homebrew/core and homebrew/cask).
==> New Formulae
imapsync
==> Updated Formulae
aliyun-cli      conan           i2pd            joplin          lxc             postgresql@9.6  socat           wtf
calicoctl       doctl           imagemagick     juju            netpbm          pygobject       ttyd            yarn
cfn-lint        glooctl         istioctl        kotlin          oniguruma       serverless      ucloud          youtube-dl
checkbashisms   gtk-doc         jhipster        kubeprod        opencoarrays    skaffold        utf8proc

==> Downloading https://artifacts.elastic.co/downloads/logstash/logstash-oss-7.0.1.tar.gz
######################################################################## 100.0%
==> Caveats
Configuration files are located in /usr/local/etc/logstash/

To have launchd start logstash now and restart at login:
  brew services start logstash
Or, if you don't want/need a background service you can just run:
  logstash
==> Summary
  /usr/local/Cellar/logstash/7.0.1: 12,515 files, 283.9MB, built in 22 minutes 49 seconds

进入logstash 相关目录下安装插件logstash-input-jdbc

cd  /usr/local/Cellar/logstash/7.0.1
./logstash-plugin install logstash-input-jdbc

插件使用文档:https://www.elastic.co/guide/en/logstash/7.0/plugins-inputs-jdbc.html

input {
      jdbc {
        type=> 'goods'
        jdbc_driver_library => "/usr/local/Cellar/logstash/7.0.1/bin/mysql-connector-java-5.1.45/mysql-connector-java-5.1.45-bin.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        jdbc_connection_string => "jdbc:mysql://127.0.0.1:3307/topcms_cyzone_cn_test"
        jdbc_user => "localhost"
        jdbc_password => "123456"
        statement => "SELECT id,name,content from shop order by id desc"
        tracking_column => "id"
        jdbc_paging_enabled=>"true"
        jdbc_page_size => "5000"
        #tracking_column => "timestamp"
        #tracking_column_type=>numeric
        schedule => "* * * * *"
      }
    }
    filter {
    }
    output {
        if[type]=="goods"{
            elasticsearch {
                hosts => ["127.0.0.1:9200"]
                index => "cyzone"
                document_id => "%{id}"
                            document_type=>"goods"
            }
        }
        stdout {
            codec => json_lines
        }
    }

配置参数具体以官方文档为准,不同的版本有不同的设置效果,比如document_type这个字段文档说以后要废弃,但是我用的是7.0,这个地方还必须用。

document_type参数在下一个大版本废弃.png

重复之前的测试,发现数据确实导入了,这里只导入了goods表的数据。如果需要导入其他的数据,可以在上面脚本接着扩展,后面还要对参数具体研究,来实现数据库变化的时候自动同步数据到ES。

5.安装Elasticsearch的PHP库

https://github.com/elastic/elasticsearch-php

自己根据实际情况封装ES管理类。

1.基础抽象类,提供公共的增删改成方法。

<?php

namespace app\common\lib\es;

use Elasticsearch\ClientBuilder;
use app\common\es\lib\ESBaseDoc;

/**
 * 基本的ES工具类
 * Class CyEsTool
 * @package app\common\es\goods
 */
abstract class AbstractES
{
    /**
     * @var ClientBuilder
     */
    private $client;

    private static $instance;

    /**
     * 索引名称相当于数据库
     * @var string
     */
    protected $index = "cyzone";

    /**
     * 索引类型,相当于表
     * @var string
     */
    protected $type = "";

    /**
     * 创建索引的时候的mapping信息
     * @var array
     */
    protected $mappings = [

    ];

    /**
     * 默认的mappings信息
     * @var array
     */
    private $defaultMappings = [

        '_default_' => [ //默认配置,每个类型缺省的配置使用默认配置
            '_all' => [   //  关闭所有字段的检索
                'enabled' => 'false'
            ],
            '_source' => [   //  存储原始文档
                'enabled' => 'true'
            ],
        ]
    ];

    /**
     * 创建索引的时候的配置信息
     * @var array
     */
    private $setting = [
        "index" => [
            "number_of_shards" => 3,
            "number_of_replicas" => 2
        ]
    ];

    private function __construct()
    {
        $this->client = ClientBuilder::create()
            ->setHosts(['127.0.0.1:9200'])
            ->build();
    }

    public static function getInstance()
    {
        if (is_null(self::$instance)) {
            self::$instance = new static();
        }
        return self::$instance;
    }

    /**
     * 获取默认的搜索字段,就是mapping里面的配置
     * @param array $field
     * @param bool $exceptId
     * @return array
     */
    protected function getSearchFiled($field = [], $exceptId = true)
    {
        if ($field) {
            return $field;
        }
        $properties = $this->mappings[$this->type]['properties']??[];
        if (empty($properties)) {
            return [];
        }
        $fields = array_keys($properties);
        foreach ($fields as $key => $value) {
            if ($exceptId && strpos($value, "id") !== false) {
                unset($fields[$key]);
            }
        }
        return $fields;
    }

    /**
     * 查看Mapping
     */
    public function getMappings()
    {
        $params = [
            'index' => $this->index
        ];
        $res = $this->client->indices()->getMapping($params);
        return $res;
    }

    /**
     * 修改Mapping
     * @return array
     */
    public function putMappings()
    {
        $mappings = array_merge($this->defaultMappings, $this->mappings);
        $params = [
            'index' => $this->index,
            'type' => $this->type,
            'body' => [
                $mappings
            ]
        ];

        return $this->client->indices()->putMapping($params);
    }

    /**
     * 插入单条的文档
     * @param ESBaseDoc $baseDoc
     * @return array
     */
    public function insertOneDoc(ESBaseDoc $baseDoc)
    {
        //可以对param适当做些检查
        $params = [
            'index' => $this->index,
            'type' => $this->type,
            'body' => [
                $baseDoc->toArray()
            ]
        ];
        return $this->client->index($params);
    }

    /**
     * @param ESBaseDoc[] $docArray
     */
    public function postBulkDoc(array $docArray)
    {
        if (count($docArray) == 0) {
            return;
        }
        $params = [];
        for ($i = 0; $i < count($docArray); $i++) {
            $params['body'][] = [
                'index' => [
                    '_index' => $this->index,
                    '_type' => $this->type,
                ]
            ];
            $params['body'][] = [
                $docArray[$i]->toArray()
            ];
        }
        $this->client->bulk($params);
    }

    /**
     * 根据id获得doc
     * @param $id
     * @return array|bool
     */
    public function getDocById($id)
    {
        $params = [
            'index' => $this->index,
            'type' => $this->type,
            'id' => $id
        ];
        try {
            return $this->client->get($params);
        } catch (\Exception $exception) {
            return false;
        }
    }

    /**
     * 根据id更新文档的内容
     * @param $id
     * @param ESBaseDoc $baseDoc
     * @return array|bool
     */
    public function updateDocById($id, ESBaseDoc $baseDoc)
    {

        $params = [
            'index' => $this->index,
            'type' => $this->type,
            'id' => $id,
            'body' => [
                'doc' => [
                    $baseDoc->toArray()
                ]
            ]
        ];
        try {
            return $this->client->update($params);
        } catch (\Exception $exception) {
            return false;
        }
    }

    /**
     * 根据id删除文档的内容
     * @param $id
     * @return array |bool
     */
    public function deleteDocById($id)
    {

        $params = [
            'index' => $this->index,
            'type' => $this->type,
            'id' => $id
        ];
        try {
            return $this->client->delete($params);
        } catch (\Exception $exception) {
            return false;
        }
    }


    //Query的参数 https://www.elastic.co/guide/en/elasticsearch/reference/6.7/query-filter-context.html
    //https://es.xiaoleilu.com/054_Query_DSL/70_Important_clauses.html
    /**
     * 多个字段查询搜索,默认搜索可以用这个
     * @param $keyWords
     * @param array $field
     * @return array
     */
    public function search($keyWords, $field = [])
    {
        $params = [
            'index' => $this->index,
            'type' => $this->type,
            'body' => [
                'query' => [
                    'multi_match' => [
                        'query' => $keyWords,
                        "fields" => $this->getSearchFiled($field)
                    ]
                ]
            ]
        ];

        return $this->client->search($params);
    }
}
  1. 具体业务类

商品相关的ES类

<?php
/**
 * Created by PhpStorm.
 * User: guodong
 * Date: 2019/5/12
 * Time: 下午5:04
 */

namespace app\common\es\lib;

class ESBaseDoc
{

    private $initParams;

    public function __construct(array $param)
    {
       foreach ($param as $key => $value){
           $reflect = new \ReflectionProperty(static::class,$key);
           if ($reflect->isPublic()){
               if (property_exists($this,$key)){
                   $this->$key = $value;
               }
           }
       }
       $this->initParams = $param;
    }

    public function toArray()
    {
        return $this->initParams;
    }
}
  1. 具体的数据对应document对象。
<?php
/**
 * Created by PhpStorm.
 * User: guodong
 * Date: 2019/5/12
 * Time: 下午5:04
 */

namespace app\common\es\lib;

class ESBaseDoc
{

    private $initParams;

    public function __construct(array $param)
    {
       foreach ($param as $key => $value){
           $reflect = new \ReflectionProperty(static::class,$key);
           if ($reflect->isPublic()){
               if (property_exists($this,$key)){
                   $this->$key = $value;
               }
           }
       }
       $this->initParams = $param;
    }

    public function toArray()
    {
        return $this->initParams;
    }
}

具体的业务类要继承这个类

namespace app\common\lib\es\goods;

use app\common\es\lib\ESBaseDoc;

class ESGoodsDoc extends ESBaseDoc
{
    public $id;
    public $name;
    public $content;
}

参考文档

https://www.cnblogs.com/ajianbeyourself/p/5529575.html
https://help.aliyun.com/document_detail/58107.html?spm=a2c4g.11186623.6.543.778473bf6G1rrB
https://www.elastic.co/guide/en/elasticsearch/reference/6.7/index.html
https://es.xiaoleilu.com/052_Mapping_Analysis/45_Mapping.html

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 158,847评论 4 362
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,208评论 1 292
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 108,587评论 0 243
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 43,942评论 0 205
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,332评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,587评论 1 218
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,853评论 2 312
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,568评论 0 198
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,273评论 1 242
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,542评论 2 246
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,033评论 1 260
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,373评论 2 253
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,031评论 3 236
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,073评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,830评论 0 195
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,628评论 2 274
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,537评论 2 269

推荐阅读更多精彩内容