elasticsearch安装dynamic-synonym插件

​ 今天就来和大家讲讲如何在es中安装dynamic-synonym插件,首先我们需要去github上下载与es版本对应的插件,一般github上基本都是本地词库和远程文本词库的,在gitee上可以找到采用数据库作为词库的源码,大致思路就是修改一些参数配置,然后自己创建一个表作为同义词词库,最后将打包好的jar包插件丢到es-plugins目录下面,最后重启一下就能跑起来了。但是!!!作者没有跑起来,遇到了好多问题【哭泣泣】,因为我是在docker容器中运行的es,而容器一直报的是Java权限问题,我在网络上找了一圈才东拼西凑的把这个问题给解决,真的太高兴啦!!!

​ 接下来就开始讲讲思路

下载源码,修改dynamic-synonym配置新增MySQL代码创建一个dynamic-synonym的表修改docker中es容器的Java.policy文件**【非常重要】**将打包好的jar包放入到 {es-root}/es-plugins目录下面docker重启es容器新建es的dynamic-synonym索引测试

**文章末尾会给出作者已经配置好的插件代码!!!!!! 请注意签收!!!!!**可以直接跳到四或者五,根据你自己的需求来选择

一、下载源码并且修改配置

​ github好多好多的源码啊,真的是看都看不过来,下载之后要结合自己es版本切换分支,这里建议直接下载最原始的源码,链接为:https://github.com/bells/elasticsearch-analysis-dynamic-synonym,下载好了之后需要切换与es版本对应代码分支,作者的es版本为7.12.1,修改一下pom文件的配置

1.1 修改pom.xml文件

xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

4.0.0

com.bellszhu.elasticsearch

elasticsearch-analysis-dynamic-synonym

7.12.1

jar

elasticsearch-dynamic-synonym

Analysis-plugin for synonym

UTF-8

${project.version}

1.8

analysis-dynamic-synonym

${project.basedir}/src/main/assemblies/plugin.xml

com.bellszhu.elasticsearch.plugin.DynamicSynonymPlugin

true

The Apache Software License, Version 2.0

http://www.apache.org/licenses/LICENSE-2.0.txt

repo

org.sonatype.oss

oss-parent

9

scm:git:git@github.com:bells/elasticsearch-analysis-dynamic-synonym.git

scm:git:git@github.com:bells/elasticsearch-analysis-dynamic-synonym.git

https://github.com/bells/elasticsearch-analysis-dynamic-synonym

org.elasticsearch

elasticsearch

${elasticsearch.version}

org.codelibs.elasticsearch.module

analysis-common

7.10.2

junit

junit

4.13.1

test

org.apache.httpcomponents

httpclient

4.5.13

mysql

mysql-connector-java

8.0.22

org.apache.logging.log4j

log4j-core

2.13.2

provided

org.apache.logging.log4j

log4j-api

2.11.1

provided

org.codelibs

elasticsearch-cluster-runner

7.10.2.0

test

org.apache.maven.plugins

maven-compiler-plugin

2.3.2

${maven.compiler.target}

${maven.compiler.target}

org.apache.maven.plugins

maven-surefire-plugin

2.11

**/*Tests.java

org.apache.maven.plugins

maven-source-plugin

2.1.2

attach-sources

jar

maven-assembly-plugin

false

${project.build.directory}/releases/

${basedir}/src/main/assemblies/plugin.xml

fully.qualified.MainClass

package

single

这里在做链接MySQL数据的时候要注意一下MySQL的驱动jar包,不同版本的url会有所区别。

二、新增MySQL代码

2.1 新增MysqlRemoteSynonymFile文件

public class MySqlRemoteSynonymFile implements SynonymFile{

/**

* 数据库配置文件名

*/

private final static String DB_PROPERTIES = "jdbc-reload.properties";

private static Logger logger = LogManager.getLogger("dynamic-synonym");

private String format;

private boolean expand;

private boolean lenient;

private Analyzer analyzer;

private Environment env;

// 数据库配置

private String location;

/**

* 数据库地址

*/

private static final String JDBC_URL = "jdbc.url";

/**

* 数据库驱动

*/

private static final String JDBC_DRIVER = "jdbc.driver";

/**

* 数据库用户名

*/

private static final String JDBC_USER = "jdbc.user";

/**

* 数据库密码

*/

private static final String JDBC_PASSWORD = "jdbc.password";

/**

* 当前节点的同义词版本号

*/

private LocalDateTime thisSynonymVersion = LocalDateTime.now();

private static Connection connection = null;

private Statement statement = null;

private Properties props;

private Path conf_dir;

MySqlRemoteSynonymFile(Environment env, Analyzer analyzer,

boolean expand, boolean lenient, String format, String location) {

this.analyzer = analyzer;

this.expand = expand;

this.format = format;

this.lenient = lenient;

this.env = env;

this.location = location;

this.props = new Properties();

//读取当前 jar 包存放的路径

Path filePath = PathUtils.get(new File(DynamicSynonymPlugin.class.getProtectionDomain().getCodeSource()

.getLocation().getPath())

.getParent(), "config")

.toAbsolutePath();

this.conf_dir = filePath.resolve(DB_PROPERTIES);

//判断文件是否存在

File configFile = conf_dir.toFile();

InputStream input = null;

try {

input = new FileInputStream(configFile);

} catch (FileNotFoundException e) {

logger.info("jdbc-reload.properties 数据库配置文件没有找到, " + e);

}

if (input != null) {

try {

props.load(input);

} catch (IOException e) {

logger.error("数据库配置文件 jdbc-reload.properties 加载失败," + e);

}

}

isNeedReloadSynonymMap();

}

/**

* 加载同义词词典至SynonymMap中

* @return SynonymMap

*/

@Override

public SynonymMap reloadSynonymMap() {

try {

logger.info("start reload local synonym from {}.", location);

Reader rulesReader = getReader();

SynonymMap.Builder parser = RemoteSynonymFile.getSynonymParser(rulesReader, format, expand, lenient, analyzer);

return parser.build();

} catch (Exception e) {

logger.error("reload local synonym {} error! cause: {}", location, e.getMessage());

throw new IllegalArgumentException(

"could not reload local synonyms file to build synonyms", e);

}

}

/**

* 判断是否需要进行重新加载

* @return true or false

*/

@Override

public boolean isNeedReloadSynonymMap() {

try {

LocalDateTime mysqlLastModify = getMySqlSynonymLastModify();

if (!thisSynonymVersion.isEqual(mysqlLastModify)) {

thisSynonymVersion = mysqlLastModify;

return true;

}

} catch (Exception e) {

logger.error(e);

}

return false;

}

/**

* 获取MySql中同义词版本号信息

* 用于判断同义词是否需要进行重新加载

*

* @return getLastModify

*/

public LocalDateTime getMySqlSynonymLastModify() {

ResultSet resultSet = null;

LocalDateTime mysqlSynonymLastModify = null;

try {

if (statement == null) {

statement = getConnection(props);

}

resultSet = statement.executeQuery(props.getProperty("jdbc.reload.swith.synonym.last_modify"));

while (resultSet.next()) {

Timestamp lastModify = resultSet.getTimestamp("last_modify");

mysqlSynonymLastModify = lastModify.toLocalDateTime();

// logger.info("当前MySql同义词最后修改时间为:{}, 当前节点同义词库最后修改时间为:{}", mysqlSynonymLastModify, thisSynonymVersion);

}

} catch (SQLException e) {

e.printStackTrace();

} finally {

try {

if (resultSet != null) {

resultSet.close();

}

} catch (SQLException e) {

e.printStackTrace();

}

}

return mysqlSynonymLastModify;

}

/**

* 查询数据库中的同义词

* @return DBData

*/

public ArrayList getDbData() {

ArrayList arrayList = new ArrayList<>();

ResultSet resultSet = null;

try {

if (statement == null) {

statement = getConnection(props);

}

logger.info("正在执行SQL查询同义词列表,SQL:{}", props.getProperty("jdbc.reload.synonym.sql"));

resultSet = statement.executeQuery(props.getProperty("jdbc.reload.synonym.sql"));

while (resultSet.next()) {

String theWord = resultSet.getString("words");

arrayList.add(theWord);

}

} catch (SQLException e) {

logger.error(e);

} finally {

try {

if (resultSet != null) {

resultSet.close();

}

} catch (SQLException e) {

e.printStackTrace();

}

}

return arrayList;

}

/**

* 同义词库的加载

* @return Reader

*/

@Override

public Reader getReader() {

StringBuilder sb = new StringBuilder();

try {

ArrayList dbData = getDbData();

for (String dbDatum : dbData) {

logger.info("正在加载同义词:{}", dbDatum);

// 获取一行一行的记录,每一条记录都包含多个词,形成一个词组,词与词之间使用英文逗号分割

sb.append(dbDatum)

.append(System.getProperty("line.separator"));

}

} catch (Exception e) {

logger.error("同义词加载失败");

}

return new StringReader(sb.toString());

}

/**

* 获取数据库可执行连接

* @param props 配置文件

* @throws SQLException 获取连接失败

*/

private static Statement getConnection(Properties props) throws SQLException {

try {

Class.forName(props.getProperty(JDBC_DRIVER));

} catch (ClassNotFoundException e) {

logger.error("驱动加载失败", e);

}

if (connection == null) {

connection = DriverManager.getConnection(

props.getProperty(JDBC_URL),

props.getProperty(JDBC_USER),

props.getProperty(JDBC_PASSWORD));

}

return connection.createStatement();

}

}

2.2 在getSynonymFile新增MySQL的连接方式

修改的DynamicSynonymTokenFilterFactory的资源获取代码

SynonymFile getSynonymFile(Analyzer analyzer) {

try {

SynonymFile synonymFile;

if ("MySql".equals(location)) {

synonymFile = new MySqlRemoteSynonymFile(environment, analyzer, expand, lenient, format, location);

} else if (location.startsWith("http://") || location.startsWith("https://")) {

synonymFile = new RemoteSynonymFile(

environment, analyzer, expand, lenient, format, location);

} else {

synonymFile = new LocalSynonymFile(

environment, analyzer, expand, lenient, format, location);

}

if (scheduledFuture == null) {

scheduledFuture = pool.scheduleAtFixedRate(new Monitor(synonymFile),

interval, interval, TimeUnit.SECONDS);

}

return synonymFile;

} catch (Exception e) {

logger.error("failed to get synonyms: " + location, e);

throw new IllegalArgumentException("failed to get synonyms : " + location, e);

}

}

三、创建一个dynamic-synonym的表

3.1 建库建表

​ 作者这边的数据库名称为word,表名为synonym

/*

Navicat Premium Data Transfer

Source Server : localhost

Source Server Type : MySQL

Source Server Version : 50717

Source Host : localhost:3306

Source Schema : auth

Target Server Type : MySQL

Target Server Version : 50717

File Encoding : 65001

Date: 05/01/2022 17:01:31

*/

SET NAMES utf8mb4;

SET FOREIGN_KEY_CHECKS = 0;

-- ----------------------------

-- Table structure for synonym

-- ----------------------------

DROP TABLE IF EXISTS `synonym`;

CREATE TABLE `synonym` (

`id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键',

`words` text CHARACTER SET utf8 COLLATE utf8_bin NULL COMMENT '同义词',

`last_modify` timestamp(0) NULL DEFAULT CURRENT_TIMESTAMP(0) ON UPDATE CURRENT_TIMESTAMP(0) COMMENT '最后更新时间',

PRIMARY KEY (`id`) USING BTREE

) ENGINE = InnoDB AUTO_INCREMENT = 2 CHARACTER SET = utf8 COLLATE = utf8_bin ROW_FORMAT = Dynamic;

-- ----------------------------

-- Records of synonym

-- ----------------------------

INSERT INTO `synonym` VALUES (1, '西红柿,番茄,洋柿子', '2022-01-05 16:48:24');

SET FOREIGN_KEY_CHECKS = 1;

3.2 修改数据库连接的配置文件

在项目的src同级目录下新增config/jdbc-reload.properties文件

# permission java.net.SocketPermission "*", "connect,resolve";

# CHCP 65001

jdbc.url=jdbc:mysql://192.168.255.132:3306/word?serverTimezone=GMT

jdbc.user=root

jdbc.driver=com.mysql.cj.jdbc.Driver

jdbc.password=123456

# 查询词库

jdbc.reload.synonym.sql=select words from synonym

# 查询更新时间

jdbc.reload.swith.synonym.last_modify=SELECT MAX(last_modify) last_modify FROM synonym

四、修改docker中es容器的Java.policy文件**【非常重要】**

这里作者用的是docker容器化部署,如果是直接装在windows系统或者centos系统下,就要去修改es依赖的Jdk,直接修改系统的jdk的java.policy文件。在这里不直接修改系统jdk的java.policy文件是因为docker容器化部署的es是独立于系统的jdk运行的,这个es有一套自己的输出逻辑。

4.1 找到Java.policy

首先进入到容器内部操作 docker exec -it es /bin/bash,然后直接打开 cd /usr/share/elasticsearch/jdk/conf/security/文件夹,找到Java.policy文件。

[root@localhost ~]# docker exec -it es /bin/bash

[root@ee5fd3f35131 elasticsearch]# cd /usr/share/elasticsearch/jdk/conf/security/

[root@ee5fd3f35131 security]# ls

java.policy java.security policy

[root@ee5fd3f35131 security]# vi java.policy

4.2 修改java.policy文件

下面文件的全部内容:

//

// This system policy file grants a set of default permissions to all domains

// and can be configured to grant additional permissions to modules and other

// code sources. The code source URL scheme for modules linked into a

// run-time image is "jrt".

//

// For example, to grant permission to read the "foo" property to the module

// "com.greetings", the grant entry is:

//

// grant codeBase "jrt:/com.greetings" {

// permission java.util.PropertyPermission "foo", "read";

// };

//

grant codeBase "file:${{java.ext.dirs}}/*" {

permission java.security.AllPermission;

};

// default permissions granted to all domains

grant {

// allows anyone to listen on dynamic ports

permission java.net.SocketPermission "localhost:0", "listen";

// "standard" properies that can be read by anyone

permission java.util.PropertyPermission "java.version", "read";

permission java.util.PropertyPermission "java.vendor", "read";

permission java.util.PropertyPermission "java.vendor.url", "read";

permission java.util.PropertyPermission "java.class.version", "read";

permission java.util.PropertyPermission "os.name", "read";

permission java.util.PropertyPermission "os.version", "read";

permission java.util.PropertyPermission "os.arch", "read";

permission java.util.PropertyPermission "file.separator", "read";

permission java.util.PropertyPermission "path.separator", "read";

permission java.util.PropertyPermission "line.separator", "read";

permission java.util.PropertyPermission

"java.specification.version", "read";

permission java.util.PropertyPermission "java.specification.vendor", "read";

permission java.util.PropertyPermission "java.specification.name", "read";

permission java.util.PropertyPermission

"java.vm.specification.version", "read";

permission java.util.PropertyPermission

"java.vm.specification.vendor", "read";

permission java.util.PropertyPermission

"java.vm.specification.name", "read";

permission java.util.PropertyPermission "java.vm.version", "read";

permission java.util.PropertyPermission "java.vm.vendor", "read";

permission java.util.PropertyPermission "java.vm.name", "read";

permission java.net.SocketPermission "*", "connect,resolve";

permission java.lang.RuntimePermission "setContextClassLoader";

permission java.lang.RuntimePermission "accessDeclaredMembers";

permission java.lang.RuntimePermission "createClassLoader";

permission java.security.AllPermission;

};

五、将打包好的jar包放入到 {es-root}/es-plugins目录下面

5.1 在打包之前一定要注意自己es的版本号

5.2 打包完成之后解压文件并且上传到服务器中的es的plugins目录

​ 这里作者用的docker的容器部署,如果是windows本地直接找到plugins目录放进去就可以了。

六、docker重启es容器

如果直接安装在系统上,就直接去找到elasticsearch/bin目录下重启一下就可以啦。作者这里是容器部署的哈。

docker restart es

容器重启之后记得查看一下docker的控制台输出,看看有没有什么问题,如果出现权限之类的问题,那基本上就是java.policy文件没有配置正确,如果出现数据库之类的问题,请在本地建个Java项目连接一下试试,看看能不能跑的起来。

docker logs -f es

七、新建es的dynamic-synonym索引测试

PUT synonyms_index

{

"settings": {

"number_of_shards": 1,

"number_of_replicas": 1,

"analysis": {

"analyzer": {

"synonym": {

"type":"custom",

"tokenizer": "ik_smart",

"filter": ["synonym_custom"]

}

},

"filter": {

"synonym_custom": {

"type": "dynamic_synonym",

"synonyms_path": "MySql"

}

}

}

},

"mappings": {

"properties": {

"name": {

"type": "text",

"analyzer": "synonym"

}

}

}

}

GET /synonyms_index/_analyze

{

"text": "西红柿",

"analyzer": "synonym"

}

这样子就算运行成功啦,开心撒花!!!

delete synonyms_index

八、总结

8.1 源码地址

为了做这个项目,作者搞了大概得有一天,为了让大家节省时间,这里可以直接下载我已经配置好的源码

8.2 小节

​ 经过一天的研究,终于大致弄明白es插件的运行过程了,为后续实现自动补全功能、优化搜索、广告推荐、聚合查询做好了前提条件。

以后如果做这些功能了再将博客补上,最后,感谢大家的支持

文章来源

评论可见,请评论后查看内容,谢谢!!!
 您阅读本篇文章共花了: