当前位置: 首页 > news >正文

Elasticsearch高效文章搜索实践

功能

image-20250610211130171

创建索引和映射

image-20250610213446050

使用postman添加映射和查询

image-20250610213733416

查询所有的文章信息,批量导入到es索引库中

image-20250610214358841
server:port: 9999
spring:application:name: es-articledatasource:driver-class-name: com.mysql.jdbc.Driverurl: jdbc:mysql://localhost:3306/leadnews_article?useUnicode=true&characterEncoding=UTF-8&serverTimezone=UTCusername: rootpassword: root
# 设置Mapper接口所对应的XML文件位置,如果你在Mapper接口中有自定义方法,需要进行该配置
mybatis-plus:mapper-locations: classpath*:mapper/*.xml# 设置别名包扫描路径,通过该属性可以给包中的类注册别名type-aliases-package: com.heima.model.article.pojos#自定义elasticsearch连接配置
elasticsearch:host: 192.168.200.130port: 9200

导入到es索引库

/*** 注意:数据量的导入,如果数据量过大,需要分页导入** @throws Exception*/
@Test
public void init() throws Exception {// 查询所有符合条件的文章数据List<SearchArticleVo> searchArticleVos = apArticleMapper.loadArticleList();// 批量导入到es索引库BulkRequest bulkRequest = new BulkRequest("app_info_article");for (SearchArticleVo searchArticleVo : searchArticleVos) {IndexRequest indexRequest = new IndexRequest().id(searchArticleVo.getId().toString()).source(JSON.toJSONString(searchArticleVo), XContentType.JSON);// 批量添加数据bulkRequest.add(indexRequest);}restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
}

需求说明

image-20250610221421008

搜索接口定义

image-20250610222047807

UserSearchDto

image-20250610222355123

实现步骤

image-20250612193842072

image-20250612193949445

image-20250612194110538

文章搜索服务实现

/*** es文章分页检索** @param dto* @return*/@Overridepublic ResponseResult search(UserSearchDto dto) throws IOException {//1.检查参数if(dto == null || StringUtils.isBlank(dto.getSearchWords())){return ResponseResult.errorResult(AppHttpCodeEnum.PARAM_INVALID);}ApUser user = AppThreadLocalUtil.getUser();//异步调用 保存搜索记录if(user != null && dto.getFromIndex() == 0){apUserSearchService.insert(dto.getSearchWords(), user.getId());}//2.设置查询条件SearchRequest searchRequest = new SearchRequest("app_info_article");SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();//布尔查询BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();//关键字的分词之后查询QueryStringQueryBuilder queryStringQueryBuilder = QueryBuilders.queryStringQuery(dto.getSearchWords()).field("title").field("content").defaultOperator(Operator.OR);boolQueryBuilder.must(queryStringQueryBuilder);//查询小于mindate的数据RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("publishTime").lt(dto.getMinBehotTime().getTime());boolQueryBuilder.filter(rangeQueryBuilder);//分页查询searchSourceBuilder.from(0);searchSourceBuilder.size(dto.getPageSize());//按照发布时间倒序查询searchSourceBuilder.sort("publishTime", SortOrder.DESC);//设置高亮  titleHighlightBuilder highlightBuilder = new HighlightBuilder();highlightBuilder.field("title");highlightBuilder.preTags("<font style='color: red; font-size: inherit;'>");highlightBuilder.postTags("</font>");searchSourceBuilder.highlighter(highlightBuilder);searchSourceBuilder.query(boolQueryBuilder);searchRequest.source(searchSourceBuilder);SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);//3.结果封装返回List<Map> list = new ArrayList<>();SearchHit[] hits = searchResponse.getHits().getHits();for (SearchHit hit : hits) {String json = hit.getSourceAsString();Map map = JSON.parseObject(json, Map.class);//处理高亮if(hit.getHighlightFields() != null && hit.getHighlightFields().size() > 0){Text[] titles = hit.getHighlightFields().get("title").getFragments();String title = StringUtils.join(titles);//高亮标题map.put("h_title",title);}else {//原始标题map.put("h_title",map.get("title"));}list.add(map);}return ResponseResult.okResult(list);}

新增文章同步添加索引

/*** 创建文章索引** @param apArticle* @param content* @param path*/
private void createArticleEsIndex(ApArticle apArticle, String content, String path) {SearchArticleVo vo = new SearchArticleVo();BeanUtils.copyProperties(apArticle, vo);vo.setContent(content);vo.setStaticUrl(path);kafkaTemplate.send(ArticleConstants.ARTICLE_ES_SYNC_TOPIC, JSON.toJSONString(vo));
}

同步文章数据

@Component
@Slf4j
public class SyncArticleListener {@Autowiredprivate RestHighLevelClient  restHighLevelClient;/*** 同步文章数据* @param message*/@KafkaListener(topics = ArticleConstants.ARTICLE_ES_SYNC_TOPIC)public void onMessage(String message) {if (StringUtils.isNotBlank(message)) {SearchArticleVo searchArticleVo = JSON.parseObject(message, SearchArticleVo.class);IndexRequest indexRequest = new IndexRequest("app_info_article");indexRequest.id(searchArticleVo.getId().toString());indexRequest.source(message, XContentType.JSON);try {restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);} catch (IOException e) {log.error("sync es error = {}", e.getMessage(), e);}}}
}

搜索记录

需求说明

image-20250613221520518

数据存储说明

image-20250613221753764

保存搜索记录-实现思路

image-20250613230006106

image-20250613230334610

image-20250613230414766

image-20250613230621594

用户搜索服务实现

保存用户搜索历史记录

/*** 保存用户搜索历史记录** @param keyword* @param userId*/@Override@Asyncpublic void insert(String keyword, Integer userId) {//1.查询当前用户的搜索关键词Query query = Query.query(Criteria.where("userId").is(userId).and("keyword").is(keyword));ApUserSearch apUserSearch = mongoTemplate.findOne(query, ApUserSearch.class);//2.存在 更新创建时间if (apUserSearch != null) {apUserSearch.setCreatedTime(new Date());mongoTemplate.save(apUserSearch);return;}//3.不存在,判断当前历史记录总数量是否超过10apUserSearch = new ApUserSearch();apUserSearch.setUserId(userId);apUserSearch.setKeyword(keyword);apUserSearch.setCreatedTime(new Date());Query query1 = Query.query(Criteria.where("userId").is(userId));query1.with(Sort.by(Sort.Direction.DESC, "createdTime"));List<ApUserSearch> apUserSearchList = mongoTemplate.find(query1, ApUserSearch.class);if (apUserSearchList == null || apUserSearchList.size() < 10) {mongoTemplate.save(apUserSearch);} else {ApUserSearch lastUserSearch = apUserSearchList.get(apUserSearchList.size() - 1);mongoTemplate.findAndReplace(Query.query(Criteria.where("id").is(lastUserSearch.getId())), apUserSearch);}}

查询用户搜索历史记录

/*** 查询用户搜索历史记录* @return*/@Overridepublic ResponseResult findUserSearch() {// 获取当前用户ApUser user = AppThreadLocalUtil.getUser();if (user == null) {return ResponseResult.errorResult(AppHttpCodeEnum.NEED_LOGIN);}// 根据当前用户查询数据,按照时间倒序List<ApUserSearch> apUserSearches = mongoTemplate.find(Query.query(Criteria.where("userId").is(user.getId())).with(Sort.by(Sort.Direction.DESC, "createdTime")), ApUserSearch.class);return ResponseResult.okResult(apUserSearches);}

删除用户搜索历史记录

/*** 删除用户搜索历史记录* @param dto* @return*/@Overridepublic ResponseResult delUserSearch(HistorySearchDto dto) {// 检查参数if (dto.getId() == null) {return ResponseResult.errorResult(AppHttpCodeEnum.PARAM_INVALID);}// 判断是否登录ApUser user = AppThreadLocalUtil.getUser();if (user == null) {return ResponseResult.errorResult(AppHttpCodeEnum.NEED_LOGIN);}// 删除mongoTemplate.remove(Query.query(Criteria.where("userId").is(user.getId()).and("id").is(dto.getId())), ApUserSearch.class);return ResponseResult.okResult(AppHttpCodeEnum.SUCCESS);}

image-20250614173542679

image-20250614173634456

关键字联想词服务实现

image-20250614175924606

image-20250614180314561

image-20250614180446769

联想词查询

/*** 联想词查询* @param dto* @return*/@Override
public ResponseResult search(UserSearchDto dto) {// 检查参数if (StringUtils.isBlank(dto.getSearchWords())) {return ResponseResult.errorResult(AppHttpCodeEnum.PARAM_INVALID);}// 分页检查if (dto.getPageSize() > 20) {dto.setPageSize(20);}// 执行查询,模糊查询Query query = Query.query(Criteria.where("associateWords").regex(".*?\\" + dto.getSearchWords() + ".*"));query.limit(dto.getPageSize());List<ApAssociateWords> apAssociateWords = mongoTemplate.find(query, ApAssociateWords.class);return ResponseResult.okResult(apAssociateWords);
}
http://www.xdnf.cn/news/1031059.html

相关文章:

  • git-build-package 工具代码详细解读
  • Spark DAG、Stage 划分与 Task 调度底层原理深度剖析
  • MySQL EXPLAIN 详解
  • 【LUT技术专题】4DLUT代码讲解
  • 【系统分析师】2009年真题:综合知识-答案及详解
  • 【卫星通信】卫星与5G深度融合的架构研究——释放非地面网络潜能,构建全球无缝连接【3GPP TR 23.700-19 V0.1.0 (2025-04)】
  • 本地 MySQL 环境连接问题排查与解决笔记
  • 文件同步·使用同步软件来管理文件(外接大脑)
  • 项目拓展-简易SQL监控,P6SPY拦截所有jdbc连接并打印执行SQL
  • 三维重建 —— 4. 三维重建基础与极几何
  • LeetCode 第73题:矩阵置零
  • 区块链与人工智能的融合:从信任到智能的IT新引擎
  • JUC核心解析系列(五)——执行框架(Executor Framework)深度解析
  • ELK 日志分析系统深度解析与实战指南
  • 使用预训练卷积神经模型进行分类(MATLAB例)
  • MaxCompute的Logview分析详解
  • 仿飞书部门选择器
  • 二维码识别深度解析
  • 大模型笔记1:大致了解大模型
  • Burgers方程初值问题解的有效区域
  • JVM 参数调优核心原则与常用参数
  • 【无标题】在 4K 高分辨率(如 3840×2160)笔记本上运行 VMware 虚拟机时平面太小字体太小(ubuntu)
  • 如何在 ArcGIS 中使用 Microsoft Excel 文件_20250614
  • 【软测】node.js辅助生成测试报告
  • 写作词汇积累(A):颇有微词、微妙(“微”字的学习理解)
  • Veeam Backup Replication系统的安装与使用
  • ABP vNext 多语言与本地化:动态切换、资源继承与热更新
  • webuploader分片上传示例,服务端上传文件到腾讯云CDN Teo 应用示例
  • React 第三方状态管理库的比较与选择
  • 后端通过nignx代理转发,提供接口供前端在防火墙外访问