当前位置：首页 > news >正文

记一次分表的实操（ShardingSphere）

news 2025/6/6 10:05:55

官网：

https://shardingsphere.apache.org/document/current/cn/overview/

介绍：

Apache ShardingSphere 是一款分布式的数据库生态系统，可以将任意数据库转换为分布式数据库，并通过数据分片、弹性伸缩、加密等能力对原有数据库进行增强。

Apache ShardingSphere 设计哲学为 Database Plus，旨在构建异构数据库上层的标准和生态。它关注如何充分合理地利用数据库的计算和存储能力，而并非实现一个全新的数据库。它站在数据库的上层视角，关注它们之间的协作多于数据库自身。

ShardingSphere-JDBC（无需独立部署，仅支持java语言，支持多种数据库和ORM框架）

ShardingSphere-JDBC 定位为轻量级 Java 框架，在 Java 的 JDBC 层提供的额外服务。

ShardingSphere-Proxy（相当于中间件实现，支持所有异构语言，需要独立部署，数据库的访问链接都要变成这个中间件）

ShardingSphere-Proxy 定位为透明化的数据库代理端，通过实现数据库二进制协议，对异构语言提供支持。

问题背景：

公司每月会有一批数据导入，大概800W，希望这些数据增删改查快，所以决定对这个表进行分表，分表的纬度按照按活动年月分表

解决方案：

:::info
技术栈：Springboot+MybatisPlus+ShardingSphere-JDBC

ShardingSphere的底层逻辑：

帮我们完成了对sql的改造，比如，我们增删改查的是STORE_PLANNING这张表，如果带上了分片字段（plan_date），那么就会根据我们的配置，映射到真正的物理表上，比如原本的语句是select * from STORE_PLANNING where plan_date=‘202506’，ShardingSphere会根据plan_date转成 select * from STORE_PLANNING_202506 where plan_date=‘202506’。

:::

表结构

create table STORE_PLANNING
(id              bigint                             not null comment 'ID'primary key,file_id         varchar(100)                       not null comment '文件指纹',plan_date       varchar(20)                        not null comment '生意计划年月（yyyymm）',war_zone        varchar(100)                       null comment '战区',province        varchar(100)                       null comment '省区',area            varchar(100)                       null comment '地区',city            varchar(100)                       null comment '城市',dealer_code     varchar(100)                       null comment '经销商编码',dealer_name     varchar(255)                       null comment '经销商名称',terminal_code   varchar(100)                       null comment '门店编码',terminal_name   varchar(400)                       null comment '门店名称',dealer_type     varchar(20)                        null comment '经销商类别（第一类、第二类、第四类）',dealer_category varchar(20)                        null comment '经销商品类（冲泡、即饮）',check_status    int      default 0                 not null comment '合规检验状态（默认0；0:待处理；1:检验失败；2:检验通过）',sync_status     int      default 0                 not null comment '同步数据状态（默认0；0:待同步；1:同步失败；2:同步通过）',col_a           text                               null comment '列a',col_b           text                               null comment '列b',col_c           text                               null comment '列c',check_error     text                               null comment '合规校验异常',sfa_sync_error  text                               null comment 'SFA同步数据异常',exp_sync_error  text                               null comment 'EXP同步数据异常',sys_error       text                               null comment '系统失败信息',remark          varchar(100)                       null comment '备注',del_flag        char     default '0'               not null comment '逻辑删除(0:正常 1:删除)',create_time     datetime default CURRENT_TIMESTAMP not null comment '创建时间',creator         varchar(100)                       null comment '创建人',update_time     datetime default CURRENT_TIMESTAMP not null comment '更新时间',updater         varchar(100)                       null comment '更新人'
)comment '门店规划表' row_format = DYNAMIC;

配置文件的修改

spring:shardingsphere:props:sql-show: truedatasource: # 数据源配置，可配置多个names: busi_plan  # 数据源名称，可填写多个，逗号隔开busi_plan: # 数据源名称type: com.zaxxer.hikari.HikariDataSourcedriver-class-name: com.mysql.cj.jdbc.Driverjdbc-url: jdbc:mysql://数据库ip:3306/busi_plan?useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghaiusername: 数据库用户password: 数据库密码hikari:pool-name: UAC_HikariCPminimum-idle: 10                # 最小空闲连接数量idle-timeout: 180000            # 空闲连接存活最大时间，默认600000（10分钟）maximum-pool-size: 100          # 连接池最大连接数，默认是10auto-commit: true               # 此属性控制从池返回的连接的默认自动提交行为,默认值：trueconnection-timeout: 30000       # 数据库连接超时时间,默认30秒，即30000connection-test-query: SELECT 1rules: # 规则配置sharding:tables: # 数据分片规则配置STORE_PLANNING: # 要分片的表名，可以是多个actual-data-nodes: busi_plan.STORE_PLANNING_$->{202506..202512} # 由数据源名 + 表名组成（参考 Inline 语法规则）,此处的表必须是数据库真实存在的物理表，不然就会报错table-strategy:  # 分表策略standard: # 用于单分片键的标准分片场景sharding-column: plan_date # 分片字段sharding-algorithm-name: t_store_plan_inline # 分片的具体策略名key-generate-strategy: # 分布式Id策略column: id # 分布式ID对应的表字段key-generator-name: snowflake # 分布式ID策略名sharding-algorithms: # 分片策略，可以配置多个t_store_plan_inline: # 分片策略名type: INLINE # 用inline表达式props:algorithm-expression: STORE_PLANNING_$->{plan_date} # 字符串拼接分片字段值key-generators: # 分布式Id策略，可以配置多个snowflake: # 分布式ID策略名type: SNOWFLAKE   # 雪花算法

将需要用的分表全部创建（也可以动态创建表，就是要自己写代码进行改造，我这里是按最简单的配置化）
约定逻辑表中必须带有分片字段，也就是活动年月（planDate），增删改查尽量都带上这个活动年月，如果不带，就是会去查所有的逻辑表，尽量避免这个操作

踩坑点

合适的mybatis、ShardingSphere版本

<!-- mybatis -->
<dependency><groupId>com.baomidou</groupId><artifactId>mybatis-plus-boot-starter</artifactId><version>3.5.3.2</version>
</dependency>
<!--shardingsphere-->
<dependency><groupId>org.apache.shardingsphere</groupId><artifactId>shardingsphere-jdbc-core-spring-boot-starter</artifactId><version>5.1.1</version><exclusions><exclusion><groupId>org.mybatis</groupId><artifactId>mybatis</artifactId></exclusion></exclusions>
</dependency>

增删改查一定要带上分片键，不然会把逻辑表相关的所有物理表都增删改查一遍

无法更新分片键，会报错

**原因：**这是ShardingSphere的保护机制，因为分片键的更改意味着数据需要跨表更新/新增

mybatisPlus的批量更新失效，带上分片键，依旧是所有表全部更新

**原因：**比如mybatisPlus的updateBatchById()，底层的真实sql实际上是update table set xxxx=xxx where id in (xxx,xxx)，没有带上分片键，所有会把逻辑表相关的所有物理表都更新

**解决：**自己写批量更新的Sql

<update id="updateBatchSfaByIdAndPlantDate">UPDATE store_planningSETsync_status = 1,sfa_sync_error = CASE id<foreach collection="errList" item="item" index="index">WHEN #{errList[${index}].id} THEN #{errList[${index}].sfaSyncError}</foreach>END,update_time = NOW()WHERE plan_date = #{planDate}AND id IN<foreach collection="errList" item="item" open="(" separator="," close=")" index="index">#{errList[${index}].id}</foreach></update>