使用Sharing-JDBC实现分表
Sharing-JDBC介绍
在创建数据库时,我们最先考虑的是按模块对数据库进行划为,但即使这样,单表数据量还是出现大数量的情况,Sharing-JDBC可以对表进行水平切分,将数据均分到不同表中
通过日期水平切分
目前我们航班动态数据全球每天航班20多万,考虑到我们业务场景,用户都是通过航班号+日期来查询一个航班,所以我们采取使用日期来水平分表
添加依赖
<dependency> <groupId>com.dangdang</groupId> <artifactId>sharding-jdbc-core</artifactId> <version>1.5.4.1</version> </dependency> <dependency> <groupId>com.dangdang</groupId> <artifactId>sharding-jdbc-config-spring</artifactId> <version>1.5.4.1</version> </dependency>
<!--mysql flight数据源 --> <bean id="mysqlDataSource" class="com.zaxxer.hikari.HikariDataSource" destroy-method="close"> ... </bean> <!--shared jdbc --> <rdb:strategy id="tableShardingStrategy" sharding-columns="local_date" algorithm-class="com.huoli.songshan.sharding.FlyDateTableShardingAlgorithm" /> <rdb:data-source id="shardingDataSource"> <rdb:sharding-rule data-sources="mysqlDataSource" default-data-source="mysqlDataSource"> <rdb:table-rules> <rdb:table-rule logic-table="flight_info" actual-tables="flight_info_${2012..2020}${['01','02','03','04','05','06','07','08','09','10','11','12']}${0..3}${0..9}" table-strategy="tableShardingStrategy" /> </rdb:table-rules> <rdb:default-database-strategy sharding-columns="none" algorithm-class="com.dangdang.ddframe.rdb.sharding.api.strategy.database.NoneDatabaseShardingAlgorithm" /> </rdb:sharding-rule> <rdb:props> <prop key="sql.show">false</prop> </rdb:props> </rdb:data-source>
先配置一个dataSource数据源,这里我们使用的是hikari,再通过shardingDataSource对dataSource进行包装,按照表中localdate字段对表进行拆分,即表名依次为flightinfo20120101、flightinfo20120102,到flightinfo20201231,下面我们实现根据日期localdate映射到对应表上
public final class FlyDateTableShardingAlgorithm implements SingleKeyTableShardingAlgorithm<Date> { private DateTimeFormatter dt = DateTimeFormat.forPattern("yyyyMMdd"); @Override public String doEqualSharding(Collection<String> availableTargetNames, ShardingValue<Date> shardingValue) { DateTime datetime = new DateTime(shardingValue.getValue()); String flydate = datetime.toString(dt); for (String each : availableTargetNames) { if (each.endsWith(flydate)) { return each; } } throw new IllegalArgumentException(); } @Override public Collection<String> doInSharding(Collection<String> availableTargetNames, ShardingValue<Date> shardingValue) { Collection<String> result = new LinkedHashSet<>(availableTargetNames.size()); for (Date value : shardingValue.getValues()) { DateTime datetime = new DateTime(value); String flydate = datetime.toString(dt); for (String tableName : availableTargetNames) { if (tableName.endsWith(flydate)) { result.add(tableName); } } } return result; } @Override public Collection<String> doBetweenSharding(Collection<String> availableTargetNames, ShardingValue<Date> shardingValue) { Collection<String> result = new LinkedHashSet<>(availableTargetNames.size()); Range<Date> range = (Range<Date>) shardingValue.getValueRange(); for (Date value = range.lowerEndpoint(); value.before(range.upperEndpoint()) || value.equals(range.upperEndpoint()); value = addDays(value, 1)) { DateTime datetime = new DateTime(value); String flydate = datetime.toString(dt); for (String each : availableTargetNames) { if (each.endsWith(flydate)) { result.add(each); } } } return result; } private Date addDays(Date date, int days) { return new Date(new DateTime(new DateTime(date).plusDays(days)).toDate().getTime()); } }
这里实现了SQL的equal、in、between方法,即通过使用select * from flight_info where local_date=’2019-01-01′,会自动映射flight_info_20190101表中查询。
通过用户ID哈希取模进行拆分
在用户订阅航班动态数据后,我们需要保存用户的订阅数据,以便后期推送动态消息给用户,用户可以在登陆的情况下订阅航班,也可以在未登陆的情况下订阅航班,用户登陆后可以拿到用户的phoneId(设备ID)和userId(用户ID),用户未登陆只能获取到用户的phoneId(设备ID),所以我们区分两种情况,使用两种表来存放用户信息trip_subscribe_nologin和trip_subscribe_login,用户未登陆根据phoneId来分表,用户登陆的情况根据userId来分表
和之前一样,我们需要先配置一个分表策略
<!-- trip_subscribe_nonlogin分表策略 --> <rdb:strategy id="subscribeNonLoginTableShardingStrategy" sharding-columns="uid" algorithm-class="com.huoli.trip.dao.sharding.SubscribeNonLoginTableShardingAlgorithm" /> <!-- trip_subscribe_login分表策略 --> <rdb:strategy id="subscribeLoginTableShardingStrategy" sharding-columns="user_id" algorithm-class="com.huoli.trip.dao.sharding.SubscribeLoginTableShardingAlgorithm" /> <!-- 分表配置配置 --> <rdb:data-source id="shardingDataSource"> <rdb:sharding-rule data-sources="mysqlDataSource"> <rdb:table-rules> <rdb:table-rule logic-table="trip_subscribe_nonlogin" table-strategy="subscribeNonLoginTableShardingStrategy" actual-tables="trip_subscribe_nonlogin_${0..63}"> <rdb:generate-key-column column-name="id" column-key-generator-class="com.huoli.trip.dao.sharding.TripKeyGenerator" /> </rdb:table-rule> <rdb:table-rule logic-table="trip_subscribe_login" table-strategy="subscribeLoginTableShardingStrategy" actual-tables="trip_subscribe_login_${0..63}"> <rdb:generate-key-column column-name="id" column-key-generator-class="com.huoli.trip.dao.sharding.TripKeyGenerator" /> </rdb:table-rule> </rdb:table-rules> <rdb:default-database-strategy sharding-columns="none" algorithm-class="com.dangdang.ddframe.rdb.sharding.api.strategy.database.NoneDatabaseShardingAlgorithm" /> </rdb:sharding-rule> <rdb:props> <prop key="sql.show">true</prop> </rdb:props> </rdb:data-source>
这里是根据phoneId、userId分别分成了64张表,即trip_subscribe_nologin有64张表(trip_subscribe_nologin_1,trip_subscribe_nologin_2 …),trip_subscribe_login有64张表(trip_subscribe_login_1,trip_subscribe_login_2 …),具体代码实现
public interface Shard<T> { /** 根据参数计算分片标识 */ public T calculateShard(Object... args); } public class SubscribeNonLoginTableShardingAlgorithm implements SingleKeyTableShardingAlgorithm<String>, Shard<String> { /** 分片数量 */ private final int shardNum = 64; public String doEqualSharding(Collection<String> availableTargetNames, ShardingValue<String> shardingValue) { String shard = calculateShard(shardingValue.getValue()); for (String tableName : availableTargetNames) { if (tableName.endsWith("_" + shard)) { return tableName; } } throw new IllegalArgumentException("未找到该表:trip_subscribe_nonlogin_" + shard); } public Collection<String> doInSharding(Collection<String> availableTargetNames, ShardingValue<String> shardingValue) { Collection<String> result = new LinkedHashSet<String>(availableTargetNames.size()); for (String value : shardingValue.getValues()) { String shard = calculateShard(value); for (String tableName : availableTargetNames) { if (tableName.endsWith("_" + shard)) { result.add(tableName); } } } return result; } public Collection<String> doBetweenSharding(Collection<String> availableTargetNames, ShardingValue<String> shardingValue) { // 不会出现between两个uid之间的查询需求,所以无需实现该方法 return new LinkedHashSet<String>(availableTargetNames.size()); } public String calculateShard(Object... args) { String phoneId = (String) args[0]; return "" + (phoneId.hashCode() & 0x7fffffff) % shardNum; } } public class SubscribeLoginTableShardingAlgorithm implements SingleKeyTableShardingAlgorithm<String>, Shard<String> { /** 分片数量 */ private final int shardNum = 64; /** * doEqualSharding/doInSharding同SubscribeNonLoginTableShardingAlgorithm */ public String calculateShard(Object... args) { String userId = (String) args[0]; return "" + (userId.hashCode() & 0x7fffffff) % shardNum; } }
这里主要介绍下calculateShard方法,首先是先拿到了phoneId或userId,获取hashCode,然后和0x7fffffff进行&运算,0x7fffffff表示long的最大值,这一步是为了保证得到的index的第一位为0,也就是为了得到一个正数。因为有符号数第一位0代表正数,1代表负数,然后和分片数shardNum,使表数据分布在shardNum内
另外这里还用到了id的生成策略,即生成一个全局的自增ID,sharing-jdbc自带了DefaultKeyGenerator生成器可以实现
public class TripKeyGenerator implements KeyGenerator { private static Logger logger = LoggerFactory.getLogger(TripKeyGenerator.class); private DefaultKeyGenerator defaultKeyGenerator; public TripKeyGenerator() { defaultKeyGenerator = new DefaultKeyGenerator(); } static { /** 从配置中获取workId */ DefaultKeyGenerator.setWorkerId(workerId); } @Override public synchronized Number generateKey() { return defaultKeyGenerator.generateKey(); } }
通过twitter的snowlflake也可以生成唯一ID,具体可以参考:这里