使用Sharing-JDBC实现分表
Sharing-JDBC介绍
在创建数据库时,我们最先考虑的是按模块对数据库进行划为,但即使这样,单表数据量还是出现大数量的情况,Sharing-JDBC可以对表进行水平切分,将数据均分到不同表中
通过日期水平切分
目前我们航班动态数据全球每天航班20多万,考虑到我们业务场景,用户都是通过航班号+日期来查询一个航班,所以我们采取使用日期来水平分表
添加依赖
<dependency> <groupId>com.dangdang</groupId> <artifactId>sharding-jdbc-core</artifactId> <version>1.5.4.1</version> </dependency> <dependency> <groupId>com.dangdang</groupId> <artifactId>sharding-jdbc-config-spring</artifactId> <version>1.5.4.1</version> </dependency>
<!--mysql flight数据源 -->
<bean id="mysqlDataSource" class="com.zaxxer.hikari.HikariDataSource" destroy-method="close">
...
</bean>
<!--shared jdbc -->
<rdb:strategy id="tableShardingStrategy" sharding-columns="local_date" algorithm-class="com.huoli.songshan.sharding.FlyDateTableShardingAlgorithm" />
<rdb:data-source id="shardingDataSource">
<rdb:sharding-rule data-sources="mysqlDataSource" default-data-source="mysqlDataSource">
<rdb:table-rules>
<rdb:table-rule logic-table="flight_info" actual-tables="flight_info_${2012..2020}${['01','02','03','04','05','06','07','08','09','10','11','12']}${0..3}${0..9}"
table-strategy="tableShardingStrategy" />
</rdb:table-rules>
<rdb:default-database-strategy sharding-columns="none" algorithm-class="com.dangdang.ddframe.rdb.sharding.api.strategy.database.NoneDatabaseShardingAlgorithm" />
</rdb:sharding-rule>
<rdb:props>
<prop key="sql.show">false</prop>
</rdb:props>
</rdb:data-source>
先配置一个dataSource数据源,这里我们使用的是hikari,再通过shardingDataSource对dataSource进行包装,按照表中localdate字段对表进行拆分,即表名依次为flightinfo20120101、flightinfo20120102,到flightinfo20201231,下面我们实现根据日期localdate映射到对应表上
public final class FlyDateTableShardingAlgorithm implements SingleKeyTableShardingAlgorithm<Date> {
private DateTimeFormatter dt = DateTimeFormat.forPattern("yyyyMMdd");
@Override
public String doEqualSharding(Collection<String> availableTargetNames, ShardingValue<Date> shardingValue) {
DateTime datetime = new DateTime(shardingValue.getValue());
String flydate = datetime.toString(dt);
for (String each : availableTargetNames) {
if (each.endsWith(flydate)) {
return each;
}
}
throw new IllegalArgumentException();
}
@Override
public Collection<String> doInSharding(Collection<String> availableTargetNames, ShardingValue<Date> shardingValue) {
Collection<String> result = new LinkedHashSet<>(availableTargetNames.size());
for (Date value : shardingValue.getValues()) {
DateTime datetime = new DateTime(value);
String flydate = datetime.toString(dt);
for (String tableName : availableTargetNames) {
if (tableName.endsWith(flydate)) {
result.add(tableName);
}
}
}
return result;
}
@Override
public Collection<String> doBetweenSharding(Collection<String> availableTargetNames,
ShardingValue<Date> shardingValue) {
Collection<String> result = new LinkedHashSet<>(availableTargetNames.size());
Range<Date> range = (Range<Date>) shardingValue.getValueRange();
for (Date value = range.lowerEndpoint(); value.before(range.upperEndpoint()) || value.equals(range.upperEndpoint()); value = addDays(value, 1)) {
DateTime datetime = new DateTime(value);
String flydate = datetime.toString(dt);
for (String each : availableTargetNames) {
if (each.endsWith(flydate)) {
result.add(each);
}
}
}
return result;
}
private Date addDays(Date date, int days) {
return new Date(new DateTime(new DateTime(date).plusDays(days)).toDate().getTime());
}
}
这里实现了SQL的equal、in、between方法,即通过使用select * from flight_info where local_date=’2019-01-01′,会自动映射flight_info_20190101表中查询。
通过用户ID哈希取模进行拆分
在用户订阅航班动态数据后,我们需要保存用户的订阅数据,以便后期推送动态消息给用户,用户可以在登陆的情况下订阅航班,也可以在未登陆的情况下订阅航班,用户登陆后可以拿到用户的phoneId(设备ID)和userId(用户ID),用户未登陆只能获取到用户的phoneId(设备ID),所以我们区分两种情况,使用两种表来存放用户信息trip_subscribe_nologin和trip_subscribe_login,用户未登陆根据phoneId来分表,用户登陆的情况根据userId来分表
和之前一样,我们需要先配置一个分表策略
<!-- trip_subscribe_nonlogin分表策略 -->
<rdb:strategy id="subscribeNonLoginTableShardingStrategy" sharding-columns="uid" algorithm-class="com.huoli.trip.dao.sharding.SubscribeNonLoginTableShardingAlgorithm" />
<!-- trip_subscribe_login分表策略 -->
<rdb:strategy id="subscribeLoginTableShardingStrategy" sharding-columns="user_id" algorithm-class="com.huoli.trip.dao.sharding.SubscribeLoginTableShardingAlgorithm" />
<!-- 分表配置配置 -->
<rdb:data-source id="shardingDataSource">
<rdb:sharding-rule data-sources="mysqlDataSource">
<rdb:table-rules>
<rdb:table-rule logic-table="trip_subscribe_nonlogin" table-strategy="subscribeNonLoginTableShardingStrategy" actual-tables="trip_subscribe_nonlogin_${0..63}">
<rdb:generate-key-column column-name="id" column-key-generator-class="com.huoli.trip.dao.sharding.TripKeyGenerator" />
</rdb:table-rule>
<rdb:table-rule logic-table="trip_subscribe_login" table-strategy="subscribeLoginTableShardingStrategy" actual-tables="trip_subscribe_login_${0..63}">
<rdb:generate-key-column column-name="id" column-key-generator-class="com.huoli.trip.dao.sharding.TripKeyGenerator" />
</rdb:table-rule>
</rdb:table-rules>
<rdb:default-database-strategy sharding-columns="none" algorithm-class="com.dangdang.ddframe.rdb.sharding.api.strategy.database.NoneDatabaseShardingAlgorithm" />
</rdb:sharding-rule>
<rdb:props>
<prop key="sql.show">true</prop>
</rdb:props>
</rdb:data-source>
这里是根据phoneId、userId分别分成了64张表,即trip_subscribe_nologin有64张表(trip_subscribe_nologin_1,trip_subscribe_nologin_2 …),trip_subscribe_login有64张表(trip_subscribe_login_1,trip_subscribe_login_2 …),具体代码实现
public interface Shard<T> {
/** 根据参数计算分片标识 */
public T calculateShard(Object... args);
}
public class SubscribeNonLoginTableShardingAlgorithm implements SingleKeyTableShardingAlgorithm<String>, Shard<String> {
/** 分片数量 */
private final int shardNum = 64;
public String doEqualSharding(Collection<String> availableTargetNames, ShardingValue<String> shardingValue) {
String shard = calculateShard(shardingValue.getValue());
for (String tableName : availableTargetNames) {
if (tableName.endsWith("_" + shard)) {
return tableName;
}
}
throw new IllegalArgumentException("未找到该表:trip_subscribe_nonlogin_" + shard);
}
public Collection<String> doInSharding(Collection<String> availableTargetNames, ShardingValue<String> shardingValue) {
Collection<String> result = new LinkedHashSet<String>(availableTargetNames.size());
for (String value : shardingValue.getValues()) {
String shard = calculateShard(value);
for (String tableName : availableTargetNames) {
if (tableName.endsWith("_" + shard)) {
result.add(tableName);
}
}
}
return result;
}
public Collection<String> doBetweenSharding(Collection<String> availableTargetNames, ShardingValue<String> shardingValue) {
// 不会出现between两个uid之间的查询需求,所以无需实现该方法
return new LinkedHashSet<String>(availableTargetNames.size());
}
public String calculateShard(Object... args) {
String phoneId = (String) args[0];
return "" + (phoneId.hashCode() & 0x7fffffff) % shardNum;
}
}
public class SubscribeLoginTableShardingAlgorithm implements SingleKeyTableShardingAlgorithm<String>, Shard<String> {
/** 分片数量 */
private final int shardNum = 64;
/**
* doEqualSharding/doInSharding同SubscribeNonLoginTableShardingAlgorithm
*/
public String calculateShard(Object... args) {
String userId = (String) args[0];
return "" + (userId.hashCode() & 0x7fffffff) % shardNum;
}
}
这里主要介绍下calculateShard方法,首先是先拿到了phoneId或userId,获取hashCode,然后和0x7fffffff进行&运算,0x7fffffff表示long的最大值,这一步是为了保证得到的index的第一位为0,也就是为了得到一个正数。因为有符号数第一位0代表正数,1代表负数,然后和分片数shardNum,使表数据分布在shardNum内
另外这里还用到了id的生成策略,即生成一个全局的自增ID,sharing-jdbc自带了DefaultKeyGenerator生成器可以实现
public class TripKeyGenerator implements KeyGenerator {
private static Logger logger = LoggerFactory.getLogger(TripKeyGenerator.class);
private DefaultKeyGenerator defaultKeyGenerator;
public TripKeyGenerator() {
defaultKeyGenerator = new DefaultKeyGenerator();
}
static {
/** 从配置中获取workId */
DefaultKeyGenerator.setWorkerId(workerId);
}
@Override
public synchronized Number generateKey() {
return defaultKeyGenerator.generateKey();
}
}
通过twitter的snowlflake也可以生成唯一ID,具体可以参考:这里