sunnyxuebuhui 2020-03-03
有一个表tmp_test_course大概有10万条记录,然后有个json字段叫outline,存了一对多关系(保存了多个编码,例如jy1577683381775)
我们需要在这10万条数据中检索特定类型的数据,目标总数据量:2931
条
SELECT COUNT(*) FROM tmp_test_course WHERE `type`=5 AND del=2 AND is_leaf=1
我们在限定为上面类型的同时,还得包含下面任意一个编码(也就是OR查询)
jy1577683381775 jy1577683380808 jy1577683379178 jy1577683378676 jy1577683377617 jy1577683376672 jy1577683375903 jy1578385720787 jy1499916986208 jy1499917112460 jy1499917093400 jy1499917335579 jy1499917334770 jy1499917333339 jy1499917331557 jy1499917330833 jy1499917329615 jy1499917328496 jy1576922006950 jy1499916993558 jy1499916992308 jy1499917003454 jy1499917002952
下面分别列出4种方式查询outline字段,给出相应的查询时间和扫描行数
耗时248毫秒
SELECT * FROM tmp_test_course WHERE `type`=5 AND del=2 AND is_leaf=1 AND ( outline like ‘%jy1577683381775%‘ OR outline like ‘%jy1577683380808%‘ OR outline like ‘%jy1577683379178%‘ OR outline like ‘%jy1577683378676%‘ OR outline like ‘%jy1577683377617%‘ OR outline like ‘%jy1577683376672%‘ OR outline like ‘%jy1577683375903%‘ OR outline like ‘%jy1578385720787%‘ OR outline like ‘%jy1499916986208%‘ OR outline like ‘%jy1499917112460%‘ OR outline like ‘%jy1499917093400%‘ OR outline like ‘%jy1499917335579%‘ OR outline like ‘%jy1499917334770%‘ OR outline like ‘%jy1499917333339%‘ OR outline like ‘%jy1499917331557%‘ OR outline like ‘%jy1499917330833%‘ OR outline like ‘%jy1499917329615%‘ OR outline like ‘%jy1499917328496%‘ OR outline like ‘%jy1576922006950%‘ OR outline like ‘%jy1499916993558%‘ OR outline like ‘%jy1499916992308%‘ OR outline like ‘%jy1499917003454%‘ OR outline like ‘%jy1499917002952%‘ )
EXPLAIN分析结果如下,全表扫描
使用函数JSON_SEARCH,更多函数请查看MySQL官方文档
可以看到,查询耗时196毫秒,速度稍微快了一点
SELECT * FROM tmp_test_course WHERE `type`=5 AND del=2 AND is_leaf=1 AND ( JSON_SEARCH(outline, ‘one‘, ‘jy1577683381775‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1577683380808‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1577683379178‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1577683378676‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1577683377617‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1577683376672‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1577683375903‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1578385720787‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499916986208‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917112460‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917093400‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917335579‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917334770‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917333339‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917331557‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917330833‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917329615‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917328496‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1576922006950‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499916993558‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499916992308‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917003454‘) IS NOT NULL OR JSON_SEARCH(outline, ‘one‘, ‘jy1499917002952‘) IS NOT NULL )
EXPLAIN分析结果如下,还是全表扫描
下面为该表建立一个联合索引(本来想建一个type-del-is_leaf-outline的索引,但是outline字段太长限制,所以只加type-del-is_leaf的联合索引
ALTER TABLE tmp_test_course ADD KEY `type-del-is_leaf` (`type`,`del`,`is_leaf`)
加入索引后再执行like和json查询,明显提速。
like执行用了136毫秒,json查询用了82.6毫秒,由此可见针对json类型使用json函数查询比like快
EXPLAIN分析结果如下,两者查询扫描的行数都限定在了2931行
因为全文索引只支持CHAR、VARCHAR和TEXT,我们需要把JSON字段定义改一下
ALTER TABLE tmp_test_course MODIFY `outline` VARCHAR(1024) NOT NULL DEFAULT ‘[]‘
添加全文索引
ALTER TABLE tmp_test_course ADD FULLTEXT INDEX outline (outline);
现在再来用全文索引进行检索
SELECT * FROM tmp_test_course WHERE `type`=5 AND del=2 AND is_leaf=1 AND MATCH(outline) AGAINST (‘jy1577683381775 jy1577683380808 jy1577683379178 jy1577683378676 jy1577683377617 jy1577683376672 jy1577683375903 jy1578385720787 jy1499916986208 jy1499917112460 jy1499917093400 jy1499917335579 jy1499917334770 jy1499917333339 jy1499917331557 jy1499917330833 jy1499917329615 jy1499917328496 jy1576922006950 jy1499916993558 jy1499916992308 jy1499917003454 jy1499917002952‘)
耗时11.6毫秒,速度提升极其明显,可见全文索引的牛逼。
EXPLAIN分析结果如下,显示只扫描了一行
以下是4种情况的执行结果
全文索引: 11.6ms
联合索引:82.6ms(json)、136ms(like)
json函数查询:196ms
like查询: 248ms
结论:全文索引 > 联合索引 > json函数查询 > like查询
数据量越大,全文索引速度越明显,就10万的量,查询速度大概比直接查询快了20倍左右,如果是百万或千万级别的表,提升差距会更加大,所以有条件还是老老实实用全文索引吧