MySQL实战15 索引优化

1.索引使用测试

1.1创建test表（测试表）

drop table if exists test;
create table test
(
    id int primary key auto_increment,
    c1 varchar(10),
    c2 varchar(10),
    c3 varchar(10),
    c4 varchar(10),
    c5 varchar(10)
)
    ENGINE = INNODB
    default CHARSET = utf8;
insert into test(c1, c2, c3, c4, c5)
values ('a1', 'a2', 'a3', 'a4', 'a5');
insert into test(c1, c2, c3, c4, c5)
values ('b1', 'b2', 'b3', 'b4', 'b5');
insert into test(c1, c2, c3, c4, c5)
values ('c1', 'c2', 'c3', 'c4', 'c5');
insert into test(c1, c2, c3, c4, c5)
values ('d1', 'd2', 'd3', 'd4', 'd5');
insert into test(c1, c2, c3, c4, c5)
values ('e1', 'e2', 'e3', 'e4', 'e5');
insert into test(c1, c2, c3, c4, c5)
values ('f1', 'f2', 'f3', 'f4', 'f5');
insert into test(c1, c2, c3, c4, c5)
values ('g1', 'g2', 'g3', 'g4', 'g5');

1.2.创建索引

create index  idx_test_c1234 on test(c1, c2, c3, c4);
show index from test;

image.png

1.3.分析以下Case索引使用情况

Case 1：

explain select * from test where c1 ='a1' and c2 ='a2' and c3 ='a3' and c4 ='a4';

explain select * from test where c4 ='a4' and c3 ='a3' and c1 ='a1' and c2 ='a2';

image.png

分析：

创建复合索引的顺序为c1,c2,c3,c4。
上述2组explain执行的结果都一样：type=ref，key_len=132，ref=const,const,const,const。

结论：在执行常量等值查询时，改变索引列的顺序并不会更改explain的执行结果，因为mysql底层优化器会进行优化，但是推荐按照索引顺序列编写sql语句。

Case 2：

explain  select * from test where c1 ='a1' and c2 ='a2'

image.png

explain  select * from test where c1 ='a1' and c2 ='a2'and c3 >'a3' and c4 ='a4'

image.png

分析：

当出现范围的时候，type=range，key_len=99，比不用范围key_len=66增加了，说明使用上了索引，但对比Case1中执行结果，说明c4上索引失效。

结论：范围右边索引列失效，但是范围当前位置（c3）的索引是有效的，从key_len=99可证明。

Case 2.1：

explain  select * from test where c1 ='a1' and c2 ='a2'and c4 >'a4' and c3 ='a3';

image.png

分析：

与上面explain执行结果对比，key_len=132说明索引用到了4个，因为对此sql语句mysql底层优化器会进行优化：范围右边索引列失效（c4右边已经没有索引列了），注意索引的顺序（c1,c2,c3,c4），所以c4右边不会出现失效的索引列，因此4个索引全部用上。

结论：范围右边索引列失效，是有顺序的：c1,c2,c3,c4，如果c3有范围，则c4失效；如果c4有范围，则没有失效的索引列，从而会使用全部索引。

Case 2.2：

explain  select * from test where c1 >'a1' and c2 ='a2'and c3 ='a3' and c4 ='a4';

image.png

分析：

如果在c1处使用范围，则type=ALL，key=Null，索引失效，全表扫描，这里违背了最佳左前缀法则,因为c1主要用于范围，而不是查询。

解决方式使用覆盖索引。

结论：在最佳左前缀法则中，如果最左前列（带头大哥）的索引失效，则后面的索引都失效。

Case 3：

explain  select * from test where c1 ='a1' and c2 ='a2' and c4 ='a4' order by c3;

image.png

分析：

利用最佳左前缀法则：中间兄弟不能断，因此用到了c1和c2索引（查找），从key_len=66，ref=const,const，c3索引列用在排序过程中。

Case 3.1：

explain  select * from test where c1 ='a1' and c2 ='a2'  order by c3;

image.png

分析：

从explain的执行结果来看：key_len=66，ref=const,const，从而查找只用到c1和c2索引，c3索引用于排序。

Case 3.2：

explain  select * from test where c1 ='a1' and c2 ='a2'  order by c4;

image.png

分析：

从explain的执行结果来看：key_len=66，ref=const,const，查询使用了c1和c2索引，由于用了c4进行排序，跳过了c3，出现了Using filesort。

Case 4：

explain  select * from test where c1 ='a1' and c5 ='a5'  order by c2,c3;

image.png

分析：

查找只用到索引c1，c2和c3用于排序，无Using filesort。

Case 4.1：

explain  select * from test where c1 ='a1' and c5 ='a5'  order by c3,c2;

image.png

分析：

和Case 4中explain的执行结果一样，但是出现了Using filesort，因为索引的创建顺序为c1,c2,c3,c4，但是排序的时候c2和c3颠倒位置了。

Case 4.2：

explain  select * from test where c1 ='a1' and c2 ='a2'   order by c2,c3;

image.png

explain  select * from test where c1 ='a1' and c2 ='a2' and c5 ='a5'  order by c2,c3;

image.png

explain  select * from test where c1 ='a1' and c2 ='a2'   order by c2,c3;

分析：

在查询时增加了c5，但是explain的执行结果一样，因为c5并未创建索引。

Case 4.3：

explain  select * from test where c1 ='a1' and c2 ='a2' and c5 ='a5'  order by c3,c2;

image.png

分析：

与Case 4.1对比，在Extra中并未出现Using filesort，因为c2为常量，在排序中被优化，所以索引未颠倒，不会出现Using filesort。

Case 5：

explain  select * from test where c1 ='a1' and c4 ='a4'  group by c2,c3;

image.png

分析：

只用到c1上的索引，因为c4中间间断了，根据最佳左前缀法则，所以key_len=33，ref=const，表示只用到一个索引。

Case 5.1：

explain  select * from test where c1 ='a1' and c4 ='a4'  group by c3,c2;

image.png

分析：

对比Case 5，在group by时交换了c2和c3的位置，结果出现Using temporary和Using filesort，极度恶劣。原因：c3和c2与索引创建顺序相反。

Case 6：

explain  select * from test where c1 >'a1'   order by c1;

image.png

分析：

①在c1,c2,c3,c4上创建了索引，直接在c1上使用范围，导致了索引失效，全表扫描：type=ALL，ref=Null。因为此时c1主要用于排序，并不是查询。

②使用c1进行排序，出现了Using filesort。

③解决方法：使用覆盖索引。

explain  select c1 from test where c1 >'a1'   order by c1;

image.png

Case 7：

explain  select c1 from test order by c1 asc ,c2 desc ;

image.png

分析：

虽然排序的字段列与索引顺序一样，且order by默认升序，这里c2 desc变成了降序，导致与索引的排序方式不同，从而产生Using filesort。

Case 8：

explain  select c1 from test where c1 in ('a1','b1') order by c2,c3;

image.png

Case 8.1 ：

explain  select c1 from test where c1 in ('a1','b1','c1','d1') order by c2,c3;

image.png

分析：

对于排序来说，多个相等条件也是范围查询

总结：

①MySQL支持两种方式的排序filesort和index，Using index是指MySQL扫描索引本身完成排序。index效率高，filesort效率低。

②order by满足两种情况会使用Using index。

1.order by语句使用索引最左前列。

2.使用where子句与order by子句条件列组合满足索引最左前列。

③尽量在索引列上完成排序，遵循索引建立（索引创建的顺序）时的最佳左前缀法则。

④如果order by的条件不在索引列上，就会产生Using filesort。

⑤group by与order by很类似，其实质是先排序后分组，遵照索引创建顺序的最佳左前缀法则。注意where高于having，能写在where中的限定条件就不要去having限定了。

通俗理解口诀：

全值匹配我最爱，最左前缀要遵守；
带头大哥不能死，中间兄弟不能断；
索引列上少计算，范围之后全失效；
LIKE百分写最右，覆盖索引不写星；
不等空值还有or，索引失效要少用。

补充：in和exsits优化

原则：小表驱动大表，即小的数据集驱动大的数据集

in：当B表的数据集必须小于A表的数据集时，in优于exists

select * fromA where id in (selectid from B)

等价于：

for selectid from B for select * fromA whereA.id = B.id

exists：当A表的数据集小于B表的数据集时，exists优于in

将主查询A的数据，放到子查询B中做条件验证，根据验证结果（true或false）来决定主查询的数据是否保留

select * fromA where exists (select **1** fromB whereB.id = A.id)

等价于：

for select * from A for select * fromB whereB.id = A.id

A表与B表的ID字段应建立索引

EXISTS (subquery)只返回TRUE或FALSE,因此子查询中的SELECT * 也可以是SELECT 1或select X,官方说法是实际执行时会忽略SELECT清单,因此没有区别
EXISTS子查询的实际执行过程可能经过了优化而不是我们理解上的逐条对比
EXISTS子查询往往也可以用JOIN来代替，何种最优需要具体问题具体分析

2.索引最佳实践

使用的表

CREATE TABLE `employees` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(24) NOT NULL DEFAULT '' COMMENT '姓名',
  `age` int(11) NOT NULL DEFAULT '0' COMMENT '年龄',
  `position` varchar(20) NOT NULL DEFAULT '' COMMENT '职位',
  `hire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '入职时间',
  PRIMARY KEY (`id`),
  KEY `idx_name_age_position` (`name`,`age`,`position`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COMMENT='员工记录表';

INSERT INTO employees(name,age,position,hire_time) VALUES('LiLei',22,'manager',NOW());
INSERT INTO employees(name,age,position,hire_time) VALUES('HanMeimei', 23,'dev',NOW());
INSERT INTO employees(name,age,position,hire_time) VALUES('Lucy',23,'dev',NOW());

1.全值匹配

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei';

image.png

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22;

image.png

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';

image.png

2.最佳左前缀法则

如果索引了多列，要遵守最左前缀法则。指的是查询从索引的最左前列开始并且不跳过索引中的列。

EXPLAIN SELECT * FROM employees WHERE age = 22 AND position ='manager';

image.png

EXPLAIN SELECT * FROM employees WHERE position = 'manager';

image.png

跳过了索引中最左前列而导致不走索引

EXPLAIN SELECT * FROM employees WHERE name = 'LiLei';

image.png

name是最左前列走索引

3.不在索引列上做任何操作（计算、函数、（自动or手动）类型转换），会导致索引失效而转向全表扫描

走索引

EXPLAIN SELECT * FROM employees WHERE name = 'LiLei';

EXPLAIN SELECT * FROM employees WHERE left(name,3) = 'LiLei';

image.png

4.存储引擎不能使用索引中范围条件右边的列

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';

image.png

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age > 22 AND position ='manager';

image.png

5.尽量使用覆盖索引（只访问索引的查询（索引列包含查询列）），减少select *语句

EXPLAIN SELECT name,age FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';

image

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';

image

6.mysql在使用不等于（！=或者<>）的时候无法使用索引会导致全表扫描

EXPLAIN SELECT * FROM employees WHERE name != 'LiLei'

image

7.is null,is not null 也无法使用索引

EXPLAIN SELECT * FROM employees WHERE name is null

image

8.like以通配符开头（'$abc...'）mysql索引失效会变成全表扫描操作

EXPLAIN SELECT * FROM employees WHERE name like '%Lei'

image

EXPLAIN SELECT * FROM employees WHERE name like 'Lei%'

image

问题：解决like'%字符串%'索引不被使用的方法？

a）使用覆盖索引，查询字段必须是建立覆盖索引字段

EXPLAIN SELECT name,age,position FROM employees WHERE name like '%Lei%';

image

b）当覆盖索引指向的字段是varchar(380)及380以上的字段时，覆盖索引会失效！

9.字符串不加单引号索引失效

EXPLAIN SELECT * FROM employees WHERE name = '1000';

EXPLAIN SELECT * FROM employees WHERE name = 1000;

image

10.少用or,用它连接时很多情况下索引会失效

EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' or name = 'HanMeimei';

image