烧死10亿脑细胞的SQL长啥样？

技术分享 3年前 (2023-07-25) 0 999+

1 前言

今天在生产中碰到了一个让我十分费解的 SQL，十分有趣。

2 现象

SQL 很好复现，就是逻辑看起来有点唬人

postgres=# create table test(id1 int,id2 int); CREATE TABLE postgres=# insert into test values(1,3),(2,1),(3,1),(3,3); INSERT 0 4 postgres=# select * from test;  id1 | id2  -----+-----    1 |   3    2 |   1    3 |   1    3 |   3 (4 rows)

业务 SQL 如下👇🏻 此处用 test 表替代，真实情况表中字段存在一个父子关系，根据 parent_id 查找子 id

postgres=# select (exists (select 1 as one from test a where (test.id1 = a.id2))) as b from test;  b  ---  t  f  t  t (4 rows)  postgres=# explain select (exists (select 1 as one from test a where (test.id1 = a.id2))) as b from test;                           QUERY PLAN                           --------------------------------------------------------------  Seq Scan on test  (cost=0.00..3.14 rows=4 width=1)    SubPlan 2      ->  Seq Scan on test a  (cost=0.00..1.04 rows=4 width=4) (3 rows)

SQL 是 self-join ，a 是 test 表的一个别名。

让我们把子查询单独摘出来执行一下

postgres=# select 1 as one from test a where (test.id1 = a.id2); ERROR:  invalid reference to FROM-clause entry for table "test" LINE 1: select 1 as one from test a where (test.id1 = a.id2);                                            ^ HINT:  Perhaps you meant to reference the table alias "a".

可以看到报错了，说明此处的 test 是取自外层的 test（即 from test），根据 test.id1 去判断 a.id2，于是返回如下结果

postgres=# select * from test;  id1 | id2  -----+-----    1 |   3   ---true (id1=1，id2里面有，遍历)    2 |   1   ---false(id1=2，id2里面没有，遍历)    3 |   1   ---true (id1=3，id2里面有，遍历)    3 |   3   ---true (id1=3，id2里面有，遍历) (4 rows)

现在让我们改写一下 SQL，修改一下别名

postgres=# select (exists (select 1 as one from test a where (a.id1 = test.id2))) as b from test;  b  ---  t  t  t  t (4 rows)  postgres=# explain select (exists (select 1 as one from test a where (a.id1 = test.id2))) as b from test;                           QUERY PLAN                           --------------------------------------------------------------  Seq Scan on test  (cost=0.00..5.24 rows=4 width=1)    SubPlan 2      ->  Seq Scan on test a  (cost=0.00..1.04 rows=4 width=4) (3 rows)

这次可以看到，结果全部是真。老样子，也是相同的原理

postgres=# select 1 as one from test a where (a.id1 = test.id2); ERROR:  invalid reference to FROM-clause entry for table "test" LINE 1: select 1 as one from test a where (a.id1 = test.id2);                                                    ^ HINT:  Perhaps you meant to reference the table alias "a".

于是根据 test.id2 去探测 a.id1，于是返回如下结果

postgres=# select * from test;  id1 | id2  -----+-----    1 |   3   ---true (id2=3，id1里面有，遍历)    2 |   1   ---true (id2=1，id1里面有，遍历)    3 |   1   ---true (id2=1，id1里面有，遍历)    3 |   3   ---true (id2=3，id1里面有，遍历) (4 rows)

让我们再改写一下 SQL

postgres=# select (exists (select 1 as one from test a where (a.id1 = a.id2))) as b from test;  b  ---  t  t  t  t (4 rows)  postgres=# explain select (exists (select 1 as one from test a where (a.id1 = a.id2))) as b from test;                           QUERY PLAN                           --------------------------------------------------------------  Seq Scan on test  (cost=1.05..2.09 rows=4 width=1)    InitPlan 1 (returns $0)      ->  Seq Scan on test a  (cost=0.00..1.05 rows=1 width=0)            Filter: (id1 = id2) (4 rows)

这次执行计划变了，变成了 InitPlan，执行计划和结构都有所差异。那么 InitPlan 是什么意思？

This plan happens whenever there is a part of your query that can (or have to) be calculated before anything else, and it doesn't depend on anything in the rest of your query.

只要查询的一部分可以（或必须）在其他任何内容之前计算，并且它不依赖于查询的其余部分中的任何内容，就会发生此计划。

A special case of SubPlan that only needs to run once.

SubPlan 的一种特殊情况，只需要运行一次。

这就有点像相关子连接和非相关子连接的说法，相关子连接在子查询语句中引用了外层表的列属性，这就导致外层表每获得一个元组，子查询就需要重新执行一次；而非相关子连接是指在子查询语句是独立的，和外层的表没有直接的关联，子查询可以单独执行一次，外层表可以重复利用子查询的执行结果。

因此上述执行计划就变成了 a 表先进行一次独立的子查询

postgres=# select * from test where id1 = id2;  id1 | id2  -----+-----    3 |   3 (1 row)  postgres=# select exists (select 3,3) as b from test;  b  ---  t  t  t  t (4 rows)  postgres=# delete from test; DELETE 4 postgres=# insert into test values(5,4); INSERT 0 1 postgres=# select (exists (select 1 as one from test a where (a.id1 = a.id2))) as b from test;  b  ---  f (1 row)  postgres=# insert into test values(3,4); INSERT 0 1 postgres=# select (exists (select 1 as one from test a where (a.id1 = a.id2))) as b from test;  b  ---  f  f (2 rows)  postgres=# insert into test values(4,4); INSERT 0 1 postgres=# select (exists (select 1 as one from test a where (a.id1 = a.id2))) as b from test;  b  ---  t  t  t (3 rows)

可以看到，只要结果中有相等的 id1 和 id2，结果就会全部返回真。

那让我们又双叒叕改写下 SQL

postgres=# truncate table test; TRUNCATE TABLE postgres=# insert into test values(1,3),(2,1),(3,1),(3,3); INSERT 0 4 postgres=# explain select (exists (select 1 as one from test a where (test.id1 = test.id2))) as b from test;                              QUERY PLAN                              --------------------------------------------------------------------  Seq Scan on test  (cost=0.00..2.09 rows=4 width=1)    SubPlan 1      ->  Result  (cost=0.00..1.04 rows=4 width=0)            One-Time Filter: (test.id1 = test.id2)            ->  Seq Scan on test a  (cost=0.00..1.04 rows=4 width=0) (5 rows)  postgres=# select (exists (select 1 as one from test a where (test.id1 = test.id2))) as b from test;  b  ---  f  f  f  t (4 rows)

这次多了一个 One-Time Filter，那么这个又是什么玩意？

A qualification used by a Result operation. If it is false, an empty result set can be returned without further work.

如果为 false，则可以返回空结果集，无需进一步工作。

让我们瞅瞅代码，在代码中有这么一段注释

 *  Result nodes are also used to optimise queries with constant  *  qualifications (ie, quals that do not depend on the scanned data),  *  such as:  *  *    select * from emp where 2 > 1  *  *  In this case, the plan generated is  *  *      Result (with 2 > 1 qual)  *      /  *       SeqScan (emp.*)  *  *  At runtime, the Result node evaluates the constant qual once,  *  which is shown by EXPLAIN as a One-Time Filter.  If it's  *  false, we can return an empty result set without running the  *  controlled plan at all.  If it's true, we run the controlled  *  plan normally and pass back the results.

逻辑很清晰，因此上述逻辑就好比这么一串 SQL

postgres=# select * from test where 2 > 1;  id1 | id2  -----+-----    1 |   3    2 |   1    3 |   1    3 |   3 (4 rows)  postgres=# select * from test where 1 > 1;  id1 | id2  -----+----- (0 rows)  postgres=# select exists(select 1 from test where 1 > 1)as b;  b  ---  f (1 row)  postgres=# select exists(select 1 from test where 1 > 1)as b from test;  b  ---  f  f  f  f (4 rows)  postgres=# select (exists (select 1 as one from test a where (test.id1 = test.id2))) as b from test;  b  ---  f  f  f  t (4 rows)

因此此时的 SQL 逻辑就变成了这样：遍历 test 表，判断 id1 = id2 的行，所以结果是 false、false、false、true