当前位置: 首页 > news >正文

优化09-表连接

一、表连接介绍

表连接类型

表连接是关系型数据库关键特性,在关系型数据库中,表连接分为三类:循环嵌套连接(Nested Loops Join)、哈希连接(Hash Join)、合并排序连接(Merge Sort Join)。假设存在表A和表B,都存在ID列,通过id列连接。

Nested Loops Join:遍历A中所有id,依次拿id和表B中的id对比。

Hash Join:对A中的id做HASH运算,放入多个HASH Bucket中,根据B表的id列的hash值和HASH Bucket匹配。

Merge Sort Join:对A表和B表的id列进行排序,按照排序结果进行连接。

表连接分析

表的访问次数,由执行计划的starts列来表示

  • NL连接驱动表被访问0次或1次,被驱动表被访问0次或N次,N的值取决于驱动表的返回行数。
  • HASH连接,驱动表和被驱动表都是被访问0次或1次,大部分场景是驱动表和被驱动表都访问1次。
  • MS连接,驱动表和被驱动表都是被访问0次或1次,大部分场景是驱动表和被驱动表都访问1次。

驱动顺序

  • NL连接的性能与驱动顺序有关,一般小表做驱动表,性能更好。
  • HASH连接的性能和驱动顺序有关,一般小表做驱动表。
  • MS连接的性能和驱动顺序无关。

是否排序

  • NL连接不排序、不多余消耗内存
  • HASH连接不排序,但是hash area多消耗内存
  • MS连接排序,消耗sort area内存

使用限制

  • NL连接支持各种写法,无限制。
  • HASH连接支持等值连接,不支持>、<、like、<>等。
  • MS连接不支持>、<、like、<>等

适用场景

  • NL连接一般适用于OLTP系统
  • HASH连接、MS连接适用于OLAP系统

二、表连接测试

创建测试数据

--创建表T1和T2
CREATE TABLE t1 (
id NUMBER NOT NULL,
n NUMBER,
contents VARCHAR2(4000));CREATE TABLE t2 (
id NUMBER NOT NULL,
t1_id NUMBER NOT NULL,
n NUMBER,
contents VARCHAR2(4000));--插入测试数据
execute dbms_random.seed(0);
INSERT INTO t1
SELECT  rownum,  rownum, dbms_random.string('a', 50)
FROM dual
CONNECT BY level <= 100
ORDER BY dbms_random.random;
INSERT INTO t2 SELECT rownum, rownum, rownum, dbms_random.string('b', 50) FROM dual CONNECT BY level <= 100000
ORDER BY dbms_random.random;
COMMIT;--查看数据量 
SQL> select count(*) from t1;COUNT(*)
----------100
SQL> select count(*) from t2;COUNT(*)
----------100000

NL连接优化实验

--  两个表无索引执行计划
alter session set statistics_level=all;
select * from t1,t2 where t1.id=t2.t1_id and t1.n=19;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));----------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.02 |    1016 |       |       |          |
|*  1 |  HASH JOIN         |      |      1 |      1 |      1 |00:00:00.02 |    1016 |  1000K|  1000K|  407K (0)|
|*  2 |   TABLE ACCESS FULL| T1   |      1 |      1 |      1 |00:00:00.01 |       7 |       |       |          |
|   3 |   TABLE ACCESS FULL| T2   |      1 |    103K|    100K|00:00:00.01 |    1006 |       |       |          |
------------------------------------------------------------------------------------------------------------------驱动表创建索引create index t1_n on t1(n);select * from t1,t2 where t1.id=t2.t1_id and t1.n=19;select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
-------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |      1 |        |      1 |00:00:00.05 |    1008 |
|*  1 |  HASH JOIN                           |      |      1 |      1 |      1 |00:00:00.05 |    1008 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1   |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  3 |    INDEX RANGE SCAN                  | T1_N |      1 |      1 |      1 |00:00:00.01 |       1 |
|   4 |   TABLE ACCESS FULL                  | T2   |      1 |    103K|    100K|00:00:00.02 |    1006 |
-------------------------------------------------------------------------------------------------------
--被驱动表创建索引
CREATE INDEX t2_t1_id ON t2(t1_id);
select * from t1,t2 where t1.id=t2.t1_id and t1.n=19;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
-----------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name    |Starts| E-Rows|A-Rows|   A-Time   |Buffers|Reads |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |         |     1|       |     1|00:00:00.01 |      7|     4|
|   1 |  NESTED LOOPS                         |         |     1|      1|     1|00:00:00.01 |      7|     4|
|   2 |   NESTED LOOPS                        |         |     1|      1|     1|00:00:00.01 |      6|     4|
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T1      |     1|      1|     1|00:00:00.01 |      3|     0|
|*  4 |     INDEX RANGE SCAN                  | T1_N    |     1|      1|     1|00:00:00.01 |      2|     0|
|*  5 |    INDEX RANGE SCAN                   | T2_T1_ID|     1|      1|     1|00:00:00.01 |      3|     4|
|   6 |   TABLE ACCESS BY INDEX ROWID         | T2      |     1|      1|     1|00:00:00.01 |      1|     0|
-----------------------------------------------------------------------------------------------------------
--大表驱动小表
SELECT /*+ leading(t2) use_nl(t1) */ *2  FROM t1, t23  WHERE t1.id = t2.t1_id4  AND t1.n = 19;
-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |      1 |00:00:00.11 |    1013 |
|   1 |  NESTED LOOPS                |      |      1 |      1 |      1 |00:00:00.11 |    1013 |
|   2 |   NESTED LOOPS               |      |      1 |    103K|    100K|00:00:00.07 |    1011 |
|   3 |    TABLE ACCESS FULL         | T2   |      1 |    103K|    100K|00:00:00.01 |    1006 |
|*  4 |    INDEX RANGE SCAN          | T1_N |    100K|      1 |    100K|00:00:00.04 |       5 |
|*  5 |   TABLE ACCESS BY INDEX ROWID| T1   |    100K|      1 |      1 |00:00:00.02 |       2 |
-----------------------------------------------------------------------------------------------
--尽管两个表都建立了索引,但是如果大表驱动小表,反而消耗更大。
--总结:
--驱动表和被驱动表考虑创建索引
--确保小表驱动大表

HASH连接优化实验

--删除索引drop index t1_n;drop index t2_t1_id;select * from t1,t2 where t1.id=t2.t1_id and t1.n=19;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));----------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.02 |    1016 |       |       |          |
|*  1 |  HASH JOIN         |      |      1 |      1 |      1 |00:00:00.02 |    1016 |  1000K|  1000K|  407K (0)|
|*  2 |   TABLE ACCESS FULL| T1   |      1 |      1 |      1 |00:00:00.01 |       7 |       |       |          |
|   3 |   TABLE ACCESS FULL| T2   |      1 |    103K|    100K|00:00:00.01 |    1006 |       |       |          |
------------------------------------------------------------------------------------------------------------------创建索引
create index idx_t1_n on t1(n);--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |      1 |        |      1 |00:00:00.06 |    1008 |       |       |          |
|*  1 |  HASH JOIN                           |          |      1 |      1 |      1 |00:00:00.06 |    1008 |  1000K|  1000K|  402K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1       |      1 |      1 |      1 |00:00:00.01 |       2 |       |       |          |
|*  3 |    INDEX RANGE SCAN                  | IDX_T1_N |      1 |      1 |      1 |00:00:00.01 |       1 |       |       |          |
|   4 |   TABLE ACCESS FULL                  | T2       |      1 |    100K|    100K|00:00:00.02 |    1006 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------总结
--驱动表和被驱动表考虑创建索引
--确保小表驱动大表
--由于HASH操作需要额外内存区域(hash area),如果需要做hash运算的数据太多,则会用的临时表空间,涉及磁盘IO会大大降低性能,索引尽量保证hash运算在内存中完成。

MS连接优化实验

--Merge Sort Join两表限制条件皆无索引SQL> SELECT /*+ leading(t2) use_merge(t1)*/ *2  FROM t1, t23  WHERE t1.id = t2.t1_id4   and t1.n=19;---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |          |      1 |        |      1 |00:00:00.06 |    1007 |       |       |          |
|   1 |  MERGE JOIN                           |          |      1 |      1 |      1 |00:00:00.06 |    1007 |       |       |          |
|   2 |   SORT JOIN                           |          |      1 |    100K|     20 |00:00:00.06 |    1005 |  9762K|  1209K| 8677K (0)|
|   3 |    TABLE ACCESS FULL                  | T2       |      1 |    100K|    100K|00:00:00.01 |    1005 |       |       |          |
|*  4 |   SORT JOIN                           |          |     20 |      1 |      1 |00:00:00.01 |       2 |  2048 |  2048 | 2048  (0)|
|   5 |    TABLE ACCESS BY INDEX ROWID BATCHED| T1       |      1 |      1 |      1 |00:00:00.01 |       2 |       |       |          |
|*  6 |     INDEX RANGE SCAN                  | IDX_T1_N |      1 |      1 |      1 |00:00:00.01 |       1 |       |       |          |
--------------------------------------------------------------------------------------------------------------------限制条件有索引
create index idx_t1_n on t1(n);
create index idx_t2_n on t2(n);
SQL> SELECT /*+ leading(t2) use_merge(t1)*/ *2  FROM t1, t23  WHERE t1.id = t2.t1_id4  and t1.n=195  and t2.n=12;SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |          |      1 |        |      0 |00:00:00.01 |       5 |       |       |          |
|   1 |  MERGE JOIN                           |          |      1 |      1 |      0 |00:00:00.01 |       5 |       |       |          |
|   2 |   SORT JOIN                           |          |      1 |      1 |      1 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T2       |      1 |      1 |      1 |00:00:00.01 |       3 |       |       |          |
|*  4 |     INDEX RANGE SCAN                  | IDX_T2_N |      1 |      1 |      1 |00:00:00.01 |       2 |       |       |          |
|*  5 |   SORT JOIN                           |          |      1 |      1 |      0 |00:00:00.01 |       2 |  2048 |  2048 | 2048  (0)|
|   6 |    TABLE ACCESS BY INDEX ROWID BATCHED| T1       |      1 |      1 |      1 |00:00:00.01 |       2 |       |       |          |
|*  7 |     INDEX RANGE SCAN                  | IDX_T1_N |      1 |      1 |      1 |00:00:00.01 |       1 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------连接条件创建索引
set autotrace traceonly
SQL> SELECT /*+ leading(t1) use_merge(t2)*/ *2  FROM t1, t23  WHERE t1.id = t2.t1_id;
Statistics
----------------------------------------------------------1  recursive calls0  db block gets1012  consistent gets0  physical reads0  redo size13999  bytes sent via SQL*Net to client673  bytes received via SQL*Net from client8  SQL*Net roundtrips to/from client2  sorts (memory)0  sorts (disk)100  rows processedCREATE INDEX idx_t1_id ON t1(id);
CREATE INDEX idx_t2_t1_id ON t2(t1_id);
set autotrace traceonly
SQL> SELECT /*+ leading(t1) use_merge(t2)*/ *2  FROM t1, t23  WHERE t1.id = t2.t1_id;100 rows selected.Statistics
----------------------------------------------------------1  recursive calls0  db block gets1021  consistent gets0  physical reads0  redo size13999  bytes sent via SQL*Net to client673  bytes received via SQL*Net from client8  SQL*Net roundtrips to/from client1  sorts (memory)0  sorts (disk)100  rows processed
http://www.xdnf.cn/news/884989.html

相关文章:

  • 各种排序算法的再整理
  • 【Nginx】使用 Nginx+Lua 实现基于 IP 的访问频率限制
  • 命令行运行python程序报错 ImportError: /lib/x86_64-linux-gnu/libstdc++.so.6
  • Cursor AI编程助手模型选择对了吗?
  • mysql跨库关联查询及视图创建
  • 机器学习——什么时候使用决策树
  • PostgreSQL 入门教程
  • 边缘计算应用实践心得
  • 防反接电路设计浅谈
  • 在使用一些不用驱动大电流的设备就可以用stm32的自己的上下拉但是本身上下拉不就是给iicspi这些他通信给信号的吗中怎么还跟驱动能力扯上了有什么场景嘛
  • Wireshark使用教程(含安装包和安装教程)
  • Kafka存储机制核心优势剖析
  • 数据库-MySQL
  • Ubuntu中常用的网络命令指南
  • 8.axios Http网络请求库(1)
  • 洛谷题目:P2761 软件补丁问题 (本题简单)
  • Unity基础-Mathf相关
  • NoSQL 之 Redis 配置与优化
  • 护网面试题目2025
  • Windows下安装MySQL8.X
  • 渗透实战PortSwigger靶场-XSS Lab 14:大多数标签和属性被阻止
  • RK3588 RTL8211F PHY的LED灯调试
  • 能做超厚铜pcb工厂有哪些?
  • MLP实战二:MLP 实现图像数字多分类
  • 大中型水闸安全监测管理系统建设方案
  • Authpf(OpenBSD)认证防火墙到ssh连接到SSH端口转发技术栈 与渗透网络安全的关联 (RED Team Technique )
  • 机器学习的数学基础:决策树
  • 今日学习:ES8语法 | Spring整合ES | ES场景八股
  • Python html 库用法详解
  • Selenium 和playwright 使用场景优缺点对比