Greenplum , HAWQ outer join与motion问题讲解__Greenplum , HAWQ outer join与motion问题讲解__PostgreSQL中文社区: 世界上功能最强大的开源数据库....

PG中文社区 /

mdi-home

首页

社区新闻

中文文档

加入ACE

mdi-home

首页

mdi-chat-processing

社区新闻

mdi-book-open-variant

中文文档

mdi-account-multiple-check

加入ACE

mdi-file-multiple-outline

Greenplum , HAWQ outer join与motion问题讲解

2016 Postgres大象会官方报名通道：点此报名

作者: digoal

日期: 2016-09-09

标签: PostgreSQL , HAWQ , Greenplum , OUTER JOIN , Motion

背景

Greenplum，HAWQ是分布式的数据库，在建表时，我们可以选择分布列，或者选择随机分布。

多个表做等值JOIN时，如果JOIN列为分布列，则不需要进行数据的重分布。

但是，如果使用的是OUTER JOIN，情况就不一样了，你可能会发现多个表进行outer join时，如果JOIN列都是HASH分布列，某些写法就可能导致需要重分布。

下面给大家分析一下原因。

创建几张测试表

其中tab1,tab2,tab3的bucketnum一致，分布列一致。

tab4,tab5,tab6的分布列与前面几张表一致，但是bucketnum不一致（所以显然HASH取模后的值也不一致）。

bucketnum默认是实体segments的6倍（意思是每台实体segment上跑6个虚拟segment），用户可以根据表的大小来调试，例如要发挥最大的计算能力，实际设置时可以设置为主机CPU核数的0.8，小表则建议设置为较小的值。

tab7与tab8为随机分布，对于随机分布的表，bucketnum设置会忽略，在实体segments之间随机分布(查看hdfs对应的文件也能看出端倪)。

postgres=# create table tab1(c1 int, c2 int, c3 text, c4 timestamp) with (bucketnum=6) distributed by(c1,c2);
postgres=# create table tab2(c1 int, c2 int, c3 text, c4 timestamp) with (bucketnum=6) distributed by(c1,c2);
postgres=# create table tab3(c1 int, c2 int, c3 text, c4 timestamp) with (bucketnum=6) distributed by(c1,c2);
postgres=# create table tab4(c1 int, c2 int, c3 text, c4 timestamp) with (bucketnum=10) distributed by(c1,c2);
postgres=# create table tab5(c1 int, c2 int, c3 text, c4 timestamp) with (bucketnum=12) distributed by(c1,c2);
postgres=# create table tab6(c1 int, c2 int, c3 text, c4 timestamp) with (bucketnum=3) distributed by(c1,c2);
postgres=# create table tab7(c1 int, c2 int, c3 text, c4 timestamp) with (bucketnum=3) distributed randomly;
postgres=# create table tab8(c1 int, c2 int, c3 text, c4 timestamp) with (bucketnum=6) distributed randomly;
postgres=# insert into tab7 select generate_series(1,1000000),generate_series(1,1000000),'test',now();
INSERT 0 1000000
postgres=# insert into tab8 select generate_series(1,1000000),generate_series(1,1000000),'test',now();
INSERT 0 1000000
postgres=# analyze tab7;
ANALYZE
postgres=# analyze tab8;
ANALYZE

下面开始几组测试，通过执行计划的motion node，观察query是否需要重分布，以及重分布那张表，重分布的键值是哪些。

测试1

HAWQ的hash分布取模值取决于bucket num，如果bucket num不一致，则JOIN时有一张表需要重分布

postgres=# explain select * from tab1 join tab6 on (tab1.c1=tab6.c1 and tab1.c2=tab6.c2);
                                              QUERY PLAN  
-----------------------------------------------------------------------------------------
 Gather Motion 6:1  (slice2; segments: 6)  (cost=0.00..862.00 rows=1 width=48)
   ->  Hash Join  (cost=0.00..862.00 rows=1 width=48)
         Hash Cond: tab1.c1 = tab6.c1 AND tab1.c2 = tab6.c2
         ->  Table Scan on tab1  (cost=0.00..431.00 rows=1 width=24)
         ->  Hash  (cost=431.00..431.00 rows=1 width=24)
               ->  Redistribute Motion 6:6  (slice1; segments: 6)  (cost=0.00..431.00 rows=1 width=24)
                                              -----    重分布tab6
                     Hash Key: tab6.c1, tab6.c2
                     ->  Table Scan on tab6  (cost=0.00..431.00 rows=1 width=24)
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: PQO version 1.638
(10 rows)

测试2

如果JOIN的其中一个表为随机分布式，随机分布的表需要复制或按JOIN列进行重分布

postgres=# explain select * from tab6 join tab7 on (tab6.c1=tab7.c1 and tab6.c2=tab7.c2);
                                              QUERY PLAN  
----------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..862.00 rows=1 width=48)
   ->  Hash Join  (cost=0.00..862.00 rows=1 width=48)
         Hash Cond: tab6.c1 = tab7.c1 AND tab6.c2 = tab7.c2
         ->  Table Scan on tab6  (cost=0.00..431.00 rows=1 width=24)
         ->  Hash  (cost=431.00..431.00 rows=1 width=24)
               ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=0.00..431.00 rows=1 width=24)
                                          ----重分布tab7
                     Hash Key: tab7.c1, tab7.c2
                     ->  Table Scan on tab7  (cost=0.00..431.00 rows=1 width=24)
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: PQO version 1.638
(10 rows)

测试3

随机分布策略的表，在每个segment上只有一个bucket

postgres=# explain select count(*) from tab7;
                                     QUERY PLAN      
----------------------------------------------------------------------------------
 Aggregate  (cost=0.00..452.11 rows=1 width=8)
   ->  Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..452.11 rows=1 width=8)
         ->  Aggregate  (cost=0.00..452.11 rows=1 width=8)
               ->  Table Scan on tab7  (cost=0.00..450.25 rows=1000000 width=1)
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: PQO version 1.638
(6 rows)

测试4

如果两个随机分布的表JOIN，数据需要在所有的实体segments(每个datanode对应一个实体segment)上按JOIN列重分布

postgres=# explain select * from tab8 join tab7 on (tab8.c1=tab7.c1 and tab8.c2=tab7.c2);
                                                 QUERY PLAN
----------------------------------------------------------------------------------------
 Gather Motion 1:1  (slice3; segments: 1)  (cost=0.00..2151.49 rows=10000 width=42)
   ->  Hash Join  (cost=0.00..2148.64 rows=10000 width=42)
         Hash Cond: tab8.c1 = tab7.c1 AND tab8.c2 = tab7.c2
         ->  Redistribute Motion 1:1  (slice1; segments: 1)  (cost=0.00..555.04 rows=1000000 width=21)
                                              ----------------重分布tab8
               Hash Key: tab8.c1, tab8.c2
               ->  Table Scan on tab8  (cost=0.00..450.25 rows=1000000 width=21)
         ->  Hash  (cost=555.04..555.04 rows=1000000 width=21)
               ->  Redistribute Motion 1:1  (slice2; segments: 1)  (cost=0.00..555.04 rows=1000000 width=21)
                                             ------------------重分布tab7
                     Hash Key: tab7.c1, tab7.c2
                     ->  Table Scan on tab7  (cost=0.00..450.25 rows=1000000 width=21)
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: PQO version 1.638
(12 rows)

测试5

outer join，2张表的outer join，如果JOIN列就是分布列，不需要重分布

postgres=# explain select * from tab1 left join tab2 on (tab1.c1=tab2.c1 and tab1.c2=tab2.c2);
                                  QUERY PLAN            
-------------------------------------------------------------------------------
 Gather Motion 6:1  (slice1; segments: 6)  (cost=0.00..862.00 rows=2 width=48)
   ->  Hash Left Join  (cost=0.00..862.00 rows=1 width=48)
         Hash Cond: tab1.c1 = tab2.c1 AND tab1.c2 = tab2.c2
         ->  Table Scan on tab1  (cost=0.00..431.00 rows=1 width=24)
         ->  Hash  (cost=431.00..431.00 rows=1 width=24)
               ->  Table Scan on tab2  (cost=0.00..431.00 rows=1 width=24)
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: PQO version 1.638
(8 rows)

测试6

outer join，3张表的outer join，需要注意JOIN的条件

1.tab1与tab2 left join后，关联不上时tab2可能会返回一些NULL值

因此再次与tab3 join时，如果JOIN条件是tab2与tab3，则不能圈定在虚拟segment内完成tab2与tab3的JOIN，必须要对tab1与tab2的outer JOIN中间结果进行重分布，对齐tab3的分布策略，再进行JOIN。

但是实际上，并不需要重分布，因为null和null比较返回的还是null，所以null实际上不需要重分布到一起去JOIN，本条SQL，对于HAWQ的优化器来说，是有优化余地的。

postgres=# explain select * from tab1 left join tab2 on (tab1.c1=tab2.c1 and tab1.c2=tab2.c2) 
                            left join tab3 on (tab2.c1=tab3.c1 and tab2.c2=tab3.c2);
                                           QUERY PLAN     
--------------------------------------------------------------------------------------------
 Gather Motion 6:1  (slice2; segments: 6)  (cost=0.00..1293.00 rows=3 width=72)
   ->  Hash Left Join  (cost=0.00..1293.00 rows=1 width=72)
         Hash Cond: tab2.c1 = tab3.c1 AND tab2.c2 = tab3.c2
         ->  Redistribute Motion 6:6  (slice1; segments: 6)  (cost=0.00..862.00 rows=1 width=48)
               Hash Key: tab2.c1, tab2.c2
               ->  Hash Left Join  (cost=0.00..862.00 rows=1 width=48)
                     Hash Cond: tab1.c1 = tab2.c1 AND tab1.c2 = tab2.c2
                     ->  Table Scan on tab1  (cost=0.00..431.00 rows=1 width=24)
                     ->  Hash  (cost=431.00..431.00 rows=1 width=24)
                           ->  Table Scan on tab2  (cost=0.00..431.00 rows=1 width=24)
         ->  Hash  (cost=431.00..431.00 rows=1 width=24)
               ->  Table Scan on tab3  (cost=0.00..431.00 rows=1 width=24)
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: PQO version 1.638
(14 rows)

2.这样的条件下，则不需要重分布，因为第一次LEFT JOIN后，TAB1不会产生空值，使用tab1再与tab3进行join也不需要重分布。

postgres=# explain select * from tab1 left join tab2 on (tab1.c1=tab2.c1 and tab1.c2=tab2.c2) 
                                left join tab3 on (tab1.c1=tab3.c1 and tab1.c2=tab3.c2);
                                   QUERY PLAN       
---------------------------------------------------------------------------------
 Gather Motion 6:1  (slice1; segments: 6)  (cost=0.00..1293.00 rows=3 width=72)
   ->  Hash Left Join  (cost=0.00..1293.00 rows=1 width=72)
         Hash Cond: tab1.c1 = tab3.c1 AND tab1.c2 = tab3.c2
         ->  Hash Left Join  (cost=0.00..862.00 rows=1 width=48)
               Hash Cond: tab1.c1 = tab2.c1 AND tab1.c2 = tab2.c2
               ->  Table Scan on tab1  (cost=0.00..431.00 rows=1 width=24)
               ->  Hash  (cost=431.00..431.00 rows=1 width=24)
                     ->  Table Scan on tab2  (cost=0.00..431.00 rows=1 width=24)
         ->  Hash  (cost=431.00..431.00 rows=1 width=24)
               ->  Table Scan on tab3  (cost=0.00..431.00 rows=1 width=24)
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: PQO version 1.638
(12 rows)

3.如果第三张关联表是JOIN条件，而非OUTER JOIN，同样不需要重分布。

postgres=# explain select * from tab1 left join tab2 on (tab1.c1=tab2.c1 and tab1.c2=tab2.c2) 
                             join tab3 on (tab2.c1=tab3.c1 and tab2.c2=tab3.c2);
                                               QUERY PLAN   
--------------------------------------------------------------------------------------------
 Gather Motion 6:1  (slice1; segments: 6)  (cost=0.00..1293.00 rows=1 width=72)
   ->  Hash Join  (cost=0.00..1293.00 rows=1 width=72)
         Hash Cond: tab2.c1 = tab3.c1 AND tab2.c2 = tab3.c2 AND tab1.c1 = tab3.c1 AND tab1.c2 = tab3.c2
         ->  Hash Join  (cost=0.00..862.00 rows=1 width=48)
               Hash Cond: tab1.c1 = tab2.c1 AND tab1.c2 = tab2.c2
               ->  Table Scan on tab1  (cost=0.00..431.00 rows=1 width=24)
               ->  Hash  (cost=431.00..431.00 rows=1 width=24)
                     ->  Table Scan on tab2  (cost=0.00..431.00 rows=1 width=24)
         ->  Hash  (cost=431.00..431.00 rows=1 width=24)
               ->  Table Scan on tab3  (cost=0.00..431.00 rows=1 width=24)
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: PQO version 1.638
(12 rows)

4.只有JOIN时，也不需要考虑重分布。

postgres=# explain select * from tab1 join tab2 on (tab1.c1=tab2.c1 and tab1.c2=tab2.c2) 
                   join tab3 on (tab2.c1=tab3.c1 and tab2.c2=tab3.c2);
                                               QUERY PLAN 
---------------------------------------------------------------------------------------
 Gather Motion 6:1  (slice1; segments: 6)  (cost=0.00..1293.00 rows=1 width=72)
   ->  Hash Join  (cost=0.00..1293.00 rows=1 width=72)
         Hash Cond: tab2.c1 = tab3.c1 AND tab2.c2 = tab3.c2 AND tab1.c1 = tab3.c1 AND tab1.c2 = tab3.c2
         ->  Hash Join  (cost=0.00..862.00 rows=1 width=48)
               Hash Cond: tab1.c1 = tab2.c1 AND tab1.c2 = tab2.c2
               ->  Table Scan on tab1  (cost=0.00..431.00 rows=1 width=24)
               ->  Hash  (cost=431.00..431.00 rows=1 width=24)
                     ->  Table Scan on tab2  (cost=0.00..431.00 rows=1 width=24)
         ->  Hash  (cost=431.00..431.00 rows=1 width=24)
               ->  Table Scan on tab3  (cost=0.00..431.00 rows=1 width=24)
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: PQO version 1.638
(12 rows)

小结

1. 随机分布的表与随机分布的表进行JOIN时，可能无法充分利用计算资源，因为每个物理节点只能用到一个核。

2. 随机分布的表与哈希分布的表JOIN时，会根据实际情况，重分布，并行计算。（如果哈希分布的表bucketnum较多，这种QUERY也能用上多核JOIN）。

3. outer join时，如果多次进行，请注意实际的场景逻辑，建议在JOIN时过滤，而不是JOIN完后过滤NULL，以避免重分布。

2016 Postgres大象会官方报名通道：http://www.huodongxing.com/event/8352217821400

扫描报名

评论:4　　

顶: 2907　

踩: 2987　

请在登录后发表评论，否则无法保存。

1# __ xcvxcvsdf 回答于 2025-02-12 15:56:12+08

https://changpozhen.tiancebbs.cn/mayi-news.xml http://www.wukong-b2b.com/sell/inquiry-htm-itemid-291.html https://aihuishou.tiancebbs.cn/szscyq/mayi-category.xml https://xunyang.tiancebbs.cn/mayi-news.xml https://ccqqita.tiancebbs.cn/mayi-category.xml https://wyzhoubian.tiancebbs.cn/mayi-store.xml https://www.tiancebbs.cn/ershoufang/499410.html https://aihuishou.tiancebbs.cn/hfsgxqhfs/mayi-category.xml https://cd.tiancebbs.cn/news/43302.html https://jingxigudaojing.tiancebbs.cn/mayi-category.xml https://aihuishou.tiancebbs.cn/lgxwfs/ https://cd.tiancebbs.cn/news/43808.html http://gy.gstcxxw.cn/fangchan/ https://zulin.tiancebbs.cn/sh/3356.html https://liuhua.tiancebbs.cn/mayi-store.xml https://aihuishou.tiancebbs.cn/gbzdgs/mayi-category.xml https://zulin.tiancebbs.cn/sh/4275.html

2# __ xcvxcvsdf 回答于 2025-01-23 03:42:04+08

https://www.tiancebbs.cn/news/38934.html https://www.tiancebbs.cn/news/39622.html https://fjqz.tiancebbs.cn/qths/504461.html https://hainan.tiancebbs.cn/mayi-info.xml https://xingbin.tiancebbs.cn/ https://ksgongyeyuan.tiancebbs.cn/mayi-category.xml https://www.tiancebbs.cn/news/42463.html https://dongyuanxian.tiancebbs.cn/qths/502106.html http://www.wukong-b2b.com/mall/658/ https://jdqshi.tiancebbs.cn/mayi-category.xml https://fkjingjikaifa.tiancebbs.cn/mayi-category.xml https://aihuishou.tiancebbs.cn/qlpesz/ https://aihuishou.tiancebbs.cn/heyuanshi/mayi-store.xml https://yangchun.tiancebbs.cn/ https://aihuishou.tiancebbs.cn/jaqfz/mayi-category.xml https://www.tiancebbs.cn/pgzp/54031.html http://www.wukong-b2b.com/mall/132/

3# __ xcvxcvsdf 回答于 2024-10-22 01:51:31+08

http://taiying.njtcbmw.cn/clnj/ http://shengshun.njtcbmw.cn/fc/ http://fuyang.tjtcbmw.cn/sczy/ http://ruanwen.xztcxxw.cn/sddy/ http://ruanwen.xztcxxw.cn/hnyy/ http://ruanwen.xztcxxw.cn/hjs/ http://huilong.sctcbmw.cn/luoyang/ http://yz.cqtcxxw.cn/baotou/ https://meijiangqu.tiancebbs.cn/ http://fuyang.tjtcbmw.cn/bjhr/ https://maerkang.tiancebbs.cn/ http://bjtcxxw.cn/putian/ https://haiyang.tiancebbs.cn/ http://huaguang.jxtcbmw.cn/ahla/ http://yuanbang.tjtcbmw.cn/zhabeish/ https://altqinghe.tiancebbs.cn/ http://ouyu.hftcbmw.cn/tianshui/

4# __ xiaowu 回答于 2024-04-22 09:24:45+08

抖音个人简介句子:https://www.nanss.com/wenan/1548.html 带符号网名:https://www.nanss.com/mingcheng/1819.html 梦幻游戏名字:https://www.nanss.com/mingcheng/1816.html 讽刺一个人死性不改的说说:https://www.nanss.com/wenan/1759.html 现实很骨感经典语句:https://www.nanss.com/yulu/1967.html 独自一人过生日的感言:https://www.nanss.com/wenan/1571.html 最浪漫的说老婆早安方式:https://www.nanss.com/yulu/1620.html 姐姐祝福妹妹结婚祝福语:https://www.nanss.com/yulu/1905.html 思念爸爸的伤感说说:https://www.nanss.com/wenan/1537.html 班级宣言:https://www.nanss.com/xuexi/1752.html 懂的说说心情短语:https://www.nanss.com/wenan/1935.html 感恩遇见姐妹情深:https://www.nanss.com/yulu/1552.html 再见过去说说:https://www.nanss.com/wenan/1794.html 睡一觉醒来又是美好的一天说说:https://www.nanss.com/wenan/1892.html 每日金句名人名言:https://www.nanss.com/xuexi/1596.html 陪伴孩子的文案:https://www.nanss.com/wenan/1708.html 对老人孝顺的感动句子:https://www.nanss.com/xuexi/1942.html 想父母了发朋友圈:https://www.nanss.com/wenan/1691.html 暗示做人别太过分经典语录:https://www.nanss.com/yulu/1532.html 期待我们再相聚的句子:https://www.nanss.com/yulu/1734.html 王者荣耀名字大全:https://www.nanss.com/mingcheng/1830.html 不要轻易听信别人的话的句子:https://www.nanss.com/yulu/1776.html 调侃血压高的句子:https://www.nanss.com/yulu/1885.html 晒弟照片的幽默句子:https://www.nanss.com/wenan/1904.html 送逝者最后一程的语句:https://www.nanss.com/yulu/1654.html 耳饰文案:https://www.nanss.com/yulu/1505.html 卡拉ok心情说说:https://www.nanss.com/wenan/1688.html 人心永远喂不饱的句子:https://www.nanss.com/yulu/1656.html 适合头疼的时候发的朋友圈:https://www.nanss.com/wenan/1735.html 有诗意的句子:https://www.nanss.com/xuexi/1952.html