PostgreSQL 最佳实践 - 水平分库(基于plproxy) 原作者:digoal / 德哥 创作时间:2016-10-09 18:04:34+08 |
doudou586 发布于2016-10-09 18:04:34 评论: 2 浏览: 15887 顶: 2781 踩: 2806 |
2016 Postgres大象会官方报名通道: 点此报名
我一直以来都比较推荐plproxy这个PostgreSQL代理软件, 因为它小巧灵活好用, 效率高.
最近朋友邀请我给他们做个分布式的方案, 所以又把plproxy翻出来了.
本文讲一讲在单节点中如何快速的部署plproxy环境.
PostgreSQL 9.3.1 plproxy 2.x
hostaddr 172.16.3.150 port 1921 user proxy password proxy dbname proxy schema digoal // 这个schema名和数据节点一致, 可以省去写plproxy language target的步骤.
hostaddr 172.16.3.150 port 1921 user digoal // plproxy将使用digoal用户连接数据节点. password digoal dbname db0 schema digoal dbname db1 schema digoal dbname db2 schema digoal dbname db3 schema digoal
首先在http://git.postgresql.org/gitweb/?p=plproxy.git;a=summary下载plproxy.
tar -zxvf plproxy-d703683.tar.gz mv plproxy-d703683 /opt/soft_bak/postgresql-9.3.1/contrib cd /opt/soft_bak/postgresql-9.3.1/contrib/plproxy-d703683 [root@db-172-16-3-150 plproxy-d703683]# export PATH=/home/pg93/pgsql9.3.1/bin:$PATH [root@db-172-16-3-150 plproxy-d703683]# which pg_config [root@db-172-16-3-150 plproxy-d703683]# gmake clean [root@db-172-16-3-150 plproxy-d703683]# gmake [root@db-172-16-3-150 plproxy-d703683]# gmake install
创建proxy库, proxy角色, 在proxy库创建plproxy extension.
pg93@db-172-16-3-150-> psql psql (9.3.1) Type "help" for help. postgres=# create role proxy nosuperuser login encrypted password 'proxy'; CREATE ROLE digoal=# create database proxy; CREATE DATABASE digoal=# \c proxy You are now connected to database "proxy" as user "postgres". proxy=# create extension plproxy; CREATE EXTENSION
调整proxy库权限
proxy=# grant all on database proxy to proxy; GRANT proxy=# \c proxy proxy You are now connected to database "proxy" as user "digoal".
创建digoal schema, 目的是和数据节点的schema匹配, 这样的话可以省去在代理函数中写target强行指定schema.
proxy=> create schema digoal; CREATE SCHEMA
创建节点数据库
proxy=> \c postgres postgres You are now connected to database "postgres" as user "postgres". postgres=# create role digoal nosuperuser login encrypted password 'digoal'; postgres=# create database db0; postgres=# create database db1; postgres=# create database db2; postgres=# create database db3;
调整权限, 赋予给后面将要给user mapping中配置的option user权限.
postgres=# grant all on database db0 to digoal; postgres=# grant all on database db1 to digoal; postgres=# grant all on database db2 to digoal; postgres=# grant all on database db3 to digoal;
使用超级用户在proxy数据库中创建server.
proxy=> \c proxy postgres You are now connected to database "proxy" as user "postgres". proxy=# CREATE SERVER cluster_srv1 FOREIGN DATA WRAPPER plproxy options (connection_lifetime '1800', p0 'dbname=db0 hostaddr=172.16.3.150 port=1921 application_name=test', p1 'dbname=db1 hostaddr=172.16.3.150 port=1921', p2 'dbname=db2 hostaddr=172.16.3.150 port=1921', p3 'dbname=db3 hostaddr=172.16.3.150 port=1921');
创建server时可以使用libpq中的选项. 例如本例使用了application_name.
将server权限赋予给proxy用户.
proxy=# grant usage on FOREIGN server cluster_srv1 to proxy; GRANT
配置proxy用户的连接cluster_srv1的选项.
proxy=# create user mapping for proxy server cluster_srv1 options (user 'digoal'); CREATE USER MAPPING
用户proxy连接到cluster_srv1时使用digoal用户连接, 这里不需要配置password, 因为我们将使用trust认证.
从proxy节点使用digoal用户连接数据库db0, db1, db2, db3使用trust认证.
vi $PGDATA/pg_hba.conf host db0 digoal 172.16.3.150/32 trust host db1 digoal 172.16.3.150/32 trust host db2 digoal 172.16.3.150/32 trust host db3 digoal 172.16.3.150/32 trust pg_ctl reload
使用超级用户创建plproxy函数, 然后把函数权限赋予给proxy权限.
proxy=# CREATE OR REPLACE FUNCTION digoal.dy(sql text) RETURNS SETOF record LANGUAGE plproxy STRICT AS $function$ cluster 'cluster_srv1'; run on all; $function$; proxy=# grant execute on function digoal.dy(text) to proxy; GRANT
proxy=# \c db0 digoal db0=# CREATE OR REPLACE FUNCTION digoal.dy(sql text) RETURNS SETOF record LANGUAGE plpgsql STRICT AS $function$ declare rec record; begin for rec in execute sql loop return next rec; end loop; return; end; $function$; db0=# \c db1 digoal ... db1=# \c db2 digoal ... db2=# \c db3 digoal ...
在proxy节点中就可以访问数据节点了。
例如查询这个动态SQL.
proxy=> select * from digoal.dy('select count(*) from pg_class') as t(i int8); i ----- 293 293 293 293 (4 rows) proxy=> select sum(i) from digoal.dy('select count(*) from pg_class') as t(i int8); sum ------ 1172 (1 row)
前面那个会话不要断开, 在另一个会话中观察proxy发起的连接到数据节点的连接.
postgres=# select * from pg_stat_activity where usename='digoal'; datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | waiting | state | query -------+---------+------+----------+---------+------------------+--------------+-----------------+-------------+---------------- 91246 | db0 | 8171 | 91250 | digoal | test | 172.16.3.150 | | 47937 | 2013-11-22 17:23:26 .138425+08 | | 2013-11-22 17:27:05.539286+08 | 2013-11-22 17:27:05.539745+08 | f | idle | select i::int8 from digo al.dy($1::text) as (i int8) 91247 | db1 | 8172 | 91250 | digoal | | 172.16.3.150 | | 47938 | 2013-11-22 17:23:26 .138688+08 | | 2013-11-22 17:27:05.53938+08 | 2013-11-22 17:27:05.539874+08 | f | idle | select i::int8 from digo al.dy($1::text) as (i int8) 91248 | db2 | 8173 | 91250 | digoal | | 172.16.3.150 | | 47939 | 2013-11-22 17:23:26 .138957+08 | | 2013-11-22 17:27:05.53938+08 | 2013-11-22 17:27:05.539841+08 | f | idle | select i::int8 from digo al.dy($1::text) as (i int8) 91249 | db3 | 8174 | 91250 | digoal | | 172.16.3.150 | | 47940 | 2013-11-22 17:23:26 .139178+08 | | 2013-11-22 17:27:05.539366+08 | 2013-11-22 17:27:05.539793+08 | f | idle | select i::int8 from digo al.dy($1::text) as (i int8) (4 rows)
再次在proxy的同一会话中查询时, 这些会话会复用, 不会断开. 前面已经讲了plproxy是使用长连接的.
如果修改了server, 那么这些连接会断开, 重新连接. 所以不需要担心修改server带来的连接cache问题.
postgres=# \c proxy postgres You are now connected to database "proxy" as user "postgres". proxy=# alter server cluster_srv1 options (set p1 'dbname=db1 hostaddr=172.16.3.150 port=1921 application_name=abc'); ALTER SERVER
再次在proxy的同一会话中查询后, 我们发现4个连接都变了, 说明alter server后, 如果再次发起plproxy函数的查询请求, 那么proxy会重置连接.
proxy=> select sum(i) from digoal.dy('select count(*) from pg_class') as t(i int8); sum ------ 1172 (1 row)
在另一会话的查询结果 :
proxy=# select * from pg_stat_activity where usename='digoal'; datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | waiting | state | query -------+---------+------+----------+---------+------------------+--------------+-----------------+------------+----------------- 91246 | db0 | 8245 | 91250 | digoal | test | 172.16.3.150 | | 47941 | 2013-11-22 17:30:36 .933077+08 | | 2013-11-22 17:30:36.936784+08 | 2013-11-22 17:30:36.938837+08 | f | idle | select i::int8 from digo al.dy($1::text) as (i int8) 91248 | db2 | 8247 | 91250 | digoal | | 172.16.3.150 | | 47943 | 2013-11-22 17:30:36 .933502+08 | | 2013-11-22 17:30:36.936783+08 | 2013-11-22 17:30:36.938981+08 | f | idle | select i::int8 from digo al.dy($1::text) as (i int8) 91249 | db3 | 8248 | 91250 | digoal | | 172.16.3.150 | | 47944 | 2013-11-22 17:30:36 .933731+08 | | 2013-11-22 17:30:36.937147+08 | 2013-11-22 17:30:36.939015+08 | f | idle | select i::int8 from digo al.dy($1::text) as (i int8) 91247 | db1 | 8246 | 91250 | digoal | abc | 172.16.3.150 | | 47942 | 2013-11-22 17:30:36 .933288+08 | | 2013-11-22 17:30:36.93757+08 | 2013-11-22 17:30:36.939299+08 | f | idle | select i::int8 from digo al.dy($1::text) as (i int8) (4 rows)
在数据节点创建测试表.
proxy=# \c db0 digoal db0=> create table t(id int); CREATE TABLE db0=> \c db1 You are now connected to database "db1" as user "digoal". db1=> create table t(id int); CREATE TABLE db1=> \c db2 You are now connected to database "db2" as user "digoal". db2=> create table t(id int); CREATE TABLE db2=> \c db3 You are now connected to database "db3" as user "digoal". db3=> create table t(id int); CREATE TABLE
在数据节点创建插入数据的实体函数, 每个节点返回不一样的数字.
\c db0 digoal db0=> create or replace function digoal.f_test4() returns int as $$ declare begin insert into t(id) values (1); return 0; end; $$ language plpgsql strict; db1=> create or replace function digoal.f_test4() returns int as $$ declare begin insert into t(id) values (1); return 1; end; $$ language plpgsql strict; db2=> create or replace function digoal.f_test4() returns int as $$ declare begin insert into t(id) values (1); return 2; end; $$ language plpgsql strict; db3=> create or replace function digoal.f_test4() returns int as $$ declare begin insert into t(id) values (1); return 3; end; $$ language plpgsql strict;
在proxy节点创建代理函数, 并且将执行权限赋予给proxy用户.
proxy=> \c proxy postgres create or replace function digoal.f_test4() returns int as $$ cluster 'cluster_srv1'; run on 0; -- 在指定的数据节点上运行, 本例可以设置为0到3, 顺序和创建的server中的配置顺序一致. p0, p1, p2, p3 $$ language plproxy strict; proxy=# grant execute on function digoal.f_test4() to proxy; GRANT proxy=# \c proxy proxy You are now connected to database "proxy" as user "proxy". proxy=> select * from digoal.f_test4(); f_test4 --------- 0 (1 row)
如果run on 的数字改成0-3以外的数字, 运行时将报错, 符合预期.
proxy=# create or replace function digoal.f_test4() returns int as $$ cluster 'cluster_srv1'; run on 4; $$ language plproxy strict; CREATE FUNCTION proxy=# \c proxy proxy You are now connected to database "proxy" as user "proxy". proxy=> select * from digoal.f_test4(); ERROR: PL/Proxy function digoal.f_test4(0): part number out of range
run on any表示随机的选择一个数据节点运行.
proxy=> \c proxy postgres You are now connected to database "proxy" as user "postgres". proxy=# create or replace function digoal.f_test4() returns int as $$ cluster 'cluster_srv1'; run on any; $$ language plproxy strict; CREATE FUNCTION proxy=# \c proxy proxy You are now connected to database "proxy" as user "proxy". proxy=> select * from digoal.f_test4(); f_test4 --------- 0 (1 row) proxy=> select * from digoal.f_test4(); f_test4 --------- 3 (1 row) proxy=> select * from digoal.f_test4(); f_test4 --------- 2 (1 row) proxy=> select * from digoal.f_test4(); f_test4 --------- 3 (1 row)
run on function() 则使用函数结果的hash值计算得到运行节点.
proxy=> create or replace function digoal.f(int) returns int as $$ select $1; $$ language sql strict; CREATE FUNCTION proxy=> \c proxy postgres You are now connected to database "proxy" as user "postgres". proxy=# create or replace function digoal.f_test4() returns int as $$ cluster 'cluster_srv1'; run on digoal.f(10); $$ language plproxy strict; CREATE FUNCTION proxy=> select digoal.f_test4(); f_test4 --------- 2 (1 row) proxy=> \c proxy postgres You are now connected to database "proxy" as user "postgres". proxy=# create or replace function digoal.f_test4() returns int as $$ cluster 'cluster_srv1'; run on digoal.f(11); $$ language plproxy strict; CREATE FUNCTION proxy=# \c proxy proxy You are now connected to database "proxy" as user "proxy". proxy=> select digoal.f_test4(); f_test4 --------- 3 (1 row) proxy=> \c proxy postgres You are now connected to database "proxy" as user "postgres". proxy=# create or replace function digoal.f_test4() returns int as $$ cluster 'cluster_srv1'; run on digoal.f(-11); $$ language plproxy strict; CREATE FUNCTION proxy=# \c proxy proxy You are now connected to database "proxy" as user "proxy". proxy=> select digoal.f_test4(); f_test4 --------- 1 (1 row)
run on all表示所有数据节点运行. 代理函数必须使用returns setof返回.
proxy=> \c proxy postgres You are now connected to database "proxy" as user "postgres". proxy=# create or replace function digoal.f_test4() returns int as $$ cluster 'cluster_srv1'; run on all; $$ language plproxy strict; ERROR: PL/Proxy function digoal.f_test4(0): RUN ON ALL requires set-returning function proxy=# drop function digoal.f_test4(); DROP FUNCTION proxy=# create or replace function digoal.f_test4() returns setof int as $$ cluster 'cluster_srv1'; run on all; $$ language plproxy strict; CREATE FUNCTION proxy=# grant execute on function digoal.f_test4() to proxy; GRANT proxy=# \c proxy proxy You are now connected to database "proxy" as user "proxy". proxy=> select digoal.f_test4(); f_test4 --------- 0 1 2 3 (4 rows)
1. 设计时需要注意plproxy函数所在的schema尽量和数据节点上实际函数的schema一致.否则需要在plproxy函数中使用target指定 schema.functionname;
2. 数据节点的个数请保持2^n,这么做有利于后期的节点扩展, 例如2个节点扩展到4个节点时, 数据不需要发生跨节点的重分布.
例如
mod(x,2)=0 那么mod(x,4)=0或2 mod(x,2)=1 那么mod(x,4)=1或3
比较适合位运算的分布算法.
当然我们也可以使用一致性哈希的设计思路,参考《一致性哈希在分布式数据库中的应用探索》https://yq.aliyun.com/articles/57954
3. 如果业务为短连接的形式, 那么需要1层连接池, 在应用程序和plproxy数据库之间. 而不是plproxy和数据节点之间.
在应用程序和plproxy之间加连接池后, 其实对于plproxy来说就是长连接了, 所以在plproxy和数据节点之间也就不需要连接池了.
4. 长连接不需要连接池, 因为plproxy和数据节点之间的连接是长连接.
5. plproxy语法非常简洁,而且函数调用彻底避免了事务的问题 connect, cluster, run, select, split, target.
6. 关于连接密码, 出于安全考虑, 建议在任何配置中不要出现明文密码, 所以最好是plproxy服务器到数据节点是trust验证, 保护好plproxy即可.
假设plproxy在172.16.3.2上. 数据节点有4个, 库名和用户名都为digoal. 那么在4个节点上配置pg_hba.conf如下.
node0 host digoal digoal 172.16.3.2/32 trust node1 host digoal digoal 172.16.3.2/32 trust node2 host digoal digoal 172.16.3.2/32 trust node3 host digoal digoal 172.16.3.2/32 trust
7. run 详解:
run on, 是数字常量, 范围是0 到 nodes-1; 例如有4个节点 run on 0; (run on 4则报错). run on ANY, run on function(...), 这里用到的函数返回结果必须是int2, int4 或 int8. run on ALL, 这种的plproxy函数必须是returns setof..., 实体函数没有setof的要求.
8. 一个plproxy中只能出现一条connect语句, 符合预期, 否则报错.
digoal=# create or replace function f_test3() returns setof int8 as $$ connect 'hostaddr=172.16.3.150 dbname=db0 user=digoal port=1921'; connect 'hostaddr=172.16.3.150 dbname=db1 user=digoal port=1921'; select count(*) from pg_class; $$ language plproxy strict; ERROR: PL/Proxy function postgres.f_test3(0): Compile error at line 2: Only one CONNECT statement allowed
9. 不要把plproxy语言的权限赋予给普通用户, 因为开放了trust认证, 如果再开放plproxy语言的权限是非常危险的.
正确的做法是使用超级用户创建plproxy函数, 然后把函数的执行权限赋予给普通用户.
千万不要这样省事 :
update pg_language set lanpltrusted='t' where lanname='plproxy';
10. 如果有全局唯一的序列需求, 可以将序列的步调调整一下, 每个数据节点使用不同的初始值.
例如
db0=# create sequence seq1 increment by 4 start with 0; CREATE SEQUENCE db1=# create sequence seq1 increment by 4 start with 1; db2=# create sequence seq1 increment by 4 start with 2; db3=# create sequence seq1 increment by 4 start with 3;
考虑到扩容, 可以将步调调比较大, 例如1024. 那么可以容纳1024个节点.
2016 Postgres大象会官方报名通道:http://www.huodongxing.com/event/8352217821400
扫描报名