元旦技术大礼包-2017金秋将要发布的PostgreSQL10.0已装备了哪些核武器?
作者: digoal
日期: 2017-01-01
标签: PostgreSQL , 10.0 , 金秋 , 元旦 , 大礼包 , commitfest
背景
早上送给大家的新年大礼包,一年一个大版本是PostgreSQL社区的传统,虽然发布时间通常为秋天,还有一段时间,但是已经迫不及待地想看看2017金秋将要发布的10.0版本已经装备了哪些核武器。
放心,还会有一波又一波的feature和增强搭上开往2017金秋的列车,本文提到的可能只是其中的某一节车厢沃,PGer是不是开始有一点振奋人心的感觉啦。
1. 并行计算专属动态共享内存区,(加速索引扫描/外部表并行的支持步伐)
Provide a DSA area for all parallel queries.
This will allow future parallel query code to dynamically allocate storage shared by all participants.
Thomas Munro, with assorted changes by me.
Parallel tuplesort (for parallel B-Tree index creation): https://commitfest.postgresql.org/12/690/
Parallel bitmap heap scan: https://commitfest.postgresql.org/12/812/
Parallel Index Scans: https://commitfest.postgresql.org/12/849/
Parallel Merge Join: https://commitfest.postgresql.org/12/918/
Parallel Append implementation: https://commitfest.postgresql.org/12/929/
parallelize queries containing subplans: https://commitfest.postgresql.org/12/941/
2. 多副本新增 "任意节点、顺序" 两种模式自由选择
Support quorum-based synchronous replication. This feature is also known as "quorum commit" especially in discussion on pgsql-hackers. This commit adds the following new syntaxes into synchronous_standby_names GUC. By using FIRST and ANY keywords, users can specify the method to choose synchronous standbys from the listed servers. FIRST num_sync (standby_name [, ...]) ANY num_sync (standby_name [, ...]) The keyword FIRST specifies a priority-based synchronous replication which was available also in 9.6 or before. This method makes transaction commits wait until their WAL records are replicated to num_sync synchronous standbys chosen based on their priorities. The keyword ANY specifies a quorum-based synchronous replication and makes transaction commits wait until their WAL records are replicated to *at least* num_sync listed standbys. In this method, the values of sync_state.pg_stat_replication for the listed standbys are reported as "quorum". The priority is still assigned to each standby, but not used in this method. The existing syntaxes having neither FIRST nor ANY keyword are still supported. They are the same as new syntax with FIRST keyword, i.e., a priorirty-based synchronous replication. Author: Masahiko Sawada Reviewed-By: Michael Paquier, Amit Kapila and me Discussion:Many thanks to the various individuals who were involved in discussing and developing this feature.
3. 添加会话级临时replication slots支持
(原来slot是为长时间使用设计,如STANDBY。 对于一些测试、针对性复制场景,不再需要担心忘记删除slot带来的问题了)
Add support for temporary replication slots This allows creating temporary replication slots that are removed automatically at the end of the session or on error. From: Petr Jelinek
4. 认证协议会更加安全(SCRAM)
Refactor the code for verifying user's password. Split md5_crypt_verify() into three functions: * get_role_password() to fetch user's password from pg_authid, and check its expiration. * md5_crypt_verify() to check an MD5 authentication challenge * plain_crypt_verify() to check a plaintext password. get_role_password() will be needed as a separate function by the upcoming SCRAM authentication patch set. Most of the remaining functionality in md5_crypt_verify() was different for MD5 and plaintext authentication, so split that for readability. While we're at it, simplify the *_crypt_verify functions by using stack-allocated buffers to hold the temporary MD5 hashes, instead of pallocing. Reviewed by Michael Paquier.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=babe05bc2b781eb3eb84a18d7010d08277e2e399>
https://postgrespro.com/docs/postgresproee/9.6/auth-methods.html#auth-password
https://commitfest.postgresql.org/12/829/
此前,我们可以使用pg_pathman插件来实现高性能多级分区
《PostgreSQL 9.5+ 高效分区表实现 - pg_pathman》
《PostgreSQL 9.6 sharding based on FDW & pg_pathman》
Push down more full joins in postgres_fdw
https://commitfest.postgresql.org/12/727/
https://commitfest.postgresql.org/12/853/
https://commitfest.postgresql.org/12/915/
https://commitfest.postgresql.org/12/850/
indirect indexes:https://commitfest.postgresql.org/12/874/
https://commitfest.postgresql.org/12/836/
实际上9.4开始就已经支持了,通过外部插件+逻辑流复制来实现,但是没有整合到内核中.值得期待的10.0,一起等风来 - 2017金秋
Turn password_encryption GUC into an enum.
This makes the parameter easier to extend, to support other password-based
authentication protocols than MD5. (SCRAM is being worked on.)
The GUC still accepts on/off as aliases for "md5" and "plain", although
we may want to remove those once we actually add support for another
password hash type.
Michael Paquier, reviewed by David Steele, with some further edits by me.
Discussion:
21.3.2. Password Authentication
The password-based authentication methods are md5, scram, and password. These methods operate similarly except
for the way that the password is sent across the connection, namely MD5-hashed, SCRAM-SHA-256 and clear-text
respectively.
If you are at all concerned about password “sniffing” attacks then md5 is preferred. Plain password should always
be avoided if possible. However, md5 cannot be used with the db_user_namespace feature. If the connection is
protected by SSL encryption then password can be used safely (though SSL certificate authentication might be a
better choice if one is depending on using SSL).
scram has more advantages than md5 as it protects from cases where the hashed password is taken directly from
pg_authid in which case a connection using only the stolen hash is possible without knowing the password behind it.
It protects as well from password interception and data sniffing where the password data could be directly obtained
from the network as well as man-in-the-middle (MITM) attacks. So it is strongly encouraged to use it over md5 for password-based deployments.
Postgres Pro Enterprise database passwords are separate from operating system user passwords. The password for each
database user is stored in the pg_authid system catalog. Passwords can be managed with the SQL commands CREATE USER
and ALTER ROLE, e.g., CREATE USER foo WITH PASSWORD 'secret'. If no password has been set up for a user, the stored
password is null and password authentication will always fail for that user.
5. Support for SCRAM-SHA-256
6. 分区、多级分区支持
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some
important differences.
The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints,
since those make no sense for a relation with no data of its own. The children are called partitions and contain
all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed,
and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless
the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing
ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross
partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column,
but range partitioning can involve multiple columns. A partitioning "column" can be an expression.
Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about
properties of partitions, and therefore that this will serve as a better foundation for a variety of possible
optimizations, including query planner optimizations. The tuple routing based which this patch does based on the
implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations
are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker,
Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
Discussion: https://www.postgresql.org/message-id/3029e460-d47c-710e-507e-d8ba759d7cbb@iki.fi
7. 聚合运算减少context切换
Perform one only projection to compute agg arguments.
Previously we did a ExecProject() for each individual aggregate argument. That turned out to be a performance
bottleneck in queries with multiple aggregates.
Doing all the argument computations in one ExecProject() is quite a bit cheaper because ExecProject's fastpath
can do the work at once in a relatively tight loop, and because it can get all the required columns with a single
slot_getsomeattr and save some other redundant setup costs.
Author: Andres Freund
Reviewed-By: Heikki Linnakangas
Discussion: https://postgr.es/m/20161103110721.h5i5t5saxfk5eeik@alap3.anarazel.de
8. hash index增强
Improve hash index bucket split behavior.
Previously, the right to split a bucket was represented by a heavyweight lock on the page number of the primary
bucket page. Unfortunately, this meant that every scan needed to take a heavyweight lock on that bucket also,
which was bad for concurrency. Instead, use a cleanup lock on the primary bucket page to indicate the right to
begin a split, so that scans only need to retain a pin on that page, which is they would have to acquire anyway,
and which is also much cheaper.
In addition to reducing the locking cost, this also avoids locking out scans and inserts for the entire lifetime
of the split: while the new bucket is being populated with copies of the appropriate tuples from the old bucket,
scans and inserts can happen in parallel. There are minor concurrency improvements for vacuum operations as well,
though the situation there is still far from ideal.
This patch also removes the unworldly assumption that a split will never be interrupted. With the new code, a split
is done in a series of small steps and the system can pick up where it left off if it is interrupted prior to completion.
While this patch does not itself add write-ahead logging for hash indexes, it is clearly a necessary first step, since
one of the things that could interrupt a split is the removal of electrical power from the machine performing it.
Amit Kapila. I wrote the original design on which this patch is based, and did a good bit of work on the comments and
README through multiple rounds of review, but all of the code is Amit's.
Also reviewed by Jesper Pedersen, Jeff Janes, and others.
Discussion: http://postgr.es/m/CAA4eK1LfzcZYxLoXS874Ad0+S-ZM60U9bwcyiUZx9mHZ-KCWhw@mail.gmail.com
Improve handling of dead tuples in hash indexes.
When squeezing a bucket during vacuum, it's not necessary to retain any tuples already marked as dead, so ignore
them when deciding which tuples must be moved in order to empty a bucket page. Similarly, when splitting a bucket,
relocating dead tuples to the new bucket is a waste of effort; instead, just ignore them.
Amit Kapila, reviewed by me. Testing help provided by Ashutosh Sharma.
9. 支持进程级条件变量,简化sleep/wakeup设计
Support condition variables.
Condition variables provide a flexible way to sleep until a cooperating process causes an arbitrary condition to
become true. In simple cases, this can be accomplished with a WaitLatch/ResetLatch loop; the cooperating process
can call SetLatch after performing work that might cause the condition to be satisfied, and the waiting process
can recheck the condition each time. However, if the process performing the work doesn't have an easy way to
identify which processes might be waiting, this doesn't work, because it can't identify which latches to set.
Condition variables solve that problem by internally maintaining a list of waiters; a process that may have
caused some waiter's condition to be satisfied must "signal" or "broadcast" on the condition variable.
Robert Haas and Thomas Munro
10. 支持聚合运算下推至sharding节点(postgres_fdw增强)
postgres_fdw: Push down aggregates to remote servers.
Now that the upper planner uses paths, and now that we have proper hooks to inject paths into the upper planning process,
it's possible for foreign data wrappers to arrange to push aggregates to the remote side instead of fetching all of the
rows and aggregating them locally. This figures to be a massive win for performance, so teach postgres_fdw to do it.
Jeevan Chalke and Ashutosh Bapat. Reviewed by Ashutosh Bapat with additional testing by Prabhat Sahu. Various mostly
cosmetic changes by me.
11. 支持流式备份时,同时备份数据文件与REDO文件
Allow pg_basebackup to stream transaction log in tar mode
This will write the received transaction log into a file called pg_wal.tar(.gz) next to the other tarfiles instead of
writing it to base.tar. When using fetch mode, the transaction log is still written to base.tar like before, and when
used against a pre-10 server, the file is named pg_xlog.tar.
To do this, implement a new concept of a "walmethod", which is responsible for writing the WAL. Two implementations
exist, one that writes to a plain directory (which is also used by pg_receivexlog) and one that writes to a tar file
with optional compression.
Reviewed by Michael Paquier
12. 分布式事务
13. Twophase transactions on slave, take 2
14. Scan key push down to heap
15. 间接索引
16. Logical Replication
参考
https://taicang.tiancebbs.cn/hjzl/459877.html https://zulin.tiancebbs.cn/sh/1766.html https://www.tiancebbs.cn/gongsizhuxiao1/54997.html https://zulin.tiancebbs.cn/sh/2041.html https://aihuishou.tiancebbs.cn/sh/985.html https://changshushi.tiancebbs.cn/hjzl/464109.html https://heshan.tiancebbs.cn/qths/472289.html https://zulin.tiancebbs.cn/sh/4580.html https://www.tiancebbs.cn/ershouwang/471704.html https://su.tiancebbs.cn/hjzl/472084.html https://www.tiancebbs.cn/ershouwang/474330.html https://su.tiancebbs.cn/hjzl/467221.html https://www.tiancebbs.cn/ershoufang/474433.html https://www.tiancebbs.cn/ershoufang/470416.html https://changshushi.tiancebbs.cn/hjzl/460125.html https://taicang.tiancebbs.cn/hjzl/461211.html https://ganzi.tiancebbs.cn/qths/456017.html