一、簡(jiǎn)介
在數(shù)據(jù)庫(kù)日漸龐大的今天,為了方便對(duì)數(shù)據(jù)庫(kù)數(shù)據(jù)的管理,比如按時(shí)間,按地區(qū)去統(tǒng)計(jì)一些數(shù)據(jù)時(shí),基數(shù)過(guò)于龐大,多有不便。很多商業(yè)數(shù)據(jù)庫(kù)都提供分區(qū)的概念,按不同的維度去存放數(shù)據(jù),便于后期的管理,PostgreSQL也不例外。
PostgresSQL分區(qū)的意思是把邏輯上的一個(gè)大表分割成物理上的幾塊兒。分區(qū)不僅能帶來(lái)訪問(wèn)速度的提升,關(guān)鍵的是,它能帶來(lái)管理和維護(hù)上的方便。
分區(qū)的具體好處是:
- 某些類(lèi)型的查詢(xún)性能可以得到極大提升。
- 更新的性能也可以得到提升,因?yàn)楸淼拿繅K的索引要比在整個(gè)數(shù)據(jù)集上的索引要小。如果索引不能全部放在內(nèi)存里,那么在索引上的讀和寫(xiě)都會(huì)產(chǎn)生更多的磁盤(pán)訪問(wèn)。
- 批量刪除可以用簡(jiǎn)單的刪除某個(gè)分區(qū)來(lái)實(shí)現(xiàn)。
- 可以將很少用的數(shù)據(jù)移動(dòng)到便宜的、轉(zhuǎn)速慢的存儲(chǔ)介質(zhì)上。
在PG里表分區(qū)是通過(guò)表繼承來(lái)實(shí)現(xiàn)的,一般都是建立一個(gè)主表,里面是空,然后每個(gè)分區(qū)都去繼承它。無(wú)論何時(shí),都應(yīng)保證主表里面是空的。
小表分區(qū)不實(shí)際,表在多大情況下才考慮分區(qū)呢?PostgresSQL官方給出的建議是:當(dāng)表本身大小超過(guò)了機(jī)器物理內(nèi)存的實(shí)際大小時(shí)(the size of the table should exceed the physical memory of the database server),可以考慮分區(qū)。
PG目前(9.2.2)僅支持范圍分區(qū)和列表分區(qū),尚未支持散列分區(qū)。
二、環(huán)境
系統(tǒng)環(huán)境:CentOS release 6.3 (Final)
PostgreSQL版本:PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit
三、實(shí)現(xiàn)分區(qū)
3.1 創(chuàng)建主表
david = # create table tbl_partition ( david(# id integer , david(# name varchar ( 20 ), david(# gender boolean, david(# join_date date, david(# dept char ( 4 )); CREATE TABLE david = #
3.2 創(chuàng)建分區(qū)表
david = # create table tbl_partition_201211 ( check ( join_date >= DATE ' 2012-11-01 ' AND join_date < DATE ' 2012-12-01 ' ) ) INHERITS (tbl_partition); CREATE TABLE david = # create table tbl_partition_201212 ( check ( join_date >= DATE ' 2012-12-01 ' AND join_date < DATE ' 2013-01-01 ' ) ) INHERITS (tbl_partition); CREATE TABLE david = # create table tbl_partition_201301 ( check ( join_date >= DATE ' 2013-01-01 ' AND join_date < DATE ' 2013-02-01 ' ) ) INHERITS (tbl_partition); CREATE TABLE david = # create table tbl_partition_201302 ( check ( join_date >= DATE ' 2013-02-01 ' AND join_date < DATE ' 2013-03-01 ' ) ) INHERITS (tbl_partition); CREATE TABLE david = # create table tbl_partition_201303 ( check ( join_date >= DATE ' 2013-03-01 ' AND join_date < DATE ' 2013-04-01 ' ) ) INHERITS (tbl_partition); CREATE TABLE david = # create table tbl_partition_201304 ( check ( join_date >= DATE ' 2013-04-01 ' AND join_date < DATE ' 2013-05-01 ' ) ) INHERITS (tbl_partition); CREATE TABLE david = # create table tbl_partition_201305 ( check ( join_date >= DATE ' 2013-05-01 ' AND join_date < DATE ' 2013-06-01 ' ) ) INHERITS (tbl_partition); CREATE TABLE david = #
3.3 分區(qū)鍵上建索引
david = # create index tbl_partition_201211_joindate on tbl_partition_201211 (join_date); CREATE INDEX david = # create index tbl_partition_201212_joindate on tbl_partition_201212 (join_date); CREATE INDEX david = # create index tbl_partition_201301_joindate on tbl_partition_201301 (join_date); CREATE INDEX david = # create index tbl_partition_201302_joindate on tbl_partition_201302 (join_date); CREATE INDEX david = # create index tbl_partition_201303_joindate on tbl_partition_201303 (join_date); CREATE INDEX david = # create index tbl_partition_201304_joindate on tbl_partition_201304 (join_date); CREATE INDEX david = # create index tbl_partition_201305_joindate on tbl_partition_201305 (join_date); CREATE INDEX david = #
對(duì)于開(kāi)發(fā)人員來(lái)說(shuō),希望數(shù)據(jù)庫(kù)是透明的,只管 insert into tbl_partition。對(duì)于數(shù)據(jù)插向哪個(gè)分區(qū),則希望由DB決定。這點(diǎn),ORACLE實(shí)現(xiàn)了,但是PG不行,需要前期人工處理下。
3.4 創(chuàng)建觸發(fā)器函數(shù)
david = # CREATE OR REPLACE FUNCTION tbl_partition_insert_trigger() RETURNS TRIGGER AS $$ BEGIN IF ( NEW.join_date >= DATE ' 2012-11-01 ' AND NEW.join_date < DATE ' 2012-12-01 ' ) THEN INSERT INTO tbl_partition_201211 VALUES (NEW. * ); ELSIF ( NEW.join_date >= DATE ' 2012-12-01 ' AND NEW.join_date < DATE ' 2013-01-01 ' ) THEN INSERT INTO tbl_partition_201212 VALUES (NEW. * ); ELSIF ( NEW.join_date >= DATE ' 2013-01-01 ' AND NEW.join_date < DATE ' 2013-02-01 ' ) THEN INSERT INTO tbl_partition_201301 VALUES (NEW. * ); ELSIF ( NEW.join_date >= DATE ' 2013-02-01 ' AND NEW.join_date < DATE ' 2013-03-01 ' ) THEN INSERT INTO tbl_partition_201302 VALUES (NEW. * ); ELSIF ( NEW.join_date >= DATE ' 2013-03-01 ' AND NEW.join_date < DATE ' 2013-04-01 ' ) THEN INSERT INTO tbl_partition_201303 VALUES (NEW. * ); ELSIF ( NEW.join_date >= DATE ' 2013-04-01 ' AND NEW.join_date < DATE ' 2013-05-01 ' ) THEN INSERT INTO tbl_partition_201304 VALUES (NEW. * ); ELSIF ( NEW.join_date >= DATE ' 2013-05-01 ' AND NEW.join_date < DATE ' 2013-06-01 ' ) THEN INSERT INTO tbl_partition_201305 VALUES (NEW. * ); ELSE RAISE EXCEPTION ' Date out of range. Fix the tbl_partition_insert_trigger() function! ' ; END IF ; RETURN NULL ; END ; $$ LANGUAGE plpgsql; CREATE FUNCTION david = #
說(shuō)明: 如果不想丟失數(shù)據(jù),上面的ELSE 條件可以改成?INSERT INTO tbl_partition_error_join_date VALUES (NEW.*); 同時(shí)需要?jiǎng)?chuàng)建一張結(jié)構(gòu)和tbl_partition 一樣的表tbl_partition_error_join_date,這樣,錯(cuò)誤的join_date 數(shù)據(jù)就可以插入到這張表中而不是報(bào)錯(cuò)了。
3.5 創(chuàng)建觸發(fā)器
david = # CREATE TRIGGER insert_tbl_partition_trigger david - # BEFORE INSERT ON tbl_partition david - # FOR EACH ROW EXECUTE PROCEDURE tbl_partition_insert_trigger(); CREATE TRIGGER david = #
四、查看表
4.1 查看所有表
david = # \dt List of relations Schema | Name | Type | Owner -- ------+----------------------+-------+---------- public | tbl_partition | table | postgres public | tbl_partition_201211 | table | postgres public | tbl_partition_201212 | table | postgres public | tbl_partition_201301 | table | postgres public | tbl_partition_201302 | table | postgres public | tbl_partition_201303 | table | postgres public | tbl_partition_201304 | table | postgres public | tbl_partition_201305 | table | postgres ( 8 rows) david = #
4.2 查看主表
david = # \d tbl_partition Table " public .tbl_partition" Column | Type | Modifiers -- ---------+-----------------------+----------- id | integer | name | character varying ( 20 ) | gender | boolean | join_date | date | dept | character ( 4 ) | Triggers: insert_tbl_partition_trigger BEFORE INSERT ON tbl_partition FOR EACH ROW EXECUTE PROCEDURE tbl_partition_insert_trigger() Number of child tables: 7 ( Use \d + to list them.) david = #
4.3 查看分區(qū)表
david = # \d tbl_partition_201304 Table " public .tbl_partition_201304" Column | Type | Modifiers -- ---------+-----------------------+----------- id | integer | name | character varying ( 20 ) | gender | boolean | join_date | date | dept | character ( 4 ) | Indexes: "tbl_partition_201304_joindate" btree (join_date) Check constraints: "tbl_partition_201304_join_date_check" CHECK (join_date >= ' 2013-04-01 ' ::date AND join_date < ' 2013-05-01 ' ::date) Inherits: tbl_partition david = #
五、測(cè)試
5.1 插入數(shù)據(jù)
david = # insert into tbl_partition values ( 1 , ' David ' , ' 1 ' , ' 2013-01-10 ' , ' TS ' ); INSERT 0 0 david = # insert into tbl_partition values ( 2 , ' Sandy ' , ' 0 ' , ' 2013-02-10 ' , ' TS ' ); INSERT 0 0 david = # insert into tbl_partition values ( 3 , ' Eagle ' , ' 1 ' , ' 2012-11-01 ' , ' TS ' ); INSERT 0 0 david = # insert into tbl_partition values ( 4 , ' Miles ' , ' 1 ' , ' 2012-12-15 ' , ' SD ' ); INSERT 0 0 david = # insert into tbl_partition values ( 5 , ' Simon ' , ' 1 ' , ' 2012-12-10 ' , ' SD ' ); INSERT 0 0 david = # insert into tbl_partition values ( 6 , ' Rock ' , ' 1 ' , ' 2012-11-10 ' , ' SD ' ); INSERT 0 0 david = # insert into tbl_partition values ( 7 , ' Peter ' , ' 1 ' , ' 2013-01-11 ' , ' SD ' ); INSERT 0 0 david = # insert into tbl_partition values ( 8 , ' Sally ' , ' 0 ' , ' 2013-03-10 ' , ' BCSC ' ); INSERT 0 0 david = # insert into tbl_partition values ( 9 , ' Carrie ' , ' 0 ' , ' 2013-04-02 ' , ' BCSC ' ); INSERT 0 0 david = # insert into tbl_partition values ( 10 , ' Lee ' , ' 1 ' , ' 2013-01-05 ' , ' BMC ' ); INSERT 0 0 david = # insert into tbl_partition values ( 11 , ' Nicole ' , ' 0 ' , ' 2012-11-10 ' , ' PROJ ' ); INSERT 0 0 david = # insert into tbl_partition values ( 12 , ' Renee ' , ' 0 ' , ' 2013-01-10 ' , ' TS ' ); INSERT 0 0 david = #
5.2 查看主表數(shù)據(jù)
david = # select * from tbl_partition; id | name | gender | join_date | dept -- --+--------+--------+------------+------ 3 | Eagle | t | 2012 - 11 - 01 | TS 6 | Rock | t | 2012 - 11 - 10 | SD 11 | Nicole | f | 2012 - 11 - 10 | PROJ 4 | Miles | t | 2012 - 12 - 15 | SD 5 | Simon | t | 2012 - 12 - 10 | SD 1 | David | t | 2013 - 01 - 10 | TS 7 | Peter | t | 2013 - 01 - 11 | SD 10 | Lee | t | 2013 - 01 - 05 | BMC 12 | Renee | f | 2013 - 01 - 10 | TS 2 | Sandy | f | 2013 - 02 - 10 | TS 8 | Sally | f | 2013 - 03 - 10 | BCSC 9 | Carrie | f | 2013 - 04 - 02 | BCSC ( 12 rows) david = #
5.3 查看分區(qū)表數(shù)據(jù)
david = # select * from tbl_partition_201301 ; id | name | gender | join_date | dept -- --+-------+--------+------------+------ 1 | David | t | 2013 - 01 - 10 | TS 7 | Peter | t | 2013 - 01 - 11 | SD 10 | Lee | t | 2013 - 01 - 05 | BMC 12 | Renee | f | 2013 - 01 - 10 | TS ( 4 rows) david = #
六、管理分區(qū)
6.1 移除數(shù)據(jù)/分區(qū)
實(shí)現(xiàn)分區(qū)表之后,我們就可以很容易地移除不再使用的舊數(shù)據(jù)了,最簡(jiǎn)單的方法就是:
david = # drop table tbl_partition_201304;
這樣可以快速移除大量數(shù)據(jù),而不是逐條刪除數(shù)據(jù)。
另一個(gè)推薦做法是將分區(qū)從分區(qū)表中移除,但是保留訪問(wèn)權(quán)限。
david = # alter table tbl_partition_201304 no inherit tbl_partition; ALTER TABLE david = #
和直接DROP 相比,該方式僅僅是使子表脫離了原有的主表,而存儲(chǔ)在子表中的數(shù)據(jù)仍然可以得到訪問(wèn),因?yàn)榇藭r(shí)該表已經(jīng)被還原成一個(gè)普通的數(shù)據(jù)表了。這樣對(duì)于數(shù)據(jù)庫(kù)的DBA來(lái)說(shuō),就可以在此時(shí)對(duì)該表進(jìn)行必要的維護(hù)操作,如數(shù)據(jù)清理、歸檔等,在完成諸多例行性的操作之后,就可以考慮是直接刪除該表(DROP TABLE),還是先清空該表的數(shù)據(jù)(TRUNCATE TABLE),之后再讓該表重新繼承主表。
david = # alter table tbl_partition_201304 inherit tbl_partition; ALTER TABLE david = #
6.2 增加分區(qū)
我們可以像之前那樣增加一個(gè)分區(qū)
david = # create table tbl_partition_201306 ( check ( join_date >= DATE ' 2013-06-01 ' AND join_date < DATE ' 2013-07-01 ' ) ) INHERITS (tbl_partition);
david = # create index tbl_partition_201306_joindate on tbl_partition_201306 (join_date);
同時(shí),需要修改觸發(fā)器函數(shù),將插入條件改成相應(yīng)的值。
說(shuō)明: 創(chuàng)建觸發(fā)器函數(shù)時(shí),最好把插入條件寫(xiě)更未來(lái)一點(diǎn),比如多寫(xiě)十年,這樣以后增加新分區(qū)時(shí)就不需要重新創(chuàng)建觸發(fā)器函數(shù)了,也可以避免一些不必要的錯(cuò)誤。
另外,還可以如下增加新的分區(qū):
david = # create table tbl_partition_201307 david - # ( LIKE tbl_partition INCLUDING DEFAULTS INCLUDING CONSTRAINTS); CREATE TABLE david = # david = # alter table tbl_partition_201307 add constraint tbl_partition_201307_join_date_check david - # check ( join_date >= DATE ' 2013-07-01 ' AND join_date < DATE ' 2013-08-01 ' ); ALTER TABLE david = # david = # create index tbl_partition_201307_joindate on tbl_partition_201307 (join_date);
david=# copy tbl_partition_201307 from '/tmp/tbl_partition_201307.sql'; //從文件中拷貝數(shù)據(jù),這些數(shù)據(jù)可以是事前準(zhǔn)備的 david = # alter table tbl_partition_201307 inherit tbl_partition;
七、約束排除
約束排除(Constraint exclusion)是一種查詢(xún)優(yōu)化技巧,它改進(jìn)了用上面方法定義的表分區(qū)的性能。
確保postgresql.conf 里的配置參數(shù)constraint_exclusion 是打開(kāi)的。沒(méi)有這個(gè)參數(shù),查詢(xún)不會(huì)按照需要進(jìn)行優(yōu)化。這里我們需要做的是確保該選項(xiàng)在配置文件中沒(méi)有被注釋掉。
如果沒(méi)有約束排除,查詢(xún)會(huì)掃描tbl_partition 表中的每一個(gè)分區(qū)。打開(kāi)了約束排除之后,規(guī)劃器將檢查每個(gè)分區(qū)的約束然后再試圖證明該分區(qū)不需要被掃描,因?yàn)樗荒馨魏畏蟇HERE子句條件的數(shù)據(jù)行。如果規(guī)劃器可以證明這個(gè),它就把該分區(qū)從查詢(xún)規(guī)劃里排除出去。
可以使用EXPLAIN 命令顯示一個(gè)規(guī)劃在constraint_exclusion 關(guān)閉和打開(kāi)情況下的不同:
7.1 約束排除關(guān)閉
david = # set constraint_exclusion = off ; SET david = # explain select count ( * ) from tbl_partition where join_date >= DATE ' 2013-04-01 ' ; QUERY PLAN -- ----------------------------------------------------------------------------------------------- Aggregate (cost = 172.80 .. 172.81 rows = 1 width = 0 ) -> Append (cost = 0.00 .. 167.62 rows = 2071 width = 0 ) -> Seq Scan on tbl_partition (cost = 0.00 .. 0.00 rows = 1 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201211 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201212 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201301 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201302 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201303 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201305 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201304 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201306 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201307 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) ( 22 rows) david = #
從上面的查詢(xún)計(jì)劃中可以看出,PostgreSQL 掃描了所有分區(qū)。下面我們?cè)倏匆幌麓蜷_(kāi)約束排除之后的查詢(xún)計(jì)劃: ?
7.2 約束排除開(kāi)啟
david = # set constraint_exclusion = on ; SET david = # explain select count ( * ) from tbl_partition where join_date >= DATE ' 2013-04-01 ' ; QUERY PLAN -- ----------------------------------------------------------------------------------------------- Aggregate (cost = 76.80 .. 76.81 rows = 1 width = 0 ) -> Append (cost = 0.00 .. 74.50 rows = 921 width = 0 ) -> Seq Scan on tbl_partition (cost = 0.00 .. 0.00 rows = 1 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201305 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201304 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201306 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) -> Seq Scan on tbl_partition_201307 tbl_partition (cost = 0.00 .. 18.62 rows = 230 width = 0 ) Filter: (join_date >= ' 2013-04-01 ' ::date) ( 12 rows) david = #
可以看到,PostgreSQL 只掃描四月份以后的分區(qū)表。
八、可選的分區(qū)方式
還可以通過(guò)建立規(guī)則的方式進(jìn)行分區(qū)。
CREATE RULE insert_tbl_partition_201211 AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE ' 2012-11-01 ' AND join_date < DATE ' 2012-12-01 ' ) DO INSTEAD INSERT INTO tbl_partition_201211 VALUES (NEW. * ); CREATE RULE insert_tbl_partition_201212 AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE ' 2012-12-01 ' AND join_date < DATE ' 2013-01-01 ' ) DO INSTEAD INSERT INTO tbl_partition_201212 VALUES (NEW. * ); ... CREATE RULE insert_tbl_partition_201306 AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE ' 2013-06-01 ' AND join_date < DATE ' 2013-07-01 ' ) DO INSTEAD INSERT INTO tbl_partition_201306 VALUES (NEW. * ); CREATE RULE insert_tbl_partition_201307 AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE ' 2013-07-01 ' AND join_date < DATE ' 2013-08-01 ' ) DO INSTEAD INSERT INTO tbl_partition_201307 VALUES (NEW. * ); CREATE RULE insert_tbl_partition_error_join_date AS ON INSERT TO tbl_partition WHERE ( join_date >= DATE ' 2013-08-01 ' OR join_date < DATE ' 2012-11-01 ' ) DO INSTEAD INSERT INTO tbl_partition_error_join_date VALUES (NEW. * );
九、注意事項(xiàng)
VACUUM?或 ANALYZE tbl_partition?只會(huì)對(duì)主表起作用,要想分析表,需要分別分析每個(gè)分區(qū)表。
十、參考資料
- PostgreSQL官方說(shuō)明: http://www.postgresql.org/docs/9.2/static/ddl-partitioning.html
- ITEYE:http://diegoball.iteye.com/blog/713826
- kenyon(君羊):http://my.oschina.net/Kenyon/blog/59455
更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主
微信掃碼或搜索:z360901061

微信掃一掃加我為好友
QQ號(hào)聯(lián)系: 360901061
您的支持是博主寫(xiě)作最大的動(dòng)力,如果您喜歡我的文章,感覺(jué)我的文章對(duì)您有幫助,請(qǐng)用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點(diǎn)擊下面給點(diǎn)支持吧,站長(zhǎng)非常感激您!手機(jī)微信長(zhǎng)按不能支付解決辦法:請(qǐng)將微信支付二維碼保存到相冊(cè),切換到微信,然后點(diǎn)擊微信右上角掃一掃功能,選擇支付二維碼完成支付。
【本文對(duì)您有幫助就好】元
