OmniTI, Inc
- Full-stack support for high-traffic websites & applications
- Millions of users
- Terabytes of data
- Gilt, Etsy, Ora.TV, Freelotto
- Surge Conference
- We're hiring!
PG Extractor
- Overcomes pg_dump/pg_restore limitations (-t, -n, -P)
- Filter by schema, table, view, function, type, owner
- Dumps each database object to its own file
- Includes all privileges (even for function) in each file
- Regex matching
- Full Python class (requires python 3)
- pg_dump only provides explicit name filters for tables & schemas. All into one file.
- pg_restore has some additional filtering options, but requires you to first to a binary dump with pg_dump and then restore from that.
- Only dumps ACLs and comments for tables.
- Started with perl script that had SQL functions to dump and rebuild schema from the system tables. 9.0 broke that real quick.
- Rewrote to use only pg_dump/pg_restore and the magic of perl to format things from there. Been using python more lately, so rewrote in python for 2.0. Stable released this week!
- Python version is a full class with public methods for getting a dictionary/JSON formatted list of the pg_restore -l list out
- Requires Python 3. Text handling is more consistent in python 3 and trying to help encourage the move from 2 to 3. All major distros have 3 available (even OmniOS), so use it!
- - Object filter can also be a separate file that lists the things you want to include or exclude.
Demo database in my work vm
python3 pg_extractor.py --help (show all options)
python3 pg_extractor.py -d demo --getall --keep_dump
python3 pg_extractor.py -d demo --table_exclude_file=exclude_list –gettables –getdata
Python 3.3.1 (default, Sep 25 2013, 19:29:01)
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pg_extractor import PGExtractor
>>> p = PGExtractor()
>>> object_list = p.build_main_object_list("demo/dump/pg_extractor_dump.pgr")
>>> table_object_list = p.build_type_object_list(object_list, ['TABLE'])
>>> for o in object_list:
... print(o)
>>> for t in table_object_list:
... print(t)
Do command from README for existing dump
Extensions
- Introduced in 9.1
- Logically grouped set of database objects
- CREATE EXTENSION pg_partman;
- Versioned
- ALTER EXTENSION pg_partman UPDATE TO '1.8.0';
- Update and revert changes predictably
- Ensure consistent code across multiple clusters
- Keep similar code on multiple systems consistent within the database itself.
- All old contrib modules are now extensions (all v1.0 that I've seen so far).
- - Reverting an update is only possible if the extension author provides a means to do so.
- - Any language, just like postgres functions (C, Perl, Python, etc)
- - Tables currently intended for configuration information, so not dumped with pg_dump except when a flag is set. And when that's set, it dumps its data with schema dumps. Needs work.
PG Jobmon
- PostgreSQL Extension
- Autonomous Functions
- Log steps of running functions
- Monitor logged functions to ensure completion
- If/When they fail, where and why
- Autonomous function is being able to commit multiple times within a single run of a function. Oracle allows this. If any part of a function fails before it completes successfully in postgres, everything that function did is rolled back. Has its advantages, but for long running functions this can start to cause locks, especially if you run any DDL within the function.
- DBlink is an extension that lets you connect to another postgres database from within the database.
- Each connection to postgres has its own transaction block. So connect back to itself, and you can get something like autonomous function capabilities.
- Log steps of a running function. If/when something breaks, would normally roll back all those logged steps. This provides a way to keep those steps around even when it fails.
- Provides audit trail of what happened and monitoring capabilities.
PG Jobmon
add_job('job name');
add_step(job_id, 'What this step will do');
... do some stuff ...update_step(step_id, 'good_status', 'What this step did successfully');
add_step(job_id, 'What this next step will do');
... do some stuff in a loop ...
update_step(step_id, 'good_status', 'update every loop iteration to track progress');
add_step(job_id, 'One last step');
... do just a bit more stuff ...update_step(step_id, 'good_status', 'Job finished ok');
close_job(job_id);
EXCEPTION
WHEN OTHERS THEN
update_step(step_id, 'bad_status', 'Uh..oh...: '||coalesce(SQLERRM,'wat'));
fail_job(job_id);
PG Jobmon
show_job('my job name', [int]);
show_job_like('I forgot my job's whole name', [int]);
-[ RECORD 3 ]-----------------------------
job_id | 10
owner | keith
job_name | PG_JOBMON TEST BAD JOB
start_time | 2012-09-15 00:55:44.742176-04
end_time | 2012-09-15 00:55:44.851514-04
status | CRITICAL
pid | 5848
-[ RECORD 4 ]-----------------------------
job_id | 9
owner | keith
job_name | PG_JOBMON TEST GOOD JOB
start_time | 2012-09-15 00:55:44.293575-04
end_time | 2012-09-15 00:55:44.725483-04
status | OK
pid | 5848
PG Jobmon
show_detail(job_id);
show_detail('job_name', [int]);
-[ RECORD 1 ]+------------------------------
job_id | 9
step_id | 19
action | Test step 1
start_time | 2012-09-15 00:55:44.501825-04
end_time | 2012-09-15 00:55:44.593389-04
elapsed_time | 0.091564
status | OK
message | Successful Step 1
-[ RECORD 2 ]+------------------------------
job_id | 9
step_id | 20
action | Test step 2
start_time | 2012-09-15 00:55:44.643017-04
end_time | 2012-09-15 00:55:44.659336-04
elapsed_time | 0.016319
status | OK
message | Rows affected: 2
-[ RECORD 3 ]+------------------------------
job_id | 9
step_id | 21
action | Test step 3
start_time | 2012-09-15 00:55:44.692518-04
end_time | 2012-09-15 00:55:44.7087-04
elapsed_time | 0.016182
status | OK
message | Successful Step 3
PG Jobmon
check_job_status();
- Make nagios check (command and service configs on my blog)
- Shameless plug – http://www.circonus.com (howto on my blog)
SELECT t.alert_text ||'('||c.alert_text||')' AS alert_status
FROM jobmon.check_job_status() c
JOIN jobmon.job_status_text t ON c.alert_code = t.alert_code;
alert_status
-------------------------------
OK(All jobs run successfully)
alert_status
--------------------------------------------------------------
CRITICAL(KEITH.SOME_OTHER_PROCESS: MISSING - Last run at 2012-09-13 07:17:07.86378-04; KEITH.ANOTHER_PROCESS: MISSING - Last run at 2012-09-13 07:16:30.169683-04;)
alert_status
--------------------------------------------------------------
WARNING(KEITH.SOME_CRITICAL_PROCESS: RUNNING; )
Mimeo
Per-table Replication Extension
“The stencil duplicator or mimeograph machine (often abbreviated to mimeo) is a low-cost printing press that works by forcing ink through a stencil onto paper...Mimeographs were a common technology in printing small quantities, as in office work, classroom materials, and church bulletins.” – Wikipedia
- Streaming & Log shipping replication are all-or-nothing
- Bucardo & Slony are another solution, but requires a lot more setup and can be intimidating to new users or someone that just needs to copy a few tables. Also require superuser.
- Logical Replication slots in 9.4 great start to having built in. BDR even better.
Mimeo
- Traditional method: Streaming & Log Shipping Replication (omnipitr)
- Logical/Per-table replication
- Snapshot
- Incremental
- Trigger-based (DML)
- Quick setup and tear-down
- Installed & run from destination database(s)
- No superuser required
- Column filter / WHERE Condition
- Monitor & Audit Trail with PG Jobmon
Types of Replication
Mimeo
- Snapshot
- The easy way
- Whole table replication
- Two tables w/ single view
- Brief exclusive lock to swap view source
- Minimize transactional lock during data migration
- Ideal for small or static tables
- Faster than DML replay if majority of table changes often
- Replicate column changes (new, dropped, type)
- No replication if source data has not changed
- Table
- Single table, no views
- Options to handle FK (cascade) and reset sequences
- Good for dev database
- Snapshot version can be hard to use on dev due to view
- Functions for monitoring if columns change on the source and haven't been replicated to destination.
- Table option is good for dev database that you don't want give write access to production
- If table has foreign keys, option to cascade the truncate to other tables that it depends on
- Run replication on table with FK first then run the tables that it would cascade the truncate to after that one
- If sequences are used, can have it reset the sequence to the value it had on production.
Types of Replication
Mimeo
- Incremental
- Control timestamp or serial column
- High transaction tables w/ timestamp or ID set every insert
- With primary/unique key, can also support updates
- DST
- Run database in GMT/UTC
- Replication does not run
- If time, talk about boundary issues. Refer to documentation for additional info and how to configure.
- At this time, mention that table, snapshot & incremental require no write privileges on the source (triggers for tracking changes).
- Lowers load on source database
Types of Replication
Mimeo
- DML
- Replay Inserts, Updates, Deletes
- Trigger w/ queue table on source
- Doesn't actually replay
- Queue table of only primary/unique key values
- Distinct on queue table
- Delete all matches on destination & re-insert
- Supports multiple destinations
- - Only permissions needed on source role are SELECT & TRIGGER (owner required for removal due to DROP TRIGGER restriction)
- - Queue table and function are kept in schema that is owned by mimeo source user
- - Automatically grants needed privileges on the trigger function & queue table to other source roles that have write privileges on the source table.
- - Limit of 100 destinations. Ridiculous if you even come close since each destination increases the number of tables being written to for every table write.
Types of Replication
Mimeo
- Log Deletes
- Same methods as DML but does not replay deletes
- Common in data warehousing
- Queue table stores entire row if it's deleted
- Destination has special column with timestamp of row's deletion
Use Cases
Mimeo
- Table audit
- Trigger to track all changes to audit table w/ audit_timestamp column
- Use incremental replication on audit table to pull to data warehouse.
- Time-based partitioning on source audit table to easily drop old data with minimal impact on production.
- Database Upgrade
- Can connect with dblink to any version that supports it
- Setup replication for larger tables to minimize downtime for pg_dump upgrade method.
- Special version of refresh functions for source database without newer features (wCTE, window functions, arrag_agg ORDER BY). Works back to 8.1 so far.
- Mention shared table scenario from client
- Single table shared across many systems for performance, replicated to a single reporting database.
PG Partition Manager
- Current partition management is entirely manual
- Custom write all tables, triggers, functions, rules, etc.
- Core devs working on getting it built in
- In the mean time ...
Automated Creation
pg_partman
- Time & Serial Based Partitioning
- Yearly, Quarterly, Monthly, Weekly, Daily, Hourly, ½ hour, ¼ hour
- Custom time interval
- Multi-level Sub-partitioning
- Static & Dynamic Triggers
- Pre-creates partitions
- Manage child table properties from parent
- Indexes, constraints, defaults, privileges, ownership, oids
- Foreign keys & unlogged coming soon!
- Automatically updates trigger functions as needed.
- Handles object name length limit (63 char)
- Constraint exclusion on non-partitioned column(s)
- Serial-based creates next partition when current reaches 50% of max constraint value by default.
- Time-based uses scheduled job. Serial can as well if you have contention with the 50% method
- Precreation avoids race conditions when inserting data at time of table creation. Can customize how far ahead it precreates
- Python script to check primary/unique key values are actually unique across partitions
- Data that doesn't match trigger function or doesn't have a child table goes to parent.
- Constraint exclusion based on premake value.
Automated Creation
pg_partman
- Python script to partition existing data
- Commits after each partition created
- Commit in smaller batches with configured wait
- Partition live, production tables
- Partitioned 74 mil row table by day (30 days of data)
- Committed in hourly blocks w/ 5 second wait
- Streaming slave never fell more than 100 seconds behind
- 2-3 second lock on parent was only interruption
Static Partitioning
pg_partman
Readable functions!
CREATE OR REPLACE FUNCTION partman_test.time_static_table_part_trig_func()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
IF TG_OP = 'INSERT' THEN
IF NEW.col3 >= '2013-03-21 00:00:00-04' AND NEW.col3 < '2013-03-22 00:00:00-04' THEN
INSERT INTO partman_test.time_static_table_p2013_03_21 VALUES (NEW.*);
ELSIF NEW.col3 >= '2013-03-20 00:00:00-04' AND NEW.col3 < '2013-03-21 00:00:00-04' THEN
INSERT INTO partman_test.time_static_table_p2013_03_20 VALUES (NEW.*);
ELSIF NEW.col3 >= '2013-03-22 00:00:00-04' AND NEW.col3 < '2013-03-23 00:00:00-04' THEN
INSERT INTO partman_test.time_static_table_p2013_03_22 VALUES (NEW.*);
ELSIF NEW.col3 >= '2013-03-19 00:00:00-04' AND NEW.col3 < '2013-03-20 00:00:00-04' THEN
INSERT INTO partman_test.time_static_table_p2013_03_19 VALUES (NEW.*);
ELSIF NEW.col3 >= '2013-03-23 00:00:00-04' AND NEW.col3 < '2013-03-24 00:00:00-04' THEN
INSERT INTO partman_test.time_static_table_p2013_03_23 VALUES (NEW.*);
ELSE
RETURN NEW;
END IF;
END IF;
RETURN NULL;
END $function$
Starts with current partition. Goes back one, then forward one. Continues in that pattern for whatever premake was configured for. This was premake = 2 so handles 2 in the past & 2 in the future.
Dynamic Partitioning
pg_partman
CREATE OR REPLACE FUNCTION partman_test.time_dynamic_table_part_trig_func()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
DECLARE
v_count int;
v_partition_name text;
v_partition_timestamp timestamptz;
v_schemaname text;
v_tablename text;
BEGIN
IF TG_OP = 'INSERT' THEN
v_partition_timestamp := date_trunc('day', NEW.col3);
v_partition_name := 'partman_test.time_dynamic_table_p'|| to_char(v_partition_timestamp, 'YYYY_MM_DD');
v_schemaname := split_part(v_partition_name, '.', 1);
v_tablename := split_part(v_partition_name, '.', 2);
SELECT count(*) INTO v_count FROM pg_tables WHERE schemaname = v_schemaname AND tablename = v_tablename;
IF v_count > 0 THEN
EXECUTE 'INSERT INTO '||v_partition_name||' VALUES($1.*)' USING NEW;
ELSE
RETURN NEW;
END IF;
END IF;
RETURN NULL;
END $function$
Custom time partitioning uses this method along with a lookup table, so it's the least performant although the most flexible for time interval needs.
Automated Destruction
pg_partman
- Configurable retention policy
- Time: Drop tables with values older than 3 months
- Serial: Drop tables with values less than 1000 minus current max
- By default only uninherits
- Can drop old tables or only their indexes
- Dump out tables for archiving
- Undo partitioning
- Dump out tables uses pg_dump and does not drop the table until pg_dump returns successfully.
- SHA-512 Hash of dump files.
- Can undo partitioning not made by pg_partman
Demo If time
PgTAP
- Unit testing for PostgreSQL queries, schema & scripting
- Essential for extension development
- ... and your sanity
- http://pgtap.org
When PostgreSQL Can't, You Can
Presented by Keith Fiske / @keithf4
Database Administrator @ OmniTI
Follow along at http://slides.keithf4.com/whenpgcantucan