Battling a Legacy Schema With DBIx::Class
YAPC::{NA,EU} 2016
For any non-trivial application you're going to write a model.
Most trivial applications turn into non-trivial applications.
And you're probably sick of writing trivial SQL.
So your trivial SQL statements may become non-trivial SQL statements.
So "don't leak the abstraction"
DBIx::Class offers many useful features, as we shall see.
Other ORMs/toolkits are available.
The ORM isn't your model, it helps your model.
MVC
MVC
OMVC
Abstract this away into your model:
$model->resultset( "Piste" )->find( 17 )->name;
Like so:
SkiResort::Model::Piste->new( id => 17 )->name;
Your controllers are then not coupled to the data store
And they know nothing about the ORM
And you can better utilise exceptions:
try sub { ... my $piste = SkiResort::Model::Piste->new( id => $piste_id ); ... }, catch_when 'SkiResort::Exception::Database' => sub { # 500 error? }, catch_when 'SkiResort::Exception::Input' => sub { # 400 error? }, catch_default sub { # something else? };
Anything you didn't design yourself, right?
Well, anything that's grown organically.
Or used older tech and didn't keep up.
Maybe just full of technical debt?
Technical debt is hard to fix in your data store.
It's more like a mortgage on your stack.
So lets look at some examples.
A Ski Resort
Simple, right?
The "legacy" version:
Eh, not so good.
11:31 <@ribasushi> if you have an existing gnarly database you want to wrap your head around, vanstyn's rdbic is a superb tool building on top of the ecosystem: http://www.catalystframework.org/calendar/2014/16
rdbic.pl examples/db/legacy/resorts_legacy.db
If you don't leak the abstraction then this becomes a non-issue.
Because your SkiResort::Model::Piste class could be querying a table named potato.
But if doesn't matter, the confusing terminology is safely contained and you only ever interact with the correctly named class.
#!/bin/bash set -e -x -u folder=$1 db_path=$2 overwrite=$3 dbicdump \ -o debug=1 -o generate_pod=0 -o preserve_case=1 \ -o dump_directory=$folder \ -o components="[qw{InflateColumn::DateTime}]" \ -o overwrite_modifications=$overwrite \ -o datetime_timezone=UTC \ 'SkiResort::Model::LegacySchema' \ 'dbi:SQLite:dbname='$db_path''
use utf8; package SkiResort::Model::LegacySchema::Result::ResortItem; # Created by DBIx::Class::Schema::Loader # DO NOT MODIFY THE FIRST PART OF THIS FILE use strict; use warnings; use base 'DBIx::Class::Core'; __PACKAGE__->load_components("InflateColumn::DateTime"); __PACKAGE__->table("resort_item"); __PACKAGE__->add_columns( "resort_id", { data_type => "integer", default_value => \"null", is_nullable => 1 }, "item_source", { data_type => "varchar( 255 )", is_nullable => 1 }, "item_id", { data_type => "integer", is_nullable => 0 }, ); # Created by DBIx::Class::Schema::Loader v0.07036 @ 2016-04-07 14:16:33 # DO NOT MODIFY THIS OR ANYTHING ABOVE! md5sum:kOaW4xm7IT+Y3qVDDBy2hg # You can replace this text with custom code or comments, and it will be # preserved on regeneration 1;
# You can replace this text with custom code or comments, and it will be # preserved on regeneration __PACKAGE__->belongs_to( "resort", "SkiResort::Model::LegacySchema::Result::Resort", { id => "resort_id" }, { is_deferrable => 0, on_delete => "CASCADE", on_update => "CASCADE" }, );
Which gives us:
$model->resultset( "ResortItem" ) ->first->resort->name;
And of course:
$model->resultset( "ResortItem" )->search( {},{ prefetch => [ qw/ resort / ] } )->first->resort->name;
package SkiResort::Model::LegacySchema::Result::ResortItem; ... __PACKAGE__->set_primary_key( "resort_id","item_source","item_id" ); __PACKAGE__->belongs_to( piste => 'SkiResort::Model::LegacySchema::Result::Piste', sub { my ( $args ) = @_; return { "$args->{self_alias}.item_source" => 'piste', "$args->{self_alias}.item_id" => { -ident => "$args->{foreign_alias}.id" }, }; }, );
Allows:
$model->resultset( "PisteItem" )->search( { item_source => 'piste' }, { prefetch => [ qw/ piste / ] } )->first->piste->name;
Using search_related bridge having added the previous belongs_to:
package SkiResort::Model::LegacySchema::Result::Resort; ... __PACKAGE__->has_many( resort_items => 'SkiResort::Model::LegacySchema::Result::ResortItem', { 'foreign.resort_id' => 'self.id' } );
Allows:
$model->resultset( "Resort" ) ->search_related( 'resort_items' ) ->search_related( 'piste' )->first->name;
And if you have many of these:
package SkiResort::Model::LegacySchema::Result::ResortItem; ... foreach my $source ( qw/ piste lift / ) { __PACKAGE__->belongs_to( $source => 'SkiResort::Model::LegacySchema::Result::' . ucfirst( $source ), sub { my ( $args ) = @_; return { "$args->{self_alias}.item_source" => $source, "$args->{self_alias}.item_id" => { -ident => "$args->{foreign_alias}.id" }, }; }, ); }
We can fix data and/or get objects from column data.
DateTime - You'll almost certainly want this one:
__PACKAGE__->load_components(qw/InflateColumn::DateTime/);
When $column is a date, timestamp or datetime data type:
$model->resultset( "Foo" )->first->$column->subtract->( months => 1 )->ymd( '-' );
DATE_FORMAT(DATE_SUB(c,INTERVAL 1 MONTH),'%Y-%m-%d')
Stop using your RDMS for date calculations / localisation.
DBIx::Class::Helper::ResultSet::DateMethods
# get count per year/month $rs->search(undef, { columns => { count => '*', year => $rs->dt_SQL_pluck({ -ident => '.start' }, 'year'), month => $rs->dt_SQL_pluck({ -ident => '.start' }, 'month'), }, group_by => [ $rs->dt_SQL_pluck({ -ident => '.start' }, 'year'), $rs->dt_SQL_pluck({ -ident => '.start' }, 'month'), ], }); # mysql (SELECT `me`.*, EXTRACT(MONTH FROM `me`.`start`), EXTRACT(YEAR FROM `me`.`start`) FROM `HasDateOps` `me` GROUP BY EXTRACT(YEAR FROM `me`.`start`), EXTRACT(MONTH FROM `me`.`start`)) # SQLite (SELECT "me".*, STRFTIME('%m', "me"."start"), STRFTIME('%Y', "me"."start") FROM "HasDateOps" "me" GROUP BY STRFTIME('%Y', "me"."start"), STRFTIME('%m', "me"."start"))
The resort table contains an active column that is a char(1).
It should be a boolean, so:
package SkiResort::Model::LegacySchema::Result::Resort; ... __PACKAGE__->load_components( "FilterColumn" ); __PACKAGE__->filter_column( active => { filter_to_storage => sub { return $_[1] ? 'Y' : 'N'; }, filter_from_storage => sub { return defined $_[1] && $_[1] =~ /Y/i ? 1 : 0; }, });
Cleans up significantly:
if ( $model->resultset( "Resort" )->first->active ) { ... }
How about sanity checking a column that contains a CSV list?
__PACKAGE__->filter_column( column_with_csv => { filter_from_storage => sub { return [ split( ',',$_[1] ) ]; }, filter_to_storage => sub { my ( $self,$values ) = @_; foreach my $value ( @{ $values // [] } ) { # maybe we can check $value here - if it's supposed # to be a reference to another table then we could # check $value exists in the child table? if not we # throw an exception } return join( ',',@{ $values // [] } ) || undef; }, });
my $sql = "Some complex SQL we don't want to rewrite as SQL::Abstract";
Options:
my @results = $schema->storage->dbh_do( sub { my ( $storage,$dbh,@binds ) = @_; # this is just pure DBI $dbh->selectrow_array( "Complex SQL Here",{},@binds ); }, @binds, );
package SkiResort::Model::LegacySchema::Result::PistesForRating; use base qw/DBIx::Class::Core/; __PACKAGE__->table_class('DBIx::Class::ResultSource::View'); __PACKAGE__->table('pistes_for_rating'); __PACKAGE__->result_source_instance->is_virtual(1); __PACKAGE__->add_columns( "name", { data_type => "varchar( 255 )", default_value => \"null", is_nullable => 1, } ); __PACKAGE__->result_source_instance->view_definition( " SELECT piste.name FROM piste WHERE rating = ? " ); 1;
$model->resultset( "PistesForRating" ) ->search( {},{ bind => [ $rating ] } )
package SkiResort::Model::LegacySchema::Result::PistesForRatingMatchingString; use base qw/SkiResort::Model::LegacySchema::Result::PistesForRating/; __PACKAGE__->table('pistes_for_rating_like'); __PACKAGE__->result_source_instance->view_definition( __PACKAGE__->SUPER::result_source_instance->view_definition . " AND piste.name like ? " ); 1;
And:
$model->resultset( "PistesForRatingMatchingString" ) ->search( {},{ bind => [ $rating,"%$string%" ] } )
use Carp qw/ cluck longmess shortmess /; sub resultset { my ( $self,$table ) = @_; if ( my $trace = $ENV{DBIC_TRACE} ) { my ( $level,$trace_file ) = split( /=/,$trace ); my $mess = $level == 1 ? shortmess( "RESULTSET: $table" ) : longmess( "RESULTSET: $table" ); if ( $trace_file ) { open( my $fh,'>>',$trace_file ) || cluck( ... ); print $fh $mess; close( $fh ); } else { print STDERR $mess; } } return $self->SUPER::resultset( $table ); }
UTF-8
mysql_enable_utf8 => 1,
Almost certainly
mysql_auto_reconnect => 1,
If only new code is going to use the dbic route:
$schema->storage->on_connect_do( [ "SET sql_mode=STRICT_ALL_TABLES", ] );
Prefetch prefetch prefetch.
Test your app with a representative dataset (where possible).
Keep your ResultSource classes up to date.
Logging
Deployment
Helpers
curl -XPOST api.metacpan.org/v0/release/_search?size=100 -d '{ "query": { "wildcard" : { "release.distribution" : "DBIx-Class*" } }, "size" : 5000, "filter" : { "term" : { "status" : "latest" } }, "fields": [ "release.distribution", "release.date", "provides" ] }' | jq -r '.hits.hits[].fields.distribution' | sort
Links and resources: