Database Versioning: The ignored aspect of version control.

Version control is an important aspect of writing code. From the initial check-in to the final build, it is crucial to keep track of every change that is made to the code. Like stacking blocks of legos, you will find what started with a single brick evolving into a city. Databases, however are often ignored from version control. If you are lucky you use some form of ORM to connect to your database like Active Record. This will handle database migrations for you and provide database version control to some degree. However for legacy projects you are out of luck. More often than not I have seen the code managed very carefully with awesome branch and merge strategies implemented. Whereas all the database scripts simply dumped in a directly called SQL. Or if you are lucky there will be sub directories with stored procedures, tables and functions in their own directories. 

 

Here's how it goes on a typical release day. The developer hands you a shiny release package with a nice bow on top. "Nice !" You say. Suddenly he reaches in his trench coat and hands you a soggy bundle wrapped hastily in brown paper. "What ...is that ?" you ask. "SQL scripts" comes the simple reply. "Well...what about rollback ?" you ask helplessly. "Restore from backup !" the developer smiles. "But...the database is 3 TerraBytes!" you yell. The developer leaves. 

Just kidding. That has never happened to me. And here is how you can avoid this from happening to you. You see there is a difference between versioning your database scripts and actually versioning the database. By versioning the database you get the ability to apply and rollback changes or "deltas" to your database. The simplest way to do this is , as mentioned above, is to use an ORM. However if you have an already established database and code and switching to ORM is not an option then there are other ways in which you can do this. There are products available on market that can help you version you database. For SQL Server Red Gate offers SQL Source Control that integrates with leading VCS. For MySQL there is dbv.php. Then there is software like Liquibase which is feature rich, free and supports a lot of databases. 

And when all else fails grab a scripting language of your choice and automate it yourself. With a mix of automation and policies you can bring any database under version control. Before you start you will need baseline of the database. A baseline will help you create the initial empty copy of the database on top of which you will apply all your changes. There are two way you can do this. Extract the schema of the database or else maintain a skeleton copy of the database i.e. truncate all the data in all the tables. Personally for established projects I like to create skeleton copies of the databases rather than extracting schemas because sometimes extracting schemas can be quite hairy. And secondly even if you manage to extract schema there is no guarantee you will be able to recreate the whole database from schema. So take it as a pro tip, a zipped up skeleton copy will make your life easier. After than implement some simple policies. All scripts must be checked into scripts directory. Every scripts must have a corresponding valid rollback script checked into rollback directory. Every script must start with a integer which must be the next available integer. So if the last script was called 04_create_blah.sql then the next one should be 05_delete_foo.sql. Now create a table in your database and called it VersionControl which will contain version control meta data. This will include last version applies, the date and time, user name who applied it. Then write your automation script to read from the directory and sort by name, read from the table and figure out the last version and figure out which scripts need to be applies. You can also have your script take parameters for rollback to apply the scripts in reverse. And your have poor man's DB version control. Remember this solution works if your database is not shared by multiple applications. You can always expand on this solution and make a more complex solution to suit your needs. Another problem you will face is very soon your scripts will reach in hundreds at which point you will find it is taking too long for you to apply all the scripts on top of the baseline. At some point you will have to take the baseline again. Just baseline your database, archive your scripts and rollback scripts. This is assuming your data loading scripts were handled else where and this mechanism was handling only DDL scripts. And you are all set again. 

What has been your experience with database version control ? Let me know in comments below. 

Next Article: Database Versioning: Using Liquibase

Comments   

0 #20 Bjørn 2014-03-26 07:04
Quoting Roman:
What about 'DELETE FROM' SQL commands? How would you rollback that? This is question that my team asking when I try to introduce some db versioning approach.


By creating a temporary table. If I need to preserve data from Table1 in case of rollback I usually create a _20140326_Table 1 copy of either the whole table or the deleted data. In case of rollback the data are copied back and table deleted. If the _yyyymmdd_ tables are not occupying too much space they are left and cleaned up by a cleanup procedure run once a year.
Quote
0 #19 Roman 2014-03-24 19:34
What about 'DELETE FROM' SQL commands? How would you rollback that? This is question that my team asking when I try to introduce some db versioning approach.
Quote
0 #18 Uri Margalit 2014-03-19 15:14
DB versioning is easy with DBmaestro. Check-Out & Check-In the database objects preventing code overrides, out-of-process changes, out-of--sync between the database and the file-based version control repository.
DBmaestro also integrates with Delphix to branch the database and merge it quickly.

http://www.toolsjournal.com/integrations-articles/item/2718-checkout-dbmaestro
Quote
0 #17 J M 2014-03-15 09:50
I have found git to version control Wordpress Development well.

A mysqldump of the Wordpress database is included in each commit.

To restore, drop all tables and run the .sql file.

Diffs accurately show changes in the database between commits.

For enhanced clarity make each table a separate sql file.
Quote
0 #16 Kyle Hailey 2014-03-15 07:41
DB versioning is easy with Delphix. Bookmark, rollback, branch a 10TB database in minutes with almost no storage overhead. Works for Oracle, SQL Server, and Postgres.
Quote
0 #15 Richard Gardiner 2014-03-15 02:12
+1 for FluentMigrator. It works well. I did add a couple of extensions, the main one being methods to add a basic data dictionary. This enabled the developers to code a database change along the meta data for usage and meaning, all without moving out of the language (c#), and solution they using for the rest of code of their code,
Quote
0 #14 Will Sargent 2014-03-15 01:00
I'm surprised no-one has mentioned Tungsten Replicator or Golden Gate yet. They're certainly useful for migrating between different schema versions.

Tungsten Replicator: https://code.google.com/p/tungsten-replicator/
Golden Gate: http://www.oracle.com/technetwork/middleware/goldengate/overview/index.html?ssSourceSiteId=otncn
Quote
0 #13 James Linden 2014-03-14 19:42
For MySQL/MariaDB, I generally use my schema generator script to create schemas with triggers to version data on a record level. It originally started as a simple "undo" feature for a custom app, but I ended up using it for pretty much any custom app now. It doesn't have any intrinsic capability to deal with table structure changes, so they have to be handled manually. It understands the concept of a 'user' (ie: a foreign key to a user table), so it also provides a history of record changes tied to that user.

http://jameslinden.com/code/mysql/schema-generator/
Quote
0 #12 lscharen 2014-03-14 17:39
Databases versioning and rollback is definitely underappreciate d. The minute you need it and don't have it, you are in for a world of hurt.

We do mostly ASP.NET MVC and rely on FluentMigrator as our migration a framework. It has been a great experience, not only for rollbacks, but getting a new developer set up with a local testing database in seconds is awesome.
Quote
0 #11 anders 2014-03-14 14:20
Yep, as mentioned Django's South library basically does this. I've actually done several projects now with legacy systems where I've pointed Django at the old database (it has a nice introspection tool that will generate your ORM code from the db schema) just to bring it under South's control so I could do automated schema migrations from that point on. The legacy app itself might still be crusty old Perl or Java or PHP code, but schema migrations become simple.
Quote

Add comment


Security code
Refresh