Versioning your development databasesOne of the questions that usually arises after a new customer starts using Plastic is ok, but what should I do with my data? Most development groups developing business applications use some sort of database backend (SQL Server, MySQL, Firebird, Oracle, whatever) and normally some sort of shared sandbox like the following:
So, when they decide they have enough problems breaking each others code, being locked by someone else’s changes, or continuously “trying to shoot a moving target”, they decide to implement some sort of version control system (or better version control system). Then they move to “stage two”, which can be depicted like:
Ok, but, the same troubles they were having with code are still present dealing with the centralized database: I make a change on the schema and then your “under development copy” stop working because it can’t handle my modification: you’re stopped until you change it (breaking mine again), or fix your code (losing your precious time)... Weren’t you supposed to work in isolation?
The answer is wonderfully explained at the book refactoring databases by Scott W. Ambler: get a private sandbox which includes the database too. If you manage to do so you’ll get rid of the database versioning hell.
Ok, but, if you’re used to directly make changes in the development database using the specific server GUI, how would you keep track of changes?
My recommendation here (and Ambler’s one, if you need to trust a real expert) is: keep your schema in DDL format! Yes, maybe it sounds too arcane for you, used to only delve into the rich SQL server user interfaces, but it will really help. Suppose your database is defined into a single (ok, or even more than one, it doesn’t really matter) file, then anyone making changes will register them in Plastic SCM, creating new versions of the files. Then merging those changes (add a new field, create a key, drop a table, whatever) will be just a matter of merging regular text files, and Plastic excels here!
Ok, but... how do I manage to set up the developer’s workspaces then? Well, my recommendation would be:
Well - I hear you saying –but then I’d only test with a minimal set of data... this will lead to performance problems. Yeah! You’re right! So, don’t get rid of your “old” central development server. Use it to run integration tests on top of real sets of data, highlighting possible performance bottlenecks. The new environment will be a combination of isolation and shared resources: the right balance between them will help you working better, a wrong one will, obviously, lead to problems.
Still– you complain – weren’t we supposed to move towards a more agile and decentralized development method? Now we need some way to update this central server, before it was done by each of the developers. Yes, but “before” your team were loosing time trying to figure out what was happening each time (or at least often) a change was made on the database. Now you just have to update this server when a new release (an internal one, not necessarily a customer milestone at all) is created. And you can use the same system your software uses in production to upgrade databases:
Ok, I’ll do that but, how will my software know the source and target versions? – you ask.
This question is greatly covered in the book I mentioned, and to be honest I was surprised to see there one of the things I’ve been doing for years, in different projects: you create a table inside your schema (you can name it “databaseinfo” for instance) which will hold versioning information. Then your system will be able to determine which the source database version is.
What about the destination one? Well, here my choice is: hardcode it in your software. Whether you do it embedding some sort of resource or with a simple constant, it doesn’t really matter, but it is very useful having your binaries know which is the database version they need to work.
You might be thinking now “yes, but creating a database upgrade system is not an easy task”. Of course it isn’t. If you need a really flexible one, take a look at www.red-gate.com. They have a powerful set of tools to cover all your needs.
But sometimes you don’t really need a wide-range solution, you just need to add some fields, perform some calculations or create some indexes from one version to the next: you can create your own upgrade system, which will be maintained together with the software, and which will be able to perform these very specific operations when it detects the source version is not the same as the destination one. Your upgrade system will have to check which one is the “starting point” and then apply all the rules until it hits the “desired destination version”.
So, you have a beautiful solution to move towards fully managed (in terms of version controlled) database development.
Ok, maybe some of you are actually disappointed because you expected some better integration, something you could just install and forget about all your data issues... I’m afraid there are no silver bullets and the solution will require some work on your side... The good thing is that in exchange you’ll get a better system, easier to modify and evolve...