Data Migration - First blog of the year

A New Year has dawned and already there is a healthy degree of activity out there showing that the market awareness of Data Migration continues to expand.

First of all to answer one of Nigel's points on my last blog: About the newsletter - I've shelved activity in that direction for now so that I can focus on the collaborative platform. We're just ironing out the legal details (nothing new there then) but I should be in a position to make a more formal announcement shortly. Of course an activity like this will require active participation of more than just yours truly so be prepared when the call comes.

And to answer another of Nigel's questions - an aggregator for Data Migration would be a great idea but probably not one that will be included in the first release of our new Data Migration platform. Let me take that away and dwell on it.

But what else has the New Year brought to my attention? Well my friend Arvind of Informatica has been writing again on Data Migration - check it out at: http://searchstorage.techtarget.com.au/topics/article.asp?DocID=6101096

He's pushing an iterative approach of four stages - analysis, extract and transform, validate and cleanse, and load and test. He's also keen on the idea of a pre-migration feasibility study.

He and I have had conversations in the past about this and I think this approach has merit, especially in situations where the data migration activity has been left to the end of the project time line. Personally though I would recommend that Data Migration activities should commence at the start of the project. Data quality analysis of the source can commence before the target is known. This might seem perverse but in most data migrations the data items in the real world (be they bond sales or the signals of a national railway) are not going to change and the metadata relationships between them are not going to change either. We can validate the quality of that data (which has probably 80% of the data quality issues) without knowing the target.

In addition performing this exercise uncovers all those hidden data sources that the business has been using to get around these data quality issues. Of course you can still do this once the target has been defined (actually you will be forced to do it). But by then most of the time and budget has been eaten up. Decisions made under this degree of pressure are not always the best ones.

Also, my attention was drawn to the new (at least I think it's new) Baring Point wiki: http://mike2.openmethodology.org/index.php/Data_Migration_Solution_Offering

I like this article as well, it covers most of the technical bases and has a nice differentiation between different scales of migration. It also includes a metric at the end for calculating project complexity. No weighting values for the variables are given (nor would I expect there to be - Baring Point have to retain some of their IP for themselves). Have a read and let me know what you think. I think it is a bold move to be supported.

However, although both articles cover some of the necessary activities within Data Migration neither is sufficient. There is no consideration of system retirement issues (and if we aren't going to retire legacy why are we migrating?). There is no fallback planning (migrating without fallback is like driving a car without insurance). No business engagement model (what do we do with the data errors we find? How do we tune our migration to fit in with the bigger, business change issues? How do we ensure that the data we migrate is acceptable in business as well as technical terms?). There is no consideration of selecting an appropriate migration approach (progressive, parallel run, big bang or even, don't migrate) and so on.

But both articles are stimulating and to be applauded for being out there, promoting best practice, and signalling the need for a new breed of IT expert - the Data Migration Architect.

Johny Morris