Spoke to my publisher at BCS, Matthew Flynn, this week and we've raised the possibility of a second edition to "Practical Data Migration". So I suppose this is a good time to reflect on the ten or more years since I first got involved with Data Migration and see what has changed.
Back in '97 when I was engaged to work on a new COTS package implementation at Anglian Water, there was no such a discipline as Data Migration.
We had a brand new system that was going to revolutionise the management of AW's copious quantities of above ground and underground assets, provide efficiencies in work patterns, better MIS, implement regulatory requirements and all round add value to the companies bottom line. But I was perplexed. Who was responsible for the data? Well, as anyone who has been ever worked on a project will know, if you moan about something long enough it somehow becomes your problem. So needless to say soon enough I was given responsibility for the getting the data in.
I can't say I was disappointed at this. As a flinty hearted independent contractor getting a piece, however unglamorous, of the project to myself at least ensured my longevity. And unglamorous it certainly was. All the best and the brightest were clamouring to get experience on the new technology. I was left to muddle through.
I suppose, just like anyone in those circumstances I looked in vain for guidance. Now I'd tracked all the main trends in methodologies - data modelling, SSADM, Jackson, Yourdan, the new fangled (at the time) DSDM and even the emerging (again at the time) Object Modelling techniques. But I searched in vain for a guide to Data Migration.
Seems we had all the guidance we needed in building new systems but none in getting the data into them at start up.
The need for this was now pressing.
Five years earlier, in a previous incarnation at Anglian Water, I had been part of an enormous team writing, from scratch, a wall to wall system to cover their billing and payments systems. But here we were replacing largely paper based systems. All the key data items had to be entered from scratch. In 1997 (well 1998 when we did the phase one go live), we were migrating from the anarchy of the client-server revolution of the '90's.
We had so much more data to deal with and from so many different sources.
I have to say that (like 80% of people today) I floundered around for a while. There was at least one spectacularly wrong false start (any one who's read the book will be aware of that one), but by dint of hard work and, crucially, learning to trust the business, we got our data ready on time and in it went without compromising the new environment.
I knew then that there was an unacknowledged gap in the socialised know-how of the IT industry and I've been working to close that gap ever since.
It has to be said that for the first five years or so I did feel like a lone voice crying in the wilderness. Not that I had to wear a hair shirt or eat locusts or wander around in the desert for a bit like biblical prophets of old, once word got out that there was someone who not only liked Data Migration but could actually be trusted to deliver, well let's say I was rarely kicking my heels between projects.
Probably it was just the zeitgeist, but maybe it was the vague ripple on the collective unconsciousness that the publication of the first edition of my book and its attendant publicity, caused but it seems to me that around the time of publication the earliest stirrings of interest in Data Migration as a market space in its own right started to be felt. And certainly the pace has quickened since then. I think Informatica who were the first to draw me to their attention. They could see that there were synergies between their acknowledged competency in systems integration - especially in the Data Warehouse area - and in Data Migration. Their subsequent acquisitions and reworking of their suite of packages is now paying dividends both for their clients and for their bottom line as their most recent profit statement pointed out.
There are also the new entrants to the market place - like Celona - who have developed innovative, ground up, technologies that change the way we approach our Data Migration tasks. We also have the likes of Golden Gate, Wipro, Seagate and so on. All of which I have blogged about over the last 12 months.
There is now even an open source variant to be considered - see www.talend.com.
The market is maturing. These new technical approaches extend out Data Migration options. There is also a settling out of the industry accepted Data Migration software architecture. None of which existed when I wrote the first edition. Back then everything was either hand crafted or cobbled together with bits of integration software we happened to have to hand. (Ah the memories of trying to migrate using a (mandated by management) EAI tool that was little understood).
So there is a lot to write about. Not that the core of my message is going to change. When it comes to migration we need to form a partnership with the business who really understand what is going on with the data that we see as bits and bytes, fields and columns. I will therefore be sticking with Data Quality Rules, System Retirement Policies, Key Data Stakeholder Analysis etc. etc. But I think we need new sections on:
- Automated Landscape Analysis
- Automated Data Quality
- ETL software (and all its various rivals)
- Software Architecture
- Selecting software and system integrator partners
- The new migration forms (like progressive migration) that the new tools permit
- Using an integrated Data Migration suite
- How all this fits with the unchanging softer products like DQR etc.
- How this fits with approaches like Agile and Extreme programming
Well that's enough for starters. But what do you think? For those of you familiar with the original what would you enhance? For those of you not familiar (shame on you) what would you expect? Drop me a line and let me know.