Data Migration - DMM2: And Who Needs an ERD?

A look at the use of Data Models in Data Migration projects and an introduction to another speaker on the DMM2 programme.

I have been asked by a client to take part in a Data Modelling workshop for which I'll be preparing this week. Now Metadata modelling is one of those arcane arts that had a tremendous vogue, got over hyped, then died the death of a thousand cuts when it became clear that it wasn't a universal panacea. It has of course re-appeared in various guises. Most master data management programmes rely on a data model but it is one that is often more represented in software/code than in formal models. Software as a service (SOA) also has to include implicit shared data models to work even if their shared representation is limited to XML.

But the use of formal analytical metadata models is often dismissed as "Too theoretical".

However I'm going to mount a defence of our use of ERD's (Entity Relationship Diagrams) in the complex world of Data Migrations.

In a Data Migration we are often faced with 100's of legacy data stores, each built and created at different time often with different technology. Some will use corporate, relational databases, some will be none relational (think spreadsheet data here). Each will have been created to satisfy a particular need for a particular group at a particular time. How the heck do we even catalogue them never mind provide some form of cross analysis without a lingua franca?

Now here I am not arguing in favour of recreating the role of the Corporate Data Architect with a detailed model. (Perhaps I will share with you my feelings on the challenges implicit in this naive transcendentalist activity on a future blog). But from a pragmatic, moving things on, perspective, having a single model against which to compare and catalogue the plethora of legacy data stores brings order to confusion. Having a common modelling language to describe the multiple different structures (as opposed to the various, rich text, diagrams that are often offered), simplifies what is still a difficult task. Finally metadata analysis allows us to see through the apparent differences in structure and naming conventions to the essential similarities and, just as importantly, differences in the legacy data stores.

I'll perhaps let you know how we get on.

My other main preoccupation this week is, of course DMM2 (our Data Migration Matters 2 conference on 1st October for anyone who has dropped into this blog for the first time). I will be posting up a new programme list shortly, hopefully without any TBA but this is proving not so easy. The August holiday season has played havoc getting the formal sign off from all pour speakers to allow us to put up proper biographies, mug shots etc. But we are in great shape.

One speaker who will be included is Nigel Key a migration project director at BT. Nigel has been responsible for a very large data migration of the back end systems responsible for broadband in the UK. And did such a good job that he got to pick up another data migration of BT Wholesale and Operational data. Nigel is a very experienced guy when it comes to large migrations and from a customer perspective will be speaking about the challenges of managing third party suppliers - an essential aspect of any large migration, especially in an era of offshoring.