Data migration - better than planning

Last time out I blogged about the impossibility of running a major programme on rigid plans, so what are the alternatives?

At the outset let me say that I am not going address a truly agile approach here. It seems that each ERP or CRM ghetto has its own preferred approach, so SAP and MS Dynamics prefer waterfall and only SalesForce favours full on agile. I will be addressing an agile approach in a white paper to be published later in the year, so for now this is about waterfall plus.

First let’s remind ourselves of my reservations about formal, detailed planning:

The complexity of large programmes is such that it may be impossible even in theory to create a plan to the nth degree of detail that will hold together
In any case in practice projects that are tanking are never saved by repeated iterations of re-planning
Data migration projects are more than 80 per cent about unplannable activities - with data quality unknowns, source system unknowns, target system unknowns (at the outset) they are more akin to risk and issue management than deliverables-based tasks.

Starting with points 1 and 2, how do we know we will reach our go live date with healthy data and healthy ETL processes when we there is so little we can actually plan?

We need a well structured approach that has controls built into it. I make no apology for using PDMv2 as my model here. Hopefully at an overview level this will make sense to anyone unfamiliar with it.

So within the waterfall variant of PDMv2 we have 4 technical modules:

Landscape analysis (LA) -> Gap analysis & mapping (GAM) -> Migration design & execution (MDE) -> Legacy decommissioning (LD).

And they pretty much work the way they sound. Within LA we perform our data profiling and data discovery, actively seeking out those hidden data stores that secretly run the business.

We start GAM once the target system is becoming clearer, mapping the (now) known sources to the target and dealing with the gaps this reveals. The mapping element of GAM is more amenable to classic planning but of course the GAPs are unpredictable.

MDE should be planned down to the minute by minute cutover timeline but also the design, build and test elements are also intrinsically plannable once the proceeding modules have done their thing. Of course any late discovered gaps are unpredictable but we will see later how they can be reduced to planned segments of activity.

LA is usually executed by the client after I have left site but is equally rigorously planned.

So we are going from unplanned to rigorously planned over the course of the program. Our initial project plan should reflect this. It should show the dependencies with the bigger programme especially for the inputs to GAM and the timeframes involved.

But how do we control the landscape analysis phase?

Here we flick to a semi-agile approach and apply time boxing. We know the end date for this phase because of the plan timeframes. We perform our analysis within this time box and prioritise, prioritise, prioritise. One of the golden rules of PDMv2 is 'No organisation wants, needs or will pay for perfect quality data'. We deliver what we can in the time available.

If this is genuinely in sufficient then we shout it out early through the DQR process and the programme has to either provide more time, more resources or accept reduced quality.

The data quality rules (DQR) module within PDMv2 is the glue that holds the technical stream and the business engagement stream together. Sitting astride the two it brings the business and technical players together regularly to prioritise one activity over another. It give a duration and a mini 'fix it' plan (even if that plan is 'ignore') to each issue and then manages and reports on these mini plans. This turns unplanned activity into planned activity and resolves issue number 3.

The key measure from GAM onwards is migration readiness - how many units of migration will migrate today, how many next week, how many the week after and so on up to cutover. We take the management outputs from GAM (mappings), DQR (gap resolution) and MDE (ETL build and test) to create a single view of actual and planned readiness.

And hey presto we have turned an unplannable work stream into a controlled and planned one.

Formal planning and re-planning is the fig leaf that covers the naked truth of plans not credible in the first place. We diligently report on them as if they were in any way a guide to the future when all along we know, but fail to acknowledge that just around the corner yet another planning cycle looms.

I’m reminded here of a sentiment attributed to the Duke of Wellington after the battle of Waterloo - the French, he said, made their plans like the fancy harnesses on the horses of cavalrymen on parade. They look magnificent but one broken link and they are useless.

The Iron Duke however made his plans like the harnesses on the a yeoman’s carthorse. Not decorative but if a link broke he could always 'Tie another bloody knot in it'. With PDMv2 we make a series of smaller plans and control the whole with a view to the end point - how can we be sure that we will reach the go-live goal on time and with data of sufficient quality to answer both technical and business requirements?

In the next blog in this series I will be looking at using a more iterative approach and its benefits but still within an overall waterfall framework. When it comes to agile - well that has its own challenges and for that I promise a white paper with an accompanying blog in the early autumn.

Johny Morris
Follow me on Twitter @johnymorris #PDMv2