How many employers or clients require us to have specific industry qualifications in order to take up relevant roles - and even check that our claims are correct? Where is the regulatory oversight that says: ‘You must ensure that job role XX is filled by someone who has the relevant professional qualifications?’
During the last 15 years of my full-time career I had the term ‘architect’ in my job title. Six years into this time I was enrolled on an ISEB (now BCS Professional Certification) ‘architecture’ course. This took a few days, and although it wasn’t trivial, it was rather lightweight. My employer at the time was a large, respectable, company with a significant history.
There are of course many professional titles and disciplines other than architect, but it’s a good one to use for comparison.
Civil and Naval Architects undergo multi-year formal training, are overseen by professional bodies, and must carry substantial indemnity against malpractice. Such malpractice could cause significant financial loss and major loss of life.
Why are IT architects different? Not all our work is risk-critical, but a lot of it is. This trend is increasing.
Professional indemnity insurance is available to us. Are we required to take it up, or do our employers have to provide it? How many do?
In the civil building world, architects consider not just the details of structure, form and shape of the building they design, but must take into account a complex and mandatory set of construction and environmental regulations, consider the relationship of their building with its surroundings and, critically, understand the way the building will be used by real people. It often happens that the architect even uses their skills to attempt to change human behaviour - vis. Le Corbusier (although the way he wanted to change people may not be to everyone’s liking!).
Where are we on this scale?
Here’s an example:
A former colleague once worked in software development at British Airways (no: he was a programmer. Let’s stop the job title inflation nonsense that results in titles like ‘Waste Management Executive’). He said that the staff were always deeply respectful when ‘the architect’ appeared on their floor. Sounds like a serious operation, as you would expect from a long established major airline.
Roll forward a few years, and BA’s systems were knocked out by two simultaneous ‘data centre failures’ - whatever that really means. I don’t imagine it was the same architect in charge of systems design by then, but the individual is irrelevant: this should not happen.
BA is a good example to use. If an equivalent catastrophic failure had occurred in one of their aircraft, a whole suite of mandatory processes would kick in. BA, the aircraft manufacturer, and any other involved party would have been required to co-operate fully by the relevant national Air Accident Investigation Board, and no stone would be left unturned until the exact cause of the accident was understood by all. Then, the entire industry would be required to apply lessons learned.
Where is the equivalent mandatory discovery and learning process that covers the data centre failure?
You may argue that the data centre failure didn’t involve any death or injury. Can you be sure that the inability to process hundreds of thousands of passenger journey details was less damaging to those affected than a 747 crashing?
Another news story that has emerged in the UK recently. Because of a ‘problem with an IT upgrade’ (a frustrating but inevitable lack of detail in news reports!) the UK Health Service has caused around half a million women to miss routine breast scans. Estimates of additional deaths are in the range 135 to 270. Are we close to that 747? That is ONE system within the health service. There are lots more.
TSB, a major UK bank, suffered a multi-day outage in customer-facing systems - at least. It is not clear if non-customer-facing systems were affected. Many customers were unable to access their accounts, found accounts in error, and, in a number of cases were able to see details of other’s accounts. This isn’t the first such failure at a UK bank recently.
Of course, the bank is claiming that they ran comprehensive test migrations to prove the process. The response has to be ‘No, you didn’t. Whatever you did must have missed some significant pathway. That is not good test planning.’
Once again, how much damage, both economic and non-economic, results? If even a regulated financial institution can suffer this sort of computing failure, surely the whole computing industry should sit up and take notice, and tighten up self-regulation before government imposes regulation?
In the TSB case it appears that the problem was probably due to a rushed data migration, possibly because of senior management pressure. We’ve all been there... if the senior IT team there were regulated and qualified to a suitable standard, they would have been far better able to resist such pressure: ‘Our professional standards are externally regulated and do not permit us to comply with your demands due to excessive risk’.
Since then we have seen Visa, and now Mastercard, suffer significant outages.
The mantra in our industry is increasingly ‘move fast and break things’. All well and good, but suppose the entity ‘moving fast’ is an autonomous vehicle?
Of course, increased regulation, whether imposed from within or without, is likely to slow innovation. But there comes a point where any industry has ceased to need protection, and is large, mature and ‘mission-critical’ enough that the balance between freedom to innovate and liability for failure must shift. It’s happened in other industries, and I think ours is well past that tipping point.
This industry should set standards equivalent to those that are normal in the ‘traditional’ professions. If it doesn’t, governments will step in.
Note: BCS (and others) do a lot of good work to improve professional standards in the IT industry.