close
close

Global IT outage disrupts key sectors, prompts calls for reform

A major global IT outage has severely disrupted industries around the world, including airlines, media and banking. The disruption, one of the largest since 2017, has highlighted the vulnerabilities inherent in shared cloud-based platforms and brought significant criticism of the risks associated with relying on them.

Tom Simnett, founder and director of Manchester-based technology firm Initforthe, expressed concerns about the reliance on cloud services. “The unfortunate consequence of everyone relying on cloud platforms that everyone uses is that when they go down, everyone goes down and that’s hugely disruptive,” he said. Simnett stressed that unified systems like Office 365 streamline business processes, but they also create a single point of failure, potentially exposing companies to vast operational risks.

Dafydd Vaughan, Chief Technology Officer at Public Digital, echoed similar sentiments. Vaughan, who played a key role in the UK’s Government Digital Service, said today’s outage illustrates the vulnerability of a connected digital world. “This issue appears to be most likely a bug – a faulty update sent to hundreds of millions of computers around the world,” he said. This ongoing issue highlights the need for companies and governments to develop robust mitigation strategies to protect against such outages.

Vaughan explained that a major contributing factor to the crisis was the lack of gradual testing of updates before their widespread rollout. He urged organizations to adopt a gradual rollout process, testing updates on a limited number of machines to ensure compatibility and functionality. “Today’s crisis could have been avoided if companies had first rolled out computer updates to a few machines to see if they worked, rather than sending them to all machines at once,” he said.

Similarly, Tom Marsland, vice president of technology at Cloud Range, pointed to the challenges of recovering from such a disaster. The spokes involved in the recovery process, including manually booting affected computers into recovery mode and administering patches one by one, can be laborious. “It’s not something that can be done remotely, and in many organizations it will require an administrator. That means someone in IT support will be going from computer to computer and doing it manually,” Marsland noted.

From Marsland’s perspective, the disaster was entirely avoidable if proper change and configuration management had been implemented. He emphasized the need for comprehensive internal testing and phased patching to quickly catch and fix potential vulnerabilities. “For larger organizations, this will take days, probably weeks. Unfortunately, as is the case with many cyberattacks, this is nothing new. Organizations’ failure to follow best practices in testing and patching… is a major cause of this,” he emphasized.

Vaughan added a broader critique of monopolistic tendencies in the digital services industry. He stressed that the concentration of control among a few companies can increase risk, justifying increased competition and diversification of providers of essential digital services. “The government needs to consider the risks of having such a small number of companies controlling such a large part of our essential infrastructure,” he said. “That also introduces risk. We need to balance the rewards with the risks and be aware that these kinds of problems can — and increasingly will — happen.”

The outage and the resulting recriminations from the digital community underscore the urgent need for more resilient infrastructure, stricter testing protocols, and a more diverse ecosystem of digital services. As technologies advance and become increasingly intertwined with everyday life, ensuring their reliability and security becomes a priority.