The issue was linked to an update from cyber security firm CrowdStrike.
The flawed update only hit Microsoft Windows systems - with many users worldwide seeing the “blue screen of death” as their computers crashed and could not be restarted.
Richard Forno is Principal Lecturer in Computer Science and Electrical Engineering at the University of Maryland, Baltimore County.
OPINION
The global information technology outage on July 19, 2024, that paralysed organisations ranging from airlines to hospitals and even the delivery of uniforms for the Olympic Games represents a growing concern for cyber security professionals, businesses and governments.
The outage is emblematic of the way organisational networks, cloud computing services and the internet are interdependent, and the vulnerabilities this creates. In this case, a faulty automatic update to the widely used Falcon cyber security software from CrowdStrike caused PCs running Microsoft’s Windows operating system to crash. Unfortunately, many servers and PCs need to be fixed manually, and many of the affected organisations have thousands of them spread around the world.
For Microsoft, the problem was made worse because the company released an update to its Azure cloud computing platform at roughly the same time as the CrowdStrike update. Microsoft, CrowdStrike and other companies such as Amazon have issued technical workarounds for customers willing to take matters into their own hands. But for the vast majority of global users, especially companies, this isn’t going to be a quick fix.
Modern technology incidents, whether cyber attacks or technical problems, continue to paralyse the world in new and interesting ways. Massive incidents such as the CrowdStrike update fault not only create chaos in the business world but disrupt global society itself. The economic losses resulting from such incidents – lost productivity, recovery, disruption to business and individual activities – are likely to be extremely high.
As a former cyber rsecurity professional and current security researcher, I believe that the world may finally be realising that modern information-based society is based on a very fragile foundation.
Interestingly, on June 11, 2024, a post on CrowdStrike’s own blog seemed to predict this very situation – the global computing ecosystem compromised by one vendor’s faulty technology – though it probably didn’t expect that its product would be the cause.
Software supply chains have long been a serious cyber security concern and potential single point of failure. Companies such as CrowdStrike, Microsoft, Apple and others have direct, trusted access into organisations’ and individuals’ computers. As a result, people have to trust that the companies are not only secure themselves, but that the products and updates they push out are well-tested and robust before they’re applied to customers’ systems. The SolarWinds incident of 2019, which involved hacking the software supply chain, may well be considered a preview of today’s CrowdStrike incident.
CrowdStrike CEO George Kurtz said “this is not a security incident or cyber attack” and that “the issue has been identified, isolated and a fix has been deployed”. While perhaps true from CrowdStrike’s perspective – they were not hacked – it doesn’t mean the effects of this incident won’t create security problems for customers. It’s quite possible that in the short term, organisations may disable some of their internet security devices to try to get ahead of the problem, but in doing so they may have opened themselves up to criminals penetrating their networks.
It’s also likely people will be targeted by various scams preying on user panic or ignorance regarding the issue. Overwhelmed users might either take offers of faux assistance that lead to identity theft, or throw away money on bogus solutions to this problem.
What to do
Organisations and users will need to wait until a fix is available or try to recover on their own if they have the technical ability. After that, I believe there are several things to do and consider as the world recovers from this incident.
Companies will need to ensure that the products and services they use are trustworthy. This means doing due diligence on the vendors of such products for security and resilience. Large organisations typically test any product upgrades and updates before allowing them to be released to their internal users, but for some routine products such as security tools, that may not happen.
Governments and companies alike will need to emphasise resilience in designing networks and systems. This means taking steps to avoid creating single points of failure in infrastructure, software and workflows that an adversary could target or a disaster could make worse. It also means knowing whether any of the products organisations depend on are themselves dependent on certain other products or infrastructures to function.
Organisations will need to renew their commitment to best practices in cyber security and general IT management. For example, having a robust backup system in place can make recovery from such incidents easier and minimise data loss. Ensuring appropriate policies, procedures, staffing and technical resources is essential.
Problems in the software supply chain like this make it difficult to follow the standard IT recommendation to always keep your systems patched and current. Unfortunately, the costs of not keeping systems regularly updated now have to be weighed against the risks of a situation like this happening again.