An executive understanding of CrowdStrike and the software outage
7 min | Neil Khatod | Article | | Information technology sector
The recent outage due to CrowdStrike’s software misconfiguration has caused $5.4 billion (and counting) in direct damages to the global economy. As such, many executives are trying to figure out what happened and what is a good next move. With several of CrowdStrike’s competitors taking advantage of the situation, I thought it would be useful to lay out what happened in easy-to-understand terms as well as give enough background to understand how CrowdStrike became so integral that their outage impacted 25% of the Fortune 500 (Axios.Com)
Let’s start with what is CrowdStrike and why are they a player. CrowdStrike started in antivirus in 2011. However, they shifted focus to mapping advanced cyber threat actors in 2014. 2014 was a critical year for cyber security for two reasons:
1. Lockheed Martin published the cyber kill chain paper
2. North Korea (according to the FBI hacked Sony pictures. During this time, CrowdStrike was at the epicenter of a revolution in cyber security.
From then on, CrowdStrike began growing through organic growth as well as via acquisition. They expanded their cyber threat analysis with advanced tools that detected logs created via network traffic and actions on the endpoint (or computer). The tool they rolled out is Falcon X. As of 2020, CrowdStrike has been seen as one of the big three in cyber security as they were called to testify before congress on their understanding of the Democratic National Committee breach.
So, with a company that big and advanced, what went wrong? To answer this, I will dissect what happened on July 19th to show best practice and how last week CrowdStrike deviated from industry standards.
Learn more about our Cyber Security Solutions Service
Cyber threat intelligence
CrowdStrike’s major contribution to cybersecurity has always been in their ability to research malicious cyber actors, their techniques, and the vulnerable systems. They do a great job at assessing the severity, and if warranted turning those techniques into something that companies can search on their network, see the diagram below:
1. CrowdStrike’s observes new techniques Malicious cyber actors.
2. CrowdStrike’s decides the threat is credible and put together a plan to detect and prevent.
3. Developers at CrowdStrike use the indicators of compromise to build code that detects and mitigates on computer endpoints.
4. The new code is tested in a cloud sandbox with a mock-up of every environment it may encounter.
5. CrowdStrike certifies the update and deploys to clients.
6. CrowdStrike’s cloud serves set up a channel to communicate with all customers computers the new codes and file.
7. Once deployed the new SYS files (from CrowdStrike) are used to evaluate the windows processes at the root level. On Friday, C-00000291.sys delivered from CrowdStrike with a timestamp of 0409 (Greenwich Mean Time) caused internal windows processes (called pipes) to stop working. This caused the major outage.
Why the test environment is critical?
They then should run these new rules against a test environment that has a sandbox of all the various technologies their rules will make changes on the endpoints and servers. The endpoint agent (Falcon X) can make code changes at the root process level on every machine it touches (see points 5-7 on the diagram). This is why the test environment is critical. This is where CrowdStrike Failed, but this is a standard that has been applied to every software since the advent of Windows 3.11 (circa 1990).
After testing, the quality control team issues to the software to the operational network for implementation. The new rules get loaded into cloud-based servers for deployment to clients. From there the servers establish a channel with each client computer.
Once the rules are deployed to the computer, CrowdStrike Falcon Software integrates the new rules and assesses the processes running on the computer. It is these processes (or pipes) that the software evaluates for legitimacy. On Friday, C-0000291.sys was stored on millions of computers in the 291 channels. It then misevaluated and killed legitimate processes that were critical for windows to work.
By the time, CrowdStrike resolved the problem, the damage had been done. So, there is a great lesson here: Cyber tools get root access to the most sensitive parts of your network. Leaders (at all levels not just CISOs/CIOs) need to understand the impact Information Technology, and more specifically Cyber Security tools, have on their business operations.
Download the Cyber Security Talent Report 2024 | Hays US
How our cyber solutions can help your organization?
For technical leaders who must evaluate products, it is critical to ask questions on how the tools work as well as how they validate functionality before going live. If you can’t do it yourself, hire a third-party consultant who can specialize in this for you. Too often I have watched companies employ an industry leader, yet they are disenchanted with the performance of those tools. I am sure that if truth were told many have buyer’s remorse for CrowdStrike. I can say that in my former role, we chose a smaller competitor to CrowdStrike, and we were pleased with both performance and ROI. I am happy to further discuss with anyone who has a need.
Learn more about our Cyber Security Solutions Service
About this author
Neil Khatod – Head of Cyber Security, Hays Americas
Neil Khatod brings decades of experience to the table. A Retired Army Officer with over 29 years of experience as a chief operating officer and chief executive officer in cyber operations, international relations, leadership development and strategy creation, Neil combines technical prowess with strategic vision.
Neil managed a $1.9 billion cybersecurity budget, built the U.S. army’s world-class cyber analytics lab studying malware, and improved the army’s cybersecurity posture by 200%. His approach is pragmatic, ensuring that security aligns seamlessly with business objectives.