Input Validation
Input validation ensures that only correctly formatted data enters an information system, preventing malformed data from entering the system and causing malfunctions in downstream components. Neglecting input validation can lead to significant consequences.
The 2024 CrowdStrike incident
On 19 July 2024, American cybersecurity company CrowdStrike distributed a faulty update to a configuration (or channel) file for its security software that caused widespread problems with computers running Microsoft Windows. As a result, roughly 8.5 million systems crashed and were unable to properly restart in what has been called the largest outage in the history of information technology.
The CrowdStrike failure was a compound failure. A failure that could, and should, have prevented by multiple steps in the business process.
How input validation could have prevented the outage
CrowdStrike provides its customers with a kernel driver named
Falcon Sensor. The failure was not caused by an update of
this kernel driver, but by a configuration update file that it uses. It
turned out that this “channel” file was malformed and contained only
zero’s. This caused a logic error in the CrowdStrike
Falcon Sensor that resulted in the blue screen of
death, bringing down 8.5 million devices worldwide.
By implementing the right input validation the
Falcon Sensor could have prevented the crash. The program
should assume that the input file could be wrong. See the OWASP Input
Validation Cheat Sheet for actionable guidance for providing Input
Validation security functionality in your applications.
Compound failure
So if this could have been prevented by proper input validation, why is it considered a compound failure?
The outage could also have been prevented or reduced when:
- The
Falcon Sensorimplemented exception handling around the loading of the “channel” file. - The
Falcon Sensorwould be developed using Test-driven Development, assuming that the developer would have identified the possibility of a corrupt “channel” file while writing the tests. - The CI/CD pipeline at CrowdStrike included steps for validating channel files.
- The CI/CD pipeline at CrowdStrike included staged adoption.
- Microsoft decided not to sign kernel drivers which behavior could be modified by configuration files
- Microsoft decided to create a process for validating configuration files for kernel drivers
- … and probably many more
This list aims to show that this was not the failure of one person, but rather a result of insufficient business processes.
“A bad system will beat a good person every time.”
— W. Edwards Deming
References
- Input Validation Sheet Sheet - OWASP
- File Content Validation - OWASP
- Error Handling Cheat Sheet - OWASP
- 2024
CrowdStrike incident - Wikipedia
- Workaround - CrowdStrike
- CrowdStrike Update: Latest News, Lessons Learned from a Retired Microsoft Engineer - YouTube: Dave’s Garage
- CrowdStrike IT Outage Explained by a Windows Developer - YouTube: Dave’s Garage
- Test-driven Development - Wikipedia
- CI/CD - Wikipedia
- Phased adoption - Wikipedia