It’s a proprietary enterprise security product so I think it’ll be difficult to get information until they give a proper post-mortem (if they do so). Here’s hoping someone can put it all together though.
Regardless, the kernel driver ought to have been statically analysed to detect this kind of memory hazard, or written in a language that prevents this class of bugs altogether. This is a priority of the US government right now, but CrowdStrike doesn’t seem to have got the memo.
For this Channel File, yes. I don’t know what the failure rate is - this article mentions 40-70%, but there could well be a lot of variance between different companies’ machines.
The driver has presumably had this bug for some time, but they’ve never had a channel file trigger it before. I can’t find any good information on how they deploy these channel files other than that they push several changes per day. One would hope these are always run by a diverse set of test machines to validate there’s no impact to functionality but only they know the procedure there. It might vary based on how urgent a mitigation is or how invasive it’ll be - though they could just be winging it. It’d be interesting to find out exactly how this all went down.
I’m a bit OOTL on what exactly a channel file is, being a Linux person, or how it relates to a driver. Are they in userspace, then? That would make it slightly less insane they didn’t check it thoroughly before their Friday update.
It’s a proprietary config file. I think it’s a list of rules to forbid certain behaviours on the system. Presumably it’s downloaded by some userland service, but it has to be parsed by the kernel driver. I think the files get loaded ok but the driver crashes when iterating over an array of pointers. Possibly these are the rules and some have uninitialised pointers but this is speculation based on some kernel dumps on twitter. So the bug probably existed in the kernel driver for quite a while, but they pushed a (somehow) malformed config file that triggered the crash.
Same. I can see some of it in between popovers about my account being suspended, getting rate limited, or of course “something went wrong”. I don’t understand why there are people who still only post there.
here’s a good overview of what happened https://www.thestack.technology/crowstrike-null-pointer-blamed-rca/
Lit, I’ve been waiting for this.
Edit: That’s mostly a high-level overview. Do you have some actual reverse-engineering you can point me to?
It’s a proprietary enterprise security product so I think it’ll be difficult to get information until they give a proper post-mortem (if they do so). Here’s hoping someone can put it all together though.
From what we have from CrowdStrike so far, the Channel File 291 update was to combat some use of Named Pipes in Windows malware.
This seems to have triggered a null pointer exception in the Falcon kernel driver as it loaded this Channel File. CrowdStrike say this is not related to the large null sections of one of the files but haven’t really explained what did trigger it.
Regardless, the kernel driver ought to have been statically analysed to detect this kind of memory hazard, or written in a language that prevents this class of bugs altogether. This is a priority of the US government right now, but CrowdStrike doesn’t seem to have got the memo.
I mean, even basic testing would have caught this. It’s not like it’s particularly infrequently triggered.
For this Channel File, yes. I don’t know what the failure rate is - this article mentions 40-70%, but there could well be a lot of variance between different companies’ machines.
The driver has presumably had this bug for some time, but they’ve never had a channel file trigger it before. I can’t find any good information on how they deploy these channel files other than that they push several changes per day. One would hope these are always run by a diverse set of test machines to validate there’s no impact to functionality but only they know the procedure there. It might vary based on how urgent a mitigation is or how invasive it’ll be - though they could just be winging it. It’d be interesting to find out exactly how this all went down.
I’m a bit OOTL on what exactly a channel file is, being a Linux person, or how it relates to a driver. Are they in userspace, then? That would make it slightly less insane they didn’t check it thoroughly before their Friday update.
It’s a proprietary config file. I think it’s a list of rules to forbid certain behaviours on the system. Presumably it’s downloaded by some userland service, but it has to be parsed by the kernel driver. I think the files get loaded ok but the driver crashes when iterating over an array of pointers. Possibly these are the rules and some have uninitialised pointers but this is speculation based on some kernel dumps on twitter. So the bug probably existed in the kernel driver for quite a while, but they pushed a (somehow) malformed config file that triggered the crash.
sorry, I haven’t looked if there’s a more detailed analysis yet
Unfortunately most of the stuff I see linked is Twitter, and I’m not in the walled garden.
Same. I can see some of it in between popovers about my account being suspended, getting rate limited, or of course “something went wrong”. I don’t understand why there are people who still only post there.
I’m sure somebody will do a proper write up in a few days.