Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Can we blame Windows for CrowdStrike outage?
30 points by pbrw on July 19, 2024 | hide | past | favorite | 48 comments
CrowdStrike update causes a Windows OS to crash and not even starting. It's definitely CrowdStrike's fault but I feel that good OS should prevent a 3rd party app to cause such damage. Can Windows take part of the blame for that outage? Would it happen on Linux?


Kinda. Yes. Apple moved away from deep kernel extensions years ago. They are no longer permitted on their latest releases.

Of course something like an EDR requires kernel level access otherwise it's too easy to bypass. But Apple has system extensions as a useful compromise. They're basically kernel level APIs that can be called by validated signed software. I think it's a good alternative to just allowing random code to run in the kernel.

The thing is, Apple has a habit of going to software vendors and saying: "We're changing this next year. There'll be a 2 year deprecation period and after that we'll lock you out. So change up or die off. We don't care."

Microsoft doesn't really do this and even if they do there's a lot of ifs and buts. They're much more receptive to the concerns of legacy software vendors because they represent a much bigger share of their market and the customer base (enterprise market) that cares about legacy is also very big and vocal.

Needless to say this is also the customer base that got heavily hammered by what happened today. But nobody thinks about that until it actually happens.


Back around 2000 or 2001 McAfee or Symantec (I can’t remember) released some virus definitions that caused Macs to kernel panic repeatedly. I worked at a college at the time which required students run it, and had to deal with the fallout.

OS X was using Unix back then as well, and the foundational design didn’t save it. But like you mention, Apple does more to protect the underlying system today than they did back then. I can’t even remember the last kernel panic I had. They used to be a semi-regular occurrence.


Given the dates 2000/2001 you mentioned it sounds more like Classic Mac OS than Mac OS X (which wasn't released until March 24, 2001)


Sorry, my timeline was messed up. It was around 2004/5.

If was OS X for sure, Tiger to be specific.


> OS X was using Unix back then

IIRC OS X, and MacOS, are and have been certified UNIX for 20ish years.


There are Linux distributions that have been certified UNIX, it really doesn't mean anything anymore.


Back in Windows Vista Microsoft removed access to the kernel making the CrowdStrike incident impossible. Microsoft had stated that no companies would be allowed to access Vista's core for security reasons, but Symantec launched an official complaint over the matter with the European Union and eventually Microsoft caved in.

https://arstechnica.com/information-technology/2006/10/7998/


> Would it happen on Linux?

It absolutely would. Windows is way ahead in terms of driver isolation and stability. Most drivers cannot bring down the system (your GPU driver can crash, your screen will flicker and maybe apps that were using it will crash, but the system recovers). Not so on Linux unfortunately, any driver will bring the system down.

More recently certain class of drivers have been making use of ebpf or virt2, which helps in isolating the driver. But I know for a fact that CS on Linux is as low level as it can be.

So long story short only MacOS is resistant to this, because they've simply deprecated any third party kernel extensions/modules.


> your GPU driver can crash, your screen will flicker and maybe apps that were using it will crash, but the system recovers

Only if the userspace part of the driver has crashed. It's not really relevant to what happened here (it was kernel code).


Yes, Crowdstrike specifically disabled Linux systems in the past: https://news.ycombinator.com/item?id=41005936


On my work machine, crowdstrike is using ebpf. Not loading any kernel modules.


> Not so on Linux unfortunately, any driver will bring the system down.

I work for a company that provides secure endpoints that are Linux based (and can also run Windows apps without issue). We do not ship Linux kernels that contain drivers that cause crashes.

Per our IT patterns and also mandated by our commercial contracts, we investigate any crashes that occur. Over the last five years, not a single crash has been caused by a device driver.

We also will never have a use for the style of security software that Crowdstrike has. Our security stack is proactively preventative, not reactive or looking for "anomalies".

So this "it absolutely would [happen on Linux]", should have an asterisk, a huge one glaring like the sun, and without said asterisk is inaccurate at an extreme.


It is a fact that Linux driver crashes causes a kernel crash, and it’s not that rare (like usb drivers). Even GPU drivers cause this [0].

But good job on not having a crash from a driver so far. Raspberry Pi users weren’t that lucky for example.

[0]: https://forums.developer.nvidia.com/t/bug-report-455-23-04-k...


The Linux kernel images my company ships do not use the shit driver you are citing. We intentionally do not deploy hardware with these chips for this reason.

Yes there are tons of Linux images out there built by stupid people and thus crash all the time, which is not Linux' fault but the fault of those stupid people. As already should have been clear enough, or so I thought in the comment you replied t: my company doesn't ship stupid images built by stupid people.


> It is a fact that Linux driver crashes causes a kernel crash

See what was already written in the comment you replied to:

  We do not ship Linux kernels that contain drivers that cause crashes.
We do not ship drivers that crash, hence they do not cause a crash.


CS needs some hardcore permissions to scan items on a VERY low level. So they can essentially push out shit code to screw up any OS. Including Mac and Linux. Don't see how MS/Windows are to blame for someone not testing a content update before pushing to the whole damn world. They should at least push out to a subset of users before destroying everyone's Friday!


I think this is a cultural issue with the Windows world. Look at Projects like ssh and git, who exist for both, but on Windows, they get shipped without documentation. Not even for the Windows-specific things. Want to know where to put your GlobalKnownHosts on Windows? F** you.


Isn’t that better explained by the fact that neither were created with Windows in mind?

git was literally created to develop Linux.


The code running in the kernel can do anything, including crashing the machine on boot. The only real way to prevent it is to disallow third-party kernel code.

If you're sure you don't want to freely use your machine just because third-party code can be dangerous, then yes, you can blame Microsoft for not taking control away from you.


Yes.

MS provided a kernel-level entry point that other OSes didn't need.

MS have an aggressive auto-update policy that is anti-best practise.

MS have a signed binary agreement that doesn't catch the things it is meant to.


Crowdstrike and other tools that have this access is for them to update their agents so that in case they see a ransomware or attack pattern to push it out to as many devices as possible to stem the attack. Do you need all this crazy level of kernel access, probably not, I hope they will have some refactoring efforts in the future.


> Do you need all this crazy level of kernel access, probably not

You absolutely do. Otherwise, you'll be unable to detect malware that IS putting itself into the kernel.


Windows might be partly responsible for handling corrupted files poorly. According to this other post, the offending file was filled with NULs:

https://news.ycombinator.com/item?id=41009740

Such a file will not have the right signature and checksum to be considered a valid executable. Either these were not checked, or whatever was doing the checking responded poorly and left the system in a state that's difficult to recover.


You might be interested in this story about crowdsec killing people's Linux VMs 3 months ago: https://old.reddit.com/r/debian/comments/1c8db7l/linuximage6...


Windows and BSODs maybe still are trigger words for me, but IMO they are not to blame.

Windows is the platform, CrowdStrike makes a product on this platform and their users willingly install, accept all the security prompts and use it. Short of Apple-style locking users out of their own devices, there is little they can do here.

I'm not well-versed in Windows enough to be able to tell if they provide better ways and safe APIs to achieve what CrowdStrike does, but even if they did, there is no telling if CrowdStrike or anyone else would use that or not.

> but I feel that good OS should prevent a 3rd party app to cause such damage

It does? It also allows the owner of the machine to bypass those preventions, which is what CrowdStrike seem to require for their product to function.

I think the "OS should protect me from myself" is a very iPad-style expectation from computers. Personally I'm happy there are OSes that don't work this way.

> Would it happen on Linux?

$ modprobe crap-mod

I guess it would.


I think the Linux market is far too fragmented for this kind of thing. I usually wait a few days before applying any updates, and definitely don't do them when I have something important coming up.


As far as I understand, yes. It's kind of astounding to me that the world has self-inflicted what is essentially a cyber attack trying to protect a poorly architected OS from actual cyber attacks when a much better architected OS is known and running on nearly all the servers in the world.

On Windows, software regularly mucks around in the kernel (device drivers, system level tools like wireshark, etc), therefore it is also necessary for security software like CrowdStrike to also muck in the kernel so it can monitor what all the other kernel level software is doing. As demonstrated today, anything that mucks in the kernel runs the risk of crashing the kernel.

In Linux, software doesn't even get that option. Nothing ever gets kernel access except the kernel itself. Root is not kernel access. The kernel still decides what root is able to do. Drivers that require that access are built into the kernel. Software that requires deeper access like Wireshark tells the kernel what to do (through system calls as root) and the kernel does it on that programs behalf. Therefore, the kernel knows everything that any program does on the system. With a trustworthy kernel, all that security software must do is instruct the kernel to monitor activity on it's behalf.


> poorly architected OS

worth noting Microsoft had a solution a few years ago that would of prevented this issue from happening, Windows 10X, due to atomic updates.

> In Linux, software doesn't even get that option. Nothing ever gets kernel access except the kernel itself. Root is not kernel access.

root has kernel access, even if the kernel restricted it, it can write to the disk and change the boot process.

also worth nothing that a popular form of software distribution on Linux is curl http://randomscript.sh | sudo sh which is arguably worse than anything on Windows.


I've yet to hear a convincing argument that any other installation method is more "secure".


Yeah that's true. Microsoft really needs to push forward with a new architecture at the core of windows. Stuff like what has happened today is inevitable under the current model where so much stuff has kernel level access. I just expected it to happen with something like anti cheat that doesn't have quite the oversight that I would assume CrowdStrike has in comparison.

Root has access to the kernel but the kernel knows everything that happens and that's my point. The kernel won't stop you from compiling a new kernel and setting it to run at the next boot. However, CrowdStrike running on Linux with eBPF for example would be able to identify and prevent such tampering without truly being in the kernel itself.

The most common way to install software on Linux is from your trusted distro repositories and from Flathub or the Snap store. Grabbing a script from the internet and piping it to a root shell is bad and something I'm sure we've all done. But take the most installed program on Windows which is likely Chrome, it really doesn't do anything differently. You download a small executable which requests admin, then it proceeds to download Chrome and install it. I'd argue grabbing a script might be the safer option because unlike installer executables from the internet, you at least have the option to read the script before running it if you choose.


> In Linux, software doesn't even get that option. Nothing ever gets kernel access except the kernel itself.

what about loadable kernel modules?


How many more cases and years you need to finally admit that windows is a shitty platform?

Yes all platforms seen bad days and have their own issues.

But windows is different - most of its troubles come from human factor - low quality people making (or made 10 years ago) low quality decisions and every new iteration is just another layer of bad decisions aimed to cover the holes in previous layer.

You can see and feel it even un ui - there is no esthetics nor real usability and never was.


I have dealt with an apple upgrade that ignored free disk space and let users continue to upgrade eventually filling up the drive. This will happen with any company with humans that can make mistakes. And apple suffers from horrible ui issues also. Most users just tend to ignore them. On Macos, activate mission control and try to lock the screen with the keyboard shortcut.


Cherry picking specific examples doesn’t prove anything.

Overall windows ux is not even close to what apple have invested in their ecosystem. It’s not even apple to oranges comparison.

Even if there are few nice and handy features here and there.


Youre right. The the UX is comparing apples to oranges. But if I can't can't even do a basic secure thing like lock my screen, is it though...

Apples can still be compared to oranges.


Windows and macOS are both total shit.


For serious commercial servers - for sure, there is currently no alternative to linux


That's really my point.


A key moment about Windows for me is when I did read the windows official driver development guide and discovered that the real name of "bsod" is in fact "BUGCHECK".

Then you understand that it is not the core that crashed, but if there is any error, in any driver, the mandated behavior is to trigger a "BUGCHECK" in the same way that you would just do a printf(error) usually...


A big problem is that the Windows platform has normalized the idea of 3rd party software running at the kernel level, and end users allow it because it's so normal. They've also normalized the idea of 3rd party software (even games) requiring Admin access to run, which is not as bad but a similar threat. Software that routinely requires elevated privileges seems to be a bad idea.


"They've also normalized the idea of 3rd party software (even games) requiring Admin access to run..." They have? Like what?


I want to know who wrote the part that appears to be crashing on a file full of NULs. Cloudstrike's code or Microsoft's?




HN commenters will defend the quality of Windows.

Windows is closed source for the vast majority of people who use it. No one oustide Microsoft, not even those who may have signed NDAs and can read some of the code, is free to edit the Windows source and recompile. If a Windows user wants to prevent something like this outage from happening, he cannot obtain the Windows source and make changes to prevent it. Instead he is encouraged (perhaps compelled) to let Microsoft remotely install and run new code any time it wants.


debatable but shit thats a bad pr


and it just disclosed how much life and death situations rely on windows.

needs better safeguards.


You can blame whoever you want. Personally, I blame pbrw.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: