I'd hazard to say that *every* design is "flawed" in some regards: there's no wa...

loup-vaillant · on Aug 15, 2018

The main point remains: server and desktop OS kernels are a critical piece of infrastructure, worthy of full machine checked verification.

In this context, it should be pretty obvious that a design that fails to minimise the trusted computing base is flawed. Even if a kernel vulnerability is unlikely to kill any given user, the sum of all the crashes, hacks, patching effort… is huge.

I bet my hat rewriting the entire kernel for all popular OSes would be far cheaper worldwide than not doing it. (Of course, this won't happen any time soon, because path dependence and network effects. Unless maybe someone works seriously on the 30 million lines problem described by Casey Muratory.)

> In other words, the ancient concept of "right tool for the job" still applies.

Absolutely. Monolithic kernels are clearly the wrong tool for the job.

zokula · on Aug 17, 2018

> Monolithic kernels are clearly the wrong tool for the job.

Says who exactly?

loup-vaillant · on Aug 17, 2018

Says Simon Biggs, Damon Lee, and Gernot Heiser. Have you even read the abstract?

> The security benefits of keeping a system’s trusted computing base (TCB) small has long been accepted as a truism, as has the use of internal protection boundaries for limiting the damage caused by exploits. Applied to the operating system, this argues for a small microkernel as the core of the TCB, with OS services separated into mutually-protected components (servers) – in contrast to “monolithic” designs such as Linux, Windows or MacOS. While intuitive, the benefits of the small TCB have not been quantified to date. We address this by a study of critical Linux CVEs, where we examine whether they would be prevented or mitigated by a microkernel-based design. We find that almost all exploits are at least mitigated to less than critical severity, and 40% completely eliminated by an OS design based on a verified microkernel, such as seL4.

If the effect is that huge, of course security trumps pretty much all other considerations. These are consumer OS kernels we're talking about. One that fails to more or less maximise security is obviously the wrong tool for the job. As the title of the paper suggests, in case you failed to read that as well.

Sporktacular · on Aug 16, 2018

If we can't make reasonably secure controllers with only 4KB of RAM, perhaps we shouldn't be making controllers with only 4KB of RAM.

Tech and market pressure will make more capable processors affordable if security is prioritised.

Dylan16807 · on Aug 16, 2018

Bah, don't conflate 'microkernel' with 'secure'.

For a small system like that, the shining ideal is probably a formally verified single program with no real OS to speak of.

Sporktacular · on Aug 16, 2018

I don't need to - the paper just did that by conflating monolithic kernels with security failures that were avoidable.

Dylan16807 · on Aug 16, 2018

That doesn't mean they are the only way to get security.

Sporktacular · on Aug 17, 2018

If you can afford formal verification and application programmers also fluent in hardware level coding, you can probably afford more RAM.

I'd want to see a solid paper comparing the security of the approach you mention to a microkernel+app. Until then we have this.

But agreed, there are other ways, of varying practicality, to achieving security.

naasking · on Aug 15, 2018

> For one, some desirable qualities contradict each other.

Such as?

badpun · on Aug 15, 2018

Security, performance, ease of use.

naasking · on Aug 15, 2018

I don't see why any of those are mutually exclusive. Capability based operating systems provide least privilege security, experiments with capability-based UIs have shown they are quite intuitive and secure because they align user actions with explicit access grants, and they don't perform any worse than monolithic or microkernel operating systems.

The problems are really the flawed mental models people insist on bringing to these problems, despite decades of research showing these models are irreparably flawed.

nine_k · on Aug 15, 2018

Capability chains are relatively more expensive to check than a bitmask, or just nothing (as e.g. in an embedded system). Also, performance asks for shortest code paths and out-of-order execution, security asks for prevention of timing attacks and Spectre-like attacks.

Security requires to identify yourself with hard-to-fake means, it takes time and effort (recollecting and typing a password, fumbling with a 2FA token); ease of use asks for trusting and immediate response (light switch, tv, etc).

Performance asks for uninterrupted execution and scheduling of tasks based on throughput; ease of use asks for maximum resources given to the interactive application, and scheduling based on lowest interactive latency.

Also feature set vs source code observability, simplicity vs configurability, performance vs modularity, build time vs code size vs code speed, etc.

naasking · on Aug 16, 2018

I'm not sure you and I are referring to the same thing by "capability". There is no chain of capabilities that needs to be checked, the reference you hold is necessary and sufficient for the operations it authorizes. In existing capability operating systems, this is a purely local operation, requiring only two memory load operations. Hardly expensive.

Every OS is vulnerable to the hardware upon which it runs, but capability security at least makes side channel attacks somewhat more difficult because of least privilege and limiting access to non-deterministic operations, like the clock.

Identity-based security built on an authorization-based model like capabilities also ensures it's difficult to make promises you can't keep. Access list models let you easily claim unenforceable properties and then people are surprised when they are easily violated.

> ease of use asks for maximum resources given to the interactive application, and scheduling based on lowest interactive latency.

That's unnecessary. You need a low latency upper bound, not a "lowest latency". This does not necessarily conflict with throughput, and furthermore, and install requiring high throughput and interactivity rarely overlap.

I'm not even sure what the rest of the properties are supposed to be about. I don't think most of those are mutually exclusive either.

amelius · on Aug 15, 2018

> In other words, the ancient concept of "right tool for the job" still applies.

Or use a multi-tool ;)

SteveNuts · on Aug 15, 2018

The trade-off with a multi-tool is additional complexity, more difficult maintenance, it's heavier, takes up more space, etc.

beat · on Aug 15, 2018

Multi-tools - for when you need to do a mediocre-to-bad job with lots of different things!