A small factor more than aircraft in the short term. Zero in the medium term when launch companies vertically integrate propellant production based on renewables. A net positive once we start relying on non-terrestrial resources instead of poising our planet to extract metals and power.
Ignoring the positive externalities, there is still some environment cost when using synthetic CH4. You're still taking water from the ground and moving it into the upper atmosphere (CO2 + ground H2O -> CH4 + O2 -> CO2 + atmospheric H2O) - which contributes to the greenhouse effect.
It's not worth worrying about, given all the other more impactful ways we can move the needle on global warming, but it's not actually zero.
Learning a programming language is easy. Learning software engineering takes months even just for the basics. The first time you learn to code, you’re doing both. Just hang in there: it gets easier with time.
Better question, why not use whatever hash is fastest on the system in question? We're talking about a RNG here -- it's not like we need to make sure that two different systems produce the same random numbers!
My answer to that is "maintainability". The more flexibility and moving parts we add to the RNG, the more things that can break. The delta from "one hash function" to "two hash functions" would involve not just adding a second hash function, but also a bunch of configuration code, target-specific logic, fallback handling, etc. There are plenty of places in the kernel that need to care about this kind of logic, but I don't believe the RNG is one of them, and I don't particularly want to require the RNG maintainers to spend their time caring about it.
Additionally, it's not just the hash function that would matter for speed here, but also the expansion. Linux RNG uses ChaCha20 for that, so if you were going all-in on target-specific speed, you'd need additional logic for swapping that out for a hardware-accelerated cipher (probably AES, which would introduce even more considerations given that it has a 16-byte block size, vs ChaCha20's 64-byte blocks).
The title mentions performance, but it is not the primary motivation AFAICT. It is only mentioned to say “it is not slower”.
The main concern was security, so it makes sense to use BLAKE2, which benefits from existing cryptanalysis of the ChaCha20 permutation, which is already used in the RNG for number generation.
(And it makes sense to use BLAKE2s in particular, to support non-64-bit systems without penalty.)
Using a single hash (instead of picking one at runtime) simplifies the attack surface IMO.
The argument here is that many (most?) Linux systems have access to a hardware SHA2, which is equally secure (in this setting) but faster. Attacks on SHA2 or Blake2 (really, of SHA1, for that matter) aren't how viable real-world attackers on the LKRNG are going to happen. It's good to see SHA1 getting swept away for other reasons though!
Do most Intel (and by extension, Linux) machines have SHA2, though? I think it’s a pretty recent extension and at least initially, they were only shipping it in their low-end embedded models.
> I haven’t been able to buy a CPU that doesn’t have SHA2 acceleration for an number of years now.
This is incorrect. Intel only launched their 11th gen desktop processors March 30, 2021. The 10th gen and earlier desktop processors do not have the SHA instructions. You can still buy a new i9-10900k from Newegg today.
(Note that 10th gen Intel mobile/laptop processors are a different micro-architecture, and do support SHA.)
Edit: Perhaps you're thinking of the AES instructions? They've been around a lot longer.
The argument that not all CPUs running Linux have hardware SHA2 is valid, and therefore can't be assumed. However saying because one (even a significant one... though it's arguable it hasn't been the majority for a while) doesn't and therefore it shouldn't be used seems shortsighted at best. For decades various minority features have been enabled in the Linux kernel. Since when is lowest common denominator the desirable target to utilize exclusively?
Maybe this is a silly question, but why should RNG even be part of the kernel in the first place? It's convenient having it in a device file, but why couldn't that be provided by some userspace program or daemon?
1. The kernel has access to unpredictable events that make good key fodder for the CSPRNG itself, which would be more annoying and less efficient to percolate into userland.
2. The kernel can assure all the processes on the device that they're getting random bits from a known source of unpredictable bits, and refuse to deliver those bits if its state hasn't been sufficiently populated; this has been a recurring source of systems vulnerabilities in programs with userland CSPRNGs.
To that, add the convenience factor of always knowing how to get random bits and not having to set up userland dependencies and ensure they're started in the right order &c.
You should generally use your kernel's CSPRNG in preference to other ones, and break that rule of thumb only if you know exactly what you're doing.
1. The kernel has much more access to sources of indeterminism than a userspace application does. Things like disk seeks, packet jitter, asynchronous interrupts, etc. provide lots of "true" entropy to work with. Userspace programs, on the other hand, have very deterministic execution. In fact, the only way to introduce true indeterminism into a userspace program is to query a kernel-mediated resource (e.g. system call, shared memory mapping, etc.), or to invoke an unprivileged and unpredictable hardware instruction, of which there are very few (e.g. RDRAND on x64, LL/SC on ARM).
2. Userspace programs cannot be made as robust to accidental or malicious failures. Even if you have a userspace RNG daemon that randomly open files or sockets to extract entropy, what happens if that daemon crashes? Or it fails to open a file or socket? Or an exploit gets an RCE into the daemon to read privileged files? By contrast, the kernel is already performing all these operations for userspace processes, so it might as well measure those things and stick the results into its own entropy pool to hand out to other processes on request.
The kernel needs randomness too. If it's done there right once, why duplicate the effort in userspace (and watch all the hilarious ways userspace solutions fail)?
getrandom was motivated by the fact that using device files has many failure modes.
Oh, nice, a good 3rd security reason! A userland CSPRNG is essentially just an additional single point of failure, since you're going to end up depending on secure randomness in the kernel already.
A lot of bits of security rely on some level of non-determinism. Things like TCP initial sequence number generation, where every TCP connection sequence starts with a random number. There have been numerous attacks where the RNG wasn't good enough, so an attacker could determine the TCP sequencing and perform various malicious activities. Additionally, things like in-kernel VPNs, like IPSEC and WireGuard, also need RNGs for their own internal functions. Calling out to userspace for that would be painful and could potentially break in a lot of unexpected ways.
I don't recall exactly but i think TCP retransmission delay and handshake adds random to the backoff, to avoid a thundering herd situation which repeats again if all clients retry at same time.
Assigning free ports to applications that listen on a socket is also random, not sure why, feels like it could be sequential unless you want to deliberately obscure what ports are being used.
Address Space Layout Randomisation (ASLR) for one. ASLR moves around key parts of an executable to make exploitation of certain types of vulnerabilities more difficult or impossible.
The kernel can keep secrets from user space, which is necessary for maintaining a secure RNG state.
The kernel also has the hardware access that is used as entropy sources. If the RNG was in user space the kernel would have to provide some way of securely exporting that entropy to user space. It is simpler and more secure to just export the end result of random numbers through a simple API.
All modern OS have made the same decision of having a kernel-based CSRNG, for the same reasons.
Not a silly question at all. It's because having a reliable and uncompromised source of randomness is essential for cryptographic applications. Having the RNG in user space would make it more vulnerable to attack.
I normally care about reproducible rng results and do try to seed all rngs used. There are lots of applications where randomness is used but you still want repeatable behavior. ML experiments is one situation where randomness is common, but is also intended to be repeatable.
I think python standard library/several of data science library rngs are platform independent.
I wouldn't say the lack of a seed is the difference between a CSPRNG and a PRNG. That's more the difference between a CSPRNG and a stream cipher, where the "seed" is called a "key" and "IV".
PRNGs (Pseudo Random Number Generators) are predictable. CSPRNGs (Cryptographically Secure Pseudo Random Number Generators) aren't (if they're working). HWRNGs (Hardware Random Number Generators) that aren't debiased via a CSPRNG or similar produce nonuniform output (not suitable for cryptography or most other uses directly). TRNGs (True Random Number Generators) might not exist in this universe (deterministic interpretations of quantum mechanics are consistent with observation), it's safer to assume they don't and avoid the term entirely.
Basically every RNG you interact with on general purpose computers is a pseudo random number generator. The kernel RNG being discussed here is a CSPRNG, as was the one it replaced.
AFAICT, BLAKE2s (previously SHA-1) is only being used for the forward secrecy element, in this case mixing a hash of the pool back into the core state, which is actually still using ChaCha20 for expansion. From quick inspection (never read this code before) I think the number of bytes in the pool is 416 (104 * 4). (See poolwords defined at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...) For such relatively small messages and based on the cycles-per-byte performance numbers from https://w3.lasca.ic.unicamp.br/media/publications/p9-faz-her... (SHA-NI benchmarks), https://www.blake2.net/blake2_20130129.pdf (BLAKE2 paper), and https://bench.cr.yp.to/results-hash.html (comprehensive table of measurements), I don't see any performance reasons for choosing BLAKE2s over SHA-256. Rather, software SHA-256 and BLAKE2s seem comparable (and that's being charitable to BLAKE2s), and SHA-NI is definitely faster.
Perhaps there were other considerations at play. Maybe something as simple as the author's preference. One thing that probably wasn't a consideration is FIPS compliance--the core function is ChaCha20 so FIPS kernels require a completely different CSPRNG, anyhow.
One aspect switching from SHA1 to BLAKE2s does is it increases the total entropy a single compression operation adds to ChaCha20. This increases speed when folded BLAKE2s adds 128 bits per operation instead of folded SHA-1 that adds 80 bits. So that's two calls instead of four (I'm assuming they kept the folding). Another speedup comes from the fact the hash function constants aren't being filled with RDRAND inputs for every call.
Finally, I'm not completely sure if increasing the hash size itself adds computational security against an attack where the internal state is compromised once, and the attacker tries to brute force the new state based on new output; My conjecture is the reseeding operation is atomic, i.e. that ChaCha20 won't yield anything until the reseed is complete. There shouldn't thus be any difference in this regard. I'd appreciate clarification wrt this.
> argues BLAKE2s is twice as fast compared to SHA256.
That's for 16KiB inputs.
> One aspect switching from SHA1 to BLAKE2s does is it increases the total entropy a single compression operation adds to ChaCha20. This increases speed when folded BLAKE2s adds 128 bits per operation instead of folded SHA-1 that adds 80 bits.
But the question was why BLAKE2s instead of SHA-256, not SHA-1. SHA-256 has the same digest length as BLAKE2s.
BLAKE3 needs 16 KiB of input to hit the numbers in that bar chart, but BLAKE2s doesn't. It'll maintain its advantage over SHA-256 all the way down to the empty string. You can see this in Figure 3 of https://github.com/BLAKE3-team/BLAKE3-specs/blob/master/blak.... (BLAKE3 is also faster than SHA-256 all the way down to the empty string, but not by as large a margin as the 16 KiB measurements suggest.)
On the other hand, these measurements were done on machines without SHA-256 hardware acceleration. If you have that (and Intel chips from the past year do), then SHA-256 does a lot better of course.
>But the question was why BLAKE2s instead of SHA-256, not SHA-1. SHA-256 has the same digest length as BLAKE2s.
Two things come to mind. Firstly, does it really matter to speed? The reseeding interval of ChaCha20 DRNG (i.e. BLAKE2 call frequency) is 300 seconds and it runs in the order of milliseconds. Best bang for buck at this point would result from ChaCha-NI.
Secondly, there's the aspect of reducing reliance on an algorithm that suffers from length extension attacks. While LRNG itself doesn't directly benefit from BLAKE2s's indifferentiability, it helps in phasing out SHA-2 which is less misuse-resistant, and that might be misused elsewhere.
(Finally, no more pointless flame wars about "An algorithm created by the NSA is being used in the LRNG!!")
My vague understanding was that KVM-accelerated virtualization still let you use Virtio RNG, although I guess I don't know what the relative performance looks like at that point.
Virtio-rng is good for getting quality seed material into the guest, but you don’t want to use it for bulk generation. Some hypervisor environments impose arbitrary and low throttles on guest virtio-rng use (think kB/s).
I think speed is definitely a big part of this, though the speedup comes primarily from getting rid of superfluous calls to RDRAND. Blake2s is already in the kernel (after a fashion) for WireGuard itself; I don't think Blake3 is. An additional nerdy point here is that the extraction phase of the LKRNG is already using ChaCha (there's a sense in which a CSPRNG is really just the keystream of a stream cipher), and Blake2s and ChaCha are closely related.
So should we expect a blake3 switch sometime in the future? It seems to be a refinement of blake2 to make it more amenable to optimization while keeping most (all?) its qualities. Being well suited for optimization across architectures would also make it ideal for the kernel and it seems the reference implementation has already done a lot of the heavy lifting.
Just in case this isn't clear, BLAKE3 breaks a single large input up into many chunks, and it hashes those chunks in parallel. The caller doesn't need to provide mulitple separate inputs to take advantage of the SIMD optimizations. (If you do have multiple separate inputs, you can actually use similar SIMD optimizations with almost any hash function. But because this situation is rare, libraries that provide this sort of API are also rare. Here's one of mine: https://docs.rs/blake2s_simd/1.0.0/blake2s_simd/many/index.h....)
Journalists have a limited First Amendment right not to be forced to reveal information or confidential news sources in court. There’s a lot of caveats related to this though. It’s called “reporter’s privilege.”
So you're essentially saying: Stop posting these articles because I don't like them? Just skip those articles then. Nobody is forcing you to read them.
But by reading these articles, we can learn about people who shaped our field (some of whom I learn about for the first time when I read their obituaries), what their contributions were, how they overcame difficulties in their careers, and sometimes how they changed the world.
> You can’t modify your age, a huge risk favor, but you can modify your vaccination status, and you can modify your weight and general health.
Not unless you are among the 4% or so of the population that is immune compromised. God forbid if you or anyone you love ever needs an organ transplant. In a world with endemic COVID-19 that’d be a death sentence.
It still seems like the response here is disproportionate to the risk. Is COVID the only serious risk that immune compromised people face? What if an immune compromised person had caught the flu in 2018? Or one of any number of other diseases that circulate?
Life is extremely dangerous for someone without a strong functioning immune system.