If you liked the first half of this site and want an extension, Cell Biology by the Numbers (2015, Milo, Phillips, https://book.bionumbers.org/) is great and has a similar intuition-building fun sense about size as well as various other measurements, including weight, time and energy at the atomic to micro-organism level.
Down, but the linked status page shows mostly operational, except for "Support Portal Availability Issues" and planned maintenance. Since it was linked, I'm curious if others see differently.
edit: It now says "Cloudflare Global Network experiencing issues" but it took a while.
Definitely! This makes me think of petrichor, the earthy smells released during rain, in urban environments is very local and memory triggering because it's tied to bacterial soil/rock/plant composition. There's something visceral about it for me because my default model of memories is audio and video playback, and the smell hits me with a forgotten dimension when I go back to a place and it rains.
Claude Shannon was interested in this kind of thing and had a paper on the entropy per letter or word of English. He also has a section in his famous "A Mathematical Theory of Communication that has experiments using the conditional probability of the next word based on the previous n=1,2 words from a few books. I wonder if the conditional entropy approaches zero as n increases assuming ergodicity. But the number of entries in the conditional probability table blows up exponentially. The trick of combining multiple n=1 of different distances sounds interesting, and reminds me a bit of contrastive prediction ml methods.
Anyway the experiments in Shannon's paper sound similar to what you describe but with less data and distance, so it should give some idea of how it would look:
From the text:
* 5. First-order word approximation. Rather than continue with tetragram, : : : , n-gram structure it is easier and better to jump at this point to word units. Here words are chosen independently but with their appropriate frequencies.
REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME CAN DIFFERENT NAT-
URAL HERE HE THE A IN CAME THE TO OF TO EXPERT GRAY COME TO FURNISHES
THE LINE MESSAGE HAD BE THESE.
6. Second-order word approximation. The word transition probabilities are correct but no further structure is included.
THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHAR-
ACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT
THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED
*
This is really one of the best things I've seen on HN in 15 years.
The mixed presentation of plug and play components interspersed with EE problems and solution really helped make it more accessible. It also got me excited about the possibilities and made me realize that we we might already approaching another open architecture DIY boom.
I got the sense that this is a side project, but I'm sure many have noticed that it could be a legit framework-level company. Someone already mentioned the recruiters, but also you're sure to have investors knocking. Whatever you do, please keep having fun and sharing it.
[Edit: I suppose the correct answer was for the full model]. What I got was much worse when I ran with hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q8_0. (I think the glyph-based reasoning is especially hard for models though, since they have a token-level blindness to them):
>>> Give me 5 odd numbers that don't have the letter 'e' in them.
<think>
Okay, so I need to come up with five odd numbers that don't have the letter 'e' in their name. Hmm, let's break this down step by step. First, what exactly is being asked here? The user wants numbers, specifically odd ones, and they can't contain the letter 'e'. So
I guess it's about the written form of the numbers, not their mathematical properties. For example, 13 is an odd number, and when you write it out, it's "thirteen," which does have an 'e' in it. So that wouldn't work.
Alright, so I need to think of numbers where their names don't include 'e'. Let me start by listing some odd numbers and then check if each has an 'e' in their name. The first few odd numbers are 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, etc.
Let's go one by one:
1. "One": No 'e' here. So 1 is a candidate.
2. "Three": Yes, that has an 'e'.
3. "Five": Again, no 'e'. So 5 works.
4. "Seven": Has an 'e' as well.
5. "Nine": No 'e' in "nine". Good to go.
Wait, so far I have 1, 5, and 9 as possible numbers. Let me keep going.
<truncated by me...>
Wait, let me make sure. The word "forty" is F-O-R-T-Y, no 'e', so adding "-one" which doesn't add an 'e' either. So yes, "forty-one" has no 'e's. So that's another number: 41.
So now I have five numbers: 1, 5, 9, 19, and 41.
Let me double-check each to ensure they don't contain the letter 'e':
- One: No 'e'.
- Five: No 'e'.
- Nine: No 'e'.
- Nineteen: No 'e'.
- Forty-one: No 'e'.
Yes, all of these are odd numbers and their names do not include the letter 'e'. I think that's a solid list.
</think>
Here is a list of five odd numbers whose names do not contain the letter 'e':
1. **One**
2. **Five**
3. **Nine**
4. **Nineteen**
5. **Forty-one**
These numbers are all odd, and their written forms do not include the letter 'e'.
All the 'founder attitude' book recs are probably a good idea, but I would recommend supplementing with some VC and investing vehicle stuff, such as Secrets of Sand Hill Road and The Power Law, especially if that seems unimportant and tedious. Understanding your options and what the incentives are for investors will help you better understand where you are likely to be aligned and where you have to be careful.
Most of the work on objective quality metrics (e.g. PESQ, POLQA, ViSQOL, DNS-MOS, NISQA) focus on speech because of telecommunications demands, but some of these have an audio mode. But there are some new promising audio ones that are ML based.
I haven't tried it but you may want to look into PAM, which is relatively new and doesn't require a reference (you don't need the original uncompressed audio), and is open source.
However, all approaches are quite far from perfect. Human evaluation is still the gold standard.
reply