> 1. The nerf is psychologial, not actual. 2. The nerf is real but in a way that is perceptual to humans, but not benchmarks.
They could publish weekly benchmarks. To disprove. They almost certainly have internal benchmarking.
The shift is certainly real. It might not be model performance but contextual changes or token performance (tasks take longer even if the model stays the same).
Anyone can publish weekly benchmarks. If you think anthropic is lying about not nerfing their models you shouldn't trust benchmarks they release anyway.
American, never heard of this. Some quick searching, and I found an Australian dairy site which describes this as permeated milk. From this advert piece it might be a way of ensuring that the milk fat/protein ratios can be easily adjusted to hit some target numbers.
I agree, much slower and worse output. It is substantially worse now than it was weeks ago.
It spends a lot of time coming up with “UI options” (Select 1, 2 or 3 with a TUI interface) for me to consider when it could just ask me what I want, not come up with a 5 layer flow chart of possibilities.
Overall I think it is just Anthropic tweaking things to reduce costs.
I am paying for a Max subscription but I am going to reevaluate other options.
Faster than an H100 for solving 128x128 matrices. But it’s not clear to me how they tested this, code is only available on request.
> We have described a high-precision and scalable analogue matrix
equation solver. The solver involves low-precision matrix operations,
which are suited well to RRAM-based computing. The matrix operations
were implemented with a foundry-developed 40-nm 1T1R RRAM array
with 3-bit resolution. Bit-slicing was used to guarantee the high preci-
sion. Scalability was addressed through the BlockAMC algorithm, which
was experimentally demonstrated. A 16 × 16 matrix inversion problem
was solved with the BlockAMC algorithm with 24-bit fixed-point preci-
sion. The analogue solver was also applied to the detection process
in massive MIMO systems and showed identical BER performance
within only three iterative cycles compared with digital counterparts
for 128 × 8 systems with 256-QAM modulation.
I've used Tiger/Saint/Satan/COPS in the distant past. But I think they're somewhat obsoleted by modern packaging and security like apparmor and selinux, not to mention docker and similar isolators.
most people like their distro to vet these things. uv et all had a reason when Python2 and 3 were a mess. i think that time is way behind us. pip is mostly to install libraries, and even that is mostly already done by the distros.
Thanks for the advice. I spend 99.99% of my time in a terminal, a browser and vscode.
The graphical environment is neither here nor there for me, I just want to do an update and cuda libraries/nvidia drivers not break and for my OS to boot!
They could publish weekly benchmarks. To disprove. They almost certainly have internal benchmarking.
The shift is certainly real. It might not be model performance but contextual changes or token performance (tasks take longer even if the model stays the same).