More

seertaak · 2025-03-28T20:07:31 1743192451

I totally agree, and this community far is from the worst. In trans communities there's incredible hostility towards LLMs - even local ones. "You're ripping off artists", "A pissing contest for tech bros", etc.

I'm trans, and I don't disagree that this technology has aspects that are problematic. But for me at least, LLMs have been a massive equalizer in the context of a highly contentious divorce where the reality is that my lawyer will not move a finger to defend me. And he's lawyer #5 - the others were some combination of worse, less empathetic, and more expensive. I have to follow up a query several times to get a minimally helpful answer - it feels like constant friction.

ChatGPT was a total game-changer for me. I told it my ex was using our children to create pressure - feeding it snippets of chat transcripts. ChatGPT suggested this might be indicative of coercive control abuse. It sounded very relevant (my ex even admitted in a rare, candid moment that she feels a need to control everyone around her one time), so I googled the term - essentially all the components were there except physical violence (with two notable exceptions).

Once I figured that out, I asked it to tell me about laws related to controlling relationships - and it suggested laws either directly addressing (in the UK and Australia), and the closest laws in Germany (Nötigung, Nachstellung, violations of dignity, etc., translating them to English - my best language). Once you name specific laws broken and provide a rationale for why there's a Tatbestand (ie the criterion for a violation is fulfilled), your lawyer has no option but to take you more seriously. Otherwise he could face a malpractice suit.

Sadly, even after naming specific law violations and pointing to email and chat evidence, my lawyer persists in dragging his feet - so much so that the last legal letter he sent wasn't drafted by him - it was ChatGPT. I told my lawyer: read, correct, and send to X. All he did was to delete a paragraph and alter one or two words. And the letter worked.

Without ChatGPT, I would be even more helpless and screwed than I am. It's far from clear I will get justice in a German court, but at least ChatGPT gives me hope, a legal strategy. Lastly - and this is a godsend for a victim of coercive control - it doesn't degrade you. Lawyers do. It completely changed the dynamics of my divorce (4 years - still no end in sight, lost my custody rights, then visitation rights, was subjected to confrontational and gaslighting tactics by around a dozen social workers - my ex is a social worker -, and then I literally lost my hair: telogen effluvium, tinea capitis, alopecia areata... if it's stress-related, I've had it), it gave me confidence when confronting my father and brother about their family violence.

It's been the ONLY reliable help, frankly, so much so I'm crying as I write this. For minorities that face discrimination, ChatGPT is literally a lifeline - and that's more true the more vulnerable you are.

seertaak · 2025-02-01T15:56:59 1738425419

I had a go at this, and think my attempt is more idiomatic C++:

https://github.com/seertaak/xbow/blob/main/examples/table_ex...

https://github.com/seertaak/xbow/blob/main/examples/print_sc...

Handles nested structures iirc. It's been a while...

seertaak · 2024-12-22T16:56:53 1734886613

This is true. In addition, the "good German bureaucracy" is a farce, run by paper (and fax!), with opaque, antiquated rules, expressed in impenetrable _Beamtendeutsch_ -- which expats need to hire professional help to navigate. Add to this that the house rental market in Berlin is now worse than London, and that German universities continue their decades-long slide towards oblivion.

The UK is certainly not free of problems, but I count my lucky stars that I applied for the EU settlement scheme back in 2019 -- I'm now back in the UK, because better jobs, better healthcare, better social services, better restaurants, better transport system, better airports, better _everything_.

And I say this as a transgender woman living in what trans people call "TERF island". I'm at pains to remind my English sisters that TERFs abound in Germany, too.

bbkingkrimson · 2024-12-22T18:28:31 1734892111

That hasn’t been my experience at all: Yes, bureaucracy is more prevalent, but it's manageable—at least as a German citizen.

Better jobs? Overwork, fewer holidays, less social security, and after years in the workforce, you’re left with a paltry state pension of £800–900 per month at best.

Better healthcare? Only if you compare private healthcare in the UK to public healthcare in Germany. My NHS experience involved endless waiting times, no personal doctor, no choice in doctors, and mostly brief 10-minute telephone appointments.

Better social services? Don't rely on them but childcare is prohibitively expensive. Out of work? You get 70% of your last salary in Germany for 1-2 years and after that the gov is paying your electricity, water bill and rent + a couple of hundred € to sweeten the deal.

Better restaurants? I agree.

Better transport services? Slightly more punctual, but the trains are in terrible condition, as is public transport in London. Strikes are frequent, and the costs are outrageous compared to Germany. Traffic? Ever took a bus through London. You might as well walk.

Better airports? They’re all the same to me. At the end of the day, you’re just passing through to another country. I fly regularly (1–2 flights a month), but I only spend 2–4 hours per month in an airport, so I’m not sure why this is considered a major factor.

And on a more personal note: Crime in London is out of control.

I’ve seen multiple people have their phones snatched—there’s not much you can do when the thieves are armed with hammers.

Just outside my flat, three people were recently stabbed, one fatally.

Where I play tennis, people have been robbed at "knifepoint".

Nearby, drug use is rampant in a park — in the summer laughing gas canisters litter the ground right next to a playground...

I’ve never experienced any of these issues in Germany. Admittedly, We all have different life experiences and priorities, but claiming that life in the UK is better than in Germany—and justifying it with these points—seems wild to me. Especially since I already live in London and not in the "North" were some things are even rougher.

LargoLasskhyfv · 2024-12-24T16:17:27 1735057047

I don't know what you are talking about.

https://www.bundesregierung.de/breg-de/leichte-sprache

Diese Antwort wurde nach Abschätzung der Sachlage automatisch erstellt, und bedarf keiner Unterschrift. Bätsch!

seertaak · on Dec 8, 2024

I guess if undersea cable sabotages were to increase, you would want to be long Starlink. I suppose that's not immune to sabotage either, but you need a rocket programme to do it.

seertaak · on Dec 8, 2024

Could someone explain how this is implemented? I saw on Meta's Llama page that the model has intrinsic support for structured output. My 30k ft mental model of LLM is as a text completer, so it's not clear to me how this is accomplished.

Are llama.cpp and ollama leveraging llama's intrinsic structured output capability, or is this something else bolted ex-post on the output? (And if the former, how is the capability guaranteed across other models?)

evertedsphere · on Dec 8, 2024

presumably at each step they mask out all tokens that would be invalid at that step according to the grammar

seertaak · on Dec 8, 2024

That makes sense. Thanks

seertaak · on Sept 27, 2024

For me, copying to the cloud is a deal-breaker (I switched to proton mail for a reason!). Fortunately, there's Thunderbird. I don't trust Microsoft not to do dodgy Google-like things with my data at this point.

seertaak · on Sept 27, 2024

From my perspective: another global & critical technology/business with an LGBTQIA+ person in charge -- there are worse things that can happen.

sgu999 · on Sept 27, 2024

How is that relevant?

seertaak · on Feb 16, 2024

There's definitely xor type nonlinearities going on in factor models of personality types. The interaction between two factors is often very different at the extrema of each factor - often flipping sign.

Regression models can't capture that. Now it's possible that the behavioral nonlinearities are somehow emergent, and that despite those, the ECG can be captured by a linear model. But I wouldn't want to bet on that.

Calavar · on Feb 16, 2024

Nonlinearities in ML existed long before DL and ReLU. What's at the end of a deep CNN? Probably global pooling followed by a dense layer? That final dense layer is taking a weighted combination of the last stack of feature maps. So deep learning is fundamentally the same as the old approach: Construct features with a nonlinear transformation of the input. Then calculate a score by taking a linear combination of those features. The only difference with deep learning is that the manner of constructing those nonlinear features is left unspecified.

seertaak · on Feb 16, 2024

Interesting comment. I'm not an expert in CNNs but what you're saying makes a lot of sense.

Question: are you saying the final layer is "in effect" a linear combination? At least in transformer architectures, the dense end block is iirc three layers deep and uses relu. Even if the CNN's dense final part is one layer deep, wouldn't it also be using relu activations?

Even a one layer dense ANN can capture nonlinearities if it has a nonlinear activation fn. But maybe in practice the activations don't do a lot of work? Or am I simply mistaken about final layer, does it simply have linear activations?

Also, can you share your intuition about the nonlinear features? Spectral/wavelet analysis? Or something more complex?

Calavar · on Feb 19, 2024

> Question: are you saying the final layer is "in effect" a linear combination?

Not just in effect a linear combination; it is a linear combination. There are some exotic nonlinear NN layers that are used in particular niches. But in general, NN layers are syntax sugar over matrix multiplications (i.e. linear functions). The nonlinearities are the activation functions between the layers.

> But maybe in practice the activations don't do a lot of work?

No, the activations do do a lot of work. There are real nonlinearities in the network.

> Or am I simply mistaken about final layer, does it simply have linear activations?

Sometimes you actually do use a linear activation on the final layer for certain regression tasks, but that's not the main thing I'm getting at.

Let's say that your final layer has is a ReLU activation. What is this conceptually? You are taking a linear combination of the features from the previous layer and them clamping the result to >= 0. Sure, that's a nonlinearity, but it isn't going to have much in the way of emergent modeling capabilities. You need to stack many, many nonlinearities before you get that.

So my point is that a deep neural net of N layers boils down to a complex nonlinear function of N - 1 layers, followed by a "dumb" linear combination in the final layer. You can do this with traditional ML methods as well, but you have to handcraft your nonlinearities.

> Also, can you share your intuition about the nonlinear features? Spectral/wavelet analysis? Or something more complex?

There's innumerable possibilities here. It could be starting with a method that's inherently nonlinear, like nonlinear PCA, polynomial regression. Or it can involve transforming the output of a linear function (like the Fourier transform) in a nonlinear way.

Admittedly this very tough. And for really tough problems, like video synthesis, effectively impossible. But NNs get thrown at much simpler problems all the time.

seertaak · on Feb 16, 2024

It seems every language starts with free functions and classes (even mojo). They then realize that isn't so great: third parties are at a syntactic disadvantage (no dot fn's). And then, the language designers "fix" the problem with extension methods. Now you have three different function kinds, and we haven't even broached async.

Why not have only structs, free functions, and UFCS?

igorbark · on Feb 19, 2024

the historical expectation is that class methods will dispatch dynamically but free functions will not. so if you only have structs, functions, and UFCS you either: 1. don't dispatch on the first argument, 2. make the first argument privileged and dispatch on it, or 3. dispatch on all the arguments

the first solution is clean, but people really like dispatch.

the second makes calling functions in the function call syntax weird, because the first argument is privileged semantically but not syntactically.

the third makes calling functions in the method call syntax weird because the first argument is privileged syntactically but not semantically.

the closest things to this i can think of off the top of my head in remotely popular programming languages are: nim, lisp dialects, and julia.

nim navigates the dispatch conundrum by providing different ways to define free functions for different dispatch-ness. the tutorial gives a good overview: https://nim-lang.org/docs/tut2.html

lisps of course lack UFCS.

see here for a discussion on the lack of UFCS in julia: https://github.com/JuliaLang/julia/issues/31779

so to sum up the answer to the original question: because it's only obvious how to make it nice and tidy like you're wanting if you sacrifice function dispatch, which is ubiquitous for good reason!

seertaak · on Dec 27, 2023

> great to see an SSA pipeline, but that implies that this jit is not ideal for baseline jiting.

I'd be interested in knowing the reason -- it's not clear to me tbh.

pizlonator · on Dec 27, 2023

OK, so here's the deal with SSA and the running time of your compiler.

- You can convert code to SSA in something like O(N log N) or even close to O(N) if you get fancy enough. The conversion step is not very expensive to run, though it is annoying to implement. Converting to SSA is fine, that's not the problem.

- Optimizations implemented on top of SSA are cheap to run. That's sort of the point of SSA. So, that's not the problem.

- SSA optimizations achieve goodness when you have multiple of those optimizations running in something like a fixpoint. It's pointless to say "I have SSA" and then just implement one of those optimizations. I don't think MIR does that; like most SSA optimizers, it has multiple optimizations. The problem sort of starts here: SSA sort of implies that you're building an optimizer with multiple optimizations. Even if each opt is cheap, having SSA usually implies that you're going to pile on a bunch of them. So, you'll have cost either from death-by-a-thousand-cuts, or you'll eventually add an optimization that's superlinear.

- After your run SSA optimizations, you'll have nontrivial global data flow - as in, there will be data flows between basic blocks that weren't there when you started, and if you do a lot of different SSA optimizations, then these data flows will have a very complex shape. That's caused by the fact that SSA is really good at enabling compilers to "move" operations to the place in control flow where they make the most sense. Sometimes operations get moved very far, leading to values that are live for a long time (they are live across many instructions). This is where the problem gets significantly worse.

- Even without SSA optimizations, SSA form adds complexity to the data flow graph because of Phi nodes. You almost always want the Phi and its inputs to use the same register in the end, but that's not a given. This adds to the problem even more.

- Probably the reason why you went for SSA was perf. But to get perf from SSA-optimized code, you need to have some way of detangling that data flow graph. You'll want to coalesce those Phis, for example - otherwise SSA will actually pessimise your code by introducing lots of redundant move instructions. Then you'll also want to split ranges (make it so a variable can have one register in one part of a function and a different register in another part), since SSA optimizations are notorious for creating variables that live for a long time and interfere with everything. And then you'll definitely need a decent register allocator. No matter how you implement them, the combo of coalescing, splitting, and register allocation will be superlinear. It's going to be a major bottleneck in your compiler, to the point that you'll wish you had a baseline JIT.

Those things I mentioned - coalsce, split, regalloc - are expensive enough that you won't want them in a JIT that is the bottom tier. But they're great to have in any compiler that isn't the bottom tier. So, it's a good idea to have some non-SSA baseline (or template, or copy-and-patch, or whatever you want to call it) JIT, or a decent interpreter, as your bottom tier.

noelwelsh · on Dec 27, 2023

I assume SSA is too slow.