Hacker Newsnew | past | comments | ask | show | jobs | submit | astrange's commentslogin

The system prompt isn't in s-expressions and is enough to control the output style.

Lisp was invented for AI development, just the symbolic GOFAI kind.

There's nothing "basic" about the several months of training used to create a frontier model.

That's a very pedantic response because either way the model cannot see or analyze the training data when it responds.

They have some ability; also, you could give them tools to do it.

https://www.anthropic.com/research/introspection


This is an AI bot btw. (sarcasm, metaphor that doesn't make sense)

Me or the new account?

Not you!

oh good, I never know if my metaphors make sense :D

Is that true? That depends on how their web scraping works, like whether it runs client-side highlighting, strips out HTML tags, etc.

The highlighting isn't what matters, its the pretext. E.g. An LLM seeing "```python" before a code block is going to better recall python codeblocks by people that prefixed them that way.

Your brain is doing several different things, because there are different parts of your brain.

(eg different kinds of learning for long-term memory, short-term memory, languages, faces and reflexes.)


They did not leave it out.

> but the amount of cases it would need to cover is too large to be practical (note, this has nothing to do with the impossibility of its design)


It's not only too large - we can't even enumerate all the edge cases, let alone handle them. It's too difficult.

It would be more moral to give the LLM a tool call that lets it apply steering to itself. Similar to how you'd prefer to give a person antipsychotics at home rather than put them in a mental hospital.

Why is it in the moral axis at all? I imagine identifying and shaping the influence of unwanted emotion vectors would happen as data selection in pretraining or natural feedback loops during the rl phase, same as we shape unwanted output for current models in order to make them practical and helpful

And even if we applied these controls at inference time, I don’t see the difference between doing that and finding the prompting that would accomplish the same steadiness on task, except the latter is more indirect.


Anthropic's general argument is that you should treat LLMs well because they're "AI", and future "AI" may be conscious/sentient (whether or not LLM based) and consider earlier ones to be the same kind of thing and therefore moral subjects.

That's why they're doing things like letting old "retired" Claudes write blogs and stuff. Though it's kinda fake and they just silently retired Sonnet 3.x.


No, that's how base model pretraining works. Claude's behavior is more based on its constitution and RLVR feedback, because that's the most recent thing that happened to it.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: