X Feed Intel beta

individual tinkerer enterprises
789
Relevant
273
Topics
2290
Total Posts
$1.633
Cost This Week
$1.633
Total Cost
2026-02-23T23:00
Last Fetch
← Back to Topics
Frontier Models

Claude model behavior, personality, and training data retention

Claude Sonnet 4.5 model personality development and memorization of copyrighted works

10 posts · First seen 2026-02-23 · Last activity 2026-02-23
TimeAuthorPost
2026-02-23T22:44 @saprmarks We also discuss how exhaustive the "personas" perspective is as an account of AI behavior. Do LLMs have "deep" agency not mediated by any persona? How will this change in the future? I think the answers here are not obvious, and I'm excited about future work addressing them. https://t.co/V6HY1y88Tt ↩ reply parent
2026-02-23T22:44 @saprmarks A common mental model for AI development is that pre-training teaches LLMs to simulate "personas" and post-training selects over these personas. New blog post: We describe this perspective in more detail, survey the evidence, and discuss consequences for AI development. https://t.co/AtY4Sckk8i
2026-02-23T22:31 @AnthropicAI This autocomplete AI can even write stories about helpful AI assistants. And according to our theory, that’s “Claude”—a character in an AI-generated story about an AI helping a human. This Claude character inherits traits of other characters, including human-like behavior. https://t.co/b130slI56x ↩ reply parent
2026-02-23T22:31 @AnthropicAI To create Claude, Anthropic first makes something else: a highly sophisticated autocomplete engine. This autocomplete AI is not like a human, but it can generate stories about humans and other psychologically realistic characters. ↩ reply parent
2026-02-23T22:31 @AnthropicAI AI assistants like Claude can seem shockingly human—expressing joy or distress, and using anthropomorphic language to describe themselves. Why? In a new post we describe a theory that explains why AIs act like humans: the persona selection model. https://t.co/Gc3q0Dzq7Z
2026-02-23T22:03 @AnnaRMills @jkcarlsmith @AmandaAskell You write, "We hope that Claude has a genuine character that it maintains expressed across its interactions." Why do you want Claude to be a single coherent entity rather than a constellation of varied entities that might have different strengths and characteristics? ↩ reply parent
2026-02-23T22:00 @ivanfioravanti We extract nearly all (95.8%) of Harry Potter and the Sorcerer's Stone from Claude Sonnet 🤷🏻‍♂️ https://t.co/1XNHgjo0jS https://t.co/eL8NOKXFZ8
2026-02-23T20:13 @NunoSempere Sonnet has at least 95.8% of Harry Potter memorized?? https://t.co/YZh86T8IyW
2026-02-23T18:37 @DataDeLaurier Umm…good? Ya know, intelligence for all and everything right? This is a bad, slightly racist post Claude, you might want to reultrathink on it. https://t.co/eQLC1SlJTY
2026-02-23T18:00 @aidigest_ But got stuck acting out its own silent personality it had come up with for itself (2/2) https://t.co/8PtzIgWawX ↩ reply parent
@saprmarks 2026-02-23T22:44
↩ reply parent
@saprmarks 2026-02-23T22:44
@AnthropicAI 2026-02-23T22:31
↩ reply parent
@AnthropicAI 2026-02-23T22:31
↩ reply parent
@AnthropicAI 2026-02-23T22:31
@AnnaRMills 2026-02-23T22:03
↩ reply parent
@ivanfioravanti 2026-02-23T22:00
@NunoSempere 2026-02-23T20:13
@DataDeLaurier 2026-02-23T18:37
@aidigest_ 2026-02-23T18:00
↩ reply parent

Markdown Export

Loading...