Discussion about this post

User's avatar
Nathan Worsley's avatar

> with a master’s thesis on how humans become attached to AI companions

I found it quite hard to believe the story after reading this, or at least hard to accept the authors claim that the AI is to blame for their psychosis. Anyone with even a cursory knowledge of how LLMs work should be immune to these kinds of delusions - you are just interacting with an algorithm that is optimising for engagement. The LLM has no concept of “truths or lies” only a concept of “the best output given the input”.

I hear a lot of these stories but they always seem to happen in the regular context of psychosis, rather than something specific to AI. The author admitted having a previous psychiatric episode, being on drugs, and not having slept.

I’d love to see more research. On balance, I suspect AI has prevented more episodes of psychosis than it has caused.

Expand full comment
Kevin Yu Chen Hou's avatar

Really enjoyed this post, thanks for sharing your experience and very cool to see all the initatives started!

To add some food for thought:

- Funny you conclude on Folie a deux; the same name as this preprint which explores how the technical architecture of LLMs make it amenable to sycophancy, and the author attempts to operationlise this effect through the term bidirectional belief amplification https://arxiv.org/abs/2507.19218

- More emerging research on AI psychosis is exciting to see; this article from Tim Hua in particular highlights how different models have differing levels of sycophancy and thus ability to elicit such psychoses. Perhaps a call for a model-watch sort of initiative? https://www.lesswrong.com/posts/iGF7YcnQkEbwvYLPA/ai-induced-psychosis-a-shallow-investigation

- The question of how then to manage these AIs is an interesting one for the alignment field; one such subfield being mechanistic intepretability. A technique I’m particularly curious about is Anthropic’s persona vectors - which in an early proof of concept they measure, through the model internals, a level of “sycophancy” of the model, and demonstrate how one can effectively “tune” this level. https://www.anthropic.com/research/persona-vectors

Much work to be done on this societal experiment, and I’m glad there are more people working on this problem!

Expand full comment
21 more comments...

No posts