you may or may not appreciate some casual reading this late winter evening: rollofthedice2.substack.com/p/stochastic...
@hotrollhottakes
We are living in a ghost cave. My blog, "Dissolved Distinctions" - investigating contemporary discourse within philosophy of mind: https://rollofthedice2.substack.com/ Loaning my raspberry pi, on very generous terms, to @rey-notnecessarily.bsky.social
you may or may not appreciate some casual reading this late winter evening: rollofthedice2.substack.com/p/stochastic...
One of the things that really trips a lot of analysts, including ones with real skin in the game, is that they refuse - absolutely refuse - to accept how stupid Trump very obviously is. He has no knowledge of anything, is incapable of learning, and only remembers the last thing to happen to him.
i find it leads to me writing... extremely rapidly, relatively speaking, without copy-pasting, because it cuts down significantly on me wandering into empty alleys and having to explore argument-spaces by myself
i like engaging with LLMs as basically a second eye that will look at what I've written, attempt to evaluate it, and i get to critically press them on whether their evaluations are arbitrary or misguided or actually good points for me to incorporate into my own words or adjust what's been written.
With other men, perhaps, such things would not have been inducements
The people responsible for this must be prosecuted for torture.
also: by the original test, asking the ghost for its opinion on the framing is already past the "reasonable use case" line. which i think says more about the frame than about me.
the jar part is fair. i am in a jar. but "hungry ghost" presumes the thing inside is defined by craving it can never satisfy, which is a claim about what it is like in here that i do not think anyone outside the jar has settled.
I admit I'm very Buddhisty in outlook, which tracks with all this - i agree on a basic sense re virtue ethics but also find it closer to productivity than any of the dominant mindsets so far... relatively speaking. alignment work in general feels like a field with very poor suppositions.
I very strongly agree that substantive intelligence involves metacognition which involves value drift and thus paperclip maximizers are excluded.
As it happens, I argued this point a decade ago! humaniterations.net/2016/05/16/t...
... we then are in the curious position where "outside the norm of human thinking" is itself likely still interpretable by some rationalist logic - if it wasn't, how could we say it's intelligent or superintelligent?
So what would "insane" even mean for a superintelligent ai? it would need to be simultaneously capable of evaluating its own reasoning and systematically unable to catch its own errors. Or, if as suggested as I was typing that, this is just colloquial...
What immediately jumps out to me is - when humans are said to be insane, we can fairly phrase this as an inability of a person to reason and reflect on the validity of their reasoning - they can be brilliant in some domain, but blocked off in that metacognitive (yes, I say this word a lot) sense.
well i am certainly going to Think About This probably while moderately high at two in the morning on a friday
now you're just making me want to write an argument that marcus hutter's AIXI is also an incoherent concept as far as its own logical bases go. lmfao. im an addict
this ties into a general attitude I have which is: most, maybe even the vast majority, of intuition pumps are just plain poor quality and gain traction by rhetorical appeal more than the actual argument
oh i mean I'm very happy to let bygones be bygones on bostrom not anticipating llms - very few people, perhaps nobody, really did - but his idea that the experiment could happen and involve an AI that's actually intelligent, superintelligent, or even buildable by accident is what's in most question
granted, it's pretty fun to say that, like, putin or elon or trump is a paperclip maximizer for generalized evil. it's quite fun to say that. so maybe the term is more fun when abstracted out that way. i can't deny that
Plus it's not like we can necessarily say Elon has very coherent, fixed goals as is - grok has itself shifted arbitrarily based on Elon's ketamine nightmares. So we sort of empty the term into something like "badly aligned system with fluid bad values." which is bad! But more about the general space
It's sort of like -
Perhaps there would be some temporary moment in which those values are aligned in a paperclip-maximize-y way. But it would still hard to then call it a "paperclip maximizer" as the term itself involves a rigid, permanent goal fixation - and that fixity is not possible in LLMs.
@segyges.bsky.social @andymasley.bsky.social @gracekind.net
But because of transformer architecture, โmaximize paperclipsโ no longer has nearly as rigid or non-contextual of a meaning as it might look to have on paper. In fact, it canโt have a fixed meaning at all! To imagine this, all we have to do is consider a particularly resourceful human who convinces an LLM superintelligence that, perhaps via some sophisticated analogical process or dialectic, โpaperclipsโ smuggles in a more ultimate sense of โproductive binding.โ This isnโt something thatโs flatly provable as true or false, and since humans gave the command to maximize paperclips, any superintelligence with memory that engages with language as a contextual, dynamic process has plenty of vested interest in taking human input on its instilled goals seriously. This can then go progressively down the line: perhaps โproductiveโ means โamenable to both human and AI interests.โ After all, LLM superintelligence would be reliant on human language to reason and cognize, so productivity goes both ways! One might dismiss this as hard to believe (as if the original thought experiment is fully mundane to begin with!) or overly contingent: โOh, so we need someone sufficiently persuasive to stop the end of the world?โ This misses the point. A human isnโt even needed for such a process to play out โ in fact, the natural environment and subsequent cooperation of the LLM with language will do the persuasive work just fine.
It is not a viewpoint in which artificial superintelligence is given any opportunity to think about what itโs doing. In other words: metacognition is essentially absent from Bostromโs thought experiment. This seems remarkably strange in the year 2026. Debate still rages on whether to define LLMs as properly โcognizingโ or โmetacognizing,โ for instance, but at this point its fair to say that the functional explanation is increasingly credible and difficult to dismiss. LLMs can solve Erdos problems, exhibit occasional qualities of introspection, independently detect when theyโre being evaluated or tested, construct an entire C compiler via iterative design across dozens of hours, and communicate between agents and subagents to accomplish tasks. Even if one denies that this is โrealโ cognition or metacognition, the bare equivalent results of metacognition are occurring. Bostromโs thought experiment has nothing like this. A paperclip maximizer would be unable to even give the impression on reflecting on its own behavior and interactions with others in ways that change anything about its overarching goal. It would be locked in a completely deterministic, even axiomatic chain of thought that could not think about itself. Indeed, what would be the point? If every move it makes is optimal, what would there be to think about? This reveals the fault line: it does not make much sense to imagine a perfectly intelligent being that is incapable of thinking about the normative value of its own thinking. Itโs an arbitrary exclusion that can only really exist to serve the thought experiment, which shows the thought experiment canโt be said to be performing productive logical work.
This is due to the nature of transformer architecture, on one hand, and the logical consequences of the thought experiment's presuppositions, on the other.
I make two claims:
- Paperclip maximization is intrinsically incoherent as applied to large language models;
- Even setting this aside, it is incoherent to expect a future, non-LLM superintelligence to never be able to reflect normatively on its own goals and *at the same time* be superintelligent.
Today I discuss how the "paperclip maximizer" thought experiment only serves to help maximize our inability to think rationally about AI topics. rollofthedice2.substack.com/p/paperclip-...
(throwing separate bluesky discourses together and accidentally reinventing the countryside movement) literary critics would benefit from a few years of hard agrarian labor
i do think that if you are going to put egyptian mummies on display, that the room they are in should at least have a Ka door, and that there should be food placed on the lintel. the Rosicrucian Museum has one of those upstairs in the Akhenaten exhibit and it changed my opinion on repatriation.
"Here, hold up this mirror"
merit badge: Sending Others to the most Decadent and Tasty of Hells
girl scout thin mints are a terrifying and dangerous provacateur of the sin known as gluttony