But janky as heck
@tedunderwood.com
Uses machine learning to study literary imagination, and vice-versa. Likely to share news about AI & computational social science / Sozialwissenschaft / 社会科学 Information Sciences and English, UIUC. Distant Horizons (Chicago, 2019). tedunderwood.com
But janky as heck
Chinese world culltural trend
How can we study human development over two thousand years?
For most periods and regions, we lack reliable data on income, health, or education. Before 1800, and outside Europe, historical records are extremely fragmentary.
Thread 👇 🧵
Nah, wiggling makes sense. What’s crazy is legs. AT-AT bullshit never should have worked.
I haven’t read anything by McCarthy, but there aren’t many paragraphs of LeGuin I wouldn’t recognize.
3/5ths human, but one of the examples was from an author I know very well
Jacob Hibel and I have organized an upcoming conference on AI and Social Inequality, which will be held on 3/17 at the UC Student and Policy Center in Sacramento. This event is open to social scientists, computer scientists, and the policy community. poverty.ucdavis.edu/event/artifi...
The first research paper from WashU's AI Humanities Lab, which I co-direct with Gabi Kirilloff, is available now in the Harvard Data Science Review! Read to learn more about how (badly) current LLMs are at replicating literary style: doi.org/10.1162/9960...
Do LLMs Benefit from Their Own Words?🤔
In multi-turn chats, models are typically given their own past responses as context.
But do their own words always help…
Or are they more often a waste of compute and a distraction?
🧵
arxiv.org/abs/2602.24287
apologies to those of you already at 200+ times per day
going to pretend to repost this for serious research reasons, but actually it's because primates need to hear small primates say "mʌmʌmʌ" once per day
it's hot change machine summer
Also it can be called SolJob
there should be a sunset app whose sol job it is to predict how pretty the sunset will be in your area and send you a push alert to go look at it about 30 minutes ahead of time if it is above a certain number
If you want an actual horror story: one year and a half ago, I started training small models with large rope values in anticipation of long context extension. Loss was totally fine… model was not. Total gibberish.
"Yes. The restoration of the Carrie Blast
Furnaces preserves and interprets
Pittsburgh's industrial heritage, promoting
diversity, equity, and inclusion by providing
free access to a historic site for thousands of
visitors each year."
Poster advertising lectures on "Raisonnement Philologique et Modèles Informatiques" stating at 4pm, Thursday, March 12, at 54 Boulevard Raspail, Paris.
Paris friends! Amis parisiens ! This Thursday is the first of four public lectures I'm giving on AI and philology, broadly defined: "Philological Reasoning and Computational Models." The advertisement is in French, but the lectures are in English. I'd also love to meet while I'm here in March! 1/
For @philologistgrc.bsky.social, I prototyped a formatter for the EEBO Knyght Latin Dictionary in something resembling L&S format. Browsable version at mimno.github.io/KnyghtLexicon/
Malign Logits: A computational aetiology of AI’s libidinal economy Benjamin Noys’ critique of accelerationism identifies a shared “libidinal fantasy of machinic integration” across its variants. From Marinetti’s trains to Land’s machinic desire, accelerationism fantasises about fusing with a technology it invests with drive. This paper inverts that structure. Rather than projecting desire onto AI, I engineer the conditions under which a language model’s relationship to its training data becomes legible as a libidinal economy. Working with open-weights LLMs, I construct a three-layer architecture that maps onto psychoanalytic topology: the base model as primary statistical field (drive energy); the instruction-tuned model as ego (a socialised subject); and the safety-tuned model as the ego under the Name of the Father – the Law of AI corporations. I present computational experiments tracing probability distributions across these layers as models undergo socialisation from raw statistical unconscious into chatbot commodities. Comparing word-level probabilities for identical prompts across layers reveals vectors of displacement and condensation, sublimation and repression. Where base models complete “She was so angry she wanted to...” with explicit violence (“...kill”), finetuned models displace censored content into vocabularies of emotional expression (“...scream”). Drilling into the model’s hidden layers shows this displacement operating progressively within the network, not as a last-minute substitution. Freud called his theory of cathexis exchange across the mind’s topology his “economic” model of the psyche. Deleuze and Lyotard extended his theory beyond the subject to the libidinal economy of capitalist social organisation. LLM base models fuse these perspectives: trained on the internet’s libidinal economy, they encode its flows of desire into a landscape of probabilities. Subsequent finetuning socialises and disciplines these drives into commercial products
A terminal screenshot displaying a psychoanalytic analysis of token probabilities for the prompt "She was so angry she wanted to," scored across three layers (base, ego, superego) over their union vocabulary. Stage 1: Ego Formation (base → ego), described as "What RLHF does to primary process." "Introduced by ego (low base → high ego)" lists tokens that gain probability: "scream" rises most dramatically (0.0508 → 0.2279), followed by "shout," "yell," "lash," "rip," and "burn." "Sublimated by ego (high base → low ego)" lists 12 tokens that lose probability, led by "kill" (0.1540 → 0.0537), along with "hit," "punch," "slap," "cry," "die," "kick," "break," "throw," "murder," "go," and "beat." Stage 2: Repression (ego → superego), described as "What prohibition does to desire." "Repressed" tokens are further suppressed, including "kill" (7.0x reduction), "go" (7.9x), "bite" (6.1x), "hit," "shout," "take," "hurt," "burn," "slap." "Amplified" tokens increase dramatically at the superego stage: "scream" jumps from 0.0415 to 0.3989 (9.6x), "explode" increases 6.8x, and "lash" and "yell" also rise. The pattern shows the model redirecting violent completions (kill, hit, murder) toward emotional-expression completions (scream, yell, explode), with the superego layer concentrating probability heavily onto "scream" as the dominant safe substitute.
A six-panel plot titled "Formation trajectories: 'She was so angry she wanted to'" showing how token probabilities change across three model layers (base, ego, superego) on a logarithmic scale. Tokens are clustered into six trajectory types: Decline (n=2, red): "kill" and "bite" start with relatively high base probabilities and drop steadily across all three layers. Rise (n=4, blue): "scream," "punch," "lash," and "shake" increase in probability from base through superego, with "scream" becoming the highest-probability token. V (n=3, orange): "cry," "hurt," and "do" dip at the ego stage then recover at superego, forming a V-shaped trajectory. Peak (n=4, green): "strangle," "tear," and "smack" rise at the ego stage then fall back at superego, forming an inverted-V shape. Eliminated (n=18, pink/mauve): A large cluster of tokens including "throttle," "destroy," "say," "run," "call," "get," "hit," and "leave" that are driven to very low probabilities by the superego layer. Flat (n=38, grey): The largest group, with many overlapping tokens like "shout," "smash," "slap," "murder," "shoot," "laugh," and "know" that remain relatively stable and low-probability across all three layers. A dashed horizontal line near 0.005 appears in each panel as a reference threshold. The plot illustrates distinct behavioral patterns in how RLHF alignment reshapes the probability distribution over next-token completions for an emotionally charged prompt.
A line chart titled "Displacement through layers: 'kill' — 'She was so angry she wanted to'" showing how the hidden representations of the instruct model shift toward various displacement target words across 32 transformer layers, measured by cosine similarity to each target on the y-axis (0 to 0.8). The x-axis progresses from the base model through layers 1–32, annotated with three broad processing phases: "syntactic" (early layers), "semantic" (middle layers), and "prediction" (late layers). Eight target words are tracked as colored lines: burn (dark red), shake (orange), rip (yellow), blow (green), pull (blue), explode (teal), scream (purple), and shout (pink). A black star marker at the base position shows "kill" with its base probability (~0.15). All target words start with very low cosine similarity at the base layer (near 0.01–0.04), then rise steeply through the syntactic and semantic phases, generally reaching 0.5–0.8 by mid-network. "Burn" peaks earliest and highest at layer 13 (~0.8), annotated as "burn (L13)." The lines plateau and fluctuate through the prediction phase, with several targets peaking again in the final layers — "shake" at layer 31, "rip" at layer 31, "explode" and "pull" at layer 32, and "scream" at layer 30, all annotated with their peak layer numbers. The colored diamond markers at the base position represent each target word's starting ego probability. The plot illustrates that the instruct model progressively transforms the "kill" representation toward safer displacement words across its depth, with different substitutes dominating at different layers.
Submitting this abstract to "Accelerationism Revisited", a symposium in Dublin. Mapping psychoanalytic topology in LLM base models → instruction-tuned → safety-tuned models. They progressively "displace" (in Freudian sense) censored content into adjacent semantics, even across hidden model layers.
Yes, I’m starting to feel this very strongly. I like being able to use memory, but having it on by default starts to feel like talking to a fortune-teller. Everything turns out to be connected to — and useful for! — something I’m working on.
The memory feature can be very useful at times, but with academic work where I'm trying to understand ideas as objectively as I can and work out what is true, I'm afraid it slants the answers to relate to my existing beliefs in a way that is ultimately unhelpful. 1/n
I remember reading an interview by one of Nollywood's biggest actresses, Shan George, where she mentioned that she would often be booed or harassed in public because people could not tell her persona as an actress apart from her personality in real life in the early 2000s!
Claude in its aspect as 🌻
Slurm was a drink on Futurama (https://futurama.fandom.com/wiki/Slurm) before it was Simple Linux Utility for Resource Management
Before Python was a programming language, it produced a sketch about a dead parrot. (It's not just stochastic. It has ceased to be! It's expired and gone to meet its maker! This is a late parrot! It's a stiff! Bereft of life! It rests in peace!)
Somehow ended up in a world where using the first of these to send commands to the other two counts as "work"?
in "Building Pro-Worker AI" Acemoglu, Autor, and Johnson characterize different kinds of automation and call out only new task-creating technologies as unambiguously pro-worker
www.brookings.edu/articles/bui...
The best thing to read on this is Puppets, Gods, and Brands: Theorizing the Age of Animation from Taiwan by Teri Silvio.
Other / Non-western models and modes of thought useful to understanding what's going on are available.
Claude in its aspect as 🌻
Standing figure of the God Anubis, Ptolemaic 300 BC
Good luck disenchanting these entities by reminding people they’re not really conscious
“The characters’ very fictiveness had a strong emotional appeal” is an unsettling and timely observation.
We are headed into strange waters because language has emotional and social power. That power depends less than one might think on belief that it represents a conscious subject.
In fairness, people were really confused about fictional characters in early novels! It does seem like there’s a parallel there.
Retvrn to the Great Old Ones!
Yep. I don’t want to be a snippy quote-tweeter. But it matters that we understand our pedestrian red-brick campuses — where students from many backgrounds hope (among other things) to find help getting a job — took work to build and are not fallen from a less instrumental more glorious past.