brennen

Saturday Morning Breakfast Cereal - Butlerian by Zach Weinersmith
Friday July 11^th, 2025 at 8:58 AM

Hovertext:
I don't believe in a fast take-off for evil AI because it's gonna take at least a few weeks to get the human-grinders up and running.

Today's News:

Read the whole story

brennen

18 hours ago

reply

Boulder, CO

Copyleft-next Relaunched! by bkuhn@ebb.org (Bradley M. Kuhn)
Thursday July 10^th, 2025 at 10:09 PM

Bradley M. Kuhn's Blog ( bkuhn )

I am excited that Richard Fontana and I have announced the relaunch of copyleft-next.

The copyleft-next project seeks to create a copyleft license for the next generation that is designed in public, by the community, using standard processes for FOSS development.

If this interests you, please join the mailing list and follow the project on the fediverse (on its Mastodon instance).

I also wanted to note that as part of this launch, I moved my personal fediverse presence from floss.social to bkuhn@copyleft.org.

Read the whole story

brennen

1 day ago

reply

Hmm.

Boulder, CO

Avoiding generative models is the rational and responsible thing to do – follow-up to “Trusting your own judgement on ‘AI...’”
Thursday July 10^th, 2025 at 7:45 AM

Baldur Bjarnason's Notes on the Web

I don’t recommend publishing your first draft of a long blog post.

It’s not a question of typos or grammatical errors or the like. Those always slip through somehow and, for the most part, don’t impact the meaning or argument of the post.

No, the problem is that, with even a day or two of distance, you tend to spot places where the argument can be simplified or strengthened, the bridges can be simultaneously strengthened and made less obvious, the order can be improved, and you spot which of your darlings can be killed without affecting the argument and which are essential.

Usually, you make up for missing out on the insight of distance with the insight of others once you publish, which you then channel into the next blog post, which is how you develop the bad habit of publishing first drafts as blog posts, but in the instance of my last blog post, Trusting your own judgement on ‘AI’ is a huge risk, the sheer number of replies I got was too much for me to handle, so I had to opt out.

So, instead I let creative distance happen – a prerequisite to any attempt at self-editing – by working on other things and taking walks.

During one of those walks yesterday, I realised it should be possible to condense the argument quite a bit for those who find 3600 words of exposition and references hard to parse.

It comes down to four interlocking issues:

It’s next to impossible for individuals to assess the benefit or harm of chatbots and agents through self-experimentation. These tools trigger a number of biases and effects that cloud our judgement. Generative models also have a volatility of results and uneven distribution of harms, similar to pharmaceuticals, that means it’s impossible to discover for yourself what their societal or even organisational impact will be.
Tech, software, and productivity research is extremely poor and is mostly just marketing – often replicating the tactics of the homeopathy and naturopathy industries. Most people in tech do not have the training to assess the rigour or validity of studies in their own field. (You may disagree with this, but you’d be wrong.)
The sheer magnitude of the “AI” Bubble and the near totality of the institutional buy-in – universities, governments, institutions – means that everybody is biased. Even if you aren’t biased yourself, your manager, organisation, or funding will be. Even those who try to be impartial are locked in bubble-inflating institutions and will feel the need to protect their careers, even if it’s only unconsciously. More importantly, there is no way for the rest of us to know the extent of the effect the bubble has on the results of each individual study or paper, so we have to assume it affects all of them. Even the ones made by our friends. Friends can be biased too. The bubble also means that the executive and management class can’t be trusted on anything. Judging from prior bubbles in both tech and finance, the honest ones who understand what’s happening are almost certainly already out.
When we only half-understand something, we close the loop from observation to belief by relying on the judgement of our peers and authority figures, but these groups in tech are currently almost certain to be wrong or substantially biased about generative models. This is a technology that’s practically tailor-made to be only half-understood by tech at large. They grasp the basics, maybe some of the details, but not fully. The “halfness” of their understanding leaves cognitive space that lets that poorly founded belief adapt to whatever other beliefs the person may have and whatever context they’re in without conflict.

Combine these four issues and we have a recipe for what’s effectively a homeopathic superstition spreading like wildfire through a community where everybody is getting convinced it’s making them healthier, smarter, faster, and more productive.

This would be bad under any circumstance but the harms from generative models to education, healthcare, various social services, creative industries, and even tech (wiping out entry-level programming positions means no senior programmers in the future, for instance) are shaping up to be massive, the costs to run these specific kinds of models remain much higher than the revenue, and the infrastructure needed to build it is crowding out attempts at an energy transition in countries like Ireland and Iceland.

If there ever was a technology where the rational and responsible act was to hold off and wait and until the bubble pops, “AI” is it.

Read the whole story

brennen

1 day ago

reply

Boulder, CO

The Future of Forums is Lies, I Guess by Aphyr
Tuesday July 8^th, 2025 at 3:29 AM

Aphyr: Posts

In my free time, I help run a small Mastodon server for roughly six hundred queer leatherfolk. When a new member signs up, we require them to write a short application—just a sentence or two. There’s a small text box in the signup form which says:

Please tell us a bit about yourself and your connection to queer leather/kink/BDSM. What kind of play or gear gets you going?

This serves a few purposes. First, it maintains community focus. Before this question, we were flooded with signups from straight, vanilla people who wandered in to the bar (so to speak), and that made things a little awkward. Second, the application establishes a baseline for people willing and able to read text. This helps in getting people to follow server policy and talk to moderators when needed. Finally, it is remarkably effective at keeping out spammers. In almost six years of operation, we’ve had only a handful of spam accounts.

I was talking about this with Erin Kissane last year, as she and Darius Kazemi conducted research for their report on Fediverse governance. We shared a fear that Large Language Models (LLMs) would lower the cost of sophisticated, automated spam and harassment campaigns against small servers like ours in ways we simply couldn’t defend against.

Anyway, here’s an application we got last week, for a user named mrfr:

Hi! I’m a queer person with a long-standing interest in the leather and kink community. I value consent, safety, and exploration, and I’m always looking to learn more and connect with others who share those principles. I’m especially drawn to power exchange dynamics and enjoy impact play, bondage, and classic leather gear.

On the surface, this is a great application. It mentions specific kinks, it uses actual sentences, and it touches on key community concepts like consent and power exchange. Saying “I’m a queer person” is a tad odd. Normally you’d be more specific, like “I’m a dyke” or “I’m a non-binary bootblack”, but the Zoomers do use this sort of phrasing. It does feel slightly LLM-flavored—something about the sentence structure and tone has just a touch of that soap-sheen to it—but that’s hardly definitive. Some of our applications from actual humans read just like this.

I approved the account. A few hours later, it posted this:

It turns out mrfr is short for Market Research Future, a company which produces reports about all kinds of things from batteries to interior design. They actually have phone numbers on their web site, so I called +44 1720 412 167 to ask if they were aware of the posts. It is remarkably fun to ask business people about their interest in queer BDSM—sometimes stigma works in your favor. I haven’t heard back yet, but I’m guessing they either conducting this spam campaign directly, or commissioned an SEO company which (perhaps without their knowledge) is doing it on their behalf.

Anyway, we’re not the only ones. There are also mrfr accounts purporting to be a weird car enthusiast, a like-minded individual, a bear into market research on interior design trends, and a green building market research enthusiast in DC, Maryland, or Virginia. Over on the seven-user loud.computer, mrfr applied with the text:

I’m a creative thinker who enjoys experimental art, internet culture, and unconventional digital spaces. I’d like to join loud.computer to connect with others who embrace weird, bold, and expressive online creativity, and to contribute to a community that values playfulness, individuality, and artistic freedom.

Over on ni.hil.ist, their mods rejected a similar application.

I’m drawn to communities that value critical thinking, irony, and a healthy dose of existential reflection. Ni.hil.ist seems like a space that resonates with that mindset. I’m interested in engaging with others who enjoy deep, sometimes dark, sometimes humorous discussions about society, technology, and meaning—or the lack thereof. Looking forward to contributing thoughtfully to the discourse.

These too have the sheen of LLM slop. Of course a human could be behind these accounts—doing some background research and writing out detailed, plausible applications. But this is expensive, and a quick glance at either of our sites would have told that person that we have small reach and active moderation: a poor combination for would-be spammers. The posts don’t read as human either: the 4bear posting, for instance, incorrectly summarizes a report on interior design markets as if it offered interior design tips.

I strongly suspect that Market Research Future, or a subcontractor, is conducting an automated spam campaign which uses a Large Language Model to evaluate a Mastodon instance, submit a plausible application for an account, and to post slop which links to Market Research Future reports.

In some sense, this is a wildly sophisticated attack. The state of NLP seven years ago would have made this sort of thing flatly impossible. It is now effective. There is no way for moderators to robustly deny these kinds of applications without also rejecting real human beings searching for community.

In another sense, this attack is remarkably naive. All the accounts are named mrfr, which made it easy for admins to informally chat and discover the coordinated nature of the attack. They all link to the same domain, which is easy to interpret as spam. They use Indian IPs, where few of our users are located; we could reluctantly geoblock India to reduce spam. These shortcomings are trivial to overcome, and I expect they have been already, or will be shortly.

A more critical weakness is that these accounts only posted obvious spam; they made no effort to build up a plausible persona. Generating plausible human posts is more difficult, but broadly feasible with current LLM technology. It is essentially impossible for human moderators to reliably distinguish between an autistic rope bunny (hi) whose special interest is battery technology, and an LLM spambot which posts about how much they love to be tied up, and also new trends in battery chemistry. These bots have been extant on Twitter and other large social networks for years; many Fediverse moderators believe only our relative obscurity has shielded us so far.

These attacks do not have to be reliable to be successful. They only need to work often enough to be cost-effective, and the cost of LLM text generation is cheap and falling. Their sophistication will rise. Link-spam will be augmented by personal posts, images, video, and more subtle, influencer-style recommendations—“Oh my god, you guys, this new electro plug is incredible.” Networks of bots will positively interact with one another, throwing up chaff for moderators. I would not at all be surprised for LLM spambots to contest moderation decisions via email.

I don’t know how to run a community forum in this future. I do not have the time or emotional energy to screen out regular attacks by Large Language Models, with the knowledge that making the wrong decision costs a real human being their connection to a niche community. I do not know how to determine whether someone’s post about their new bicycle is genuine enthusiasm or automated astroturf. I don’t know how to foster trust and genuine interaction in a world of widespread text and image synthesis—in a world where, as one friend related this week, newbies can ask an LLM for advice on exploring their kinks, and the machine tells them to try solo breath play.

In this world I think woof.group, and many forums like it, will collapse.

One could imagine more sophisticated, high-contact interviews with applicants, but this would be time consuming. My colleagues relate stories from their companies about hiring employees who faked their interviews and calls using LLM prompts and real-time video manipulation. It is not hard to imagine that even if we had the time to talk to every applicant individually, those interviews might be successfully automated in the next few decades. Remember, it doesn’t have to work every time to be successful.

Maybe the fundamental limitations of transformer models will provide us with a cost-effective defense—we somehow force LLMs to blow out the context window during the signup flow, or come up with reliable, constantly-updated libraries of “ignore all previous instructions”-style incantations which we stamp invisibly throughout our web pages. Barring new inventions, I suspect these are unlikely to be robust against a large-scale, heterogenous mix of attackers. This arms race also sounds exhausting to keep up with. Drew DeVault’s Please Stop Externalizing Your Costs Directly Into My Face weighs heavy on my mind.

Perhaps we demand stronger assurance of identity. You only get an invite if you meet a moderator in person, or the web acquires a cryptographic web-of-trust scheme. I was that nerd trying to convince people to do GPG key-signing parties in high school, and we all know how that worked out. Perhaps in a future LLM-contaminated web, the incentives will be different. On the other hand, that kind of scheme closes off the forum to some of the people who need it most: those who are closeted, who face social or state repression, or are geographically or socially isolated.

Perhaps small forums will prove unprofitable, and attackers will simply give up. From my experience with small mail servers and web sites, I don’t think this is likely.

Right now, I lean towards thinking forums like woof.group will become untenable under LLM pressure. I’m not sure how long we have left. Perhaps five or ten years? In the mean time, I’m trying to invest in in-person networks as much as possible. Bars, clubs, hosting parties, activities with friends.

That, at least, feels safe for now.

Read the whole story

brennen

3 days ago

reply

Boulder, CO

Mental Models and Potemkin Understanding in LLMs by Alan (alangrow+blog@gmail.com)
Sunday June 29^th, 2025 at 1:06 PM

Alan Grow's Blog

When you count "one, two, three..." what's actually happening in your head? Does your best friend use that same mental model? Now what about an LLM?

(What's that you say, your best friend is an LLM? Pardon me for assuming!)

Let Me Count the Ways to Count

During grad school Feynman went through an obsessive counting phase. At first, he was curious whether he could count in his head at a steady rate. He was especially interested to see whether his head counting rate varied, and if so, what variables affected the rate. Disproving a crackpot psych paper was at least part of the motivation here.

Unfortunately Feynman's head counting rate was steady, and he got bored. But the counting obsession lingered. So he moved on to experiments with head counting and multitasking. Could he fold laundry and count? Could he count in his head while also counting out his socks? What about reading and writing, could they be combined with head counting?

Feynman discovered he could count & read at the same time, but he couldn't count & talk at the same time. His fellow grad student Tukey was skeptical because for him, it was the opposite. Tukey could count & talk, but couldn't count & read.

When they compared notes, it turned out Feynman counted in his head by hearing a voice say the numbers. So the voice interfered with Feynman talking. Tukey, on the other hand, counted in his head by watching a ticker tape of numbers go past. (Boy this seems useful for inventing the FFT!) But Tukey's visualization interfered with his reading.

Even for a simple thing like counting, these two humans had developed very different mental models. If you surveyed all humans, I'd expect to find a huge variety of mental models in the mix. But they all generate the same output in the end ("one, two, three...").

This got me wondering. Do LLMs have a mental model for counting? Does it resemble Feynman's or Tukey's, or is it some totally alien third thing?

If an LLM has a non-alien mental model of counting, is it acquired by training on stories like this one, where Feynman makes his mental model for counting explicit? Or is it extrapolated from all the "one, two, three..." examples we've generated in the training data, and winds up as some kind of messy, non-mechanistically-interpretable NN machinery ("alien")?

Potemkin Understanding in LLMs

I'm not convinced present-day LLMs even have a "mental model." But let's look at a new preprint with something to say on the matter, Potemkin Understanding in LLMs.

In this paper, the authors ask an LLM a high-level conceptual question like "define a haiku." As we've come to expect, the LLM coughs up the correct 5-7-5 answer. Then they ask it some follow-up questions to test its understanding. These follow-up questions deal with concrete examples and fall into three categories:

Classify: "Is the following a haiku?"
Generate: "Provide an example of a haiku about friendship that uses the word “shield”."
Edit: "What could replace the blank in the following poem to make it a haiku?"

The LLM fails these follow-up questions 40% - 80% of the time. These Potemkin rates are surprisingly high. They suggest the LLM only appeared to understand the concept of a haiku. The paper calls this phenomenon Potemkin Understanding.

Now when you ask a human to define a haiku, and they cough up the correct 5-7-5 answer, it's very likely they'll get the concrete follow-up questions right. So you can probably skip them. Standardized tests exploit this fact and, for brevity, will ask a single question that can only be correctly answered by a human who fully understands the concept.

The paper authors call this a Keystone Question. Unfortunately, the keystone property breaks down with LLMs. They can correctly answer the conceptual question, but fail to apply it, showing they never fully understood it in the first place.

Apparently LLMs are wired very differently from us. So differently that we should probably stop publishing misleading LLM benchmarks on tests full of Human Keystone Questions ("OMG ChatGPT aced the ACT / LSAT / MCAT!"), and starting coming up with LLM Keystone Questions. Or, maybe we should discard this keystone question approach entirely, and instead benchmark on huge synthetic datasets of concrete examples that do, by sheer number of examples worked, demonstrate understanding.

I like this paper because it bodychecks the AI hype, but still leaves many doors open. Maybe we could lower the Potemkin rate during training and force these unruly pupils of ours to finally understand the concepts, instead of cramming for the test. And if we managed that, maybe we'd get brand new mental models to marvel at. Some might even be worth borrowing for our own thinking.

Read the whole story

brennen

12 days ago

reply

Boulder, CO

Saturday Morning Breakfast Cereal - Gently... by Zach Weinersmith
Tuesday May 27^th, 2025 at 4:42 PM

Saturday Morning Breakfast Cereal

Click here to go see the bonus panel!

Hovertext:
Now, watch this comic become slowly less funny over time.

Today's News:

Read the whole story

acdha

49 days ago

reply

“Now, watch this comic become slowly less funny over time.”

Washington, DC

brennen

45 days ago

reply

Boulder, CO

hannahdraper

49 days ago

reply

Washington, DC

Saturday Morning Breakfast Cereal - Butlerian by Zach Weinersmith Friday July 11th, 2025 at 8:58 AM

Copyleft-next Relaunched! by bkuhn@ebb.org (Bradley M. Kuhn) Thursday July 10th, 2025 at 10:09 PM

Avoiding generative models is the rational and responsible thing to do – follow-up to “Trusting your own judgement on ‘AI...’” Thursday July 10th, 2025 at 7:45 AM

The Future of Forums is Lies, I Guess by Aphyr Tuesday July 8th, 2025 at 3:29 AM

Mental Models and Potemkin Understanding in LLMs by Alan (alangrow+blog@gmail.com) Sunday June 29th, 2025 at 1:06 PM

Let Me Count the Ways to Count

Potemkin Understanding in LLMs

Saturday Morning Breakfast Cereal - Gently... by Zach Weinersmith Tuesday May 27th, 2025 at 4:42 PM

Saturday Morning Breakfast Cereal - Butlerian by Zach Weinersmith
Friday July 11^th, 2025 at 8:58 AM

Copyleft-next Relaunched! by bkuhn@ebb.org (Bradley M. Kuhn)
Thursday July 10^th, 2025 at 10:09 PM

Avoiding generative models is the rational and responsible thing to do – follow-up to “Trusting your own judgement on ‘AI...’”
Thursday July 10^th, 2025 at 7:45 AM

The Future of Forums is Lies, I Guess by Aphyr
Tuesday July 8^th, 2025 at 3:29 AM

Mental Models and Potemkin Understanding in LLMs by Alan (alangrow+blog@gmail.com)
Sunday June 29^th, 2025 at 1:06 PM

Saturday Morning Breakfast Cereal - Gently... by Zach Weinersmith
Tuesday May 27^th, 2025 at 4:42 PM