department of hack
2028 stories
·
17 followers

The Future of Forums is Lies, I Guess

2 Shares

In my free time, I help run a small Mastodon server for roughly six hundred queer leatherfolk. When a new member signs up, we require them to write a short application—just a sentence or two. There’s a small text box in the signup form which says:

Please tell us a bit about yourself and your connection to queer leather/kink/BDSM. What kind of play or gear gets you going?

This serves a few purposes. First, it maintains community focus. Before this question, we were flooded with signups from straight, vanilla people who wandered in to the bar (so to speak), and that made things a little awkward. Second, the application establishes a baseline for people willing and able to read text. This helps in getting people to follow server policy and talk to moderators when needed. Finally, it is remarkably effective at keeping out spammers. In almost six years of operation, we’ve had only a handful of spam accounts.

I was talking about this with Erin Kissane last year, as she and Darius Kazemi conducted research for their report on Fediverse governance. We shared a fear that Large Language Models (LLMs) would lower the cost of sophisticated, automated spam and harassment campaigns against small servers like ours in ways we simply couldn’t defend against.

Anyway, here’s an application we got last week, for a user named mrfr:

Hi! I’m a queer person with a long-standing interest in the leather and kink community. I value consent, safety, and exploration, and I’m always looking to learn more and connect with others who share those principles. I’m especially drawn to power exchange dynamics and enjoy impact play, bondage, and classic leather gear.

On the surface, this is a great application. It mentions specific kinks, it uses actual sentences, and it touches on key community concepts like consent and power exchange. Saying “I’m a queer person” is a tad odd. Normally you’d be more specific, like “I’m a dyke” or “I’m a non-binary bootblack”, but the Zoomers do use this sort of phrasing. It does feel slightly LLM-flavored—something about the sentence structure and tone has just a touch of that soap-sheen to it—but that’s hardly definitive. Some of our applications from actual humans read just like this.

I approved the account. A few hours later, it posted this:

A screenshot of the account `mrfr`, posting "Graphene Battery Breakthroughs: What You Need to Know Now. A graphene battery is an advanced type of battery that incorporates graphene, a single layer of carbon atoms arranged in a two-dimensional honeycomb lattice. Known for its exceptional electrical conductivity, mechanical strength, and large surface area, graphene offers transformative potential in energy storage, particularly in enhancing the performance of lithium-ion and other types of battery, Get more info @ a marketresearchfuture URL

It turns out mrfr is short for Market Research Future, a company which produces reports about all kinds of things from batteries to interior design. They actually have phone numbers on their web site, so I called +44 1720 412 167 to ask if they were aware of the posts. It is remarkably fun to ask business people about their interest in queer BDSM—sometimes stigma works in your favor. I haven’t heard back yet, but I’m guessing they either conducting this spam campaign directly, or commissioned an SEO company which (perhaps without their knowledge) is doing it on their behalf.

Anyway, we’re not the only ones. There are also mrfr accounts purporting to be a weird car enthusiast, a like-minded individual, a bear into market research on interior design trends, and a green building market research enthusiast in DC, Maryland, or Virginia. Over on the seven-user loud.computer, mrfr applied with the text:

I’m a creative thinker who enjoys experimental art, internet culture, and unconventional digital spaces. I’d like to join loud.computer to connect with others who embrace weird, bold, and expressive online creativity, and to contribute to a community that values playfulness, individuality, and artistic freedom.

Over on ni.hil.ist, their mods rejected a similar application.

I’m drawn to communities that value critical thinking, irony, and a healthy dose of existential reflection. Ni.hil.ist seems like a space that resonates with that mindset. I’m interested in engaging with others who enjoy deep, sometimes dark, sometimes humorous discussions about society, technology, and meaning—or the lack thereof. Looking forward to contributing thoughtfully to the discourse.

These too have the sheen of LLM slop. Of course a human could be behind these accounts—doing some background research and writing out detailed, plausible applications. But this is expensive, and a quick glance at either of our sites would have told that person that we have small reach and active moderation: a poor combination for would-be spammers. The posts don’t read as human either: the 4bear posting, for instance, incorrectly summarizes a report on interior design markets as if it offered interior design tips.

I strongly suspect that Market Research Future, or a subcontractor, is conducting an automated spam campaign which uses a Large Language Model to evaluate a Mastodon instance, submit a plausible application for an account, and to post slop which links to Market Research Future reports.

In some sense, this is a wildly sophisticated attack. The state of NLP seven years ago would have made this sort of thing flatly impossible. It is now effective. There is no way for moderators to robustly deny these kinds of applications without also rejecting real human beings searching for community.

In another sense, this attack is remarkably naive. All the accounts are named mrfr, which made it easy for admins to informally chat and discover the coordinated nature of the attack. They all link to the same domain, which is easy to interpret as spam. They use Indian IPs, where few of our users are located; we could reluctantly geoblock India to reduce spam. These shortcomings are trivial to overcome, and I expect they have been already, or will be shortly.

A more critical weakness is that these accounts only posted obvious spam; they made no effort to build up a plausible persona. Generating plausible human posts is more difficult, but broadly feasible with current LLM technology. It is essentially impossible for human moderators to reliably distinguish between an autistic rope bunny (hi) whose special interest is battery technology, and an LLM spambot which posts about how much they love to be tied up, and also new trends in battery chemistry. These bots have been extant on Twitter and other large social networks for years; many Fediverse moderators believe only our relative obscurity has shielded us so far.

These attacks do not have to be reliable to be successful. They only need to work often enough to be cost-effective, and the cost of LLM text generation is cheap and falling. Their sophistication will rise. Link-spam will be augmented by personal posts, images, video, and more subtle, influencer-style recommendations—“Oh my god, you guys, this new electro plug is incredible.” Networks of bots will positively interact with one another, throwing up chaff for moderators. I would not at all be surprised for LLM spambots to contest moderation decisions via email.

I don’t know how to run a community forum in this future. I do not have the time or emotional energy to screen out regular attacks by Large Language Models, with the knowledge that making the wrong decision costs a real human being their connection to a niche community. I do not know how to determine whether someone’s post about their new bicycle is genuine enthusiasm or automated astroturf. I don’t know how to foster trust and genuine interaction in a world of widespread text and image synthesis—in a world where, as one friend related this week, newbies can ask an LLM for advice on exploring their kinks, and the machine tells them to try solo breath play.

In this world I think woof.group, and many forums like it, will collapse.

One could imagine more sophisticated, high-contact interviews with applicants, but this would be time consuming. My colleagues relate stories from their companies about hiring employees who faked their interviews and calls using LLM prompts and real-time video manipulation. It is not hard to imagine that even if we had the time to talk to every applicant individually, those interviews might be successfully automated in the next few decades. Remember, it doesn’t have to work every time to be successful.

Maybe the fundamental limitations of transformer models will provide us with a cost-effective defense—we somehow force LLMs to blow out the context window during the signup flow, or come up with reliable, constantly-updated libraries of “ignore all previous instructions”-style incantations which we stamp invisibly throughout our web pages. Barring new inventions, I suspect these are unlikely to be robust against a large-scale, heterogenous mix of attackers. This arms race also sounds exhausting to keep up with. Drew DeVault’s Please Stop Externalizing Your Costs Directly Into My Face weighs heavy on my mind.

Perhaps we demand stronger assurance of identity. You only get an invite if you meet a moderator in person, or the web acquires a cryptographic web-of-trust scheme. I was that nerd trying to convince people to do GPG key-signing parties in high school, and we all know how that worked out. Perhaps in a future LLM-contaminated web, the incentives will be different. On the other hand, that kind of scheme closes off the forum to some of the people who need it most: those who are closeted, who face social or state repression, or are geographically or socially isolated.

Perhaps small forums will prove unprofitable, and attackers will simply give up. From my experience with small mail servers and web sites, I don’t think this is likely.

Right now, I lean towards thinking forums like woof.group will become untenable under LLM pressure. I’m not sure how long we have left. Perhaps five or ten years? In the mean time, I’m trying to invest in in-person networks as much as possible. Bars, clubs, hosting parties, activities with friends.

That, at least, feels safe for now.

Read the whole story
brennen
13 minutes ago
reply
Boulder, CO
Share this story
Delete

Mental Models and Potemkin Understanding in LLMs

1 Share

When you count "one, two, three..." what's actually happening in your head? Does your best friend use that same mental model? Now what about an LLM?

(What's that you say, your best friend is an LLM? Pardon me for assuming!)

Let Me Count the Ways to Count

During grad school Feynman went through an obsessive counting phase. At first, he was curious whether he could count in his head at a steady rate. He was especially interested to see whether his head counting rate varied, and if so, what variables affected the rate. Disproving a crackpot psych paper was at least part of the motivation here.

Unfortunately Feynman's head counting rate was steady, and he got bored. But the counting obsession lingered. So he moved on to experiments with head counting and multitasking. Could he fold laundry and count? Could he count in his head while also counting out his socks? What about reading and writing, could they be combined with head counting?

Feynman discovered he could count & read at the same time, but he couldn't count & talk at the same time. His fellow grad student Tukey was skeptical because for him, it was the opposite. Tukey could count & talk, but couldn't count & read.

When they compared notes, it turned out Feynman counted in his head by hearing a voice say the numbers. So the voice interfered with Feynman talking. Tukey, on the other hand, counted in his head by watching a ticker tape of numbers go past. (Boy this seems useful for inventing the FFT!) But Tukey's visualization interfered with his reading.

Even for a simple thing like counting, these two humans had developed very different mental models. If you surveyed all humans, I'd expect to find a huge variety of mental models in the mix. But they all generate the same output in the end ("one, two, three...").

This got me wondering. Do LLMs have a mental model for counting? Does it resemble Feynman's or Tukey's, or is it some totally alien third thing?

If an LLM has a non-alien mental model of counting, is it acquired by training on stories like this one, where Feynman makes his mental model for counting explicit? Or is it extrapolated from all the "one, two, three..." examples we've generated in the training data, and winds up as some kind of messy, non-mechanistically-interpretable NN machinery ("alien")?

Potemkin Understanding in LLMs

I'm not convinced present-day LLMs even have a "mental model." But let's look at a new preprint with something to say on the matter, Potemkin Understanding in LLMs.

In this paper, the authors ask an LLM a high-level conceptual question like "define a haiku." As we've come to expect, the LLM coughs up the correct 5-7-5 answer. Then they ask it some follow-up questions to test its understanding. These follow-up questions deal with concrete examples and fall into three categories:

  1. Classify: "Is the following a haiku?"
  2. Generate: "Provide an example of a haiku about friendship that uses the word “shield”."
  3. Edit: "What could replace the blank in the following poem to make it a haiku?"

The LLM fails these follow-up questions 40% - 80% of the time. These Potemkin rates are surprisingly high. They suggest the LLM only appeared to understand the concept of a haiku. The paper calls this phenomenon Potemkin Understanding.

Now when you ask a human to define a haiku, and they cough up the correct 5-7-5 answer, it's very likely they'll get the concrete follow-up questions right. So you can probably skip them. Standardized tests exploit this fact and, for brevity, will ask a single question that can only be correctly answered by a human who fully understands the concept.

The paper authors call this a Keystone Question. Unfortunately, the keystone property breaks down with LLMs. They can correctly answer the conceptual question, but fail to apply it, showing they never fully understood it in the first place.

Apparently LLMs are wired very differently from us. So differently that we should probably stop publishing misleading LLM benchmarks on tests full of Human Keystone Questions ("OMG ChatGPT aced the ACT / LSAT / MCAT!"), and starting coming up with LLM Keystone Questions. Or, maybe we should discard this keystone question approach entirely, and instead benchmark on huge synthetic datasets of concrete examples that do, by sheer number of examples worked, demonstrate understanding.

I like this paper because it bodychecks the AI hype, but still leaves many doors open. Maybe we could lower the Potemkin rate during training and force these unruly pupils of ours to finally understand the concepts, instead of cramming for the test. And if we managed that, maybe we'd get brand new mental models to marvel at. Some might even be worth borrowing for our own thinking.

Read the whole story
brennen
8 days ago
reply
Boulder, CO
Share this story
Delete

Saturday Morning Breakfast Cereal - Gently...

1 Comment and 4 Shares


Click here to go see the bonus panel!

Hovertext:
Now, watch this comic become slowly less funny over time.


Today's News:
Read the whole story
acdha
45 days ago
reply
“Now, watch this comic become slowly less funny over time.”
Washington, DC
brennen
41 days ago
reply
Boulder, CO
hannahdraper
45 days ago
reply
Washington, DC
Share this story
Delete

Life in plastic is fantastic

1 Comment

byruit:

Life in plastic is fantastic

Read the whole story
brennen
63 days ago
reply
Notably better interface design than most new cars in 2025.
Boulder, CO
Share this story
Delete

Down

2 Shares

https://www.oglaf.com/down/

Read the whole story
brennen
63 days ago
reply
Boulder, CO
Share this story
Delete

PhD Timeline

3 Shares
Rümeysa Öztürk was grabbed off the street in my town one month ago.
Read the whole story
brennen
69 days ago
reply
Boulder, CO
acdha
73 days ago
reply
Washington, DC
Share this story
Delete
Next Page of Stories