September 30, 2015 7:50 AM   Subscribe

TauntBot is an exercise in generated language. It uses a rather huge and growing corpus of hand-selected words, and an interesting set of rules, for hand-crafting them into verbose insults. It then posts these to Twitter. It replies to people, I'm hoping to get it to trade barbs with them. Yes, I'm a terrible person. Now, with that out of the way...

Of course, most of these are currently directed at Martin Shkreli, who raised the price of an AIDS drug 5000%, and Donald Trump, but it'll reply to anyone who mentions it. It also spews random insults 15 minutes a day at the reader and humans in general.

The thing I've found most interesting in developing this is forcing myself to discover the structure of my native language as a complete amateur. Eventually I would like to use machine learning technique to discover these, but doing it myself is more educational (at least to me).

It gets the language right the vast majority of the time, but even when it messes up it's still not terrible. English is an amazingly inconsistent language.

The structure of its sentences is generated according to random weights, but the current structure is limited to declaring that the subject is one or more qualified objects, or adjectives.

I could use some suggestions for coming up with more interesting insult structures from any interested language enthusiasts here. :-)
Role: programmer
posted by ephemerae (20 comments total) 3 users marked this as a favorite
This project was posted to MetaFilter by feckless fecal fear mongering on October 4, 2015: Go away, or I shall taunt you a second time!

I forgot to mention that TauntBot's vocabulary is pretty egalitarian: it contains no words which to my knowledge are racist, sexist, homophobic, or religion-baiting. Upon spotting these they are immediately removed.
posted by ephemerae at 8:19 AM on September 30, 2015 [2 favorites]

Pretty solid. It said I was "a ceaseless, large slug, a pathetic, incorrect dolt and an appalling clusterfuck." The first part was actually kinda complimentary, though.
posted by ignignokt at 9:57 AM on September 30, 2015 [1 favorite]

Re: egalitarianness: It is generally good at that, but it's probably not a good idea to call people "doughy."
posted by ignignokt at 10:05 AM on September 30, 2015

Oh my! This is exactly what my day needed.

> Everyone says you are a desolate, croaking backside and an infuriatingly selfish car accident.

Instead of hiding under my desk and hissing any anyone who tries to talk to me, maybe I will just be friends with this bot. Or hide under my desk and show its tweets to anyone who tries to talk to me.
posted by duien at 11:42 AM on September 30, 2015 [3 favorites]

ignignokt, thanks for the heads up. I'll remove that word and its synonyms. (Won't take effect 'til tonight though).
posted by ephemerae at 12:25 PM on September 30, 2015 [1 favorite]

Of course, most of these are currently directed at Martin Shkreli […]
I get what you're doing here, and you can try the activism (or whatever) argument if you want, but know that doing this is a good way to get your account disabled/banned, righteous or not. Shkreli's got enough actual people going at him that it's debatable he'd even notice enough to act, but you probably shouldn't make a habit of it. Besides the TOS implications, unsolicited @mentioning is just frowned upon as a general bot-making practice, also.

I do like the core of it, though.
posted by Su at 1:10 AM on October 2, 2015

A good alternative is to just mention him as "Martin Shkreli." People often make bots that quote users they find from the Twitter firehose, and they usually reference them without the "@" to avoid bothering them.
posted by ignignokt at 3:56 AM on October 2, 2015

Ooh, good points Su and ignignokt. Deployed a fix. Thanks!

It's definitely getting better at forming interesting sentences.
posted by ephemerae at 7:46 AM on October 2, 2015 [1 favorite]


We know you are a houseful of arse tears and a large number of exorbitant, cheesy felons.

We had a cheesy-shells-based dinner so when I showed this to my very-full husband he warned me not to poke his houseful of exorbitant, cheesy felons or he'd cry arse tears on me.
posted by bookdragoness at 9:05 PM on October 3, 2015 [2 favorites]

It's interesting now that people are starting to play with it, the grammar formation bugs are becoming more apparent. Those will be fixed as I get to them. It is gratifying that folks think this idea is interesting enough to play with. You have my thanks.

In other news, its vocabulary is still very embryonic. Apart from some initial seeding of words I could think of after a mug of stout on a Thursday night, it is taking me some time to curate words from the dictionary, and I'm only up to the start of words starting with J. I'm looking forward to seeing what it can do armed with a full arsenal of pointy words.

After that we can start doing smarter conversions between word types, exceptions to rules, synonyms, antonyms, classifications, interesting sentence structures, negations, severity measurement etc. I should really just do machine learning but the algorithmic/probabilistic approach is turning out to be quite fun.

One person mentioned on the front page post "How could you hook this thing up to Eliza?". I suppose we could but I'm hoping Twitter can serve as the conversational medium instead for getting into an argument with someone silly enough to believe they're interacting with a human. Something Eliza-like that could try to glean meaning from human replies could be good though, even if it's just reacting to keywords, it might be good enough to either be interesting or fool someone.
posted by ephemerae at 10:37 PM on October 4, 2015

NoraReed retweeted this useful list of non ableist slurs (which I can only assume is to be used for bot making)
posted by Just this guy, y'know at 8:40 AM on October 5, 2015 [1 favorite]

Now supported: alliteration, non-repetitive word selection, removed a bunch of ablist terms.
posted by ephemerae at 10:25 AM on October 6, 2015 [1 favorite]

Now supported: definite articles and titles, so you can say things like "the queen of a swimming pool of ninnies."

This is WAY too much fun.
posted by ephemerae at 11:39 PM on October 13, 2015

The problem is, once you start, you just keep going.
I have two bots currently running, another being written, mainly because I'm experimenting with a ridiculously insane structure with experimental function calls written in tracery and other stuff.
Also I had an idea for a fourth this morning.

I don't think anyone ever stops at just one bot do they?
posted by Just this guy, y'know at 3:35 AM on October 14, 2015 [1 favorite]

posted by EndsOfInvention at 11:41 AM on October 17, 2015 [1 favorite]

OK, since our last episode, TauntBot has gained some new abilities:

* The ability to assign a lack of a good thing, e.g. Martin has no integrity.
* Possessives: Your dog is a flatulent bore and a ninny.
* Tense awareness, past present and future. Your father had no reason to exist, unfortunately eats mud, and will eat bacteria.
* Plural subjects. Your friends are viruses and have no life.
* Constructing multiple-sentence insults, switching subject appropriately. The NRA is a bunch of fetid gibbons. They will be noisy.
* Titles. Donald is the king of a world of retching donkeys.
* Time constraints. You licked a bucket of toads on the way to work, and gobbled a plethora of stench while having sex.
* Parting shots. You are putrid, you steaming macaw.
* Pre-Subject Accusations. Total git Donald Trump is quite revolting.
* Only compare the subject to things which are not groups of people, e.g. nothing like this: Fred is a bag of gluttonous thieves. But this would be OK: Fred is a garrulous sack of frogs.

Way too much fun. Ideas for what to do next.

* ELIZA-style assembling of retorts using words the target has used in conversations.
* Awareness of depth of conversation, so that it could say things like "that all you got?" when a target responds a second time.
* Alternatives. Either you're a donkey or a festering boil. I can't decide.
* Questions. Your father was an orangutan? That explains why you are smelly.
* Other joinings such as: If you're so $good_adjective, why do you $verb_phrase?

People are not interacting with it much yet though it has managed to fool some white supremacists and homophobes, so that's interesting.
posted by ephemerae at 1:51 PM on October 25, 2015 [1 favorite]

I'm not sure your last link is correct...

Also, what is this written in? (Did you say and I missed it?)
I was working on one using CBDQ / Tracery, but needed a bit more state intelligence (It's planend as a narrative, so it needs to know the time, remember previous actions etc.) so I've started moving the existing Tracery over to Node.js.
I don't have the spare time for it at the moment though so it's going slowly.
posted by Just this guy, y'know at 3:35 PM on October 25, 2015

Oops, I meant this link for the run-in with a homophobe. I'm astonished the conversation lasted that long.

This was written in Java, runs in an embedded Jetty server inside a Docker container running on DigitalOcean. It does the usual thing of spawning off a bunch of threads for the background tasks it needs to get done such as maintaining a queue of stuff it wants to post to Twitter (it spans them out so it doesn't hit the API rate limit), checking for mentions and assembling replies, maintaining the word selection queues*, managing exponential backoff for replies (to avoid spam by other bots - this has already happened!), and blurting out random barbs at regular intervals.

It doesn't actually serve any web content itself, but I was considering opening up an API for requesting insults or voting on villains du jour. Ideas welcome here.

It's at the point where the complexity warrants conversion to Scala. There's a class for each kind of grammatical structure in use, which are composed into sentences in a top-down manner. It works well enough, but these classes' constructors have lots of options now though (I'm all about the immutability), so I made builders, but they are just verbose. Scala's case classes will kill two birds with one stone there. The type system is very useful for knowing what's what, I think I'd go mad if I had to write this in Node or Ruby or whatever. More power to you if you can do that :)

[*] The word selection is only semi-random to avoid repetition. When a word is used it is sent to the back of the queue for that word type, and only words in the front half of the queue are selected.
posted by ephemerae at 4:57 PM on October 25, 2015 [2 favorites]

Thanks for the link.
I love that he's trying to sealion a bot. It's sublime!

Also, woo, good answer. There are some words in there I have to go and look up!

I don't really know Node very well , but I have a few projects (work related and personal) that I think would benefit from it, hence starting out with a simple bot and adding complexity as I went on. (This might end badly!)
posted by Just this guy, y'know at 3:41 AM on October 26, 2015 [1 favorite]

Now with images of cats. OK, admittedly just grumpy cat so far. NEED MOAR CATS PLZ.

It was pretty fun picking apart the Java image and font APIs to make this work.

(HT to Smart Dalek for the idea.)
posted by ephemerae at 3:47 PM on November 8, 2015 [1 favorite]

« Older There's at least ONE....   |   Google Images for Slack... Newer »

You are not currently logged in. Log in or create a new account to post comments.