Fill in the following form:
This is a work in progress. It is not expected to work properly. Future enhancements include:
More than anything else, what I need right now to improve this is real data to work from. If you've ever written a drunken e-mail to your ex-girl/boyfriend, or better yet, have received one, send it my way. I'll only use it for good, I promise...mostly.
Much less work went into this one (which is somewhat depressing because people seem to think this one works much better). The first pass (and most important, in my opinion) in this algorithm is a stochastic context-free grammar. A stochastic context-free grammar is a context-free grammar with probabilities attached. For example:
| Time | 0.4 | → last week. |
| 0.35 | | a couple weeks after I saw you at the Place | |
| 0.25 | | a month or two ago. |
That is an actual rule from the grammar I use (non-terminals are in italics). Whenever I want a place name, I just use that non-terminal. Note that all the probabilities add up to 1.0 (as they should). Note also that the second rule references another non-terminal (as context-free grammars often do). So I just use these probabilities and a random number generator to generate an e-mail from the grammar. Hooray.
From here there are four somewhat less interesting "drunk" filters that get applied:
Finding suitable probabilities for each of these four filters, I found, was the hardest part and took a long time. Being out by even half a percentage point can have huge implications for how the e-mail comes out.