Dissociated press

Dissociated press is the algorithm that uses the input text to produce a "kind of relevant" humorous garbage that "logically" connects random parts of the input. The name was proposed as the "anti name" of the "Associated Press".

The Dissociated Press applet. You can enter any other text to process but depending on your Java version pasting from the clipboard may not work. To work around this, the long enough pre-set text is provided.

The most known implementation of this algorithm is built-in into Emacs. While the algorithm itself is not so complex and was a joke topic through generations of the software engineers, relatively few other implementations exist. The algorithm was first published in HACKMEN[1]. It is a part of the whole hacker culture[2], and has been employed with considerable satirical effect to the utterances of Usenet flamers[3].

Features

The algorithm of the dissociated press can be run either at character level or at word level. When executed at character level, it produces "new words" (the reader likely needs to be a native speaker of the language to understand the beauty). Depending on the language, the algorithm sometimes may even produce the existing dictionary words that were not present in the input text. When executed at word level, it produces the broken sentences that may have some new and strange meaning, such as topic was discussed at the level that may have serious consequences or similar. The algorithm has overlap and continuation parameters that allow to regulate, how much the newly generated text is similar to the original input text. Both character and word levels can be applied to the text in any language. As the algorithm uses random search, it produces the different output if run repeatedly with the same input and parameters.

While funny by itself, the algorithm can be applied to generate a big volumes of text that has many characteristics similar to the real text. Such data are valuable as an input of some performance or regression tests. For instance, testing the search performance of the database may deliver more realistic results if the keys are produced by this algorithm rather than being purely random.

Algorithm

During every step of the algorithm tries to find in the input text a sequence that matches the end of the existing, previously created output. If such sequence is found, several words (or letters) that directly follow the matching sequence (in this found location) are appended to the end of the output. This changes the end of the output, and it is possible to repeat the loop generating more "dissociated" text.

The algorithm can also be applied to the text repeatedly. The final output may be screened by human being to pick the most interesting parts.

The length of the subsequence that must match and the length of the fragment that is copied afterwards are the two parameters that largely define the level of similarity between the input and output. If these values are too large, the only possible output is identical to the input. However even large value may unexpectedly connect two sentences into something unexpected.

References

  1. 1 HAKMEM (1972), item 176
  2. 2 in the positive sense of the hacker word
  3. 3 Dissociated press entry in The Jargon File