In the three years since John F. Burrows presented Delta, his new measure of authorial difference, in his Busa Award lecture (2001), there has been a flurry of activity in the authorship attribution community and beyond. Delta measures the difference between test texts and a set of texts by possible authors in an elegantly simple way: the frequencies of the most frequent words in the test text and in each of the primary texts are compared with their mean frequencies in the primary set. The difference between the test text and the mean is then compared with the difference between the texts by each author in the primary set and the mean. Then the absolute values of the differences between the z-scores for all the words are summed and the mean is calculated, producing Delta, "the mean of the absolute differences between the z-scores for a set of word-variables in a given text-group and the z-scores for the same set of word-variables in a target text"
(Burrows 2002a, 271). The primary author whose texts show the smallest Delta, the smallest mean difference, from the test text has the best claim to being the author of the test text.
Burrows has published two articles demonstrating the effectiveness of Delta on Restoration poetry, even for small texts (2002a, 2003), and has applied the technique to the interplay between translation and authorship in "The Englishing of Juvenal: Computational Stylistics and Translated Texts" (2002b). David L. Hoover has just published two studies involving Delta (2004a, 2004b) that automate the process of calculating and evaluating the results of Delta in an Excel spreadsheet with macros. Hoover's first article demonstrates Delta's effectiveness on early 20th century novels, and shows that increasing the number of frequent words to be analyzed far beyond the 150 most frequent that Burrows uses—to the 700 or 800 most frequent—substantially improves the results, as does the removal of personal pronouns and words that are frequent in the entire corpus only because they are extremely frequent in a single text. It also shows that large drops in Delta from the first to the second likeliest author are strongly associated with correct attributions. The second article shows that it is possible to improve the accuracy of attribution by Delta by selecting subsets of the word frequency list for analysis and by changing the formula of Delta itself, and also extends the testing of the measures to contemporary literary criticism, where they continue to perform very well. These new methods recapture information about whether a word is more or less frequent than the mean, about how different the test text is from the mean, about the size of the absolute difference between the test text and each primary text, and about the direction of the difference between the test text and the primary text.
In spite of the fact that Burrows's Delta is simple and intuitively reasonable, it, like previous statistical authorship attribution techniques, and like Hoover's alterations, lacks any compelling theoretical justification. Nonetheless, it and some of the variations upon it are manifestly and surprisingly effective, even in difficult open authorship attribution situations in which the claimants cannot be limited to a small number by traditional means. Other ongoing studies that are not ready for public discussion are underway by several researchers, involving a 'real life' attribution problem on 19th century prose, another on a Middle English saint's life, and an application of the technique and its variants to biology.
In this paper I investigate the effectiveness of Delta and Hoover's various Delta Prime candidates on a corpus of 1,430,000 words of Modern American Poetry by poets born between 1902 and 1943. This investigation returns to poetry but brings the techniques forward to the 20th century. Although it is well known that changes in language and style across long spans of time are very considerable, and that many authorship attribution techniques are sensitive to these differences, preliminary results show that Delta and the various Delta Primes are even more accurate on the corpus investigated here than on the restoration poetry that Burrows investigated. They are so accurate, in fact, that the differences between the original Delta and the alternatives are relatively small (it is difficult to improve much on 100% accuracy). These results may be related to a greater individuality in poetic styles in modern poetry, with some poets using rhyme and meter and others working in much looser forms, and to the presence of dialect. Whatever the cause, however, they further demonstrate the robustness of the techniques, which have now been tested on two corpora of poetry written nearly 300 years apart, on novels from 1900, and contemporary literary criticism. Further tests on contemporary prose and on texts tagged for part of speech are ongoing, not so much in an attempt to further confirm the effectiveness and reliability of Delta and Delta Prime, which now seem very solidy validated, but rather in the hope of more fully understanding why these relatively simple techniques work so well, and in continuing to improve their already impressive power.