This paper discusses the controversy over the authorship of twelve
of the "Federalist" papers as seen and studied by over twenty non-traditional authorship attribution practitioners. The "Federalist"
papers were written during the years 1787 and 1788 by Alexander
Hamilton, John Jay, and James Madison. These 85 propaganda tracts
were intended to help get the U.S. Constitution ratified. They
were all published anonymously under the pseudonym, "Publius." The
general consensus of traditional attribution scholars (although
varying from time to time) is that Hamilton wrote 51 of the papers,
Madison wrote 14, Jay wrote 5, while 3 papers were written jointly
by Hamilton and Madison, and 12 papers have disputed authorship —
either Hamilton or Madison.
In 1964, Frederick Mosteller and David Wallace, building on the
earlier unpublished work of Frederick Williams and Frederick
Mosteller, published their non-traditional authorship attribution
study, "Inference and Disputed Authorship: The Federalist." It is
arguably the most famous and well respected example from all of the
non-traditional attribution studies. It is the most statistically
sophisticated non-traditional study ever carried out. There even
has been a 40 page paper explicating the statistical techniques of
the Mosteller and Wallace study (Francis). Since then, hundreds
of papers have cited the Mosteller and Wallace work and over two
dozen non-traditional attributiion practitioners have analyzed
and/or conducted variations of the original study.
These practitioners wanted to test their statistical approaches
against the Mosteller and Wallace touchstone study. Mosteller and
Wallace set the boundry conditions for the subsequent work — e.g.,
not using the Jay articles as a control. Their experimental design
and overall report is never questioned. Most of these later
practitioners do not select or prepare the input text as rigorously
as Mosteller and Wallace — whose own selection and preparation
was not as rigorous and complete as it should have been.
This section discusses the way the Federalist papers were
originally published (76 in newspapers and 8 in the book
compilation) and which editions the practitioners chose
for their non-traditional studies — how 84 papers became
85 and how some papers had different numbers in different
editions. The effect that the lack of Hamilton and Madison
holigraphs had on the studies is discussed. The choice of
edition has the potential of profoundly changing the results
of the studies.
Project Gutenberg Etexts are usually created
from multiple editions, all of which are in the
Public Domain in the United States, unless a
copyright notice is included. Therefore we do NOT
keep these books in compliance with any particular
paper edition, usually otherwise.
(Front Material of Gutenberg Etext #1404)
The compounding problem of down-loading texts via the internet
is explicated — e.g., one of the texts includes every variant
of every paragraph. It is shown why none of the Federalist
studies used a 'valid' text of the Federalist papers. The
question, "Does this incorrect input data invalidate the final
'answer?'" is discussed.
This sample cannot contain questionable Hamilton writings.
This sample must also fulfill the other criteria of a
valid sample — e.g., same genre, same constricted time
frame. There also should be a sub-set of this sample set
aside for later analysis in order to guard against the
charge of cherry picking the style-markers. This is not
the same as the Mosteller and Wallace "training sample."
In addition to discussing the way the Madison sample was
constructed, what was said about the Hamilton sample will
be applied here.
Does the lopsided number of Hamilton papers over Madison
papers (51 to 14) pose a problem for the studies? Were the
Hamilton and Madison control texts from outside the Federalist
papers chosen correctly? Why are these "outside" controls not
used by most of the other practitioners? This section goes on
to discuss the control problems that arose with the Mosteller
and Wallace study and have been perpetuated through the
subsequent studies. This section also discusses the other
control problems introduced in these studies.
The cumulative effect of NEARLY A THOUSAND
SMALL CHANGES [emphasis mine] has been to
improve the clarity and readability of the
text without changing its original argument.
(Scigliano, lii)
In the Mosteller and Wallace study, a "little book of decisions"
is mentioned. This "book," originally constructed by Williams and
Mosteller, contained an extensive list of items that Mosteller
and Wallace unedited, de-edited, and edited before beginning the
statistical analysis of the texts — items such as quotations and
numerals. Unfortunately, neither Williams and Mosteller nor Mosteller
and Wallace published the contents of this "little book of decisions"
and only mention five of their many decisions in the published work.
[Mosteller and Wallace 7, 16, 38-41] The little book has been lost
and cannot be recovered or even reconstructed [Mosteller]. This paper
goes on to discuss the many ramifications of the "little book" on
their study and the subsequent studies. Also, how the loss of the
"little book" casts a shadow of "scientific invalidity" over the
Mosteller and Wallace work — i.e., it cannot be replicated. Their
"little book" was not used by any of the following studies — making
meaningful comparisons moot.
This section goes on to list many of the unediting, de-editing,
and editing items that need to be considered. It lists several of
the mistakes made by the many practitioners and what these mistakes
mean to the validity of the studies (e.g.):
- Wrong letters
- Quotes — e.g., 131 words of Federalist 5 are a quote from
Queen Ann, 334 words of Federalist 9 are a quote from
Montesque
- Footnotes — the author's and the editors'
- Numbers
- Foreign languages
- Spelling
- Homographic forms
- Contracted forms
- Hyphenation
- Word determination
- Disambiguation
- Editorial intervention — internal (e.g., Hamilton on
Madison) and external (e.g., from the first newspaper
copy editor to present day editors)
Are practitioners (statisticians and non-statisticians) so
blinded by the statistical sophistication that the other
elements of a valid non-traditional authorship study are
ignored?
Do professional historians accept, deny, or show indifference
to the body of work that supports the Mosteller and Wallace
study? Why did I spend hours searching for a Mosteller and
LAWRENCE study of the Federalist papers?
Is the case put forth by Mosteller
and Wallace and buttressed by the other non-traditional
practitioners nothing but a "Monument" built on sand? What
effect does showing the flaws in the Federalist studies have
on non-traditional studies in general — i.e., if the best is
suspect, what about the rest!