April 7, 1991

For three years a Shakespeare Clinic of Claremont Colleges undergraduates has been using computers to see which of 58 claimed "true authors" of Shakespeare's poems and plays actually matched Shakespeare's style. The focus in the Clinic's final year was on 27 poet-claimants, [note 1] using both a new, modal test and a battery of more conventional tests. None of the poets tested matched Shakespeare. Walter Raleigh, the closest to Shakespeare by modal test, was 2.4 standard errors distant from Shakespeare's mean modal score, with not much better than a two percent chance of common authorship. [note 2] [note 3] John Donne, the most distant claimant, was 36.6 standard errors distant from Shakespeare. None of the three "leading" candidates with organised followings today -- Francis Bacon, Christopher Marlowe, and Edward de Vere, 17th Earl of Oxford -- came out anywhere near Shakespeare. This paper concentrates on Oxford, the candidate with the largest following and the largest body of recent supporting literature. [note 4]

Modal testing divides a text into blocks, counts for 52 keywords in each block (middling common words such as 'about', 'again', 'ways', and 'words'), and measures and ranks eigenvalues, or modes. Modes do not directly represent keyword occurrences; instead they measure complex patterns of deviation from a writer's normal rates of word frequency. They measure the way an author uses, or avoids using, words together. In Shakespeare's poems, modal analysis has revealed a few very strong, characteristic modes, quickly tailing off into many weak modes. All 90 blocks of Shakespeare's poems, and a block or two of sonnets taken from his plays, show the same characteristic pattern, while many blocks of other authors do not. Thus, Shakespeare's lowest and 'best' modal score is minus 25.36; his highest and worst score is 187.65; his mean score is 56.23; his standard deviation 40.09. Oxford's best, mean, and worst scores were 233.81, 356.94 and 490.47, respectively, all worse than Shakespeare's worst. Overall, Oxford tested 18.37 standard errors distant from Shakespeare's mean, very distant indeed. [note 5]

The Clinic also performed five more conventional tests on Oxford and 19 other Elizabethan poets. These tests were: hyphenated compound words (HCWs) and relative clauses per thousand; grade-level of writing, as measured by word- and sentence-length; and percentage of open- and feminine-ended lines. (In the phrase 'the evil that men do', that men do is a relative clause. An open line is a line not ended by a piece of punctuation. A feminine ending ends a line on an unstressed syllable, with a word such as 'gotten'.) In general, Shakespeare used compound words and open and feminine endings more frequently than his contemporaries, and relative clauses less frequently.

We found Shakespeare's patterns to be strikingly consistent, and often strikingly at variance with those of other Elizabethan poets. Only two of our 75 conventional tests on (roughly) 3,000-word blocks of Shakespeare fell outside of Shakespeare's profile, while almost half of our 170 conventional tests on other poets fell outside the profile. Table 1, comparing the poems of the Earl of Oxford and of 'Meritum Petere Grave' (a number of poems in the 1573 Hundreth Sundry Flowers are signed by this Latin phrase, which is considered by some Oxfordians to be an Oxfordian "posy") provides examples of the contrasts: [note 6]

Table 1. Six Tests Comparing Shakespeare and Oxford

               TRC    HCW    GRL    Fem    Open   Modal
               1000   1000          End    Line
                                    %      %

Shakespeare    11.9   5.7    11.6   11.9   15.3   56.2

Shakespeare     2.6   1.7     1.3    4.7    4.1   40.1
Std. error

Oxford score   19.7   1.0     7.0    0.0   11.1  356.5
or mean

Oxford dis.     2.9  -2.8    -3.7   -2.5?  -1.0   18.4
fr. Sh. (s.e.)

Meritum score  10.8   1.3     9.0    1.1   12.0  125.6
or mean

Meritum dis.
fr. Sh. (s.e.)  0.4  -2.4    -2.1   -2.2?  -0.8    4.9

Commonisation    no   minor  minor    no   minor    no

Time problems?   no    no     no      yes    yes    no?

Ox-Sh. match?    no    no     no      no?    yes    no

Mer-Sh. match?   yes   no     no      no?    yes    no

               Matches       Mismatches      ?Mismatches
Oxford            1              4                1
Meritum           2              3                1

*Does not include The Phoenix and the Turtle.

Oxford's poems have many more total relative clauses (TRCs) than Shakespeare's, and many fewer hyphenated compound words (HCWs) and feminine endings. Shakespeare wrote at the 11th-grade level (GRL), Oxford at the 7th. Even ignoring feminine endings tests as dubious, Oxford's poems fall outside Shakespeare's profile by four of the six tests.

None of the tests in Table 1 is perfect. Some tests have commonisation problems. Texts vary, and many spelling and punctuation peculiarities are supplied by editors, not authors. Modern editors differ widely in their use of exclamation marks and parentheses, though not in their use of HCWs. [note 7] Other tests have genre or time problems. Modal tests which work on poems do not work nearly as well on plays or songs from plays. [note 8] Shakespeare's line-ending frequencies increased drastically over his lifetime; they make an excellent test for works of the same period, but not for works like Oxford's, written decades earlier. Hence, Oxford's mismatches with Shakespeare on feminine endings do little to disprove his candidacy -- unless one accepts the assertions of leading Oxfordian scholars that the true dates of Shakespeare's plays are 20 years earlier than is now supposed.

Many Oxfordians believe, moreover, that the dissimilarities we found between Shakespeare and Oxford are developmental, not intrinsic, more like the differences between a caterpillar and a butterfly than like those between a silk purse and a sow's ear.

Despite these imperfections, we believe that our tests are a severe setback for the Oxford candidacy. Excluding the first eight of Oxford's poems as possible songs moves him closer to Shakespeare, but not close enough: 7 standard errors, instead of 18. We could not test the 'developmental changes' argument directly on Shakespeare, from whom we only have ten or fifteen years of poetry. But we could test it on two other writers with large, firmly dated bodies of poetry, Milton and Spenser. Milton's earliest poems (before 1633) and his later poem, Samson Agonistes (1670-71) both fit within a profile set by Paradise Lost (1658-1665). Spenser's Epigrams and Sonnets (1569) and his Amoretti (1595) match his Shepherd's Calendar (1579) closely -- though, for some reason, his Faerie Queene tests very distant from the other four works mentioned. As far as we can tell from these improvised tests, [note 9] Milton was a butterfly all his life, and so was Spenser -- except when he wrote the Faerie Queene. [note 10]

A cross-check of each conventional test against 3,000-word samples from an early Shakespeare play (Richard III, 1592) and a late one (Macbeth, 1606) indicates very little change in Shakespeare's profiles, apart from line endings. [note 11] The old Shakespeare seemed just as likely as the young to pour out HCWs, to stint on relative clauses, and to write at a given grade level, though the grade-level of the plays, understandably, is consistently much lower than that of the poems. Hence, four tests -- relative clauses, hyphenated compound words, grade-level, and modal tests -- do not seem to suffer greatly either from time problems or from commonisation problems. [note 12] Oxford's poems fall outside the profile on all four.

Moreover, the line-ending trends in Shakespeare's plays, which make Oxford's early poems only a dubious mismatch, still hurt his candidacy because, as the plays are conventionally dated, the trends continued for years after Oxford's death in 1604. Figure 1 and Figure 2 illustrate the dilemma this poses for the Oxford candidacy. If one accepts the conventional Riverside Shakespeare dating, the rising frequency of feminine endings is unmistakable and continues after Oxford's death (Figure 1). If one accepts the earliest clear Oxfordian dating, argued by Eva Turner Clark, [note 13] Shakespeare's plays cluster two decades earlier and nothing happens after Oxford's death -- but the rising trend disappears (Figure 1), and Oxford's and Meritum's line endings are a complete mismatch with those in Shakespeare's plays. Oxfordian dating is speculative and could no doubt be reshuffled somehow to fit Oxford at both ends, but it would require a major overhaul of their present dating. [note 14]

Figure 1. Shakespeare Plays: Feminine Endings (Riverside Dating)

          |                                        *
          |                                       *
          |                                      *  *
        30|                               *  *       *
Percent   |                          *      * **
Feminine  |                            * *  * *
Endings   |                           * **     *
(Halliday)|                            *       *
        20|                     *      *
          |                      ** *
          |                   *  *    *
          |                   *
          |                        *
        10|                      * *
          |                  *    **
          |                        **
          |         |         |         |         |         |
         1570      1580      1590      1600      1610      1620


Figure 2. Shakespeare Plays: Feminine Endings (Clark Dating)

          |            *
          |              **
          |       *                        *
        30|       *          *
Percent   |        * * **
Feminine  |         ***      *
Endings   |           * *
(Halliday)|     **      * *
        20|        * **
          |     * **
          |        **
          |           *
        10|     *
          |       *  *  *  *
          |           * *
          |         |         |         |         |         |
         1570      1580      1590      1600      1610      1620


Our conclusion from the Clinic was that Shakespeare fit within a fairly narrow, distinctive profile under our best tests. If his poems were written by a committee, it was a remarkably consistent committee. If they were written by any of the claimants we tested, it was a remarkably inconsistent claimant. If they were written by the Earl of Oxford, he must, after taking the name of Shakespeare, have undergone several stylistic changes of a type and magnitude unknown in Shakespeare's accepted works. And, for both feminine endings and open lines, he must have somehow found a way to carry on trends which are well known in Shakespeare's plays for almost a decade after his own death. These are not easy assumptions to make. We do not claim to have said the last word on this subject, nor to have solved the Shakespeare authorship mystery. But, if it strains credulity to suppose that Will Shakspere, the Stratford grain dealer, could have written Shakespeare's poems and plays, it also strains credulity to suppose that people like Oxford, with entirely different stylistic idiosyncrasies from Shakespeare, could have been the true authors of his poems and plays.

Note 1

The claimants' works tested were:

Poems in Hundreth Sundry Flowers signed "Meritum Petere Grave", thought by some Oxfordians to be an Oxford posy, were also tested, as were poems by eight non-claimants: George Chapman, Giles Fletcher, William Herbert, Fulke Greville, Gervais Markham, John Milton (early), Mary Wroth, and Henry Willobie. The full list of fifty-eight claimants may be found in O.J. Campbell and E.G. Guinn (eds), The Reader's Encyclopedia of Shakespeare (New York, 1966), 115.

Note 2

See W. E. Y. Elliott and R. J. Valenza, "A Touchstone for the Bard", Computers and the Humanities 25 (1991): 199-209.

Note 3

Shakespeare Clinic, "Matching Shakespeare, 1990" (Claremont, 1990)

Note 4

For example, Charlton Ogburn, The Mysterious William Shakespeare (New York, 1985).

Note 5

In a normal distribution, about two-thirds of a population fall within one standard error above and below the mean; 95 percent of a population fall within two standard errors of the mean; 99.7 percent fall within three standard errors. We make no claim that our distributions of tested scores are normal (they are not), or that the individual observations are statistically independent. Hence, the standard-error numbers in note 2 and note 3 are for comparison only.

Note 6

Our Oxford text is from Stephen May, "The Poems of Edward de Vere, Seventeenth Earl of Oxford, and of Robert Devereaux, Second Earl of Essex", Studies in Philology 78 (1980); our Meritum text is from Ruth Loyd Miller, ed., A Hundreth Sundrie Flowres (Port Washington, 1975). We modernized spelling and some punctuation of both texts to correspond with the conventions of the Riverside Shakespeare.

Note 7

W. E. Y. Elliott and R. J. Valenza, "Computers and the Oxford Candidacy" (Claremont, 1990), 2.

Note 8

Neither Shakespeare's plays nor his songs match his poems under modal testing, using the fifty-two keywords used on the poems. The problem with the songs is sing-song, repetitive lines and a superabundance of "Hey nonny nonnies". Shakespeare's plays do match each other closely under modal analysis but are not distinguishable from other people's plays. A different set of keywords might cure this development, but we have not yet tried to develop one.

Note 9

Our improvisation was to use our keywords optimized for Shakespeare, rather than working up new sets of keywords optimized for Milton and Spenser.

Note 10

See note 2 and note 7.

Note 11

See W. E. Y. Elliott and R. J. Valenza, "Computers and the Oxford Candidacy" (Claremont, 1990), p. 5.

Note 12

For example, our counts of hyphenated compound words in three modern editions of all of Shakespeare's poems are as follows:
     1864 Trinity College edition       316
     1974 Riverside edition             255
     1986 Oxford edition                275
The Riverside and Oxford Shakespeares both have six times as many HCWs per thousand as Oxford's poems modernized by the Riverside rules; the Trinity College Shakespeare has seven times as many.

Note 13

Eva Turner Clark, Hidden Allusions in Shakespeare's Plays (New York, 1931).

Note 14

Clark (note 13) argues that most of Shakespeare's plays were written in the 1570s and 1580s but updated in the 1590s. Charlton Ogburn (note 4), rather than assigning his own dates to the plays, generally offers the earliest Stratfordian dating he can find, while avoiding dating altogether for most of Shakespeare's last plays by conventional dating. A table of five kinds of verse tests -- feminine endings, open endings, mid-line speech endings, light endings, and weak endings -- may be found in F. E. Halliday, A Shakespeare Companion, 1550-1950 (London, 1952), 681. All five tests showed sharp increases continuing after Oxford's death in 1604.

In addition to our own conventional tests from the Clinic, we also experimented with four tests adapted from D. W. Foster's Elegy by W. S.: A Study in Attribution (Canterbury, New Jersey, 1989). These tests are:

  1. incongruous "who" (modifying an inanimate noun);
  2. hendiadys (using two nouns where one would expect a noun and an adjective);
  3. redundant comparative or superlative ("the most unkindest cut of all"); and
  4. participial compound words ("back-breaking").
Neither Oxford's nor Meritum's poems have any of these features, which, Foster argues, appear with some frequency in Shakespeare's works, but much more rarely in the works of others. For various reasons, we could not validate these tests well enough to rely on them as independent disqualifiers for 3000-word samples, but it is at least worth mentioning that we tried them with negative results.

Oxford, Meritum, and other claimants also differ sharply from Shakespeare in frequency of exclamations and parenthetical expressions. Shakespeare has many more exclamations than Oxford or Meritum, but many fewer parenthetical expressions than Meritum. But these are so much a matter of editorial, not authorial, discretion, and editorial variance has been so broad compared to HCWs, that we have not included them in Table 1.

Another test not covered in the Clinic report, but apparent just from reading Meritum's poems is a simple "favourite word" test. Shakespeare and Meritum both used the archaic term "make" and the modern "mate" to describe a love-partner. But Shakespeare appears to have preferred "mate" five-to-one over "make", at least in his poems, while Meritum preferred "make" to "mate" by the same ratio.

