Checking Some Wellesley Index Attributions by Empirical ‘Internal Evidence’: The Case of Blackie and Burton

Since its inception, the Wellesley Index has been a great resource for scholars wanting to know the identity of the numerous anonymous contributors to the nineteenth-century periodicals. However, when all the available external evidence was exhausted Wellesley attributors began to rely on internal evidence, and some of these attributions are now being queried as unduly speculative. This is the case with the attribution of certain articles in Tait’s Edinburgh Magazine to John Stuart Blackie and John Hill Burton, two Scottish contributors in the 1830s and 40s, where the evidence is, as Eileen Curran noted in The Curran Index, often ‘tenuous.’ Developments in computational stylistics over the last thirty years now offer statistical techniques for testing such doubtful attributions. Use of the Burrows Method, based on an author’s relative usage or non-usage of common function words, allows the researcher to isolate an author’s distinctive stylistic traits and to use these to compare his known articles with others of more doubtful provenance and to make informed judgments about the likelihood of his authorship of these. These methods were used to test the authorship of eight articles attributed to Blackie and eight attributed to Burton. The use of function words in the doubtful articles was compared to that in six articles reliably attributed to Blackie and ten reliably attributed to Burton and then to that by contemporaries also writing for other major periodicals. It was found that only four of the Blackie articles tested and two of those by Burton appear to have been correctly attributed in the Wellesley Index.


Introduction
It was once thought that authors' use of the very common words of English showed no patterned variation: any competent user of the English language used them at the same rate, and individuality of style only emerged from less common lexical words.Following in the footsteps of 'function word' pioneers Ellegard and Mosteller and Wallace, 2 John Burrows showed that the incidence of these words varies significantly between texts by different authors while remaining comparatively constant within a single author's work (Burrows, 1987: 1-2).This finding was at the heart of all his early work and led to the development of what is usually known as the "Burrows Method" at the Centre for Literary and Linguistic Computing [CLLC]3 at the University of Newcastle, Australia.By subjecting the most common words of a large set of texts to statistical procedures like Principal Component Analysis Burrows was able to show that rates of usage vary systematically, and that most authors have a characteristic use, sometimes called a "wordprint," of these common words that distinguishes their style from that of their fellows (Burrows 1996(Burrows , 2004)). 4ne major area in which the Burrows Method has been used is that of the attribution of authorship (Love, 2002: Chap. 8).After many years of experimentation and refinement by Burrows and his colleagues, it has emerged as a particularly useful tool for testing hypotheses devised by the researcher on the basis of fragmentary or ambiguous "external" evidence, or on "internal" evidence of style, tone, or subject matter.Selected texts by known authors (including the putative author) are subjected to statistical analysis of the frequency of the most common words in the expectation that authorial groupings based on the authors' characteristic word usage will emerge, and that if a text of doubtful or contested authorship is placed among these texts, the validity of the attribution can be assessed.The results of such tests are, of course, matters of probability, not of certainty.Factors other than the author's idiosyncratic preferences-for example the text's era, genre, and topic, and the gender, nationality and stance of the author-can determine which common words are used.It is therefore usual, when using this method, to take the precaution of choosing texts for comparison which share most of these characteristics with the text or texts being tested.The authors of the present article have become interested in applying the methods of computational stylistics to the area of anonymous authorship in nineteenth century British periodicals (see Jordan et al, 2006), and this article reports on our investigation of a particular case in the hope that the methods we have used may prove helpful to others undertaking the same sort of work.

Authorship and Internal Evidence
Hugh Craig has argued that, in spite of the fact that some scholars question the role of authorship "as a key to understanding textuality and as a basis for interpretation and editing," the continuing academic concern with attribution demonstrates that the "older authorship model […] which made authorship the chief guarantor and constituting power of meaning in texts" remains the basis of much academic work."After two decades or so of work," he continues, "computational stylistics has established a strength and consistency to the author effect in practice that overturns the consensus about the invalidity of that effect in theory" (Craig, 2009-10).Walter Houghton (the founder of the Wellesley Index project) argued for the importance of attribution by pointing out that "the context in which one discusses an essay, and therefore its place in a work of scholarship, can depend on knowing the contributor and therefore the group he speaks for."An anonymous paper attacking the Thirty-nine Articles, he suggested, "would mean one thing if it were written by T.H. Huxley and something quite different if the author were the Bishop of London" (Houghton, 1972: 48-51).
In his book Attributing Authorship, Harold Love discusses "the problem of unattributed journalism" and suggests that for some authors, particularly minor ones "the determinism of authorship is often crucial" to the mapping out of a career, defining a canon or "establishing whether the person concerned really held particular views."Although conceding that for many texts there may be no sure answer as to "how personal responsibility for given aspects of given texts might be distributed," he points out that there are good practical reasons for crediting authorship to the person who performed it.Further, he suggests, most writers "wish passionately to assert their responsibility for their creations" (Love, 2002: 2-3).These observations are pertinent to a consideration of the notion of authorship of the nineteenth-century periodicals 5 .
The nineteenth century literary scene in Britain was notable for the number of serious quarterly and monthly periodicals dealing with political, literary, historical and scientific subjects of the kind that now have their own specialised academic journals.Anonymity in reviewing was an inheritance from the eighteenth century which the major quarterlies adopted without question in the first decades of the nineteenth century.As the century progressed, however, there was a "movement towards signature" which became the norm in the reviewing of the eighties and nineties (Maurer, 1948:10).This issue, and the controversy surrounding it, is intimately connected with the relationship between authorial individuality and editorial responsibility-whether the journal's or the author's name is more important.It is not surprising that the debate concerning anonymity versus signature was so contentious, in view of the nature of the 'hybrid genre' created by the 5 One notable example of the public's interest in the authorship of anonymous articles was the Saturday Review's Modern Women series.When a collection of these was published in 1868 under the title Modern Women and What is Said of Them, the publisher deliberately withheld authorship information in this tantalizing fashion: "The authorship of these papers has been attributed to different individuals, male and female; but it is more than probable that the writers whose names have been mentioned in this connection are precisely those who have had nothing to do with them (Modern, "Advertisement").Fifteen years later, Eliza Lynn Linton was finally able to claim authorship of ten of these articles including the infamous "Girl of the Period" article.She thanked the authorities of the Saturday Review for their permission to republish them under her own name, saying she was "glad to be able at last to assume the full responsibility " for her own work and claiming she had twice been introduced to the author of the "Girl of the Period," one a clergyman, the other a society matron (Girl vii-viii).
Edinburgh Review and adopted by the other journals.Reviews, according to Levine and Madden (1968: viii-ix), are supposed to be dependent "upon the works on which they feed" and are assumed to be at their best when they mediate "truth without calling attention to the author."The runaway success of the Edinburgh Review, however, lay in the fact that it was, even from the start, more than a review.It was a review where the bulk of the interest was to come from the so-called reviewer's own input-his or her "large and original views of all the important questions to which these works might relate" (Wellesley online introduction).While the authorial contribution had become allimportant to the success of the genre, the policy of anonymity worked at merging authorial individuality with the editorial voice and house-style of the journal.
For many years the authors of periodical articles could only be identified if the contributions had been republished by the author as part of a collection of essays.However, as the online guide to the Wellesley Index puts it, "the scholarly importance of this material created an imperative to provide indexes through which it could be accessed."During the last eighty years6 both literary scholars and historians have been constructing such indexes by combing surviving publishers' records and personal memoirs and letters from the period for further information on the authorship of these articles.Between 1966 and 1989 much of the effort was harnessed into producing the multi-volume Wellesley Index where the authors of many of the unsigned contributions to forty-three nineteenth-century periodicals have been identified (Colby, 1994:287-8), and this has been a great resource for scholars since its inception.
Nevertheless, doubt has for some time been thrown on the accuracy of some of the Wellesley identifications (for example Hall, 1991).In the earlier stages of the Wellesley project, when scholars were dealing with periodicals for which substantial editorial archives had been preserved, attributions were restricted to those for which such "external evidence" could be found.When, however, the project moved on to journals for which archival material was scanty the use of "internal evidence" began to be admitted, and scholars began to identify authors from such aspects as subject matter, point of view, and 'style, using what one of the editors described as "the warts, tics and scars" of individuals to differentiate one author's writing from another's (Hiller, 1979).There is growing evidence that some of this use of internal evidence has been rather cavalier.It was Eileen Curran's concern with the possible further misattribution of articles by Blackie and Burton, two Scotsmen with many similar characteristics-born in 1809, educated at Marischal College, Aberdeen, frequent contributors to the periodical pressthat prompted us to undertake the project reported on here.Although the commonalities shared by Blackie and Burton created a situation where the usual methods of internal attribution were likely to fail, 7 the ability of computational stylistics to reveal distinctive authorial stylistic patterns would seem to offer a better chance of success in assigning authorship to one or other of these authors. 8We received from Eileen a list of the articles in Tait's Edinburgh Magazine ascribed to Blackie and Burton, and her own notes suggesting that although 94 articles are attributed to one or other of these men between 1833 and 1854, there is good external evidence for the authorship of only 16 of them.We believed that these 16 articles would provide us with a sufficient basis to build characteristic word usage profiles for Blackie and Burton, and that the methods of computational stylistics could be used to test their authorship of the doubtful articles.We have in consequence digitised 13 of the "known" articles (the remaining three were, we felt, too short to be useful) and 16 of the "doubtful," and we feel this has allowed us to reach fairly firm conclusions: of the 16 doubtful articles we tested, four are likely to be by Blackie and two by Burton, and it seems possible that two further, as yet unidentifiable, authors were responsible for seven of the remaining ten doubtful articles.

Using the Burrows Method
In using the Burrows Method one first assembles three groups of texts9 , a Test Piece or Pieces, a Base Set, and a Counter Set.In this case we prepared as Test Pieces the sixteen "doubtful" articles ascribed to Blackie or Burton which had been digitised, and two Base Sets of texts reliably attributed to each author.For constructing Counter Sets we had at our disposal a large corpus of 162 digitised reliably-attributed texts (more than one and a half million words) that had been published in major quarterlies and monthlies between 1830 and 1880.10One then creates Tables11 showing the proportional usage of words from a pre-determined list in the texts for all three Sets.We used a list of the 150 most common "function" words derived from our Victorian Periodicals corpus.(See Appendix C).When the texts to be compared have been selected, their tables of numbers are pasted into a comprehensive statistics package like SPSS or MiniTab and measurements of the comparative use of these words conducted.
For the presentation of our findings in this article we have used MiniTab 15 to create Multivariate Cluster Analyses with the results expressed as dendrograms. 12An increasing number of practitioners of computational stylistics are turning to multivariate analysis as the most suitable technique for the analysis of linguistic variables. 13Cluster analysis is a useful technique for representing the 'natural groupings, if any, of a set of texts.The main advantage of the method is that its results can be presented in a treediagram format, which is easy for the reader to interpret.The samples bearing most resemblance unite earliest, while the most different samples unite last of all.This simplicity of presentation is offset by a lack of transparency as to what underlies the pairings or separations, and the method occasionally produces an unexpected pairing when some other likeness causes a statistical affinity with a text by another author.Yet, as Hoover (2001: 438) writes, the failure of cluster analysis to produce "completely accurate clusters is hardly a catastrophe" since other methods are available. 14It is our normal practice, before accepting the validity of an initial cluster analysis, to run further tests where the results are produced in a different format (e.g. as scatter diagrams) to check that the interpretations of the findings are consistent.

Wordlist Tests
Our first step was to see whether our Victorian Periodicals wordlist would separate the firmly attributed Blackie and Burton articles on the basis of authorship.To do this we first compared the distribution of the words in our list in the articles by Blackie with the distribution in those by Burton.Figure 1 below 15 shows the two sets of articles dividing on an authorial basis into quite distinct trees, suggesting that there is a significant difference in these authors' usage of the words in our Victorian Periodicals wordlist.Our second step was to see if this distinction was preserved when checked against firmly attributed articles written by contemporaries Figure 1 14 Hoover also writes that "under carefully controlled conditions, however, a cluster analysis of the most frequent words of texts is still a useful first step in determining authorship." 15Full bibliographic information for the texts used in the figures included in this article can be found in Appendix B.

Blackie Burton
Accordingly, we ran a large number of trials comparing the Blackie and Burton articles with different groups of texts from our corpus and found that the texts reliably attributed to each of our Base Set authors almost always grouped together, but were differentiated in varying degrees from those by other authors within the corpus.Figure 2 below shows one such test where the word distribution in the Blackie and Burton articles was compared with the distribution in nine articles by three other authors in our corpus.The tree diagram confirms the Wellesley Index attribution of these articles and suggests that Blackie's word usage was similar to that of Kingsley and Martineau, though not as similar as that of these two to each other, whereas Burton's was more similar to that of Macaulay than to that of any of the other three.

Macaulay
Having confirmed the reliability of our Victorian Periodicals wordlist for distinguishing Blackie's and Burton's word use from that of their contemporaries, our next step was to run large numbers of tests where we compared the Test Pieces either individually or in a group with a Base Set of articles by the author to whom they were attributed and a Counter Set made up of articles reliably attributed to authors who were unlikely to have written them.If the Test Piece was usually positioned among or close to the articles of the author to whom it was attributed, we felt the attribution could be accepted.On the other hand, if the analyses only occasionally placed a Test Piece among or close to the articles of this author, we concluded that this was probably due to the fact that the word use of the authors in that particular Counter Set was on this occasion even less like that of the author of the Test Piece than that of the Base Set author, and felt the attribution was not necessarily confirmed.

The Burton Tests
In the case of the Burton Test Pieces we were able to come to what we felt was a satisfactory conclusion with the use of the 150 Victorian Periodicals wordlist and the Test Pieces combined into a single Test Set.For the Burton Base Set we were able to use not only the reliably attributed texts from Tait's, but also some of Burton's contributions to the Edinburgh Review.These two Sets were then compared with Counter Sets composed of texts by pairs of authors chosen randomly from our corpus.The results shown in Figures 3-4 (below) are typical of our findings.In almost all the tests we ran, although six of the articles attributed to Burton showed considerable volatility, sometimes forming a single cluster, sometimes forming trees with the Counter Set authors, two articles, "Von Raumer on the character and times of Charles I" and "St.Andrews," regardless of the authors used for the Counter Set, attached themselves to the Burton tree.We therefore feel that there is good reason to believe that these two articles were written by Burton, but that the other six texts in the Test Set that are attributed to him in the Wellesley Index probably were not.

The Blackie Tests
The Blackie articles required more elaborate testing, since in tests of the type used for Burton not only the Test Pieces but also quite a number of the Blackie Base Set showed a tendency to group with authors used in the various Counter Sets.Our tests revealed that whereas Burton's writing style varied little from text to text, Blackie varied his style according to subject matter or admired models.We therefore employed a method of testing used at the CLLC to give more precise results: the use of a wordlist composed of those "function" words that Blackie used relatively more or relatively less frequently than his contemporaries.The list was created by running a T-test 16 to compare the proportional usage of the 150 wordlist words in all the reliably attributed Blackie articles with their usage in the articles by the other authors in our Victorian Periodicals corpus.In this test the higher the T-value, the more systematic and marked are the differences between the two sets.We found that the T-tests isolated 56 words (See Appendix D) with a T-test value of +/-2.0 or stronger 17 .We then used these words as "Blackie Markers" in another series of tests.Once again we performed a great many tests using many combinations of the texts in our Counter Sets.We came to the conclusion that in the case of Blackie, it was better to compare the Test Pieces one by one against a range of Counter Sets.Just two of these tests are shown here-one where the Test Piece groups with Blackie (figure 5) and one where it does not (figure 6).These tests in combination led us to the conclusion that only half of the articles in our Test Set that are credited to Blackie in the Wellesley Index-"National vs. state education," "Carlyle's Past and Present," "Styria, and the Styrian Alps," and "The Scottish Universities and the Established Church"-appear to be correctly attributed.

Alternative Authors
In our earlier experiments with the 150 Victorian Periodicals wordlist, we had also noticed that several of the articles attributed to Burton, though according to our tests not written by him, nevertheless appeared to share a common author.In several tests Pitcairn's Criminal Trials, (July 1833), Tytler's History of Scotland, (Dec.1837), and The Life and Rebellion of the Duke of Monmouth, (Jan.1845) formed a single tree no matter how varied the combinations surrounding them.This suggested to us that the group of articles attributed to Burton might contain works by other individual authors whom it was possible to identify, though not name, using the Burrows Method.Again we performed a series of tests using texts by other authors from our corpus as Counter Sets, and in the process a second set of texts that almost invariably formed a single tree emerged.7 and 8 (above) are examples of the kind of tests run, and it can be seen that as well as the grouping described above, another group of texts, "Mr.Carlyle's Oliver Cromwell's Letters and Speeches" (Jan.1846), "The Work of De La Motte Fouqué" (Aug.1845), "The Rev. Dr. Lindsay Alexander's Switzerland and the Swiss Churches" (Nov.1846), and "Monastic studies, jests, and eccentricities" (Oct.1845), emerged as possibly being by a single author.

Conclusion
We believe that our analyses have established the following about the sixteen articles whose attribution to Blackie and Burton in the Wellesley Index we tested: 1. Half of the articles questionably credited to Blackie, "National versus state education" (Nov.1837), "Thomas Carlyle's Past and Present" (June 1843), "Styria, and the Styrian Alps" (Aug.1843), and "The Scottish Universities and the Established Church" (June 1845), appear to be correctly attributed.
2. Only two of the articles questionably attributed to Burton, "Von Raumer on the character and times of Charles I" (Feb.1837) and "St.Andrews" (June 1844), appear to have been written by him.
Eileen Curran's The Curran Index: Additions to and Corrections of The Wellesley Index to Victorian Periodicals contains notes like the following to a number of attributions: Tait's Edinburgh Magazine . . .431 On a criticism of Niebuhr, 5 o.s. 1 n.s.(April 1834) 188-189.s/ B. Delete entry, with its attribution to John Stuart Blackie.Add: John Hill Burton, prob.A year earlier, Burton had contributed #229; see above.The first article that can safely be attributed to Blackie, #803, did not appear until 3 years later and carried no signature.Burton, a historian educated in the classics, contributed a great deal to Tait's in the 1830s.[Wellesley attributes to Burton 4 later articles signed B.--#s 1804, 1846, 1870, and 1878; and to Blackie 2 others--#s 1522 and 1719.The evidence is often tenuous for these and also for several other articles given to the two men.] Figure 2 Figure 7

Figures
Figures 7 and 8 (above)  are examples of the kind of tests run, and it can be seen that as well as the grouping described above, another group of texts, "Mr.Carlyle's Oliver Cromwell's Letters and Speeches" (Jan.1846), "The Work of De La Motte Fouqué" (Aug.1845), "The Rev. Dr. Lindsay Alexander's Switzerland and the Swiss Churches" (Nov.1846), and "Monastic studies, jests, and eccentricities" (Oct.1845), emerged as possibly being by a single author.