Linking Words to Writers : Building a Reliable Corpus for Historical Sociolinguistic Research

0
329

Building a corpus of seventeenth-century private letters suitable for historical sociolinguistic research, we face the problem caused by the contemporary rate of literacy. As part of the seventeenth-century population was either illiterate or partly literate, we have to establish whether letters are written by the senders themselves or not, before matching specific language use with the social rank of the sender of a letter. Therefore we have developed a procedure, based on both form and content analysis, in order to identify letters as autographical or non-autographical. The analyses in this identification procedure would not be possible without interdisciplinary research and collaboration with historians, archivists and artificial intelligence researchers, which shows that crossing the borders of linguistics can be vital for achieving a reliable corpus. A spin-of f of our identification procedure is a sub-corpus of non-autographical letters, which will allow us to examine the practice of so-called encoding in Dutch private letters for the first time. 1 A treasure for historical linguists Examining the linguistic past from the perspective of the language history from below, historical linguists consider the language of proximity, found in ego documents such as private letters and diaries, as an indispensable source 1 A preliminary, shorter version of this article, titled Tackling the Writer-Sender Problem: the newly developed Leiden Identification Procedure (LIP), was published in the journal Historical Sociolinguistics and Sociohistorical Linguistics 9 (2009). 344 Judith Nobels & Marijke van der Wal for reliable data.2 Until recently, for the history of the Dutch language of the seventeenth and eighteenth centuries linguists had to rely mainly on ego documents written by men from higher ranks in society. Ego documents from women in general and from both men and women of lower and middle classes were available only in small numbers, scattered over various provincial, municipal and personal archives in the Netherlands.3 This situation has changed considerably as historians rediscovered the Dutch documents in the High Court of Admiralty’s archives, kept in the National Archives (Kew, UK). Apart from a wide range of other material including treatises on seamanship, plantation accounts, textile samples, ships’ journals, poems and lists of slaves, this collection of so-called sailing letters comprises about 38,000 Dutch letters, both commercial and private, from the second half of the seventeenth to the early nineteenth centuries. These sailing letters were confiscated during the wars fought between the Netherlands and England. What makes this huge collection of letters so interesting for linguists are the 15,000 private letters, written by men, women and even children of all social ranks, including the lower and middle classes. They of fer an unprecedented opportunity to gain access to the everyday and colloquial language of the past and will consequently enable linguists to get a view on the Dutch language history from below. As part of the research project Letters as loot. Towards a non-standard view on the history of Dutch, we explore this highly valuable source for both the seventeenth and the eighteenth century.4 In this chapter we will focus on the challenges of the seventeenth-century material and discuss our fruitful collaboration with researchers from other disciplines. 2 For a discussion of the concept of ‘language history from below’, see for instance Elspaß (2007a: 3–9; 2007b: 155). 3 In this article we use the term social class as a synonym of social rank and not in its nineteenth-century meaning. 4 The research project Letters as loot. Towards a non-standard view on the history of Dutch was initiated by the programme leader Marijke van der Wal (Leiden) and funded by The Netherlands Organisation for Scientific Research.; cf. also (Dutch and English version). Linking Words to Writers 345 2 Confiscated letters England and the Netherlands were rivals and enemies for centuries: no fewer than four Anglo-Dutch Wars were fought and in various other wars of the eighteenth and the beginning of the nineteenth century England and the Netherlands stood at opposite sides. Warfare implied privateering (in Dutch kaapvaart): private ships (privateers) authorized by a country’s government attacked and seized cargo from ships owned by the enemy. Privateering was a longstanding legitimate activity, practiced by all seafaring European countries and regulated by strict rules. The conquered ship and all its cargo, called a prize, were considered as loot for the privateer, if the rules had been followed by the book. In England it was the High Court of Admiralty (HCA) that had to establish whether the current procedures were properly followed. In order to be able to decide whether the ship was a lawful prize, all the papers on board, both commercial and private, were confiscated and claimed by the High Court of Admiralty. After the legal procedure, the confiscated letters stayed in the High Court of Admiralty’s Archives. This is how a huge number of Dutch letters from Dutch ships taken by privateers ended up stocked in hundreds of boxes in the British National Archives. To fully appreciate the huge number of letters it is important to note that in very many cases the ships’ cargo contained a lot more mail than the crew’s own correspondences. Ships sailing to the Caribbean (West India) and to East India often took mailbags on board and thus functioned as mail carriers between the Netherlands and those remote regions, and vice versa (Van Vliet 2007: 47–55; Van Gelder 2006: 10–15). Gathering dust in the HCA archives for centuries, only a very small part of the collection of confiscated papers has been examined for specific historical research in the last decade of the twentieth century. The actual size of the collection came to light in 2005 when the historian Roelof van Gelder made an indispensable, but still rough inventory of the Dutch HCA material.5 5 Cf. van Gelder’s report (Van Gelder 2006).