Discovering African English through corpora

0
527

Corpus Linguistics and African Englishes is an exposition of the nature and applications of corpora and a glimpse into African English. In its 15 chapters, it offers a comprehensive picture of (African-Englishbased) corpus linguistics: from what the field is and what the process of compiling a corpus involves, to how a corpus can be exploited to detect specific usages and how these can inform real-life decisions. Chapter 1.1, by Esimaje & Hunston, is a simply written introduction to corpus linguistics. Section 1 defines ‘corpus linguistics’ beyond the traditional ‘collection of written or spoken material’ to include signed languages, gives its history, explains its significance, and describes the historical debates around it. Section 2 explains the three key concepts in corpus linguistics: ‘corpus design’, ‘corpus output’, and ‘corpus annotation’. Section 3 presents types of corpora, software to search them, and their applications. In Chapter 1.2, Fuchs, van Rooy and Gut stress the need for corpora to be representative. They highlight the role of the International Corpus of English (ICE) project in enabling research on (six) African English varieties and also describe other corpora of African Englishes, the most prominent of which being the Global Web-Based English Corpus (GloWbE). They then formulate ‘the guidelines that the compilers of a new corpus should bear in mind’ and illustrate corpusbased studies of African Englishes with three case studies. However, many African-English researchers are likely to take issue with one methodological point: the authors illustrating with features (appearing in GloWbE) which would be hard to ‘sell’ as ‘African English’, e.g. the spellings bra, bru, dat, dis, neva, and shem reported for Kenyan English. In Chapter 1.3, Esimaje presents ‘[t]he purpose, design and use of the Corpus of Nigerian and Cameroonian English Learner Language (Conacell)’, a 442,939 words-long corpus consisting of data produced by ‘intermediate learners’ and ‘advanced learners’. She demonstrates the usefulness of Conacell in identifying misspelt words and in examining tense uses by university students. Chapter 1.4, by Steigertahl, describes the methodology used to compile the 190,000-word ‘Corpus of English(es) Spoken by Black Namibians post Independence (Corpus of ESBNaPI)’. Although the chapter opens the door for ‘Namibian English’ to be considered for addition to the map of African Englishes, the five morphosyntactic structures it discusses based on such a very small corpus cannot represent ‘southern Africa’, which, beyond Namibia and South Africa referred to in the discussion, includes another six English-speaking countries. In Chapter 1.5, after concisely introducing a ‘600,000-word corpus of written Ghanaian English (GhE) from [ . . . ] 1966 to 1975’, Brato outlines ‘the sociolinguistic and historical evolution of English in Ghana’. One key statement he makes is that ‘Ghana is currently on the verge of moving into the endonormative stabilization phase [ . . . ]’ (p. 123). The chapter’s aim was to produce empirical evidence of what Ghanaian English looked like during the preceding stage, i.e. the nativization phase, and ‘how GhE has changed in real time’ (p. 134). In Chapter 1.6, Ozón, FitzGerald & Green present a 240,000-word spoken corpus of Cameroon Pidgin English (CPE) and illustrate its potential uses through a set of case studies.