"Professional Corpora”: Teaching Strategies for Work with Online Documentation, Translation Memories and Content Management

0
735

The expansion of electronic memory capacity is having fundamental long-term effects on the way texts are produced and used, and thus on the way they are translated. Translators are increasingly working on data bases in non-linear ways, separated from awareness of any active communicative context. This enhances productivity and consistency but challenges more humanistic values like understanding, cooperation, and job satisfaction. It order to address these changes, teaching practices should 1) make students aware of the communicative functions of texts, particularly the ways in which particular parts of texts become high-risk in particular situations, 2) teach students how to use electronic technologies within such a frame, and how to teach themselves about the technologies, and 3) train students for a range of professional communication jobs, incorporating both the technical sides and the various revision and editing techniques now required by the technologies. Do not worry: there is no universal revolution this week. Translation is still what it has always been, more or less. In some particular fields, however, and indeed in a widening circle of fields, a series of memory-based technologies are fundamentally altering the way translations are produced, and thus the way translators need to be trained. Here we shall focus on those particular changes, in those particular sectors, with the rider that some of the novelties may be coming your way soon. Where are the texts? The most basic of these changes concerns the kind of linguistic material that translators work on when dealing with websites, software programs, and product documentation. There are still no doubt texts, with a beginning, a middle and an end, of the kind Aristotle approved of. But much translational work is now carried out on sets of linguistic data, and is done with the help of sets of linguistic data, and with no visible beginning, middle or end. For example, a translator may be required to locate and translate a series of updates to a website. They will render just those updates, without necessarily seeing the entire website as any kind of text, and quite commonly not as any kind of image either. Or again, translators may work in a team using an online translation-memory system to render hundreds of pages of technical documentation for an impossible deadline: only the project manager, at best, will see the text as a whole (or indeed, as a project); the translators themselves will only see series of small unconnected parts, like foot-soldiers in a battle. As these examples suggest, the conceptual and cognitive changes are due to technology on two levels. On the one hand, the source material is being generated by piecing together fragments, often for users who are only going to use fragments (think of all the user manuals and software Help files, all produced and read through indexes). On the other, translation memories and content-management systems divide linguistic material into phrases and chunks, at cohesive levels much lower than anything traditionally called a text (usually at the paragraph or section levels). This fundamental change, the absence of initial textuality, underlies all the rest. People no longer use such contents in a linear way, starting at the beginning and reading through to the end. So the documents are not written in a linear way, and they are certainly not translated in a linear way. We are all working on memorized chunks, and updates to memorized chunks. In semiotic terms, the paradigmatic has drawn-and-quartered the syntagmatic. It took us a few decades to realize that translators work not on sentences but on texts, much to the chagrin of phrase-level linguists in search of equivalence. It then took us the 1990s to see that documents usually come with useful information on deadlines, quality required, readerships and rates of pay, so that what we actually work on are not just texts but all those things bundled into “projects”. And now another decade or so has been necessary for us to see that technology has changed translation activities into something like maintenance operations, of the kind you use when you have your car serviced. When we only translate the updates, when we do not see the beginning or end of the communication act, our work has moved from the “project” to the “program” (yes, like a car-service program). Linearity is no longer in texts, but in the possibility of keeping maintenance contracts over time. The segments, at whatever level, are universally memorized (memory is what our technologies work on) and paired across languages. Does that mean some kind of return to phrase-level equivalence? In many respects, yes, except that equivalence is now fixed by company convention, with little disturbance from natural usage. What we work from are increasingly not contextualized fragments of language, with social connotations and the like. What we access and apply are sets of data, lists of linguistic material, or what are elsewhere known as corpora. Where is the information? Translators still need to know languages and cultures. In highly technical fields, however, good documentation skills, good organization skills, and some basic common sense can often replace developed language competence. For example, I occasionally have to render legalistic documents into Catalan, a language that I do not know well. This mostly concerns university regulations, suggested modifications to university regulations, and explanations of why I do not apply university regulations correctly – usually in my devious attempts to get foreign students enrolled. How do I locate the correct Catalan terms and phrases? Obviously I look at selected parallel texts on websites (texts in Catalan with the same or similar discursive function); I also look closely at the texts that are sent to me (mainly the university regulations themselves, and the official complaints about my non-applications); I might very occasionally check an online bilingual legal glossary; I learn a lot from my Catalan spell-checker; and finally, depending on the quality required, I learn from revision by a native speaker. On a bad day I might write a text in Spanish, a language I do know fairly well, then get a webbased machine translation into Catalan, which gives remarkably good results, and then run all or some of the above checks. Of course, when I do the translation with a translation memory, all future efforts draw on the matching phrases I produce. With all those modes of assistance, I still may not know much about the language or the culture, but I can certainly write and rewrite some effective official letters in Catalan. My main point here is that all those processes draw on corpora of one kind or another. The parallel texts form a corpus, as do the glossaries and dictionaries and spellcheckers, and the translation memories, and perhaps even the knowledge stored in my reviser’s brain. These are certainly not corpora in the sense used by mainstream Corpus Linguistics: the lists of data do not seek to represent a whole national language or generalist patterns of usage; the use of those lists does not require any systematic mining for terminology or phraseology; there is no sophisticated knowledge management at stake, and I am not about to suggest there should be (professional usage tends toward efficient knowledge management anyway). When we use those corpora, we are simply writing (or rewriting, or translating) in a way that works from lists rather than from texts alone. We are using what I want to term “professional corpora”. In order to use those materials, we need skills in addition to those associated with the use of languages and the weaving of texts in cultures. Those skills are what we now have to identify and somehow teach. What is translation competence? My second point is no less important, but it is harder to explain. Some years ago I proposed that translation competence properly involves solving problems to which there is more than one correct solution (Pym 1992). When there is just one solution (French “faire un discours” is English “make a speech”), then we are applying terminology, or phraseology, or whatever cultural authority legitimates the equivalence. However, there are many situations in which more than one solution is viable (French “élaborer un discours” might be “develop a speech”, “develop a discourse”, “elaborate a discourse”, and much more, depending on a hundred subtle semantic and rhetorical factors). In those cases, I proposed, properly translational skills were required. The questions of right vs. wrong (“binary” problems) were for language-learning and electronic memoroies: the problems with more than one solution (“non-binary” problems) were for translation competence. Ours were the skills that paid close attention to textuality, that adapted messages to new purposes, that enacted the interplay of cultures. That is how I sought to define the line between the training of translators and the learning of languages. That theory enjoyed a certain success in its day. Yet I am now compelled to reconsider it. If we look at the many professional corpora now instantaneously available, can we really pretend that their use is not part of what translators do? Do we really want to say that translators somehow become more “translatory”, as it were, the more they work on the more-than-one, in effect the more they use and re-use solutionproposing technologies? That is not a very happy theory. It could even condemn translators to produce multiple alternative renditions, wasting time and in many cases reducing the qualities of outputs: Lorenzo (2002) finds that the longer student translators go over and revise their work, the lower the quality becomes. On that view, the more translatory the translator, the less efficient the translation process. And that is certainly no longer where the solutions lie. In the pre-Internet era, which was when I star