Joseph Olive, Caitlin Christianson, John McCary (eds.): Handbook of Natural Language Processing and Machine Translation

0
533

The Handbook of Natural Language Processing and Machine Translation is a 936 pages long, two-and-half-kilo hardcover heavyweight outlining the objectives, results and achievements of the GALE program (Global Autonomous Language Exploitation) funded by DARPA (Defense Advanced Research Projects Agency). It is the public final account on the work by three GALE teams on clearly defined language processing challenges, stemming from the practical objective of providing end-users with accurate and timely information, relevant to the fulfilment of their duties. In the DARPA context, the end-users are mainly English-speaking military staff or intelligence analysts. The GALE task framework is strikingly simple: transcribe—translate—distil—present. The first task is to collect large amounts of training and evaluation material: text and speech in Arabic and Chinese from pre-defined source types (web, newswire, broadcast news and discussions). The next task is content translation and extraction of relevant facts which are then presented to the end-user in clear and concise English. The book starts with an interesting introduction describing the history and background of DARPA language programs in general and GALE in particular. The titles of the subsequent chapters reflect the main phases of the GALE challenge: data collection, translation of text, translation of speech, distillation (which means query-based fact-finding from text), evaluation, and building of operational systems for end-users. Each chapter starts with an editorial introduction, followed by thematically grouped articles which in most cases look like self-standing scientific papers, a few of which have been published earlier in journals and conference proceedings, and are presented in the book in an adapted or updated version. For many of the topics and articles, however, it is not possible to find any trace of previous publication, at least by web search.Â