In a recent paper advocating a corpus-based and probabilistic approach to grammar development, Black, Lafferty, and Roukos (1992) argue that “the current state of the art is far from being able to produce a robust parser of general English” and advocate “steady and quantifiable,” empirically corpus-driven grammar development and testing. Black et al. are addressing a community in which armchair introspection has been and still is the dominant methodology in many quarters, but in some parts of Europe, corpus linguistics never died. For nearly two decades, the Nijmegen group led by Jan Aarts have been undertaking corpus analyses that, although motivated primarily by the desire to study language variation using corpus data, are particularly relevant to the issue of broad-coverage grammar development. In distinction to other groups undertaking corpus-based work (e.g., Garside, Leech, and Sampson 1987), the Nijmegen group has consistently adopted the position that it is possible and desirable to develop a formal, generative grammar that characterizes the syntactic properties of a given corpus and can be used to assign appropriate analyses to each of its sentences. Nelleke Oostdijk’s book provides a detailed description of the cumulative development of a grammar capable of analyzing a one million-word corpus of English written texts, drawn from a wide but balanced variety of sources. This task forms a significant component of the wider Tools for Syntactic Corpus Analysis (TOSCA) project being undertaken at Nijmegen. Oostdijk’s work provides an excellent example of the strengths and weaknesses of the approach advocated by Black et al. In addition, she discusses issues such as sampling and tokenization of corpus material, as well as the exploitation of the analyzed corpus in studies of language variation. However, in this review I will concentrate on the central core of her book: the development of the grammar and performance of the associated parser, since this is the part that is most relevant to computational linguistics. Oostdijk begins by locating her work and the TOSCA project within the field of computational linguistics (arguing that it is distinguished by “an interest in language itself as it is actually produced” (p. 2)) and contrasting it to the LSP system (Sager 1981) and Parsifal (Marcus 1980). The comparison is brief and the choice odd since more general broad-coverage grammars, such as DIAGRAM (Robinson 1982), PEG (Jensen et al. 1986) and ANLT (Grover et al. 1989), and more corpus-oriented parsing systems, such as FIDDITCH (Hindle 1983, 1993) or MITFP (de Marcken 1990), have been developed within the field, but are not discussed anywhere. A similar suspicion of isolationism recurs in the sections dealing with the grammatical formalism used;Â
PLACE YOUR ADVERT HERE
- ACCOUNTING PROJECT TOPICS AND MATERIALS3553
- EDUCATION PROJECT TOPICS AND MATERIALS3486
- ENGLISH AND LINGUISTIC PROJECT TOPICS AND MATERIALS2939
- COMPUTER SCIENCE PROJECT TOPICS AND MATERIALS FINAL YEAR1274
- BANKING AND FINANCE PROJECT TOPICS AND MATERIALS1250
- BUSINESS ADMINISTRATION PROJECT TOPICS AND MATERIALS1236
- EDUCATION FOUNDATION GUIDANCE AND COUNSELLING TOPICS AND MATERIALS1045
- ZOOLOGY PROJECT TOPICS AND MATERIALS1002
- MASS COMMUNICATION PROJECT TOPICS AND MATERIALS1001
- ANIMAL SCIENCE PROJECT TOPICS AND MATERIALS978
- LAW PROJECT TOPICS AND MATERIALS896
- ARTS EDUCATION PROJECT TOPICS AND MATERIALS844
- MARKETING PROJECT TOPICS AND MATERIALS690
- AGRICULTURAL EXTENSION PROJECT TOPICS AND MATERIALS676
- PUBLIC ADMINISTRATION PROJECT TOPICS AND MATERIALS654
LATEST PROJECTS
STUDIES ON SOME ASPECTS OF ANTHRACNOSE-BLIGHT-DIEBACK COMPLEX OF CULTIVARS OF GRAPEVINES (VITIS SPP.) IN...
GENETIC VARIABILITY STUDIES OF TWENTY POTATO GENOTYPES
RELATIONSHIP OF HAEMOGLOBIN AND POTASSIUM POLYMORPHISM WITH CONFORMATION, MILK PRODUCTION AND BLOOD BIOCHEMICAL PROFILES...
ADOPTION OF AGRICULTURAL INNOVATIONS AMONG MEMBERS AND NON-MEMBERS OF WOMEN CO-OPERATIVE SOCIETIES IN OJU...
SMALL FARMER CREDIT WITH PARTICULAR REFERENCE TO NIGERIA
DISCLAIMER
All undertaking works, records and reports posted on this website, modishproject.com are the property/copyright of their individual proprietors. They are for research reference/direction purposes and the works are publicly supported. Do not present another person’s work as your own to maintain a strategic distance from counterfeiting its results. Use it as a guide and not to duplicate the work in exactly the same words (verbatim). modishproject.com is a vault of exploration works simply like academia.edu, researchgate.net, scribd.com, docsity.com, coursehero and numerous different stages where clients transfer works. The paid membership on modishproject.com is a method by which the site is kept up to help Open Education. In the event that you see your work posted here, and you need it to be eliminated/credited, it would be ideal if you call us on +2348053692035 or send us a mail along with the web address linked to the work, to [email protected]. We will answer to and honor each solicitation. Kindly note notification it might take up to 24 - 48 hours to handle your solicitation.