Visualization of Text Document Corpus

409

Visualization is commonly used in data analysis to help the user in getting an initial idea about the raw data as well as visual representation of the regularities obtained in the analysis. In similar way, when we talk about automated text processing and the data consists of text documents, visualization of text document corpus can be very useful. From the automated text processing point of view, natural language is very redundant in the sense that many different words share a common or similar meaning. For computer this can be hard to understand without some background knowledge. We describe an approach to visualization of text document collection based on methods from linear algebra. We apply Latent Semantic Indexing (LSI) as a technique that helps in extracting some of the background knowledge from corpus of text documents. This can be also viewed as extraction of hidden semantic concepts from text documents. In this way visualization can be very helpful in data analysis, for instance, for finding main topics that appear in larger sets of documents. Extraction of main concepts from documents using techniques such as LSI, can make the results of visualizations more useful. For example, given a set of descriptions of European Research projects (6FP) one can find main areas that these projects cover including semantic web, e-learning, security, etc. In this paper we describe a method for visualization of document corpus based on LSI, the system implementing it and give results of using the system on several datasets.

DOWNLOAD PROJECT

Visualization of Text Document Corpus

Related

PLACE YOUR ADVERT HERE

DEPARTMENTS

LATEST PROJECTS

STUDIES ON SOME ASPECTS OF ANTHRACNOSE-BLIGHT-DIEBACK COMPLEX OF CULTIVARS OF GRAPEVINES (VITIS SPP.) IN...

GENETIC VARIABILITY STUDIES OF TWENTY POTATO GENOTYPES

RELATIONSHIP OF HAEMOGLOBIN AND POTASSIUM POLYMORPHISM WITH CONFORMATION, MILK PRODUCTION AND BLOOD BIOCHEMICAL PROFILES...

ADOPTION OF AGRICULTURAL INNOVATIONS AMONG MEMBERS AND NON-MEMBERS OF WOMEN CO-OPERATIVE SOCIETIES IN OJU...

SMALL FARMER CREDIT WITH PARTICULAR REFERENCE TO NIGERIA

DISCLAIMER

EDITOR PICKS

STUDIES ON SOME ASPECTS OF ANTHRACNOSE-BLIGHT-DIEBACK COMPLEX OF CULTIVARS OF GRAPEVINES...

GENETIC VARIABILITY STUDIES OF TWENTY POTATO GENOTYPES

RELATIONSHIP OF HAEMOGLOBIN AND POTASSIUM POLYMORPHISM WITH CONFORMATION, MILK PRODUCTION AND...

POPULAR POSTS

Accounting project topics

CIVIL SERVICE IN NIGERIA

TOP 5 BEST TRUSTED RESEARCH PROJECT TOPICS AND MATERIALS WEBSITE IN...

POPULAR CATEGORY

English as a foreign language in early childhood through the implementation of art crafts...

A CRITICAL REVIEW OF THE CAUSES OF COST OVERRUN IN CONSTRUCTION INDUSTRIES IN DEVELOPING...

EFFECT OF DEBT MANAGEMENT IN NIGERIA (1960 – 2005)