IMPORTANCE OF DATA MINING IN BIOLOGY

0
738

IMPORTANCE OF DATA MINING IN BIOLOGY (EDUCATION PROJECT TOPICS AND MATERIALS)

CHAPTER ONE

1.0   INTRODUCTION

1.1     Data Mining

Data mining involves the use of sophisticated data analysis tools to discover previously unknown, valid patterns and relationships in large datasets. (Adriaans and Zantinge,1999).

Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information which was proposed by (Zhu and Cao, 2003). The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use (ACM   SIKKDD, 2006). Data mining software is one of a number of analytical tools for analyzing data./ It allows users to analyze data from many different dimensions, categorize it and summarize the relationships indentified.

Technically, data mining is the process of finding correlation or pattern among dozens of fields in large relational databases. Data mining consists of more than collecting and managing data, it also includes analysis and prediction. (Wilson et al., 2003) refer to data mining as utilizations of statistical techniques within the knowledge discovery process. Data mining is used for a variety of purpose in both the private and public sectors. In public sector, data mining applications were initially used as a means to detect fraud and waste. It has been reported that data mining has helped the Federal Government recover millions of dollars in fraudulent Medicare payments (Cashlink, 2000). Another example is the Federal Aviation Administration, which uses data mining to review plane crash data to recognize common and recommend precautionary measures.

1.2     Types of Data Mining

Classification

Clustering

Classification

Classification is a learning function that maps (classify) a data item into one of several predefined classes (Weiss and Kulikowski, 1991). Examples of classification methods used as part of knowledge discovery applications include the classification of trends in financial markets and the automated identification of objects of interest in large image databases.

Clustering    

Clustering is a common descriptive task where one seeks to identify a finite set of categories or clusters to describe the data. The categories can be mutually exclusive and exhaustive or consist of a richer representation, such as hierarchical or overlapping categories. Examples of clustering applications in a knowledge discovery context include discovering homogenous sub- populations for consumers in marketing databases and identifying subcategories of spectra from infrared sky measurements. (Wilson et al 2003).

DOWNLOAD COMPLETE PROJECT MATERIAL

IMPORTANCE OF DATA MINING IN BIOLOGY (EDUCATION PROJECT TOPICS AND MATERIALS)

Leave a Reply