ONLINE INFORMATION DATABASE ARCHIVES

0
804

CHAPTER ONE

1.0              INTRODUCTION

The amount of documents in an online information database system usually grows rapidly overtime. How to store, manage and search these documents within the online information database system is a challenging problem. Documents in an online information database system are stored as semi-structured data, while in the traditional relational database it is stored as structured data. Relational database management system cannot manage semi-structured data efficiently and cannot satisfy the requirement of content-based text retrieval.

A lot of research works have been done about semi-structured data, such as data modeling, query language for text retrieval, index methods and text retrieval algorithms and similarity search algorithms. These research results have been used a lot in an online information database system systems. SSREADER an online information database system, the national an online information database system and wanfang database are popular referencing system in china. All the referencing system classify the documents into several classes and support querying inside a given class. Metadata search and full-text search through a single keyword or expressions are both supported in these referencing systems. other examples of referencing system are greenstone an online information database system, uc berkeley an online information database system, tufts an online information database system, acm an online information database system, ncstrl etc. similar functions are supported in these referencing system, such as metadata searching, full-text searching, documents classification and browsing. Greenstone an online information database system has a suite of software that provides management toOIDS for creating and maintaining a an online information database system. tufts an online information database system is for the integration of collections that exist or may be developed in the future. there is a system named lore developed by Stanford. it is a database management system for managing semi-structured data. The ncstrl at cornell university is a distributed technical report referencing method developed by the arpa-sponsored computer science technical report project. The ncstrl collection is distributed among a set of interoperating servers operated by participating national archivess. all of the referencing system described above do not support the following functions: structure and content-based queries, automatic entries of external documents and parallel document processing.

The OIDS system described in this study has the following features. (1) Generalization: It is essentially a general document database management system. It can be used to build referencing system for user needs and provides a suite of toOIDS to maintain it. (2) Parallelism. OIDS uses a lot of processors to execute queries and manage documents, which improves both storage capacity and query efficiency. (3) Structure and content-based retrieval. Users can query inside a document for an element, e.g. a chapter of a book, which not only allows users to propose for a more accurate query, but also reduce the information transmission workload in networks. (4) Personalization. OIDS can query according to user’s interest and recommend documents relevant to user. (5) Automatic external data entering. OIDS can combine with other search engines in finding and adding references automatically. (6) Multi-format supporting. DL collects a lot of document resources including books, journal papers, proceedings etc. and supports document information retrieval for a lot of document formats. (7) DLSQL query. OIDS defines a query language like standard SQL, named DLSQL. By using DLSQL, users can program and do all the operations in OIDS. (8) Automatic document classification. It creates a classifier according to the sample documents loaded by the system manager and automatically classifies documents.

A method for cross referencing material in a reference work having a plurality of portions therein, comprising the following steps:

(a) providing a reference work including a plurality of major sections, a plurality of minor sections within at least two of the major sections, and a plurality of instructional steps within at least two of the minor sections;

(b) further providing a first series of sequential numbers for referencing each of the major sections, with each of the numbers of the first series corresponding to one of the major sections in sequential order;

(c) further providing a second series of sequential numbers for referencing each of the minor sections, with each of the numbers of the second series corresponding to one of the minor sections in sequential order;

(d) further providing a series of sequential letters for referencing each of the instructional steps, with each of the letters of the instructional steps corresponding to one of the instructional steps in sequential order;

(e) indicating a specific major section, minor section, and instructional step from a first portion of the reference work in a second portion of the reference work, by placing one of the first series of numbers, one of the second series of numbers, and one of the series of letters referring to the major section, minor section, and instructional step of the first portion of the reference work, in the second portion of the reference work, thereby providing backward cross referencing in the reference work; and

(f) indicating a specific major section, minor section, and instructional step from the second portion of the reference work in the first portion of the reference work, by placing one of the first series of numbers, one of the second series of numbers, and one of the series of letters referring to the major section, minor section, and instructional step of the second portion of the reference work, in the first portion of the reference work, thereby, providing both forward and backward cross referencing in the reference work.

1.1              STATEMENT OF PROBLEM

Owing to:

The difficulties peoples and staff face in locating materials.

Unwillingness attitude of some staff when dealing with data/information.

Difficulties people encountered when searching for a given book title.

Time wasted in searching for book on shelve.

Time wasted in arranging books on shelve.

Important nature of referencing method in the academic growth of any learning national archives.

The need arise for the development of an online information database system for the national archives.

ONLINE INFORMATION DATABASE ARCHIVES