Topic tracking for radio, TV broadcast, and newswire

0
429

\

We present our tracking system for the 1998 Topic Detection and Tracking project (TDT-2). This project addresses multiple sources of information in the form of both text and speech from newswire, radio and television news broadcast programs. The technical challenge of TDT-2 tracking is to follow the topics being discussed in the stories from multiple sources. Our tracking system is probability based and we successfully solve the problem of score normalization across topics. Our automatic score normalization is simple, e cient and very e ective. Tested on the 20K TDT-2 stories collected between March and April 1998, our tracking system achieves the performance of 1.5% miss error (on a combination of closed caption and newswire) and 3.0% miss error (on a combination of automatic speech recognition output and newswire) at the cost of 0.1% false alarm error. In the 1998 TDT-2 evaluation, our tracking system was ranked the best with the o cial topic-weighted Ctrack of 0.0057.Â