ROBUST TOPIC INFERENCE FOR LATENT SEMANTIC LANGUAGE MODEL ADAPTATION

0
394

We perform topic-based, unsupervised language model adaptation under an N-best rescoring framework by using previous-pass system hypotheses to infer a topic mixture which is used to select topic-dependent LMs for interpolation with a topic-independent LM. Our primary focus is on techniques for improving the robustness of topic inference for a given utterance with respect to recognition errors, including the use of ASR confidence and contextual information from surrounding utterances. We describe a novel application of metadata-based pseudo-story segmentation to language model adaptation, and present good improvements to character error rate on multi-genre GALE Project data in Mandarin Chinese.Â