#15  Infrequent Sense Identification for Mandarin Text to Speech Systems 

There are seven cases of grapheme to phoneme (GTP) in a text to speech (TTS) system (Yarowsky, 1997). Among them, the most difficult task is disambiguating the homograph word, which has the same POS (part of speech) but different pronunciation. In this case, different pronunciations of the same word always correspond to different word senses. Once the word senses are disambiguated, the problem of GTP is resolved.

There is a little different from traditional WSD (word sense disambiguation), in this task two or more senses may correspond to one pronunciation. That is, the sense granularity is coarser than WSD. For example, the preposition “为” has three senses: sense1 and sense2 have the same pronunciation {wei 4}, while sense3 corresponds to {wei 2}. In this task, to the target word, not only the pronunciations but also the sense labels are provided for training; but for test, only the pronunciations are evaluated. The challenge of this task is the much skewed distribution in real text: the most frequent pronunciation occupies usually over 80%.

In this task, we will provide a large volume of training data (each homograph word has at least 300 instances) accordance with the truly distribution in real text. In the test data, we will provide at least 100 instances for each target word. In order to focus on the performance of identifying the infrequent sense, we will intentionally divide the infrequent pronunciation instances and frequent instances half and half in the test dataset. The evaluation method compiles with the precision vs. recall evaluation.

All instances come from People Daily newspaper (the most popular newspaper in Mandarin). Double blind annotations are executed manually, and a third annotator checks the annotation.

Yarowsky, David (1997). “Homograph disambiguation in text-to-speech synthesis.” In van Santen, Jan T. H.; Sproat, Richard; Olive, Joseph P.; and Hirschberg, Julia. Progress in Speech Synthesis. Springer-Verlag, New York, 157-172.

Organizers: Peng Jin, Yunfang Wu and Shiwen Yu (Peking University Beijing, China)
  • Test data release: March 25, 2010
  • Result submission deadline: March 29, 2010,
  • Organizers send the test results: April 2, 2010

