The goal of this task is to provide a framework for the evaluation of systems for cross-lingual lexical substitution. Given a paragraph and a target word, the goal is to provide several correct translations for that word in a given language, with the constraint that the translations fit the given context in the source language. This is a follow-up of the English lexical substitution task from SemEval-2007 (McCarthy and Navigli, 2007), but this time the task is cross-lingual.

While there are connections between this task and the task of automatic machine translation, there are several major differences. First, cross-lingual lexical substitution targets one word at a time, rather than an entire sentence as machine translation does. Second, in cross-lingual lexical substitution we seek as many good translations as possible for the given target word, as opposed to just one translation, which is the typical output of machine translation. There are also connections between this task and a word sense disambiguation task which uses distinctions in translations for word senses (Resnik and Yarowsky, 1997) however in this task we do not restrict the translations to those in a specific parallel corpus; the annotators and systems are free to choose the translations from any available resource. Also, we do not assume a fixed grouping of translations to form "senses" and so it is possible that any token instance of a word may have translations in common with other token instances that are not themselves directly related.

Given a paragraph and a target word, the task is to provide several correct translations for that word in a given language. We will use English as the source language and Spanish as the target language.


Organizers: Rada Mihalcea (University of North Texas), Diana McCarthy (University of Sussex), Ravi Sinha (University of North Texas)
Web Site: http://lit.csci.unt.edu/index.php/Semeval_2010

  • Test data availability: 1 March - 2 April , 2010
  • Result submission deadline: within 7 days after downloading the *test* data.
  • Closing competition for this task: 2 April

