Evaluation Exercises on Semantic Evaluation - ACL SigLex event
#17 All-words Word Sense Disambiguation on a Specific Domain (WSD-domain)
Description Domain adaptation is a hot issue in Natural Language Processing, including Word Sense Disambiguation. WSD systems trained on general corpora are known to perform worse when moved to specific domains. WSD-domain task will offer a testbed for domain-specific WSD systems, and will allow to test domain portability issues.
Texts from ECNC and WWF will be used in order to build domain specific test copora (see example below). The data will be available in a number of languages: English, Dutch and Italian, and possibly Basque and Chinese (confirmation pending). The sense inventories will be based on wordnets of the respective languages.
The test data will comprise three documents (6000 word chunk with approx. 2000 target words) for each language. The test data will be annotated by hand using double-blind annotation plus adjudication. Inter-Tagger Agreement will be measured. There will not be training data available, but participants are free to use existing hand-tagged corpora and lexical resources. Traditional precision and recall measures will be used in order to evaluate the participant systems, as implemented in past WSD Senseval and SemEval tasks.
Environment domain text example:
"Projections for 2100 suggest that temperature in Europe will have risen by between 2 to 6.3 °C above 1990 levels. The sea level is projected to rise, and a greater frequency and intensity of extreme weather events are expected. Even if emissions of greenhouse gases stop today, these changes would continue for many decades and in the case of sea level for centuries. This is due to the historical build up of the gases in the atmosphere and time lags in the response of climatic and oceanic systems to changes in the atmospheric concentration of the gases."