HOME   CALL FOR TASKS CALL FOR INTEREST Workshop Submission Workshop PROGRAM NEWS   REGISTRATION TASKS (short list) TASKS (with rankings) DATA   [41 available]
LRE special issue  
CONTACTS  
 

SemEval-2

Evaluation Exercises on Semantic Evaluation - ACL SigLex event

2010

TASKS

Area:  
Available tasks:   1 

#5  Automatic Keyphrase Extraction from Scientific Articles 

Description
Keyphrases are words that capture the main topic of the document. As keyphrases represent the key ideas of documents, extracting good keyphrases benefits various natural language processing (NLP) applications, such as summarization, information retrieval (IR) and question-answering (QA). In summarization, the keyphrases can be used as a semantic metadata. In search engines, keyphrases can supplement full-text indexing and assist users in creating good queries. Therefore, the quality of keyphrases has a direct impact on the quality of downstream NLP applications.

Recently, several systems and techniques have been proposed to extract keyphrases. Hence, we propose a shared task in order to provide the chance to compete and benchmark such technologies.

In the shared task, the participants will be provided with set of scientific articles and will be asked to produce the keyphrases for each article.

The organizers will provide trial, train and test data. The average length of the articles is between 6 and 8 pages including tables and pictures. We will provide two sets of answers: author-assigned keyphrases and reader-assigned keyphrases. All reader-assigned keyphrases will be extracted from the papers whereas some of author-assigned keyphrases may not occur in the content.

The answer set contains lemmatized keyphrases. We also accept two alternation of keyphrase: A of B -> B A (e.g. policy of school = school policy) and A's B (e.g. school's policy = school policy). However, in case that the semantics has been changed due to the alternation, we do not include the alternation as the answer set.

In this shared task, we follow the traditional evaluation metric. That is, we match the keyphrases in the answer sets (i.e. author-assigned keyphrases and reader-assigned keyphrases) with those participants provide and calculate precision, recall and F-score. Then finally, we will rank the participants by F-score.

The Google-group for the task is at http://groups.google.com.au/group/semeval2010-keyphrase?lnk=gcimh&pli=1

Organizers: Su Nam Kim (University of Melbourne), Olena Medelyan (University of Waikato), Min-yen Kan (National University of Singapore), Timothy Baldwin (University of Melbourne)
Web Site: http://docs.google.com/Doc?id=ddshp584_46gqkkjng4

[ Ranking]

Timeline:

  • Test and training data release : Feb. 15th (Monday)
  • Closing competition : March 19th (5 weeks for competition) (Friday)
  • Results out : by March 31st
  • Submission of description papers: April 17, 2010
  • Notification of acceptance: May 6, 2010
  • Workshop: July 15-16, 2010 ACL Uppsala

© 2008 FBK-irst  |  internal area