Task 5 Automatic Keyphrase Extraction from Scientific Articles

In all tables, precision recall, f-score are micro-averaged precision, recall and F-scores. First P, R, F are computed over top 5 candidates, next P, R,F are computed over top 10 candidates and last P, R, F are computed over top 15 candidates. The system is ranked by F-score over top 15 candidates. For baselines, we use 1,2,-3 grams as candidates and TF*IDF as features. In baseline tabel, TF*IDF is an unsupervised method to rank the candidates based on TF*IDF scores. NB and ME are supervised methods using Naive Bayes and Maximum Entropy in WEKA. In second column, Reader means to use the reader-assigned keyword set as gold-standard data and Combined means to use both author-assigned and reader-assigned keyword sets as answers. Note that one team used author-assigned keywords by crawling the original documents. Among three outputs that this team submitted, we used one system which did not use author-assigned keywords for evaluation. Hence, none of systems used the author-assigned keywords.

Baseline

top 5 candidatestop 10 candidatestop 15 candidates
MethodbyPrecisionRecallF-scorePrecisionRecallF-scorePrecisionRecallF-score
TF*IDFReader17.80%7.39%10.44%13.90%11.54%12.61%11.60%14.45%12.87%
Combined22.00%7.50%11.19%17.70%12.07%14.35%14.93%15.28%15.10%
NBReader 16.80% 6.98% 9.86% 13.30% 11.05% 12.07% 11.40% 14.20% 12.65%
Combined 21.40% 7.30% 10.89% 17.30% 11.80% 14.03% 14.53% 14.87% 14.70%
MEReader 16.80% 6.98% 9.86% 13.30% 11.05% 12.07% 11.40% 14.20% 12.65%
Combined 21.40% 7.30% 10.89% 17.30% 11.80% 14.03% 14.53% 14.87% 14.70%

Performance over Combined Keywords

top 5 candidatestop 10 candidatestop 15 candidates
TeamPrecisionRecallF-scorePrecisionRecallF-scorePrecisionRecallF-score
SZTERGAK49.60%16.92%25.23%37.80%25.78%30.65%30.80%31.51%31.15%
HUMB39.00%13.30%19.84%32.00%21.83%25.95%27.20%27.83%27.51%
WINGNUS40.20%13.71%20.45%30.50%20.80%24.73%24.93%25.51%25.22%
KP-Miner36.00%12.28%18.31%28.60%19.51%23.20%24.93%25.51%25.22%
ICL34.40%11.73%17.49%29.20%19.92%23.68%24.60%25.17%24.88%
SEERLAB39.00%13.30%19.84%29.70%20.26%24.09%24.07%24.62%24.34%
KX_FBK34.20%11.66%17.39%27.00%18.42%21.90%23.60%24.15%23.87%
DERIUNLP27.40%9.35%13.94%23.00%15.69%18.65%22.00%22.51%22.25%
Maui35.00%11.94%17.81%25.20%17.19%20.44%20.33%20.80%20.56%
DFKI29.20%9.96%14.85%23.30%15.89%18.89%20.27%20.74%20.50%
BUAP13.60%4.64%6.92%17.60%12.01%14.28%19.00%19.44%19.22%
SJTULTLAB30.20%10.30%15.36%22.70%15.48%18.41%18.40%18.83%18.61%
UNICE27.40%9.35%13.94%22.40%15.28%18.17%18.33%18.76%18.54%
UNPMC18.00%6.14%9.16%19.00%12.96%15.41%18.13%18.55%18.34%
JU_CSE28.40%9.69%14.45%21.50%14.67%17.44%17.80%18.21%18.00%
likey29.20%9.96%14.85%21.10%14.39%17.11%16.33%16.71%16.52%
UvT24.80%8.46%12.62%18.60%12.69%15.09%14.60%14.94%14.77%
POLYU15.60%5.32%7.93%14.60%9.96%11.84%13.87%14.19%14.03%
UKP9.40%3.21%4.79%5.90%4.02%4.78%5.27%5.39%5.33%

Performance over Reader-Assigned Keywords

top 5 candidatestop 10 candidatestop 15 candidates
TeamPrecisionRecallF-scorePrecisionRecallF-scorePrecisionRecallF-score
SZTERGAK34.60%14.37%20.31%26.10%21.68%23.69%21.47%26.74%23.82%
HUMB30.40%12.62%17.84%24.80%20.60%22.51%21.20%26.41%23.52%
KX_FBK29.20%12.13%17.14%23.20%19.27%21.05%20.33%25.33%22.56%
WINGNUS30.60%12.71%17.96%23.60%19.60%21.41%19.80%24.67%21.97%
ICL27.20%11.30%15.97%22.40%18.60%20.32%19.47%24.25%21.60%
SEERLAB31.00%12.87%18.19%24.10%20.02%21.87%19.33%24.09%21.45%
KP-Miner28.20%11.71%16.55%22.00%18.27%19.96%19.33%24.09%21.45%
DERIUNLP22.20%9.22%13.03%18.90%15.70%17.15%17.53%21.84%19.45%
DFKI24.40%10.13%14.32%19.80%16.45%17.97%17.40%21.68%19.31%
UNICE25.00%10.38%14.67%20.10%16.69%18.24%16.00%19.93%17.75%
SJTULTLAB26.60%11.05%15.61%19.40%16.11%17.60%15.60%19.44%17.31%
BUAP10.40%4.32%6.10%13.90%11.54%12.61%14.93%18.60%16.56%
Maui25.00%10.38%14.67%18.10%15.03%16.42%14.87%18.52%16.50%
UNPMC13.80%5.73%8.10%15.10%12.54%13.70%14.47%18.02%16.05%
JU_CSE23.40%9.72%13.73%18.10%15.03%16.42%14.40%17.94%15.98%
likey24.60%10.22%14.44%17.90%14.87%16.24%13.80%17.19%15.31%
POLYU13.60%5.65%7.98%12.60%10.47%11.44%12.00%14.95%13.31%
UvT20.40%8.47%11.97%15.60%12.96%14.16%11.93%14.87%13.24%
UKP8.20%3.41%4.82%5.30%4.40%4.81%4.67%5.81%5.18%