Task 5 Automatic Keyphrase Extraction from Scientific Articles
|
In all tables, precision recall, f-score are micro-averaged precision, recall and F-scores. First P, R, F are computed over top 5 candidates, next P, R,F are computed over top 10 candidates and last P, R, F are computed over top 15 candidates. The system is ranked by F-score over top 15 candidates. For baselines, we use 1,2,-3 grams as candidates and TF*IDF as features. In baseline tabel, TF*IDF is an unsupervised method to rank the candidates based on TF*IDF scores. NB and ME are supervised methods using Naive Bayes and Maximum Entropy in WEKA. In second column, Reader means to use the reader-assigned keyword set as gold-standard data and Combined means to use both author-assigned and reader-assigned keyword sets as answers. Note that one team used author-assigned keywords by crawling the original documents. Among three outputs that this team submitted, we used one system which did not use author-assigned keywords for evaluation. Hence, none of systems used the author-assigned keywords.
|
Baseline
|
| | top 5 candidates | top 10 candidates | top 15 candidates | Method | by | Precision | Recall | F-score | Precision | Recall | F-score | Precision | Recall | F-score |
TF*IDF | Reader | 17.80% | 7.39% | 10.44% | 13.90% | 11.54% | 12.61% | 11.60% | 14.45% | 12.87% |
| Combined | 22.00% | 7.50% | 11.19% | 17.70% | 12.07% | 14.35% | 14.93% | 15.28% | 15.10% |
NB | Reader | 16.80% | 6.98% | 9.86% | 13.30% | 11.05% | 12.07% | 11.40% | 14.20% | 12.65% |
| Combined | 21.40% | 7.30% | 10.89% | 17.30% | 11.80% | 14.03% | 14.53% | 14.87% | 14.70% |
ME | Reader | 16.80% | 6.98% | 9.86% | 13.30% | 11.05% | 12.07% | 11.40% | 14.20% | 12.65% |
| Combined | 21.40% | 7.30% | 10.89% | 17.30% | 11.80% | 14.03% | 14.53% | 14.87% | 14.70% |
|
Performance over Combined Keywords
|
| top 5 candidates | top 10 candidates | top 15 candidates | Team | Precision | Recall | F-score | Precision | Recall | F-score | Precision | Recall | F-score |
SZTERGAK | 49.60% | 16.92% | 25.23% | 37.80% | 25.78% | 30.65% | 30.80% | 31.51% | 31.15% |
HUMB | 39.00% | 13.30% | 19.84% | 32.00% | 21.83% | 25.95% | 27.20% | 27.83% | 27.51% |
WINGNUS | 40.20% | 13.71% | 20.45% | 30.50% | 20.80% | 24.73% | 24.93% | 25.51% | 25.22% |
KP-Miner | 36.00% | 12.28% | 18.31% | 28.60% | 19.51% | 23.20% | 24.93% | 25.51% | 25.22% |
ICL | 34.40% | 11.73% | 17.49% | 29.20% | 19.92% | 23.68% | 24.60% | 25.17% | 24.88% |
SEERLAB | 39.00% | 13.30% | 19.84% | 29.70% | 20.26% | 24.09% | 24.07% | 24.62% | 24.34% |
KX_FBK | 34.20% | 11.66% | 17.39% | 27.00% | 18.42% | 21.90% | 23.60% | 24.15% | 23.87% |
DERIUNLP | 27.40% | 9.35% | 13.94% | 23.00% | 15.69% | 18.65% | 22.00% | 22.51% | 22.25% |
Maui | 35.00% | 11.94% | 17.81% | 25.20% | 17.19% | 20.44% | 20.33% | 20.80% | 20.56% |
DFKI | 29.20% | 9.96% | 14.85% | 23.30% | 15.89% | 18.89% | 20.27% | 20.74% | 20.50% |
BUAP | 13.60% | 4.64% | 6.92% | 17.60% | 12.01% | 14.28% | 19.00% | 19.44% | 19.22% |
SJTULTLAB | 30.20% | 10.30% | 15.36% | 22.70% | 15.48% | 18.41% | 18.40% | 18.83% | 18.61% |
UNICE | 27.40% | 9.35% | 13.94% | 22.40% | 15.28% | 18.17% | 18.33% | 18.76% | 18.54% |
UNPMC | 18.00% | 6.14% | 9.16% | 19.00% | 12.96% | 15.41% | 18.13% | 18.55% | 18.34% |
JU_CSE | 28.40% | 9.69% | 14.45% | 21.50% | 14.67% | 17.44% | 17.80% | 18.21% | 18.00% |
likey | 29.20% | 9.96% | 14.85% | 21.10% | 14.39% | 17.11% | 16.33% | 16.71% | 16.52% |
UvT | 24.80% | 8.46% | 12.62% | 18.60% | 12.69% | 15.09% | 14.60% | 14.94% | 14.77% |
POLYU | 15.60% | 5.32% | 7.93% | 14.60% | 9.96% | 11.84% | 13.87% | 14.19% | 14.03% |
UKP | 9.40% | 3.21% | 4.79% | 5.90% | 4.02% | 4.78% | 5.27% | 5.39% | 5.33% |
|
Performance over Reader-Assigned Keywords
|
| top 5 candidates | top 10 candidates | top 15 candidates | Team | Precision | Recall | F-score | Precision | Recall | F-score | Precision | Recall | F-score |
SZTERGAK | 34.60% | 14.37% | 20.31% | 26.10% | 21.68% | 23.69% | 21.47% | 26.74% | 23.82% |
HUMB | 30.40% | 12.62% | 17.84% | 24.80% | 20.60% | 22.51% | 21.20% | 26.41% | 23.52% |
KX_FBK | 29.20% | 12.13% | 17.14% | 23.20% | 19.27% | 21.05% | 20.33% | 25.33% | 22.56% |
WINGNUS | 30.60% | 12.71% | 17.96% | 23.60% | 19.60% | 21.41% | 19.80% | 24.67% | 21.97% |
ICL | 27.20% | 11.30% | 15.97% | 22.40% | 18.60% | 20.32% | 19.47% | 24.25% | 21.60% |
SEERLAB | 31.00% | 12.87% | 18.19% | 24.10% | 20.02% | 21.87% | 19.33% | 24.09% | 21.45% |
KP-Miner | 28.20% | 11.71% | 16.55% | 22.00% | 18.27% | 19.96% | 19.33% | 24.09% | 21.45% |
DERIUNLP | 22.20% | 9.22% | 13.03% | 18.90% | 15.70% | 17.15% | 17.53% | 21.84% | 19.45% |
DFKI | 24.40% | 10.13% | 14.32% | 19.80% | 16.45% | 17.97% | 17.40% | 21.68% | 19.31% |
UNICE | 25.00% | 10.38% | 14.67% | 20.10% | 16.69% | 18.24% | 16.00% | 19.93% | 17.75% |
SJTULTLAB | 26.60% | 11.05% | 15.61% | 19.40% | 16.11% | 17.60% | 15.60% | 19.44% | 17.31% |
BUAP | 10.40% | 4.32% | 6.10% | 13.90% | 11.54% | 12.61% | 14.93% | 18.60% | 16.56% |
Maui | 25.00% | 10.38% | 14.67% | 18.10% | 15.03% | 16.42% | 14.87% | 18.52% | 16.50% |
UNPMC | 13.80% | 5.73% | 8.10% | 15.10% | 12.54% | 13.70% | 14.47% | 18.02% | 16.05% |
JU_CSE | 23.40% | 9.72% | 13.73% | 18.10% | 15.03% | 16.42% | 14.40% | 17.94% | 15.98% |
likey | 24.60% | 10.22% | 14.44% | 17.90% | 14.87% | 16.24% | 13.80% | 17.19% | 15.31% |
POLYU | 13.60% | 5.65% | 7.98% | 12.60% | 10.47% | 11.44% | 12.00% | 14.95% | 13.31% |
UvT | 20.40% | 8.47% | 11.97% | 15.60% | 12.96% | 14.16% | 11.93% | 14.87% | 13.24% |
UKP | 8.20% | 3.41% | 4.82% | 5.30% | 4.40% | 4.81% | 4.67% | 5.81% | 5.18% |
|