51751 |
Creator |
3c9bbcc5b9dec5879be1be1348991a7e |
51751 |
Creator |
cd9399913909d9a164856bb74229a18a |
51751 |
Date |
2017-10 |
51751 |
Is Part Of |
repository |
51751 |
abstract |
This work looks in depth at several studies that have attempted to automate the process
of citation importance classification based on the publications’ full text. We offer
a comparison of their individual similarities, strengths and weaknesses. We analyse
a range of features that have been previously used in this task. Our experimental
results confirm that the number of in-text references are highly predictive of influence.
Contrary to the work of Valenzuela et al. (2015), we find abstract similarity one
of the most predictive features. Overall, we show that many of the features previously
described in literature have been either reported as not particularly predictive,
cannot be reproduced based on their existing descriptions or should not be used due
to their reliance on external changing evidence. Additionally, we find significant
variance in the results provided by the PDF extraction tools used in the pre-processing
stages of citation extraction. This has a direct and significant impact on the classification
features that rely on this extraction process. Consequently, we discuss challenges
and potential improvements in the classification pipeline, provide a critical review
of the performance of individual features and address the importance of constructing
a large-scale gold-standard reference dataset. |
51751 |
authorList |
authors |
51751 |
presentedAt |
ext-f0e6daecf8e678516f2cddd346a45279 |
51751 |
status |
peerReviewed |
51751 |
uri |
http://data.open.ac.uk/oro/document/635926 |
51751 |
uri |
http://data.open.ac.uk/oro/document/635927 |
51751 |
uri |
http://data.open.ac.uk/oro/document/635937 |
51751 |
uri |
http://data.open.ac.uk/oro/document/635938 |
51751 |
uri |
http://data.open.ac.uk/oro/document/635939 |
51751 |
uri |
http://data.open.ac.uk/oro/document/635940 |
51751 |
uri |
http://data.open.ac.uk/oro/document/661066 |
51751 |
type |
AcademicArticle |
51751 |
type |
Article |
51751 |
label |
Pride, David and Knoth, Petr (2017). Incidental or influential? – A decade of
using text-mining for citation function classification. In: 16th International Society
of Scientometrics and Informetrics Conference, 16-20 Oct 2017, Wuhan. |
51751 |
label |
Pride, David and Knoth, Petr (2017). Incidental or influential? – A decade of using
text-mining for citation function classification. In: 16th International Society
of Scientometrics and Informetrics Conference, 16-20 Oct 2017, Wuhan. |
51751 |
Title |
Incidental or influential? – A decade of using text-mining for citation function classification. |
51751 |
in dataset |
oro |