subject predicate object context
51751 Creator 3c9bbcc5b9dec5879be1be1348991a7e
51751 Creator cd9399913909d9a164856bb74229a18a
51751 Date 2017-10
51751 Is Part Of repository
51751 abstract This work looks in depth at several studies that have attempted to automate the process of citation importance classification based on the publications’ full text. We offer a comparison of their individual similarities, strengths and weaknesses. We analyse a range of features that have been previously used in this task. Our experimental results confirm that the number of in-text references are highly predictive of influence. Contrary to the work of Valenzuela et al. (2015), we find abstract similarity one of the most predictive features. Overall, we show that many of the features previously described in literature have been either reported as not particularly predictive, cannot be reproduced based on their existing descriptions or should not be used due to their reliance on external changing evidence. Additionally, we find significant variance in the results provided by the PDF extraction tools used in the pre-processing stages of citation extraction. This has a direct and significant impact on the classification features that rely on this extraction process. Consequently, we discuss challenges and potential improvements in the classification pipeline, provide a critical review of the performance of individual features and address the importance of constructing a large-scale gold-standard reference dataset.
51751 authorList authors
51751 presentedAt ext-f0e6daecf8e678516f2cddd346a45279
51751 status peerReviewed
51751 uri http://data.open.ac.uk/oro/document/635926
51751 uri http://data.open.ac.uk/oro/document/635927
51751 uri http://data.open.ac.uk/oro/document/635937
51751 uri http://data.open.ac.uk/oro/document/635938
51751 uri http://data.open.ac.uk/oro/document/635939
51751 uri http://data.open.ac.uk/oro/document/635940
51751 uri http://data.open.ac.uk/oro/document/661066
51751 type AcademicArticle
51751 type Article
51751 label Pride, David and Knoth, Petr (2017). Incidental or influential? – A decade of using text-mining for citation function classification. In: 16th International Society of Scientometrics and Informetrics Conference, 16-20 Oct 2017, Wuhan.
51751 label Pride, David and Knoth, Petr (2017). Incidental or influential? – A decade of using text-mining for citation function classification. In: 16th International Society of Scientometrics and Informetrics Conference, 16-20 Oct 2017, Wuhan.
51751 Title Incidental or influential? – A decade of using text-mining for citation function classification.
51751 in dataset oro