subject predicate object context
58880 Creator 8eb9378b0e3dcd225dfc47fcdc9b35f4
58880 Creator ext-e31d11cd5ec70db4bf99122c75fb81d0
58880 Creator ext-a8e4f56d6da87bb3a2e09b15d7edd03a
58880 Creator ext-e4bc5c5734488d465d57a63167f6a25c
58880 Date 2010
58880 Is Part Of repository
58880 abstract We present two complementary annotation schemes for sentence based annotation of full scientific papers, CoreSC and AZ-II, which have been applied to primary research articles in chemistry. The AZ scheme is based on the rhetorical structure of a scientific paper and follows the knowledge claims made by the authors. It has been shown to be reliably annotated by independent human coders and has proven useful for various information access tasks. AZ-II is its extended version, which has been successfully applied to chemistry. The CoreSC scheme takes a different view of scientific papers, treating them as the humanly readable representations of scientific investigations. It therefore seeks to retrieve the structure of the investigation from the paper as generic high-level Core Scientific Concepts (CoreSC). CoreSCs have been annotated by 16 chemistry experts over a total of 265 full papers in physical chemistry and biochemistry. We describe the differences and similarities between the two schemes in detail and present the two corpora produced using each scheme. There are 36 shared papers in the corpora, which allows us to quantitatively compare aspects of the annotation schemes. We show the correlation between the two schemes, their strengths and weaknesses and discuss the benefits of combining a rhetorical based analysis of the papers with a content-based one.
58880 authorList authors
58880 presentedAt ext-0f1ec27d4ee64bfa7cc10ce5044e0bbb
58880 status peerReviewed
58880 uri http://data.open.ac.uk/oro/document/773189
58880 uri http://data.open.ac.uk/oro/document/773204
58880 uri http://data.open.ac.uk/oro/document/773205
58880 uri http://data.open.ac.uk/oro/document/773206
58880 uri http://data.open.ac.uk/oro/document/773207
58880 uri http://data.open.ac.uk/oro/document/773208
58880 uri http://data.open.ac.uk/oro/document/773688
58880 uri http://data.open.ac.uk/oro/document/773689
58880 type AcademicArticle
58880 type Article
58880 label Liakata, Maria; Teufel, Simone; Siddharthan, Advaith and Batchelor, Colin (2010). Corpora for the conceptualisation and zoning of scientific papers. In: LREC 2010, 7th International Conference on Language Resources and Evaluation, 2010, Valletta, Malta.
58880 label Liakata, Maria; Teufel, Simone; Siddharthan, Advaith and Batchelor, Colin (2010). Corpora for the conceptualisation and zoning of scientific papers. In: LREC 2010, 7th International Conference on Language Resources and Evaluation, 2010, Valletta, Malta.
58880 Title Corpora for the conceptualisation and zoning of scientific papers
58880 in dataset oro