subject	predicate	object
56443	Creator	2515c15e5a8e5ef71a6e3a3c05d159fc
56443	Creator	556b3989a60915a4b54e93e345f053c2
56443	Creator	b0e94dfec7566d65fcff3a4ea91bc56f
56443	Date	2018
56443	Is Part Of	repository
56443	abstract	Human evaluations are broadly thought to be more valuable the higher the inter-annotator agreement. In this paper we examine this idea. We will describe our experiments and analysis within the area of Automatic Question Generation. Our experiments show how annotators diverge in language annotation tasks due to a range of ineliminable factors. For this reason, we believe that annotation schemes for natural language generation tasks that are aimed at evaluating language quality need to be treated with great care. In particular, an unchecked focus on reduction of disagreement among annotators runs the danger of creating generation goals that reward output that is more distant from, rather than closer to, natural human-like language. We conclude the paper by suggesting a new approach to the use of the agreement metrics in natural language generation evaluation tasks.
56443	authorList	authors
56443	presentedAt	ext-06df43ccc4d6e2f0592d64a8284f91db
56443	status	peerReviewed
56443	uri	http://data.open.ac.uk/oro/document/658876
56443	uri	http://data.open.ac.uk/oro/document/658877
56443	uri	http://data.open.ac.uk/oro/document/658878
56443	uri	http://data.open.ac.uk/oro/document/658879
56443	uri	http://data.open.ac.uk/oro/document/658880
56443	uri	http://data.open.ac.uk/oro/document/658881
56443	uri	http://data.open.ac.uk/oro/document/673022
56443	type	AcademicArticle
56443	type	Article
56443	label	Amidei, Jacopo ; Piwek, Paul and Willis, Alistair (2018). Rethinking the Agreement in Human Evaluation Tasks. In: Proceedings of the 27th International Conference on Computational Linguistics, 20-26 Aug 2018, Santa Fe, New Mexico, pp. 3318–3329.
56443	label	Amidei, Jacopo ; Piwek, Paul and Willis, Alistair (2018). Rethinking the Agreement in Human Evaluation Tasks. In: Proceedings of the 27th International Conference on Computational Linguistics, 20-26 Aug 2018, Santa Fe, New Mexico, pp. 3318–3329.
56443	Title	Rethinking the Agreement in Human Evaluation Tasks
56443	in dataset	oro

About

Resources

Support

Follow us