subject predicate object context
51046 Creator 8eb9378b0e3dcd225dfc47fcdc9b35f4
51046 Creator ext-a375ad27f9bba79eaff7219da857a8f7
51046 Creator ext-1bd20fb2a8369a734b45a1b584d8833b
51046 Creator ext-1cd6e5f0578cd1928a68eccf77b9903c
51046 Creator ext-4a193ef2ae87a5e2160020b06c423d42
51046 Creator ext-7e2d742437412763d805bb99d671f62a
51046 Creator ext-8edebc372e764cc21362813fb4a3015b
51046 Creator ext-dee261f51438816dd34233bc799d20cc
51046 Date 2016-07-14
51046 Is Part Of repository
51046 Is Part Of p21576912
51046 abstract We present an incremental Bayesian model that resolves key issues of crowd size and data quality for consensus labeling. We evaluate our method using data collected from a real-world citizen science program, BeeWatch, which invites members of the public in the United Kingdom to classify (label) photographs of bumblebees as one of 22 possible species. The biological recording domain poses two key and hitherto unaddressed challenges for consensus models of crowdsourcing: (1) the large number of potential species makes classification difficult, and (2) this is compounded by limited crowd availability, stemming from both the inherent difficulty of the task and the lack of relevant skills among the general public. We demonstrate that consensus labels can be reliably found in such circumstances with very small crowd sizes of around three to five users (i.e., through group sourcing). Our incremental Bayesian model, which minimizes crowd size by re-evaluating the quality of the consensus label following each species identification solicited from the crowd, is competitive with a Bayesian approach that uses a larger but fixed crowd size and outperforms majority voting. These results have important ecological applicability: biological recording programs such as BeeWatch can sustain themselves when resources such as taxonomic experts to confirm identifications by photo submitters are scarce (as is typically the case), and feedback can be provided to submitters in a timely fashion. More generally, our model provides benefits to any crowdsourced consensus labeling task where there is a cost (financial or otherwise) associated with soliciting a label.
51046 authorList authors
51046 issue 4
51046 status peerReviewed
51046 uri http://data.open.ac.uk/oro/document/633709
51046 volume 7
51046 type AcademicArticle
51046 type Article
51046 label Siddharthan, Advaith ; Lambin, Christopher; Robinson, Anne-Marie; Sharma, Nirwan; Comont, Richard; O’Mahony, Elaine; Mellish, Chris and Van Der Wal, René (2016). Crowdsourcing Without a Crowd: Reliable Online Species Identification Using Bayesian Models to Minimize Crowd Size. ACM Transactions on Intelligent Systems and Technology, 7(4), article no. 45.
51046 label Siddharthan, Advaith ; Lambin, Christopher; Robinson, Anne-Marie; Sharma, Nirwan; Comont, Richard; O’Mahony, Elaine; Mellish, Chris and Van Der Wal, René (2016). Crowdsourcing Without a Crowd: Reliable Online Species Identification Using Bayesian Models to Minimize Crowd Size. ACM Transactions on Intelligent Systems and Technology, 7(4), article no. 45.
51046 Publisher ext-2af1883e4bbfa0356fcedb366171cb38
51046 Title Crowdsourcing Without a Crowd: Reliable Online Species Identification Using Bayesian Models to Minimize Crowd Size
51046 in dataset oro