ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment

ul Hassan, Umair; Zaveri, Amrapali; Marx, Edgard; Curry, Edward; Lehmann, Jens

doi:10.1007/978-3-319-49004-5_44

by Umair ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, Jens Lehmann

Abstract:

Crowdsourcing has emerged as a powerful paradigm for dealing with data using a large number of people. For instance, crowdsourcing has been successfully employed for quality assessment and improvement of Linked Data. A major challenge of Linked Data quality assessment with crowdsourcing is the cold-start problem: how to estimate the reliability of crowd workers and assign the most reliable workers to tasks? We address this challenge by proposing a novel approach for generating test questions from DBpedia, a general knowledge base, based on topics that de ne the domain of the tasks. We then use these test questions to approximate the reliability of the workers. Subsequently, the tasks are dynamically assigned to reliable workers to help improve the accuracy of collected responses. Our proposed approach, ACRyLIQ, is evaluated using workers hired from Amazon Mechanical Turk, on two real-world datasets with tasks for Linked Data quality assessment. We validate our proposed approach in terms of accuracy and compare it against the baseline approach of reliability approximation using gold-standard tasks. The results demonstrate that our proposed approach achieves high accuracy without the need of gold-standard tasks.

View PDF

Reference:

Umair ul Hassan, Amrapali Zaveri, Edgard Marx, Edward Curry, Jens Lehmann, "ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment", Chapter in 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW2016), Springer, Bologna, Italy, pp. 681-696, 2016.

Bibtex Entry:

@incollection{Hassan2016,
abstract = {Crowdsourcing has emerged as a powerful paradigm for dealing with data using a large number of people. For instance, crowdsourcing has been successfully employed for quality assessment and improvement of Linked Data. A major challenge of Linked Data quality assessment with crowdsourcing is the cold-start problem: how to estimate the reliability of crowd workers and assign the most reliable workers to tasks? We address this challenge by proposing a novel approach for generating test questions from DBpedia, a general knowledge base, based on topics that de ne the domain of the tasks. We then use these test questions to approximate the reliability of the workers. Subsequently, the tasks are dynamically assigned to reliable workers to help improve the accuracy of collected responses. Our proposed approach, ACRyLIQ, is evaluated using workers hired from Amazon Mechanical Turk, on two real-world datasets with tasks for Linked Data quality assessment. We validate our proposed approach in terms of accuracy and compare it against the baseline approach of reliability approximation using gold-standard tasks. The results demonstrate that our proposed approach achieves high accuracy without the need of gold-standard tasks.},
address = {Bologna, Italy},
author = {ul Hassan, Umair and Zaveri, Amrapali and Marx, Edgard and Curry, Edward and Lehmann, Jens},
booktitle = {20th International Conference on Knowledge Engineering and Knowledge Management (EKAW2016)},
doi = {10.1007/978-3-319-49004-5_44},
file = {:Users/ed/Library/Application Support/Mendeley Desktop/Downloaded/ul Hassan et al. - 2016 - ACRyLIQ Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment.pdf:pdf},
pages = {681--696},
publisher = {Springer},
title = {{ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment}},
url = {http://www.edwardcurry.org/publications/EKAW2016.pdf},
year = {2016}
}