Lora Aroyo, Anca Dumitrache, Oana Inel, Zoltán Szlávik, Benjamin Timmermans, and Chris Welty. 2019. Crowdsourcing Inclusivity: Dealing with Diversity of Opinions, Perspectives and Ambiguity in Annotated Data. In Proc. WWW Companion. 1294-1295.
Agathe Balayn and Alessandro Bozzon. 2019. Designing Evaluations of Machine Learning Models for Subjective Inference: The Case of Sentence Toxicity. In REAIS Workshop a HCOMP.
David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, and Antonio Torralba. 2020. Rewriting a deep generative model. In Proc. ECCV. Springer, 351-369.
Jonathan Bragg and Daniel S Weld. 2018. Sprout: Crowd-powered task design for crowdsourcing. In Proc. UIST. 165-176.
Guillermo F Cabrera, Christopher J Miller, and Jeff Schneider. 2014. Systematic labeling bias: De-biasing where everyone is wrong. In Proc. ICPR. 4417-4422.
Joseph Chee Chang, Saleema Amershi, and Ece Kamar. 2017. Revolt: Collaborative crowdsourcing for labeling machine learning datasets. In Proc. CHI. 2334-2346.
Nitesh V. Chawla, KevinW. Bowyer, Lawrence O. Hall, andW. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 16 (2002), 321-357.
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proc. KDD. 785-794.
John Joon Young Chung, Jean Y Song, Sindhu Kutty, Sungsoo Hong, Juho Kim, and Walter S Lasecki. 2019. Efficient elicitation approaches to estimate collective crowd answers. Proc. ACM Hum. Comput. Interact. 3, CSCW (2019), 1-25.
Gennaro Costagliola, Vincenzo Deufemia, Giuseppe Polese, and Michele Risi. 2004. A parsing technique for sketch recognition systems. In Proc. VL/HCC. 19-26.
Arthur J Cropley. 1997. Fostering creativity in the classroom: General principles. The creativity research handbook 1, 84.114 (1997), 1-46.
Nilesh Dalvi, Anirban Dasgupta, Ravi Kumar, and Vibhor Rastogi. 2013. Aggregating crowdsourced binary ratings. In Proc. WWW. 285-294.
Burcu F. Darst, Kristen C. Malecki, and Corinne D. Engelman. 2018. Using recursive feature elimination in random forest to account for correlated variables in high dimensional data. BMC Genetics 19 (2018).
Luca De Alfaro, Vassilis Polychronopoulos, and Michael Shavlovsky. 2015. Reliable aggregation of boolean crowdsourced tasks. In Proc. HCOMP.
Mark Diaz, Isaac Johnson, Amanda Lazar, Anne Marie Piper, and Darren Gergle. 2018. Addressing Age-Related Bias in Sentiment Analysis. In Proc. CHI. 1-14.
Pinar Donmez, Jaime G Carbonell, and Jeff Schneider. 2009. Efficiently learning the accuracy of labeling sources for selective sampling. In Proc. KDD. 259-268.
Anca Dumitrache. 2015. Crowdsourcing disagreement for collecting semantic annotation. In Proc. ESWC. 701-710.
Anca Dumitrache, Lora Aroyo, and Chris Welty. 2018. Capturing ambiguity in crowdsourcing frame disambiguation. In Proc. HCOMP.
Anca Dumitrache, Lora Aroyo, and Chris Welty. 2018. Crowdsourcing ground truth for medical relation extraction. ACM Trans. Interact. Intell. Syst. 8, 2 (2018).
Sam Fletcher and Md Zahidul Islam. 2018. Comparing sets of patterns with the Jaccard index. Australas. J. Inf. Syst. 22 (2018).
Clive Frankish, Richard Hull, and Pam Morgan. 1995. Recognition accuracy and user acceptance of pen interfaces. In Proc. CHI, Vol. 95. 503-510.
Mikel Galar, Alberto Fernández, Edurne Barrenechea, Humberto Bustince, and Francisco Herrera. 2011. An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes. Pattern Recognit. 44, 8 (2011), 1761-1776.
Bin-Bin Gao, Chao Xing, Chen-Wei Xie, Jianxin Wu, and Xin Geng. 2017. Deep label distribution learning with label ambiguity. IEEE Trans. Image Process. 26, 6 (2017), 2825-2838.
W. Gao, L.Wang, Y.-F. Li, and Z.-H. Zhou. 2016. Risk minimization in the presence of label noise. In Proc. AAAI. 1575-1581.
Mitchell L Gordon, Kaitlyn Zhou, Kayur Patel, Tatsunori Hashimoto, and Michael S Bernstein. 2021. The disagreement deconvolution: Bringing machine learning performance metrics in line with reality. In Proc. CHI. 1-14.
Melody Guan, Varun Gulshan, Andrew Dai, and Geoffrey Hinton. 2018. Who said what: Modeling individual labelers improves classification. In Proc. AAAI.
Andrew Hard, Chloé M Kiddon, Daniel Ramage, Francoise Beaufays, Hubert Eichner, Kanishka Rao, Rajiv Mathews, and Sean Augenstein. 2018. Federated Learning for Mobile Keyboard Prediction. In arXiv:1811.03604.
Ali Hasan, Sana Moin, Ahmad Karim, and Shahaboddin Shamshirband. 2018. Machine learning-based sentiment analysis for twitter accounts. Appl. Math. Comput. 23, 1 (2018), 11.
Stefan Heindorf, Martin Potthast, Benno Stein, and Gregor Engels. 2015. Towards vandalism detection in knowledge bases: Corpus construction and analysis. In Proc. SIGIE. 831-834.
Matthias Hirth, Tobias Hoßfeld, and Phuoc Tran-Gia. 2013. Analyzing costs and accuracy of validation mechanisms for crowdsourcing platforms. Math. Comput. Modell. 57, 11-12 (2013), 2918-2932.
Jason I. Hong and James A. Landay. 2000. SATIN: A Toolkit for Informal Ink-Based Applications. In Proc. UIST. 63-72.
Bing Quan Huang, YB Zhang, and Mohand Tahar Kechadi. 2007. Preprocessing techniques for online handwriting recognition. In Proc. ISDA. 793-800.
Christoph Hube, Besnik Fetahu, and Ujwal Gadiraju. 2019. Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. In Proc. CHI. 1-12.
Panagiotis G Ipeirotis, Foster Provost, Victor S Sheng, and Jing Wang. 2014. Repeated labeling using multiple noisy labelers. Data Min. Knowl. Discov. 28, 2 (2014), 402-441.
Sanjay Kairam and Jeffrey Heer. 2016. Parting crowds: Characterizing divergent interpretations in crowdsourced annotation tasks. In Proc. CSCW. 1637-1648.
David R Karger, Sewoong Oh, and Devavrat Shah. 2011. Iterative learning for reliable crowdsourcing systems. In Proc. NeurIPS. 1953-1961.
Eugene Laksana, Tadas Baltrušaitis, Louis-Philippe Morency, and John P Pestian. 2017. Investigating facial behavior indicators of suicidal ideation. In Proc. FG. 770-777.
SangWon Lee, Rebecca Krosnick, Sun Young Park, Brandon Keelean, Sach Vaidya, Stephanie D O'Keefe, and Walter S Lasecki. 2018. Exploring real-time collaboration in crowd-powered systems through a ui design tool. Proc. ACM Hum. Comput. Interact. 2, CSCW (2018), 1-23.
Luis A. Leiva, Vicent Alabau, Verónica Romero, Alejandro H. Toselli, and Enrique Vidal. 2014. Context-Aware Gestures for Mixed-Initiative Text Editing UIs. Interact. Comput. 27, 6 (2014).
Christopher H Lin, Mausam Mausam, and Daniel S Weld. 2012. Crowdsourcing control: Moving beyond multiple choice. In Proc. AAAI Workshops.
James S. Lipscomb. 1991. A trainable gesture recognizer. Pattern Recognit. 24, 9 (1991), 895-907.
Tong Liu, Akash Venkatachalam, Pratik Sanjay Bongale, and Christopher Homan. 2019. Learning to predict population-level label distributions. In Proc. WWW Companion. 1111-1120.
Wenyin Liu. 2003. On-line graphics recognition: State-of-the-art. In International Workshop on Graphics Recognition. 291-304.
Robert Tyler Loftin, James MacGlashan, Bei Peng, Matthew E Taylor, Michael L Littman, Jeff Huang, and David L Roberts. 2014. A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback.. In Proc. AAAI. 937-943.
A Chris Long Jr, James A Landay, Lawrence A Rowe, and Joseph Michiels. 2000. Visual similarity of pen gestures. In Proc. CHI. 360-367.
VK Chaithanya Manam and Alexander J Quinn. 2018. Wingit: Efficient refinement of unclear task instructions. In Proc. HCOMP.
Diana Maynard and Adam Funk. 2011. Automatic detection of political opinions in tweets. In Proc. ESWC. 88-99.
Danaë Metaxa-Kakavouli, Kelly Wang, James A Landay, and Jeff Hancock. 2018. Gender-inclusive design: Sense of belonging and bias in web interfaces. In Proc. CHI. 1-6.
Charles Kay Ogden and Ivor Armstrong Richards. 1923. The Meaning of Meaning: A Study of the Influence of Language upon Thought and of the Science of Symbolism. Nature 111, 566 (1923).
Victoria C Oleynick, Todd M Thrash, Michael C LeFew, Emil G Moldovan, and Paul D Kieffaber. 2014. The scientific study of inspiration in the creative process: challenges and opportunities. Front. Hum. Neurosci. 8 (2014), 436.
Sameera Palipana, Dariush Salami, Luis A. Leiva, and Stephan Sigg. 2021. Pantomime: Mid-Air Gesture Recognition with Sparse Millimeter-Wave Radar Point Clouds. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 1 (2021).
Brandon Paulson and Tracy Hammond. 2008. Paleosketch: accurate primitive sketch recognition and beautification. In Proc. IUI. 1-10.
Brandon Paulson, Pankaj Rajan, Pedro Davalos, Ricardo Gutierrez-Osuna, and Tracy Hammond. 2008. What!?! no Rubine features?: using geometric-based features to produce normalized confidence values for sketch recognition. In HCC Workshop: Sketch Tools for Diagramming. 57-63.
Martin Potthast. 2010. Crowdsourcing a Wikipedia vandalism corpus. In Proc. SIGIR. 789-790.
D.M.W. Powers. 2011. Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. J. Mach. Learn. Technol. 2, 1 (2011).
V. C. Raykar, Y. Shipeng, L. H. Zhao, G . H. Valadez, C. Florin, L. Bogoni, and L. Moy. 2010. Learning from crowds. J. Mach. Learn. Res. 11 (2010), 1297-1322.
Ryan Rifkin and Aldebaro Klautau. 2004. In Defense of One-Vs-All Classification. J. Mach. Learn. Res. 5 (2004), 101-141.
Dean Rubine. 1991. Specifying Gestures by Example. Proc. SIGGRAPH 25, 4 (1991), 329-337.
Mike Schaekermann, Joslin Goh, Kate Larson, and Edith Law. 2018. Resolvable vs. irresolvable disagreement: A study on worker deliberation in crowd work. Proc. ACM Hum. Comput. Interact. 2, CSCW (2018), 1-19.
Shilad Sen, Margaret E Giesel, Rebecca Gold, Benjamin Hillmann, Matt Lesicko, Samuel Naden, Jesse Russell, ZixiaoWang, and Brent Hecht. 2015. Turkers, Scholars, " Arafat" and" Peace" Cultural Communities and Algorithmic Gold Standards. In Proc. CSCW. 826-838.
Victor S Sheng, Foster Provost, and Panagiotis G Ipeirotis. 2008. Get another label? improving data quality and data mining using multiple, noisy labelers. In Proc. KDD. 614-622.
Beat Signer, Ueli Kurmann, and M Norrie. 2007. iGesture: a general gesture recognition framework. In Proc. ICDAR, Vol. 2. 954-958.
Padhraic Smyth, Usama Fayyad, Michael Burl, Pietro Perona, and Pierre Baldi. 1994. Inferring Ground Truth from Subjective Labelling of Venus Images. In Proc. NeurIPS. 1085-1092.
Rion Snow, Brendan O'connor, Dan Jurafsky, and Andrew Y Ng. 2008. Cheap and fast-but is it good? evaluating non-expert annotations for natural language tasks. In Proc. EMNLP. 254-263.
Yu Suzuki and Satoshi Nakamura. 2016. Assessing the quality of Wikipedia editors through crowdsourcing. In Proc. WWW. 1001-1006.
Dapeng Tao, Jun Cheng, Zhengtao Yu, Kun Yue, and LizhenWang. 2019. Domain-Weighted Majority Voting for Crowdsourcing. IEEE Trans. Neural Netw. Learn. Syst. 30, 1 (2019), 163-174.
Fangna Tao, Liangxiao Jiang, and Chaoqun Li. 2020. Label similarity-based weighted soft majority voting and pairing for crowdsourcing. Knowl. Inf. Syst. 62 (2020), 2521-2538.
Charles C. Tappert, Ching Y. Suen, and Toru Wakahara. 1990. The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12 (1990), 787-808.
ToddMThrash, Laura A Maruskin, Scott E Cassidy, JamesWFryer, and RichardM Ryan. 2010. Mediating between the muse and the masses: Inspiration and the actualization of creative ideas. J. Pers. Soc. Psych. 98, 3 (2010), 469.
T. Tian, J. Zhu, and Y. Qiaoben. 2019. Max-margin majority voting for learning from crowds. IEEE Trans. Pattern Anal. Mach. Intell. 41, 10 (2019), 2480-2494.
Carlos Toxtli, Siddharth Suri, and Saiph Savage. 2021. Quantifying the Invisible Labor in CrowdWork. Proc. ACM Hum. Comput. Interact. 5, CSCW2 (2021), 1-26.
Jinzheng Tu, Guoxian Yu, Carlotta Domeniconi, Jun Wang, Guoqiang Xiao, and Maozu Guo. 2018. Multi-label answer aggregation based on joint matrix factorization. In Proc. ICDM. IEEE, 517-526.
Alexandra Uma, Tommaso Fornaciari, Anca Dumitrache, Tristan Miller, Jon Chamberlain, Barbara Plank, Edwin Simpson, and Massimo Poesio. 2021. SemEval-2021 Task 12: Learning with Disagreements. In ACL Workshop on Semantic Evaluation.
Shaun Wallace, Brendan Le, Luis A Leiva, Aman Haq, Ari Kintisch, Gabrielle Bufrem, Linda Chang, and Jeff Huang. 2020. Sketchy: Drawing Inspiration from the Crowd. Proc. ACM Hum. Comput. Interact. 4, CSCW2 (2020), 1-27.
Shaun Wallace, Alexandra Papoutsaki, Neilly H Tan, Hua Guo, and Jeff Huang. 2021. Case Studies on the Motivation and Performance of Contributors Who Verify and Maintain In-Flux Tabular Datasets. Proc. ACM Hum. Comput. Interact. 5, CSCW2 (2021), 1-25.
ShaunWallace, Lucy Van Kleunen, Marianne Aubin-Le Quere, Abraham Peterkin, Yirui Huang, and Jeff Huang. 2017. Drafty: Enlisting Users to be Editors who Maintain Structured Data. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 5. 187-196.
Jing Wang and Xin Geng. 2019. Classification with Label Distribution Learning. In Proc. IJCAI. 3712-3718.
Peter Welinder, Steve Branson, Pietro Perona, and Serge Belongie. 2010. The multidimensional wisdom of crowds. In Proc. NeurIPS. 2424-2432.
Jacob Whitehill, Ting-fan Wu, Jacob Bergsma, Javier Movellan, and Paul Ruvolo. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In Proc. NeurIPS. 2035-2043.
Jin Xiangyu, Liu Wenyin, Sun Jianyong, and Zhengxing Sun. 2002. On-line graphics recognition. In Proc. Pacific Graphics. 256-264.
Anbang Xu, Zhe Liu, Yufan Guo, Vibha Sinha, and Rama Akkiraju. 2017. A New Chatbot for Customer Service on Social Media. In Proc. CHI. 3506-3510.
Arianna Yuan and Yang Li. 2020. Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods. In Proc. CHI. 1-12.
Biqiao Zhang, Georg Essl, and Emily Mower Provost. 2017. Predicting the distribution of emotion perception: capturing inter-rater variability. In Proc. ICMI. 51-59.
Hao Zhang, Liangxiao Jiang, and Wenqiang Xu. 2019. Multiple Noisy Label Distribution Propagation for Crowdsourcing. In Proc. IJCAI. 1473-1479.
Qian Zhao, F Maxwell Harper, Gediminas Adomavicius, and Joseph A Konstan. 2018. Explicit or implicit feedback? Engagement or satisfaction? A field experiment on machine-learning-based recommender systems. In Proc. SAC. 1331-1340.
Yudian Zheng, Guoliang Li, Yuanbing Li, Caihua Shan, and Reynold Cheng. 2017. Truth inference in crowdsourcing: Is the problem solved? Proc. VLDB Endowment 10, 5 (2017), 541-552.