Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
6984
Anwitaman Datta Stuart Shulman Baihua Zheng Shou-De Lin Aixin Sun Ee-Peng Lim (Eds.)
Social Informatics Third International Conference, SocInfo 2011 Singapore, October 6-8, 2011 Proceedings
13
Volume Editors Anwitaman Datta Aixin Sun Nanyang Technological University (NTU) School of Computer Engineering Block N4, Nanyang Avenue, 639798, Singapore E-mail: {anwitaman,axsun}@ntu.edu.sg Stuart Shulman University of Massachusetts Amherst Thompson Hall, 200 Hicks Way, Amherst, MA 01003, USA E-mail:
[email protected] Baihua Zheng Ee-Peng Lim Singapore Management University School of Information Systems 80 Stamford Rd, 178902, Singapore E-mail: {bhzheng,eplim}@smu.edu.sg Shou-De Lin National Taiwan University Graduate Institute of Networking and Multimedia Department of Computer Science and Information Engineering Roosevelt Rd., Taipei 10617, Taiwan E-mail:
[email protected] ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-24703-3 e-ISBN 978-3-642-24704-0 DOI 10.1007/978-3-642-24704-0 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011938200 CR Subject Classification (1998): C.2, H.5, H.4, H.3, I.2.6, J.4 LNCS Sublibrary: SL 3 – Information Systems and Application, incl. Internet/Web and HCI © Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
The International Conference on Social Informatics was first held in Warsaw, Poland in 2009, followed by Laxenburg, Austria in 2010. Singapore, a major hub in the Asia Pacific region well known for its multi-racial and multi-cultural society, is proud to have hosted the third conference. Both SocInfo 2009 and SocInfo 2010 were small meetings that covered mainly the computing perspective of social informatics. This orientation broadens at SocInfo 2011, reaching out to diverse researchers worldwide in the true Singaporean spirit, where people from different backgrounds meet and mix: a computational social science melting pot. Ultimately, the quality of a conference depends upon the papers included in its program. We received 68 full papers and 13 poster-paper submissions from across the globe, coming from 40 different countries, showcasing the international character of the conference. The Program and General Co-chairs wish to acknowledge the great work of our seven publicity chairs, who advertised the conference across the different regions. Given the multidisciplinary nature of some of the papers, we had to employ up to eight reviews for some papers to obtain adequate inputs from different perspectives, before deciding on the papers’ inclusion in the program. Our program committee members, supported by some external reviewers, turned in over 290 reviews. Based on these expert reviews, we selected 15 full papers, 8 short papers, and 13 poster papers for the final program. The healthy number of submissions, followed by the rigorous review process by a multi-disciplinary Program Committee, led to a very exciting program that meets our primary mission to make SocInfo 2011 a premier venue for social and computer scientists to exchange their latest research ideas as a basis for better integrated scholarship from the two disciplines going forward. The conference hosted papers spanning social network analysis and dynamics, eGovernance, security and privacy issues, peer production, and several case studies, to mention some key topics. This was topped up with four interesting keynote talks by Michael Macy (Cornell University), Noshir Contractor (Northwestern University), Jaideep Srivastava (University of Minnesota), Hsinchun Chen (University of Arizona); two invited speakers, Hiroshi Motoda (AFOSR/AOARD and Osaka University) and Sue B. Moon (Korea Advanced Institute of Science and Technology); and tutorials by Adam Wierzbicki (Polish-Japanese Institute of Information Technology) and Stuart W. Shulman (University of Massachusetts Amherst). The conference also featured a Symposium on Social Dynamics in Online Gaming to create a forum for the discussion of topics regarding online gaming and the dynamics of player interactions. Without the support of several funding agencies, organizations as well as individuals, organizing SocInfo 2011 successfully would not have been possible. These include sponsorships from AOARD, AFOSR, the Lee Foundation, the
VI
Preface
Polish-Japanese Institute of Information Technology, the Media Development Authority of Singapore, and the Singapore Internet Research Centre of Nanyang Technological University. The support from Singapore universities, namely SMU, NUS, and NTU, as well as the other supporting associations, the Singapore Infocomm Technology Federation and the International Communication Association, along with the Steering Committee’s guidance and leadership and student travel, grant support were of key importance. SocInfo 2011 was organized by a highly dedicated Organizing Committee. We sincerely thank the committee members for their contribution. We would also like to express our gratitude to our Honorary General Chairs, Rajendra K. Srivastava and Steven Miller, for their advice and the Program Committee members for their review efforts. A special thanks also goes to the administrative support team: Dorcas Ho, Angela Kwek, Chua Kian Peng, Wong Cheok Lup, Fong Soon Keat, and others, who helped with the various logistics. Last but not least, without the authors, presenters, and the attendees, this conference, like any other, would be worth nothing. So a special thanks to all the participants for giving the conference the life it has gained, and we hope that it continues to grow as a multi-disciplinary event in the years to come, providing a platform for researchers and industry partners from many walks of life to come together for the exchange of ideas. October 2011
Ee-Peng Lim Noshir Contractor Stephen E. Fienberg Anwitaman Datta Stuart Shulman Baihua Zheng Shou-De Lin
Organization
Organizing Committee Honorary General Chairs Rajendra K. Srivastava Steven Miller
Singapore Management University Singapore Management University
General Co-chairs Ee-Peng Lim Noshir Contractor Stephen E. Fienberg
Singapore Management University Northwestern University, USA Carnegie Mellon University, USA
Program Co-chairs Anwitaman Datta Richard Rogers Stuart Shulman
Nanyang Technological University, Singapore University of Amsterdam, The Netherlands University of Massachusetts Amherst, USA
Workshop Co-chairs Lin Qiu Angela Leung
Nanyang Technological University, Singapore Singapore Management University
Tutorial Co-chairs Do Quoc Anh Dion Hoe-Lian Goh
Singapore Management University Nanyang Technological University, Singapore
Poster/Demo Co-chairs Baihua Zheng Shou-De Lin
Singapore Management University National Taiwan University
Sponsorship/Exhibition Co-chairs David Lo Singapore Management University Jin-Cheon Na Nanyang Technological University, Singapore Publications Co-chairs Aixin Sun Leonard Bolc
Registration Chair Feida Zhu
Nanyang Technological University, Singapore Polish-Japanese Institute of Information Technology, Poland
Singapore Management University
VIII
Organization
Local Arrangement Co-chairs Hady Wirawan Lauw Institute for Infocomm Research, Singapore Chei Sian Lee Nanyang Technological University, Singapore Publicity Co-chairs Jie Tang Paolo Boldi Yan Wang Vineet Chaoji Dmitri Williams Eun Ju Lee Giorgos Cheliotis Tsuyoshi Murata Steering Committee Adam Wierzbicki (Chair) Karl Aberer Katsumi Tanaka Anwitaman Datta
Tsinghua University, China University of Milan, Italy Macquarie University, Australia Yahoo! Research, India University of South California, USA Seoul National University, Korea National University of Singapore Tokyo Institute of Technology, Japan
Polish-Japanese Institute of Information Technology, Poland EPFL Lausanne, Switzerland Kyoto University, Japan Nanyang Technological University, Singapore
Program Committee Research Papers Adam Wierzbicki Alice Robbin Anne Massey Asur Sitaram Baden Hughes Bamshad Mobasher Bernardo Sorj Bruce Neubauer Carlos Nunes Silva Chirag Shah Chris Cornelis Cosma Shalizi Chris Hinnant Christopher Mascaro Daniele Quercia Dario Taraborelli David Millard Elin Rønby Pedersen Emma Angus Ernesto Damiani Flo Reeder
Polish-Japanese Institute of Information Technology, Poland Indiana University, USA Indiana University, USA HP Labs, Palo Alto, USA SYL Semantics, New Zealand DePaul University, USA Federal University of Rio de Janeiro, Brazil Albany State University, USA University of Lisbon, Protugal Rutgers University, USA Ghent University, Belgium Carnegie Mellon University, USA Florida State University, USA Drexel University, USA University of Cambridge, UK University of Surrey, UK University of Southampton, UK Google, USA University of Wolverhampton, UK University of Milan, Italy MITRE Corporation, USA
Organization
Francesco Bolici Gerhard Fuchs George Barnett Geoffery Seaver Georgios Lappas Graciela Selaimen Hana Alnuaim Huseyin Oktay Helja Franssila Ibrahim Kushchu Ido Guy Irwin King James Caverlee James Joshi Jana Diesner Janusz Kacprzyk J´erˆome Lang Jie Tang Julita Vassileva Kalpana Shankar Karine Nahon Keiichi Nakata Klaus Bredl Marios Dikaiakos Mark Weal Marshall Poole Martine De Cock Maurizio Teli Mikolaj Morzy Michael Conover Michael Baron Miguel Vicente Nadia Bennani Neel Sundaresan Nicholas Rowland Nusa Ferligoj Paolo Massa Sonet Pedro-Garcia Lopez Peter Cruickshank Przemyslaw Kazienko Richard Forno Richard Rogers See-Kiong Ng
IX
University of Cassino, Italy University of Stuttgart, Germany University of California Davis, USA National Defense University, USA Technological Educational Institution of Western Macedonia, Greece Federal University of Rio de Janeiro, Brazil King Abdulaziz University, Saudi Arabia University of Massachusetts, USA University of Tampere, Finland Mobile Government Consortium International, UK IBM Research, Haifa, Israel Chinese University of Hong Kong Texas A&M University, USA University of Pittsburgh, USA Carnegie Mellon University, USA Polish Academy of Sciences, Poland CNRS, France Tsinghua University, China University of Saskatchewan, Canada University College Dublin, Ireland University of Washington, USA University of Reading, UK University of Augsburg, Germany University of Cyprus University of Southampton, UK University of Illinois Urbana-Champaign, USA Ghent University, Belgium Ahref Foundation, Italy Pozna´ n University of Technology, Poland Indiana University, USA Baron Consulting, Australia University of Valladolid, Spain Insa, Lyon, France eBay, USA Pennsylvania State University, USA University of Ljubljana, Slovenia Fondazione Bruno Kessler, Trento, Italy University Rovira i Virgili, Spain Edinburgh Napier University, UK Wroclaw University of Technology, Poland University of Maryland Baltimore County, USA University of Amsterdam, The Netherlands Institute for Infocomm Research, Singapore
X
Organization
Shawn Walker Sini Ruohomaa Sonja Buchegger Stuart Anderson Stephan Humer Sun-Ki Chai Svetlin Bostandjiev Taewoo Nam Thanakorn Sornkaew Thomas ˚ Agotnes Tony Moore Thorsten Strufe Timothy French Tsuyoshi Murata Ulrik Brandes Vaclav Snasel Wai-Tat Fu Weining Zhang Wenlian Hsu Winter Mason Xiaolin Shi Xiaoyong Du Yefeng Liu Ying Ding Yuh-Jong Hu Yves-Alexandre De Montjoye Posters/Demos Ken C.K. Lee I-Hsien Ting James Cheng Man-Kwan Shan Mi-yen Yeh Peng Wen-Chih Hu Haibo Yu-ru Lin
University of Washington, USA University of Helsinki, Finland KTH Stockholm, Sweden University of Edinburgh, UK Berlin University of the Arts, Germany University of Hawaii, USA University of California Santa Barbara, USA University at Albany-SUNY, USA Ramkhamhaeng University, Thailand University of Bergen, Norway Deloitte Analytic, USA Universit¨at Mannheim, Germany University of Bedfordshire, UK Tokyo Institue of Technology, Japan University of Konstanz, Germany ˇ VSB-Technical University of Ostrava, Czech Republic University of Illinois at Urbana-Champaign, USA University of Texas at San Antonio, USA Academia Sinica, Taiwan Yahoo! Research, USA Stanford University, USA Renmin University, China Waseda University, Japan Indiana University, USA National Chengchi University, Taiwan Massachusetts Institute of Technology, USA
University of Massachusetts Dartmouth, USA National University of Kaohsiung, Taiwan Nanyang Technological University, Singapore National Cheng Chih University, Taiwan Academia Sinica, Taiwan National Chiao Tung University, Taiwan Hong Kong Baptist University, Hong Kong Harvard University and Northeastern University, USA
External Reviewers Aixin Sun Alexandru Iosup Amirreza Masoumzadeh
Andreas H¨ ofer Anthony Ventresque Bing He
Bing Tian Dai Byung-Won On Cheng-Wei Lee
Organization
Dapi Shih Jordi Duch Hady Lauw Mun Thye Mak Haiqin Yang Nicholas Loulloudes Hassan Takabi Palakorn Achananuparp Ikram Muhammad Khan Roger Lee Ioannis Katakis Supaporn Spanurattana Jie Zhang Thomas Paul
Supporting Universities
Supporting Associations
Sponsors
Vera Liao Wei Dong William Yeoh Xiaoli Li Xin Xin Xuelian Long Zhenjiang Lin
XI
Table of Contents
Keynotes Digital Media and the Relational Revolution in Social Science . . . . . . . . . Michael W. Macy
1
Using Web Science to Understand and Enable 21st Century Multidimensional Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Noshir Contractor
3
Data Mining as a Key Enabler of Computational Social Science . . . . . . . . Jaideep Srivastava
4
Predicting Market Movements: From Breaking News to Emerging Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hsinchun Chen
5
Invited Talks Learning Information Diffusion Models from Observation and Its Application to Behavior Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hiroshi Motoda
6
Analysis of Twitter Unfollow: How often Do People Unfollow in Twitter and Why? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sue Moon
7
Network Analysis Robustness of Social Networks: Comparative Results Based on Distance Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paolo Boldi, Marco Rosa, and Sebastiano Vigna
8
Endogenous Control of DeGroot Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . Sridhar Mandyam and Usha Sridhar
22
Mathematical Continuity in Dynamic Social Networks . . . . . . . . . . . . . . . . John L. Pfaltz
36
eGovernance and Knowledge Management Government 2.0 Collects the Wisdom of Crowds . . . . . . . . . . . . . . . . . . . . . Taewoo Nam and Djoko Sigit Sayogo
51
XIV
Table of Contents
Web Searching for Health: Theoretical Foundations for Analyzing Problematic Search Engine Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pallavi Rao and Marko M. Skoric
59
The Role of Trust and ICT Proficiency in Structuring the Cross-Boundary Digital Government Research . . . . . . . . . . . . . . . . . . . . . . . Djoko Sigit Sayogo, Taewoo Nam, and Jing Zhang
67
Integration and Warehousing of Social Metadata for Search and Assessment of Scientific Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniil Mirylenka, Fabio Casati, and Maurizio Marchese
75
Applications of Network Analysis Comparing Linkage Graph and Activity Graph of Online Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuan Yao, Jiufeng Zhou, Lixin Han, Feng Xu, and Jian L¨ u
84
Context-Aware Nearest Neighbor Query on Social Networks . . . . . . . . . . . Yazhe Wang and Baihua Zheng
98
Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eva Zangerle, Wolfgang Gassler, and G¨ unther Specht
113
Community Dynamics A Spectral Analysis Approach for Social Media Community Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xuning Tang, Christopher C. Yang, and Xiajing Gong
127
Design of a Reputation System Based on Dynamic Coalition Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuan Liu, Jie Zhang, and Quanyan Zhu
135
Guild Play in MMOGs: Rethinking Common Group Dynamics Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muhammad Aurangzeb Ahmad, Zoheb Borbora, Cuihua Shen, Jaideep Srivastava, and Dmitri Williams Tadvise: A Twitter Assistant Based on Twitter Lists . . . . . . . . . . . . . . . . . Peyman Nasirifard and Conor Hayes
145
153
Case Studies A Case Study of the Effects of Moderator Posts within a Facebook Brand Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Irena Pletikosa Cvijikj and Florian Michahelles
161
Table of Contents
Cognition or Affect? – Exploring Information Processing on Facebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ksenia Koroleva, Hanna Krasnova, and Oliver G¨ unther
XV
171
Trust, Privacy, and Security Trend Analysis and Recommendation of Users’ Privacy Settings on Social Networking Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toshikazu Munemasa and Mizuho Iwaihara
184
Semantics-Enabled Policies for Information Sharing and Protection in the Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuh-Jong Hu, Win-Nan Wu, and Jiun-Jan Yang
198
Social Mechanism of Granting Trust Basing on Polish Wikipedia Requests for Adminship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Piotr Turek, Justyna Spychala, Adam Wierzbicki, and Piotr Gackowski Revealing Beliefs Influencing Trust between Members of the Czech Informatics Community . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tom´ aˇs Knap and Irena Ml´ynkov´ a
212
226
Peer-Production High-Throughput Crowdsourcing Mechanisms for Complex Tasks . . . . . . Guido Sautter and Klemens B¨ ohm
240
Designing for Motivation: Focusing on Motivational Values in Two Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fahri Yetim, Torben Wiedenhoefer, and Markus Rohde
255
A Bounded Confidence Approach to Understanding User Participation in Peer Production Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giovanni Luca Ciampaglia
269
Posters/Demos Modelling Social Network Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Radoslaw Michalski, Sebastian Palus, Piotr Br´ odka, Przemyslaw Kazienko, and Krzysztof Juszczyszyn
283
XVI
Table of Contents
Towards High-Quality Semantic Entity Detection over Online Forums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Juan Du, Weiming Zhang, Peng Cai, Linling Ma, Weining Qian, and Aoying Zhou
287
“I’m Not an Alcoholic, I’m Australian”: An Exploration of Alcohol Discourse in Facebook Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sarah Posner and Dennis Wollersheim
292
Impact of Expertise, Social Cohesiveness and Team Repetition for Academic Team Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anthony Ventresque, Jackson Tan Teck Yong, and Anwitaman Datta
296
CEO’s Apology in Twitter: A Case Study of the Fake Beef Labeling Incident by E-Mart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaram Park, Hoh Kim, Meeyoung Cha, and Jaeseung Jeong
300
GViewer: GPU-Accelerated Graph Visualization and Mining . . . . . . . . . . Jianlong Zhong and Bingsheng He
304
Sharing Scientific Knowledge with Knowledge Spaces . . . . . . . . . . . . . . . . . Marcos Baez, Fabio Casati, and Maurizio Marchese
308
Analysis of Multiplayer Platform Users Activity Based on the Virtual and Real Time Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaroslaw Jankowski
312
Tracking Group Evolution in Social Networks . . . . . . . . . . . . . . . . . . . . . . . . Piotr Br´ odka, Stanislaw Saganowski, and Przemyslaw Kazienko
316
Gathering in Digital Spaces: Exploring Topical Communities on Twitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cate Huston and Michael Weiss
320
“Eco-MAME”: Ecology Activity Promotion System Based on Human Psychological Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rie Tanaka, Shinichi Doi, Taku Konishi, Naoki Yoshinaga, Satoko Itaya, and Keiji Yamada SPLASH: Blending Gaming and Content Sharing in a Location-Based Mobile Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dion Hoe-Lian Goh, Chei Sian Lee, Alton Y.K. Chua, Khasfariyati Razikin, and Keng-Tiong Tan An Interactive Social Boarding System Using Home Infotainment Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sounak Dey and Avik Ghose
324
328
332
Table of Contents
XVII
Tutorials From Computational to Human Trust: Problems, Methods and Applications of Trust Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adam Wierzbicki
338
Text Analytics for Social Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stuart W. Shulman
339
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
341
Digital Media and the Relational Revolution in Social Science Michael W. Macy Department of Sociology Department of Information Science Cornell University Ithaca, NY 14853
Abstract. Social science paradigms are invariably grounded in the available methods of data collection. Beginning with administrative records in the late 19th Century, social scientists have collected stores of data on individual attributes, using surveys and records kept by governments and employers. Individual-level data is also aggregated as population statistics for groups of varying size, from households to nation states, and these data are analyzed using multivariate linear models that require the implausible assumption that the observations are independent, as if each respondent was the sole resident of a small island. In comparison, until recently, we have had very limited data about the interactions between people - such as influence, sanctioning, exchange, trust, attraction, avoidance, and imitation. Yet social relations and interactions are the foundation of social life. The entities that we most need to learn about are the things about which we know the least. The reason is simple: It is much easier to observe friends than to observe a friendship. Social interactions are fleeting and mostly private - one needs to be present at precisely the right moment. Moreover, relations are tedious and error-prone to hand-code and record, given the nuances of interaction, the need for repeated observations as relations unfold over time, and the rapid increase in the number of relations as the size of the group increases. As a consequence, studies of social interactions tend to be static, limited to the structures of interaction without regard to content, and based on very small groups. That is why social science has generally been limited mainly to the study of individuals with individual data aggregated for groups and populations. Except in very small groups, social relations have been just too hard to observe. All this is rapidly changing as human interactions move increasingly online. Interactions that for the history of humankind have been private and ephemeral in nature now leave a silicon record - literally footprints in the sand - in the form of publicly available digital records that allow automatic data collection on an unprecedented scale. However, social scientists have been reluctant to embrace the study of what is often characterized as the “virtual world,” as if human interaction somehow becomes metaphysical the moment it is mediated by information technologies. While great care must be exercised in generalizing to the o ine world, the digital traces of computer-mediated interactions are unique in human history, providing an exceptional opportunity for research on the dynamics of social interaction, in which individuals influence selected others in response to the influences they receive. In my presentation, I will survey recent studies using digital records of interpersonal interaction to address questions ranging from
¯
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 1–2, 2011. c Springer-Verlag Berlin Heidelberg 2011
2
M.W. Macy social inequality to diurnal and seasonal mood changes to the spread of protest in the Arab Spring, including contributions by Rob Claxton, Nathan Eagle, Scott Golder, Jon Kleinberg, Noona Oh, Patrick Park, Michael Siemens, Silvana Toska, and Shaomei Wu.
Using Web Science to Understand and Enable 21st Century Multidimensional Networks Noshir Contractor Northwestern University Evanston, IL, USA
Abstract. Recent advances in Web Science provide comprehensive digital traces of social actions, interactions, and transactions. These data provide an unprecedented exploratorium to model the socio-technical motivations for creating, maintaining, dissolving, and reconstituting multidimensional social networks. Multidimensional networks include multiple types of nodes (people, documents, datasets, tags, etc.) and multiple types of relationships (co-authorship, citation, web links, etc). Using examples from research in a wide range of activities such as disaster response, public health and massively multiplayer online games, Contractor will argue that Web Science serves as the foundation for the development of social network theories and methods to help advance our ability to understand and enable multidimensional networks.
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, p. 3, 2011. c Springer-Verlag Berlin Heidelberg 2011
Data Mining as a Key Enabler of Computational Social Science Jaideep Srivastava Department of Computer Science & Engineering University of Minnesota, Twin Cities, USA
Abstract. Observation and analysis of a phenomenon at unprecedented levels of granularity not only furthers our understanding of it, but also transforms the way it is studied. For instance, invention of gene-sequencing and computational analysis transformed the life sciences, creating fields of inquiry such as genomics, proteomics, etc.; and the Hubble space telescope has furthered the ability of humanity to look much farther beyond what we could otherwise. With the mass adoption of the Internet in our daily lives, and the ability to capture high resolution data on its use, we are at the threshold of a fundamental shift not only in our understanding of the social and behavioral sciences (i.e. psychology, sociology, and economics), but also the ways in which we study them. Massively Multiplayer Online Games (MMOGs) and Virtual Worlds (VWs) have become increasingly popular and have communities comprising tens of millions. They serve as unprecedented tools to theorize and empirically model the social and behavioral dynamics of individuals, groups, and networks within large communities. The preceding observation has led to a number of multi-disciplinary projects, involving teams of behavioral scientists and computational scientists, working together to develop novel methods and tools to explore the current limits of behavioral sciences. This talk consists of four parts. First, we describe findings from the Virtual World Exploratorium; a multi-institutional, multi-disciplinary project which uses data from commercial MMOGs and VWs to study many fields of social science, including sociology, social psychology, organization theory, group dynamics, macro-economics, etc. Results from investigations into various behavioral sciences will be presented. Second, we provide a survey of new approaches for behavioral informatics that are being developed by multi-disciplinary teams, and their successes. We will also introduce novel tools and techniques that are being used and»or developed as part of this research. Third, we will discuss some novel applications that are not yet there, but are just around the corner, and their associated research issues. Finally, we present commercial applications of Game Analytics research, based on our experiences with a startup company that we’ve created.
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, p. 4, 2011. c Springer-Verlag Berlin Heidelberg 2011
Predicting Market Movements: From Breaking News to Emerging Social Media Hsinchun Chen Department of Management Information Systems University of Arizona 1130 E. Helen St. Tucson, AZ 85721
Abstract. In this talk I will present several studies conducted at the AI Lab of the University of Arizona that aim to understand and predict market movements using text mining, breaking news, and social media. In “User-Generated Content on Social Media: Predicting New Product Market Success from Online Word-of-Mouth,” we explore the predictive validity of various text and sentiment measures of online WOM for the market success of new products. The context of our study is the Hollywood movie industry where the forecast of movie sales is highly challenging and has started to incorporate online WOM. We first examine the evolvement patterns of online WOM over time, followed by correlation analysis of how various sentiment measures are related to the metrics of new product success. Overall, the number of WOM messages was found to be the most useful predictor of the five new product metrics. In “AZ SmartStock: Stock Prediction with Targeted Sentiment and Life Support,” we develop a text-based stock prediction engine with targeted sentiment and life support considerations in a real world financial setting. We focus on inter-day trading experiments, with the 5-, 10-, 20-, and 40-day trading windows. We focus on S&P 500 firms in order to minimize the potential illiquid problem associated with thinly traded stocks. News articles from major newswires were extracted from Yahoo! Finance. Life support of a company is extracted from aggregated energy (novelty) of terms used in the news articles where the company is mentioned. The combined Life-Support model was shown to out-perform other models in the 10-day trading window setting. In “A Stakeholder Approach to Stock Prediction using Finance Social Media,” we utilize firm-related finance web forum discussions for the prediction of stock return and trading of firm stock. Considering forum participants uniformly as shareholders of the firm, suggested by prior studies, and extracting forum-level measures provided little improvement over the baseline set of fundamental and technician variables. Recognizing the true diversity among forum participants, segmenting them into stakeholder groups based upon their interactions in the forum social network and assessing them independently, refined the measures extracted from the forum and improved stock return prediction. The superior performance of the stakeholder-level model represented a statistically significant improvement over the baseline in directional accuracy, and provided an annual return of 44% in simulated trading of firm stock.
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, p. 5, 2011. c Springer-Verlag Berlin Heidelberg 2011
Learning Information Di«usion Models from Observation and Its Application to Behavior Analysis Hiroshi Motoda The Institute of Scientific and Industrial Research Osaka University, Japan
Abstract. We investigate how well di«erent information di«usion models can explain observation data by learning their parameters and discuss which model is more appropriate to which topic. We use two models, one from push type diffusion (AsIC) and the other from pull type di«usion (AsLT), both of which are extended versions of the well known Independent Cascade (IC) and the Linear Threshold (LT) models that incorporate asynchronous time delay. The model parameters are learned by maximizing the likelihood of the observed data being generated by an EM like iterative search, and the model selection is performed by choosing the one with better predictive power. We first show by using four real networks that the proposed learning algorithm correctly learns the model parameters both accurately and stably, and the proposed selection method identifies the correct di«usion model from which the data are generated. We then apply these methods to behavioral analysis of topic propagation using a real blog di«usion sequence, and show that although the inferred relative di«usion speed and range for each topic is rather insensitive to the model selected, there is a clear indication of which topic to better follow which model. The correspondence between the topic and the model selected is indeed interpretable.
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, p. 6, 2011. c Springer-Verlag Berlin Heidelberg 2011
Analysis of Twitter Unfollow: How often Do People Unfollow in Twitter and Why? Sue Moon Computer Science Department, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon, Korea
Abstract. Unfollow in Twitter o«ers a unique opportunity to researchers to study the dissolution of relationship. We collected daily snapshots of follow relationship of 1.2 million Korean-speaking users for 51 days and their all tweets. From careful statistical analysis, we confirm that unfollow is prevalent and irrelevant to the volume of interaction. We find that other factors such as link reciprocity, tweet burstiness and informativeness are crucial for unfollow decision. We conduct interview with 22 users to supplement the results and figure out motivations behind unfollow behavior. From those quantitative and qualitative research we draw significant implications in both theory and practice. Then we use a multiple logistic regression model to analyze the impacts of the structural and interactional properties on unfollow in Twitter. Our model with 42 dependent variables demonstrates that both structural and interactional properties are important to explain the unfollow behavior. Our findings are consistent with previous literature about multiple dimensions of tie strength in sociology but also add unique aspects of unfollow decision that people appreciate receiving attention rather than giving.
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, p. 7, 2011. c Springer-Verlag Berlin Heidelberg 2011
Robustness of Social Networks: Comparative Results Based on Distance Distributions Paolo Boldi, Marco Rosa, and Sebastiano Vigna Dipartimento di Scienze dell’Informazione, Universit` a degli Studi di Milano, Italia
Abstract. Given a social network, which of its nodes have a stronger impact in determining its structure? More formally: which node-removal order has the greatest impact on the network structure? We approach this well-known problem for the first time in a setting that combines both web graphs and social networks, using datasets that are orders of magnitude larger than those appearing in the previous literature, thanks to some recently developed algorithms and software tools that make it possible to approximate accurately the number of reachable pairs and the distribution of distances in a graph. Our experiments highlight deep differences in the structure of social networks and web graphs, show significant limitations of previous experimental results, and at the same time reveal clustering by label propagation as a new and very effective way of locating nodes that are important from a structural viewpoint.
1
Introduction
In the last years, there has been an ever-increasing research activity in the study of real-world complex networks [WF94] (the world-wide web, the Internet autonomous-systems graph, coauthorship graphs, phone call graphs, email graphs and biological networks, to cite a few). These networks, typically generated directly or indirectly by human activity and interaction, appear in a large variety of contexts and often exhibit a surprisingly similar structure. One of the most important notions that researchers have been trying to capture is “node centrality”: ideally, every node (often representing an individual) has some degree of influence or importance within the social domain under consideration, and one expects such importance to be reflected in the structure of the social network; centrality is a quantitative measure that aims at revealing the importance of a node. Among the types of centrality that have been considered in the literature (see [Bor05] for a good survey), many have to do with shortest paths between nodes; for example, the betweenness centrality of a node v is the sum, over all pairs of nodes x and y, of the fraction of shortest paths from x to y passing through v. The role played by shortest paths is justified by one of the most well known features of complex networks, the so-called small-world phenomenon. A small-world network [CH10] is a graph where the average distance between nodes is logarithmic in the size of the network, whereas the clustering coefficient A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 8–21, 2011. c Springer-Verlag Berlin Heidelberg 2011
Robustness of Social Networks Comparative Results
9
is large (that is, neighbourhoods tend to be denser) than in a random Erd˝ osR´enyi graph with the same size and average distance.1 Here, and in the following, by “distance” we mean the length of the shortest path between two nodes. The fact that social networks (either electronically mediated or not) exhibit the smallworld property is known at least since Milgram’s famous experiment [Mil67] and is arguably the most popular of all features of complex networks. Based on the above observation that the small-world property is by far the most crucial of all the features that social networks exhibit, it is quite natural to consider centrality measures that are based on node distance, like betweenness. On the other hand, albeit interesting and profound, such measures are often computationally too expensive to be actually computed on real-world graphs; for example, the best known algorithm to compute betweenness centrality [Bra01] takes time O(nm) and requires space for O(n + m) integers (where n is the number of nodes and m is the number of arcs): both bounds are infeasible for large networks, where typically n ≈ 109 and m ≈ 1011 . For this reason, in most cases other strictly local measures of centrality are usually preferred (e.g., degree centrality). One of the ideas that have emerged in the literature is that node centrality can be evaluated based on how much the removal of the node “disrupts” the graph structure [AJB00]. This idea provides also a notion of robustness of the network: if removing few nodes has no noticeable impact, then the network structure is clearly robust in a very strong sense. On the other hand, a node-removal strategy that quickly affects the distribution of distances probably reflects an importance order of the nodes. Previous literature has used mainly the diameter or some analogous measure to establish whether the network structure changed. Recently, though, there have been some successful attempts to produce reliable estimates of the neighbourhood function of very large graphs [PGF02, BRV11]; an immediate application of these approximate algorithms is the computation of the number of reachable pairs of the graph (the number of pairs x, y such that there is a directed path from x to y) and its distance distribution (the distance distribution of a graph is a discrete distribution that gives, for every t, the fraction of pairs of nodes that are at distance t). From these data, a number of existing measures can be computed quickly and accurately, and new ones can be conceived. We thus consider a certain ordering of the nodes of a graph (that is supposed to represent their “importance” or “centrality”). We remove nodes (and of course their incident arcs) following this order, until a certain percentage ϑ of the arcs have been deleted2 ; finally, we compare the number of reachable pairs and distance distribution of the new graph with the original one. The chosen ordering is considered to be a reliable measure of centrality if the measured 1 2
The reader might find this definition a bit vague, and some variants are often spotted in the literature: this is a general problem, also highlighted recently in [LADW05]. Observe that we delete nodes but count the percentage of arcs removed, and not of nodes: this choice is justified by the fact that otherwise node orderings that put large-degree nodes first would certainly be considered (unfairly) more disruptive.
10
P. Boldi, M. Rosa, and S. Vigna
difference increases rapidly with ϑ (i.e., it is sufficient to delete a small fraction of important nodes to change the structure of the graph). In this work, we applied the described approach to a number of complex networks, considering different orderings, and obtained the following results: – In all complex networks we considered, the removal of a limited fraction of randomly chosen nodes does not change the distance distribution significantly, confirming previous results. – We test strategies based on PageRank and on clustering (see Section 4.1 for more information about this), and show that they (in particular, the latter) disrupt quickly the structure of a web graph. – Maybe surprisingly, none of the above strategies seem to have an impact when applied to social networks other than web graphs. This is yet another example of a profound structural difference between web graphs and social networks,3 on the same line as those discussed in [BRV11] and [CKL+ 09]. This observation, in particular, suggests that social networks tend to be much more robust and cohesive than web graphs, at least as far as distances are concerned, and that “scale-free” models, which are currently proposed for both type of networks, do not to capture this important difference.
2
Related Work
The idea of grasping information about the structure of a network by repeatedly removing nodes out of it is not new: Albert, Jeong and Barab´ asi [AJB00] study experimentally the variation of the diameter on two different models of undirected random graphs when nodes are removed either randomly or in “connectedness order” and report different behaviours. They also perform tests on some small real data set, and we will compare their results with ours in Section 6. More recently, node-centrality measures that look at how some graph invariant changes when some vertices or edges are deleted (sometimes called “vitality” [BE05] or “induced” measures) have been studied for example in [Bor06] (identifying nodes that maximally disconnect the network) or in [BCK06] (related to the uncertainty of data). Donato, Leonard, Millozzi and Tsaparas [DLMT08] study how the size of the giant component changes when nodes of high indegree or outdegree are removed from the graph. While this is an interesting measure, it does not provide information about what happens outside the component. They develop a library for semi-external visits that makes it possible to compute in an exact way the strongly connected components on large graphs. 3
We remark that several proposals have been made to find features that highlight such structural differences in a computationwise-feasible way (e.g., assortative mixing [NP03]), but all instances we are aware of have been questioned by the subsequent literature, so no clear-cut results are known as yet.
Robustness of Social Networks Comparative Results
11
Finally, Fogaras [Fog03] considers how the harmonic diameter 4 (the harmonic mean of the distances) changes as nodes are deleted from a small (less than one million node) snapshot of the .ie domain, reporting a large increase (100%) when as little as 1000 nodes with high PageRank are removed. The harmonic diameter is estimated by a small number of visits, however, which gives no statistical guarantee on the accuracy of the results. Our study is very different. First of all, we use graphs that are two orders of magnitude larger than those considered in [AJB00] or [Fog03]; moreover, we study the impact of node removal on the whole spectrum of distances. Second, we apply removal procedures to large social networks (previous literature used only web or Internet graphs), and the striking difference in behaviour shows that “scale-free” models fail to capture essential differences between these kind of networks and web graphs. Third, we document in a reproducible way all our experiments, which have provable statistical accuracy.
3
Computing the Distance Distribution
Given a directed graph G, its neighbourhood function NG (t) returns for each t ∈ N the number of pairs of nodes x, y such that y is reachable from x in no more than t steps. From the neighbourhood function, several interesting features of a graph can be estimated, and in this paper we are especially interested in the distance distribution of the graph G , represented by the cumulative distribution function HG (t), which returns the fraction of reachable pairs at distance at most t, that is, HG (t) = NG (t)/maxt NG (t). The corresponding probability density function will be denoted by hG (−). Recently, HyperANF [BRV11] emerged as an evolution of the ANF tool [PGF02]. HyperANF can compute for the first time in a few hours the neighbourhood function of graphs with billions of nodes with a small error and good confidence using a standard workstation. The free availability of HyperANF opens new and interesting ways to study large graphs, of which this paper is an example.
4
Removal Strategies and Their Analysis
In the previous section, we discussed how we can effectively approximate the distance distribution of a given graph G; we shall use such a distribution as the graph structural property of interest. Consider now a given total order ≺ on the nodes of G; we think of ≺ as a removal strategy in the following sense: when we want to remove ϑm arcs, we start removing the ≺-largest node (and its incident arcs), go on removing the second-≺-largest node etc. and stop as soon as ≥ ϑm arcs have been removed. 4
Actually, the notion had been introduced before by Marchiori and Latora and named connectivity length [ML00], but we find the name “harmonic diameter” much more insightful.
12
P. Boldi, M. Rosa, and S. Vigna
The resulting graph will be denoted by G(≺, ϑ). Of course, G(≺, 0) = G whereas G(≺, 1) is the empty graph. We are interested in applying some measure of divergence 5 between the distribution HG and the distribution HG(≺,ϑ) . By looking at the divergence when ϑ varies, we can judge the ability of ≺ to identify nodes that will disrupt the network. 4.1
Some Removal Strategies
We considered several different strategies for removing nodes from a graph. Some of them embody actually significant knowledge about the structure of the graph, whereas others are very simple (or even independent of the graph) and will be used as baseline. Some of them have been used in the previous literature, and will be useful to compare our results. As a first observation, some strategies requires a symmetric graph (a.k.a., undirected). In this case, we symmetrise the graph by adding the missing arcs6 . The second obvious observation is that some strategies might depend on available metadata (e.g., URLs for web graphs) and might not make sense for all graphs. Random. No strategy: we pick random nodes and remove them from the graph. It is important to test against this “nonstrategy” as we can show that the phenomena we observe are due to the peculiar choice of nodes involved, and not to some generic property of the graph. Largest-degree first. We remove nodes in decreasing (out)degree order. This strategy is an obvious baseline, as degree centrality is the first shot at centrality in a network. Near-Root. In web graphs, we can assume that nodes that are roots of web sites and their (quasi-)immediate successors (e.g., pages linked by the root) are most important in establishing the distance distribution, as people tend to link higher levels of web sites. This strategy removes essentially first root nodes, then the nodes that are children of a root on, and so on. PageRank. PageRank [PBMW98] is a well-known algorithm that assigns ranks to nodes using a Markov chain based on the structure of the graph. It has been designed as an improvement over degree centrality, because nodes with high degree which however are connected to nodes of low rank will have a rather low rank, too (the definition is indeed recursive). There is a vast body of literature on the subject: see [BSV09, LM04] and the references therein. Label propagation. Label propagation [RAK07] is a powerful technique for clustering symmetric graphs. Each node has a label (initially, the node number itself) and through a number of rounds each node changes its label by 5 6
We purposedly use the word “divergence” between distributions, instead of “distance”, to avoid confusion with the notion of distance in a graph. It is mostly a matter of taste whether to use directed symmetric graphs or simple undirected graphs. In our case, since we have to cope with both directed and undirected graph, we prefer to speak of directed graphs that are symmetric, that is, for every arc x → y there is a symmetric arc y → x.
Robustness of Social Networks Comparative Results
13
taking the label of the majority of its neighbours. At the end, node labels are used as cluster identifiers. Our removal strategy picks first, for each cluster in decreasing size order, the node with the highest number of neighbours in other clusters: intuitively, it is a representative of a set of tightly connected nodes (the cluster) which however has a very significant connection with the outside world (the other clusters) and thus we expect that its removal should seriously disrupt the distance distribution. Once we have removed all such nodes, we proceed again, cluster by cluster, using the same criterion (thus picking the second node of each cluster that has more connection towards other clusters), and so on. 4.2
Measures of Divergence
Once we changed the structure of a graph by deleting some of its nodes (and arcs), there are several ways to measure whether the structure of the graph has significantly changed. The first, basic raw datum we consider is the fraction of pairs of nodes that are still reachable (w.r.t. the number of pairs initially reachable). Then, to estimate the change of the distance distribution we considered the following possibilities (here P denotes the original distance distribution, and Q the distribution after node removal): Relative average-distance change. This is somehow the simplest and most natural measure: how much has the average distance changed? We use the measure μQ −1 δ(P, Q) = μP where μ denotes the average; in other words, we measure how much the average value changed. This measure is non-symmetric, but it is of course easy to obtain δ(P, Q) from δ(Q, P ). Relative harmonic-diameter change. This measure is analogous to the relative average-distance change, but the average on distances is harmonic and computed on all pairs, that is: 1 n(n − 1) (NG (t) − NG (t − 1)), = n(n − 1) 1 t x=y d(x,y) t>0 where n is the number of nodes of the graph. This measure, used in [Fog03], combines reachability information, as unreachable pairs contribute zero to the sum. It is easily computable from the neighbourhood function, as shown above. Kullback-Leibler divergence. This is a measure of information gain, in the sense that it gives the number of additional bits that are necessary to code samples drawn from P when using an optimal code for Q. Also this measure is non-symmetric, but there is no way obtain the divergence from P to Q given that from Q to P . norms. A further alternative is given by viewing distance distributions as functions N → [0 . . 1] and measure their distance using some -norm, most notably 1 or 2 . Such distances are of course symmetric.
14
P. Boldi, M. Rosa, and S. Vigna
We tested, with various graphs and removal strategies, how the choice of distribution divergence influences the interpretation of the results obtained. In Figure 1 we show this for a single web graph and a single strategy, but the outcomes agree on all the graphs and strategies tested: the interpretation is that all divergences agree, and for this reason we shall use the (simple) measure δ applied to the average distance in the experimental section. The advantage of δ over the other measures is that it is very easy to interpret; for example, if δ has value, say, 0.3 it means that node removal has increased the average distance by 30%. We also discuss δ applied to the harmonic diameter. 0.12
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
probability
0.1 0.08 0.06 0.04 0.02 0 1
10
100
0
0.05
0.1
0.00 0.01
0.05 0.10
0.15 0.20
0.15
0.2
0.25
0.3
θ
length 0.30
Kullback-Leibler δ-average distance
L1 L2
Fig. 1. Testing various divergence measures on a web graph (a snapshot of the .it domain of 2004) and the near-root removal strategy. You can see how the distance distribution changes for different values of ϑ and the behaviour of divergence measures. We omitted to show the harmonic-diameter change to make the plot easier to read.
5
Experiments
For our experiments, we considered a number of networks with various sizes and characteristics; most of them are either web graphs or (directed or undirected) social graphs of some kind (note that for web graphs we can rely on the URLs as external source of information). More precisely, we used the following datasets: – Hollywood : One of the most popular undirected social graphs, the graph of movie actors: vertices are actors, and two actors are joined by an edge whenever they appeared in a movie together. – LiveJournal : LiveJournal is a virtual community social site started in 1999: nodes are users and there is an arc from x to y if x registered y among his friends (it is not necessary to ask y permission, so the graph is directed ). We considered the same 2008 snapshot of LiveJournal used in [CKL+ 09] for their experiments – Amazon: This dataset describes similarity among books as reported by the Amazon store; more precisely the data was obtained in 2008 using the Amazon E-Commerce Service APIs using SimilarityLookup queries.
Robustness of Social Networks Comparative Results
15
– Enron: This dataset was made public by the Federal Energy Regulatory Commission during its investigations: it is a partially anonymised corpus of e-mail messages exchanged by some Enron employees (mostly part of the senior management). We turned this dataset into a directed graph, whose nodes represent people and with an arc from x to y whenever y was the recipient of (at least) a message sent by x. – For comparison, we considered two web graphs of different size: a 2004 snapshot of the .it domain (≈ 40 million nodes), and a snapshot taken in May 2007 of the .uk domain (≈ 100 million nodes). All our graphs are available from public sources, and the software is licensed under GPL at http://law.dsi.unimi.it/.
6
Discussion
Table 1 shows that social networks suffer spectacularly less disconnection than web graphs when their nodes are removed using our strategies. Our most efficient removal strategy, label propagation, can disconnect almost all pairs of a web graph by removing 30% of the arcs, whereas it disconnects only half (or less) of the pairs on social networks. This entirely different behaviour shows that web graphs have a path structure that passes through fundamental hubs. Moreover, the average distance of the web graphs we consider increases by 50−80% upon removal of 30% of the arcs, whereas in most social networks there is just a 5% increase, the only exception being Amazon (15%).7 Note that random removal can separate a good number of reachable pairs, but the increase in average distance is very marginal. This shows that considering both measures is important in evaluating removal strategies. Of course, we cannot state that there is no strategy able to disrupt social networks as much as a web graph (simply because this strategy may be different from the ones that we considered), but the fact all strategies work very similarly in both cases (e.g., label propagation is by far the most disruptive strategy) suggests that the phenomenon is intrinsic. There is of course a candidate easy explanation: shortest paths in web graphs pass frequently through home pages, which are linked more than other pages. But this explanation does not take into account the fact that clustering by label propagation is significantly more effective than the near-root removal strategy. Rather, it appears that there are fundamental hubs (not necessarily home pages) which act as shortcuts and through which a large number of shortest paths pass. Label propagation is able to identify such hubs, and their removal results in an almost disconnected graph and in a very significant increase in average distance. 7
We remark that in some cases the measure is negative or does not decrease monotonically. This is an artifact of the probabilistic technique used to estimate the number of pairs—small relative errors are unavoidable.
0
0
random degree
0.05
0.05
0.1
degree
0.1
0.3
0
0
20
40
60
80
100
0
0.05
0.05
random degree
random
0 θ
0.15
0.1
0.15
Amazon
degree
0.1
.it
PR LP
near-root
0.25
0.3
PR LP
0.2
LP
0.25
θ
PR
0.2
θ
0.15
θ
0.15
20
40
60
80
100
PR
0.2
0.2
near-root
0.25
LP
0.25
0.3
0.3
0
0
5
10
15
20
25
0
0.05
0.05
random degree
random
0
5
10
15
20
25
0.1
degree
0.1
PR LP
θ
0.15
θ
0.15
PR
0.2
0.2
near-root
0.25
LP
0.25
0.3
0.3
Fig. 2. Typical behaviour of social networks (Amazon, upper) and web graphs (.it, lower) when a ϑ fraction of arcs is removed using various strategies. None of the proposed strategies completely disrupts the structure of social networks, but the effect of the label-propagation removal strategy on web graphs is very visible.
0
0.2
0.4
0.6
0.8
1
random
0
0.2
0.4
0.6
0.8
1
reachable nodes % reachable nodes %
δ-average distance
δ-average distance
δ-harmonic diameter δ-harmonic diameter
16 P. Boldi, M. Rosa, and S. Vigna
Strategy RND DEG Amazon PR LP Random Degree Enron PR LP Random Degree Hollywood PR LP Random Degree LiveJournal PR LP Random Degree .it PR Near-Root LP Random Degree .uk PR Near-Root LP
Graph
0.01 0.008 (100%) −0.005 (118%) 0.001 (97%) 0.006 (104%) 0.013 (99%) 0.006 (97%) 0.007 (99%) 0.005 (99%) −0.003 (101%) 0.005 (87%) 0.001 (102%) 0.018 (90%) 0.007 (97%) 0.003 (95%) 0.002 (97%) 0.006 (102%) −0.012 (94%) 0.035 (101%) −0.002 (100%) 0.037 (90%) 0.013 (86%) 0.002 (100%) 0.015 (98%) 0.032 (89%) 0.054 (80%) 0.059 (87%)
0.05 0.002 (93%) 0.002 (97%) 0.014 (99%) 0.023 (104%) 0.014 (93%) 0.017 (86%) 0.033 (81%) 0.029 (80%) 0.018 (104%) 0.015 (105%) 0.004 (94%) 0.038 (78%) 0.006 (94%) 0.02 (91%) 0.018 (102%) 0.013 (103%) 0.025 (89%) −0.025 (94%) 0.089 (87%) 0.309 (61%) 0.219 (43%) 0.023 (85%) 0.013 (96%) 0.076 (80%) 0.261 (54%) 0.235 (38%)
0.1 0.031 (82%) 0.006 (86%) 0.032 (98%) 0.054 (82%) 0.006 (83%) 0.056 (75%) 0.055 (63%) 0.04 (72%) 0.009 (92%) 0.001 (98%) 0.023 (100%) 0.052 (65%) 0.009 (89%) 0.053 (105%) 0.042 (99%) 0.02 (90%) 0.01 (75%) −0.013 (95%) 0.191 (68%) 0.342 (40%) 0.417 (20%) 0.044 (85%) −0.043 (75%) 0.125 (66%) 0.286 (48%) 0.303 (22%)
0.15 0.041 (79%) 0.006 (87%) 0.037 (94%) 0.096 (87%) 0.003 (80%) 0.061 (72%) 0.067 (53%) −0.048 (59%) 0.017 (87%) 0.006 (92%) 0.025 (100%) 0.066 (57%) 0.014 (92%) 0.065 (108%) 0.063 (112%) 0.024 (89%) 0.013 (67%) −0.005 (93%) 0.249 (62%) 0.344 (38%) 0.53 (16%) 0.089 (93%) −0.031 (78%) 0.149 (59%) 0.297 (45%) 0.394 (19%)
0.2 0.056 (76%) 0.028 (95%) 0.069 (94%) 0.112 (82%) 0.007 (76%) 0.064 (67%) 0.093 (45%) 0.061 (57%) −0.004 (74%) 0.022 (112%) 0.03 (94%) 0.061 (54%) 0.02 (84%) 0.064 (92%) 0.07 (96%) 0.043 (98%) 0.021 (58%) 0.001 (90%) 0.293 (52%) 0.346 (36%) 0.601 (11%) 0.054 (68%) −0.019 (80%) 0.173 (52%) 0.311 (44%) 0.445 (14%)
0.3 0.082 (70%) 0.091 (80%) 0.097 (80%) 0.153 (64%) 0.022 (88%) 0.13 (52%) 0.135 (34%) 0.05 (52%) 0.021 (77%) 0.02 (93%) 0.036 (90%) 0.058 (52%) 0.032 (78%) 0.101 (91%) 0.104 (99%) 0.058 (93%) 0.035 (46%) 0.002 (90%) 0.418 (35%) 0.376 (35%) 0.83 (5%) 0.035 (49%) 0.001 (74%) 0.267 (39%) 0.387 (41%) 0.505 (6%)
Table 1. For each graph and a sample of fractions of removed arcs we show the change in average distance (by the measure δ defined in Section 4.2) and the percentage of reachable pairs. PR stands for PageRank, and LP for label propagation.
Robustness of Social Networks Comparative Results 17
18
P. Boldi, M. Rosa, and S. Vigna
These hubs are not necessarily of high outdegree: quite the opposite, rather, is true. The behaviour of web graphs under the largest-degree strategy is illuminating: we obtain the smallest reduction in reachable pairs and an almost unnoticeable change of the average distance, which means that nodes of high outdegree are not actually relevant for the global structure of the network. Social networks are much more resistant to node removal. There is not strict clustering, nor definite hubs, that can be used to eliminate or elongate shortest paths. This is not surprising, as networks emerging from social interaction are much less engineered (there is no notion of “site” or “page hierarchy”, for example) than web graphs. The second important observation is that the removal strategies based on PageRank and label propagation are always the best (with the exception of the near-root strategy for .uk, which is better than PageRank). This suggests that label propagation is actually able to identify structurally important nodes in the graph—in fact, significantly better than any other method we tested. Is the ranking provided by label propagation correlated to other rankings? Certainly not to the other rankings described in this paper, due to the different level of disruption it produces on the network. The closest ranking with similar behaviour is PageRank, but, for instance, Kendall’s τ between PageRank and ranking by label propagation on the .uk dataset is ≈ −0.002 (complete uncorrelation). It is interesting to compare our results against those in the previous literature. With respect to [AJB00], we test much larger networks. We can confirm that random removal is less effective that rank-based removal, but clearly the variation in diameter measured in [AJB00] has been made on a symmetrised version of the web graph. Symmetrisation destroys much of the structure of the network, and it is difficult to justify (you cannot navigate links backwards). We have evaluated our experiment using the variation in diameter instead of the variation in average distance (not shown here), but the results are definitely inconclusive. The behaviour is wildly different even between graphs of the same type, and shows no clear trend. This was expected, as the diameter is defined by a maximisation property, so it is very unstable. We also evaluated the variation in harmonic diameter (see Table 2), to compare our results with those of [Fog03]. The harmonic diameter is very interesting, as it combines reachability and distance. The data confirm what we already stated: web graphs react to removal of 30% of their arcs by label propagation by increasing their harmonic diameter by an order of magnitude—something that does not happen with social networks. Table 2 is even more striking than Table 1 in showing that label propagation selects highly disruptive nodes in web graphs. Our criterion for node elimination is a threshold on the number of arcs removed, rather than nodes, so it is not possible to compare our results with [Fog03] directly. However, for .uk PageRank at ϑ = 0.01 removes 648 nodes, which produced in the .ie graph a relative increment of 100%, whereas we find 14%. This is to be expected, due to the very small size of the dataset used in [Fog03]: experience shows that connectedness phenomena in web graphs are very different
Strategy RND DEG Amazon PR LP RND DEG Enron PR LP RND DEG Hollywood PR LP RND DEG LiveJournal PR LP RND DEG .it PR NR LP RND DEG .uk PR NR LP
Graph
0.01 −0.01 (100%) −0.15 (118%) 0.03 (97%) −0.04 (104%) 0.01 (99%) 0.03 (97%) 0.01 (99%) 0.01 (99%) −0.02 (101%) 0.15 (87%) −0.02 (102%) 0.02 (90%) 0.05 (97%) −0.03 (95%) 0.04 (97%) −0.06 (102%) 0.04 (94%) 0.03 (101%) −0.02 (100%) 0.18 (90%) 0.18 (86%) −0 (100%) −0.02 (98%) 0.14 (89%) 0.31 (80%) 0.2 (87%)
0.05 0.03 (93%) 0 (97%) 0.02 (99%) −0.04 (104%) 0.04 (93%) 0.19 (86%) 0.27 (81%) 0.18 (80%) −0.07 (104%) −0.04 (105%) 0.06 (94%) −0.12 (78%) −0.01 (94%) 0.12 (91%) 0 (102%) 0.04 (103%) 0.1 (89%) 0.12 (94%) 0.25 (87%) 1.17 (61%) 1.68 (43%) 0.13 (85%) −0.01 (96%) 0.33 (80%) 1.27 (54%) 2.02 (38%)
0.1 0.13 (82%) 0.09 (86%) 0.02 (98%) 0.2 (82%) 0.11 (83%) 0.41 (75%) 0.67 (63%) 0.29 (72%) −0 (92%) 0.02 (98%) 0.02 (100%) −0.11 (65%) 0.05 (89%) 0.08 (105%) 0.11 (99%) 0.04 (90%) 0.17 (75%) 0.05 (95%) 0.72 (68%) 2.15 (40%) 4.44 (20%) 0.12 (85%) 0.04 (75%) 0.79 (66%) 1.45 (48%) 3.71 (22%)
0.15 0.13 (79%) 0.05 (87%) 0.06 (94%) 0.15 (87%) 0.12 (80%) 0.47 (72%) 0.99 (53%) 0.53 (59%) 0.01 (87%) 0.1 (92%) 0.02 (100%) −0.11 (57%) −0.02 (92%) 0.01 (108%) 0.18 (112%) 0.03 (89%) 0.32 (67%) −0.1 (93%) 1.05 (62%) 2.32 (38%) 6.58 (16%) 0 (93%) 0.28 (78%) 0.98 (59%) 1.37 (45%) 5.13 (19%)
0.2 0.13 (76%) −0.05 (95%) 0.06 (94%) 0.19 (82%) 0.16 (76%) 0.59 (67%) 1.38 (45%) 0.55 (57%) 0.11 (74%) −0.09 (112%) 0.09 (94%) −0.12 (54%) 0.06 (84%) −0.07 (92%) 0.12 (96%) −0.02 (98%) 0.45 (58%) 0.13 (90%) 1.52 (52%) 2.32 (36%) 9.68 (11%) 0.28 (68%) 0.26 (80%) 1.16 (52%) 1.37 (44%) 7.33 (14%)
0.3 0.14 (70%) 0.1 (80%) 0.23 (80%) 0.47 (64%) 0.05 (88%) 1.17 (52%) 2.27 (34%) 0.62 (52%) −0.02 (77%) 0.09 (93%) 0.14 (90%) −0.15 (52%) 0.13 (78%) 0.21 (91%) 0.23 (99%) 0.15 (93%) 0.69 (46%) 0.21 (90%) 3.17 (35%) 2.83 (35%) 22.32 (5%) 0.58 (49%) 0.1 (74%) 2.19 (39%) 1.84 (41%) 16.61 (6%)
Table 2. For each graph and a sample of fractions of removed arcs we show the change in harmonic diameter (by the measure δ defined in Section 4.2) and the percentage of reachable pairs. PR stands for PageRank, and LP for label propagation.
Robustness of Social Networks Comparative Results 19
20
P. Boldi, M. Rosa, and S. Vigna
in the “below ten million nodes” region. Nonetheless, the growth trend is visibile in both cases. However, the experiments in [Fog03] fail to detect both the disruptive behaviour at ϑ = .3 and the striking difference in behaviour between largest-degree and PageRank strategy.
7
Conclusions and Future Work
We have explored experimentally the alterations of the distance distribution of some social networks and web graphs under different node-removal strategies. We have confirmed some of the experimental results that appeared in the literature, but at the same time shown some basic limitations of previous approaches. In particular, we have shown for the first time that there is a clear-cut structural difference between social networks and web graphs8 , and that it is important to test node-removal strategies until a significant fraction of the arcs have been removed. Probably the most important conclusion is that “scale-free” models, which are currently proposed for both web graphs and social networks, do not to capture this important difference: for this reason, they can only make sense as long as they are adopted as baselines. It might be argued that reachable pairs and distance distributions are too coarse as a feature. Nonetheless, we believe that they are the most immediate global feature that are approachable computationally. For instance, checking whether node removal alters the clustering coefficient would not be so interesting, because the clustering coefficient of each node depends only on the structure of the neighbourhood of each node. Thus, by removing first the nodes with high coefficient it would be trivial to make the clustering coefficient of the graph decrease quickly. Such trivial approaches cannot possibly work with reachable pairs or with distance distributions because they are properties that depend on the graph as a whole. Finally, the efficacy of label propagation as a removal strategy suggests that it may be very interesting to study it as a form of ranking: an open question is whether it could be useful, for instance, as a query-independent ranking for information-retrieval applications.
References [AJB00] [BCK06]
8
Albert, R., Jeong, H., Barab´ asi, A.-L.: Error and attack tolerance of complex networks. Nature 406, 378–382 (2000) Borgatti, S.P., Carley, K.M., Krackhardt, D.: On the robustness of centrality measures under conditions of imperfect data. Social Networks 28(2), 124–136 (2006)
In this paper, like in all the other experimental research on the same topic, conclusions about social networks should be taken with a grain of salt, due to the heterogeneity of such networks and the lack of a large repertoire of examples.
Robustness of Social Networks Comparative Results [BE05]
21
Brandes, U., Erlebach, T. (eds.): Network Analysis. LNCS, vol. 3418, pp. 1–6. Springer, Heidelberg (2005) [Bor05] Borgatti, S.P.: Centrality and network flow. Social Networks 27(1), 55–71 (2005) [Bor06] Borgatti, S.P.: Identifying sets of key players in a social network. Comput. Math. Organ. Theory 12, 21–34 (2006) [Bra01] Brandes, U.: A faster algorithm for betweenness centrality. Journal of Mathematical Sociology 25(2), 163–177 (2001) [BRV11] Boldi, P., Rosa, M., Vigna, S.: HyperANF: Approximating the neighbourhood function of very large graphs on a budget. In: Proceedings of the 20th International Conference on World Wide Web, pp. 625–634. ACM, New York (2011) [BSV09] Boldi, P., Santini, M., Vigna, S.: PageRank: Functional dependencies. ACM Trans. Inf. Sys. 27(4), 1–23 (2009) [CH10] Cohen, R., Havlin, S.: Complex Networks: Structure, Robustness and Function. Cambridge University Press, Cambridge (2010) Chierichetti, F., Kumar, R., Lattanzi, S., Mitzenmacher, M., Panconesi, [CKL+ 09] A., Raghavan, P.: On compressing social networks. In: KDD 2009, pp. 219–228. ACM, New York (2009) [DLMT08] Donato, D., Leonardi, S., Millozzi, S., Tsaparas, P.: Mining the inner structure of the web graph. Journal of Physics A: Mathematical and Theoretical 41(22), 224017 (2008) [Fog03] Fogaras, D.: Where to Start Browsing the Web? In: B¨ ohme, T., Heyer, G., Unger, H. (eds.) IICS 2003. LNCS, vol. 2877, pp. 65–79. Springer, Heidelberg (2003) [LADW05] Li, L., Alderson, D.L., Doyle, J., Willinger, W.: Towards a theory of scalefree graphs: Definition, properties, and implications. Internet Math. 2(4) (2005) [LM04] Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Math. 1(3), 355–400 (2004) [Mil67] Milgram, S.: The small world problem. Psychology Today 2, 60–67 (1967) [ML00] Marchiori, M., Latora, V.: Harmony in the small-world. Physica A 285(34), 539–546 (2000) [NP03] Newman, M.E.J., Park, J.: Why social networks are different from other types of networks. Phys. Rev. E 68(3), 036122 (2003) [PBMW98] Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, Stanford University, USA (1998) [PGF02] Palmer, C.R., Gibbons, P.B., Faloutsos, C.: Anf: a fast and scalable tool for data mining in massive graphs. In: KDD 2002, pp. 81–90. ACM, New York (2002) [RAK07] Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3) (2007) [WF94] Wasserman, S., Faust, K.: Social network analysis: Methods and applications. Cambridge Univ. Press, Cambridge (1994)
Endogenous Control of DeGroot Learning Sridhar Mandyam and Usha Sridhar Ecometrix Research, Bangalore, India {sridhar.mandyam,usha.sridhar}@Ecometrix.in
Abstract. The DeGroot update cycle for belief learning in social networks models beliefs as convex combinations of older beliefs using a stochastic matrix of social influence weights. In this paper, we explore a new endogenous control scenario for this type of learning, where an agent on her own initiative, adjusts her private social influence to follow another agent, say, one which receives higher attention from other agents, or one with higher beliefs. We develop an algorithm which we refer to as BLIFT, and show that this type of endogenous perturbation of social influence can lead to a ‘lifting’ or increasing of beliefs of all agents in the network. We show that the per-cycle perturbations produce improved variance contractions on the columns of the stochastic matrix of social influences, resulting in faster convergence, as well as consensus in beliefs. We also show that this may allow belief values to be increased beyond the DeGroot beliefs, which we show are the lower bounds for BLIFT. The result of application of BLIFT is illustrated with a simple synthetic example. Keywords: DeGroot Model, Belief Learning, Social Networks, Endogenous Control.
1 Introduction The study of the evolution of beliefs in social networks has been an important area of research in social dynamics and learning. Individual beliefs have often been modeled as a scalar numbers in a fixed range denoting a level of confidence about some global truth. Beliefs are thought to evolve over time through a process of updating or social learning from connected neighbors on the basis of how much “attention” an agent pays to connected neighbors [2, 3, 5, 14, 15]. The modeling of individual agents as nodes in a graph, where some degree of ‘connectedness’ or weight associated with an edge between the nodes could capture the notion of influence amongst the agents has allowed matrix representation of such quantities. Among the many possible mechanisms that could be thought of for updating beliefs using the influences in such a matrix representation, a simple averaging procedure attributed to DeGroot [6] has drawn much attention in the recent years. At the core of the DeGroot update is a remarkably simple matrix-vector multiply update that obtains new beliefs as a product of a row-stochastic influence matrix and a column vector of previous beliefs. The influence matrix is row-stochastic because of the assumption that individuals use normalized influence weights to characterize the A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 22–35, 2011. © Springer-Verlag Berlin Heidelberg 2011
Endogenous Control of DeGroot Learning
23
attention paid to other agents, including the attention paid to oneself. The update process produces new beliefs as convex combinations of previous beliefs. Under certain conditions of connectivity the update cycle leads to convergence in the beliefs to a state of consensus [4]. This basic update cycle with fixed influences has an intuitive appeal for the purpose of explaining some observed behaviors in social networks, such as information aggregation, diffusion, and the impact of network structure on the emergence of consensus. However, in the process of achieving convergence the final beliefs become enclosed within the range of initial beliefs. The rather ‘uncontrollable’ manner of convergence after the receipt of the initial signal is determined by the fixed influence weights, quite akin to the achievement of stationarity of state transitions in an analogous homogeneous Markov Chain. This raises a basic question on what kind of endogenous change in the influence matrix, even if such changes render the analogous Markov Chain non-homogeneous, might result from a higher degree of control on the behavior of average learned belief. One is interested specifically in the possibility that the resultant non-homogenous Markov Chain representation might result in a higher average belief, even as it exhibits properties of convergence and consensus that are certainly desirable [8]. It might be of interest to know how convergence and consensus of beliefs might be impacted by when some or all agents begin to pay higher attention to a specific agent over many cycles, perhaps ‘following’ an agent who has the higher ‘prestige’ of increased average attention. Conversely, we may then ask if a network of agents can collectively seek an improved average belief by only performing adjustments to one’s own influence weights, as a “local” adjustment, as it were. In this paper, we explore mechanisms for agents to apply endogenous control on the attention they pay to other agents, with the express goal of suitably modifying and lifting the achievable belief behavior. The primary motivation for exploring endogenous control of belief learning derives from the argument that social influences need to be modeled in a manner that reflects the potential need for agents to adjust these influences in every cycle on the basis of some dynamic of the observed beliefs. Otherwise in the DeGroot model, the agents are failing to demonstrate adaptation, or to fully utilize learnt beliefs information from one cycle, in another. We develop the rationale for seeking such weight changes that increase average belief iteratively; achieving convergence and a higher consensus value than what is possible in the homogeneous chain update analog. We also show that our method may offer a suitable framework for analyzing the possibilities for applying this type of control in more general scenarios where agents may learn to shape a belief profile. The paper is organized as follows. In Section 2 we briefly review the DeGroot learning method and some of the essential results that have been reported in recent research. The core of our proposed method for changing influence weights to increase average belief is described in Section 3. We present the BLIFT algorithm in Section 4 and discuss the convergence properties in Section 5. An synthetic example to illustrate the results of BLIFT on a small network appears in Section 6. Concluding remarks appear in Section 7.
24
S. Mandyam and U. Sridhar
2 DeGroot Belief Learning Let us first define the basic entities required to set up a DeGroot belief update cycle in a social network. We consider a social network of m agents. Their interconnection is represented by a directed graph G(V,E), with the set V of nodes representing m agents, and the set of edges E denoting “connections”. Let the adjacency matrix A be associated with graph G, with Aij = 1 if agent i is influenced by j, and Aij = 0, otherwise. Let A represent a structure in which we shall allow for directed connections in manner that agent i may be connected to agent j, but not necessarily vice-versa, and for an agent to be connected to fewer than (m-1) other agents. This implies that A need not be symmetric. We define an (m x m) square matrix, T, of influences which expresses the ‘attention’ paid by any agent to other agents as a set of normalized weights, which implies that the elements in every row of T sums to 1. These weights are directly the weights associated with the edges in the graph. Hence Tij > 0 when Aij = 1, and Tij = 0, otherwise. The network is considered static; i.e. agents cannot alter the structure of their network by adding new connections or dropping old ones. This essentially means that A is a constant, and we do not allow an agent to set a positive influence weight for another agent through the social learning cycles, if she were not connected to that agent from start. The belief held by each agent about some global truth is thought to be captured by a real number in the range [0, 1]. We assume we are given initial values of beliefs, denoted by an (m x 1) vector, b0. An (m x 1) vector, bt-1 denotes the vector of beliefs for the cycle (t-1). Each of the m agents in the social network ‘learns’ by updating her own belief by obtaining a weighted average of the beliefs of her neighbors, using the product (1) to produce an updated vector of beliefs for the cycle t. The DeGroot update cycle, constructed from the recursion in (1), represents an averaging process in which each new belief for every agent is a convex combination of previous beliefs of her neighbors, if the weights in every row i representing the attention paid by agent i to all agents, j=1,..,m including self (Tii), are such that Tij ≥ 0 and normalized to sum to 1, 1 . Thus all elements of T lie in the range [0, 1]. i.e., for all i = 1,..,m, ∑ The recursion (1) leads to an alternative ways of viewing the progression of updates: (2) The belief vector in cycle t is also the result of multiplying it by the tth power T. Since T exhibits the properties of a stochastic matrix, as tN, a suitably large number, the Nth power T converges under conditions of aperiodicity and irreducibility, to a row-stochastic matrix with all its rows the same. Any such row represents the unit left eigenvector of the matrix T. The recursion in (1) is said to converge to a consensus in beliefs when the structure of T mirrors a transition
Endogenous Control of DeGroot Learning
25
probability matrix associated with an aperiodic, irreducible, finite, homogeneous Markov Chain [14]. Since the update cycle produces bt-1 as a convex combination of bt, the recursion monotonically reduces the range of values of beliefs in each cycle, until consensus is achieved. The stochastic matrix achieves T is said to achieve a ‘contraction’ or reduction in the variance in every column, leading to a final row vector to which every row of T converges. This ‘contraction’ reflects in transformations of bt to bt-1 through T, as a monotonically decreasing variance of the beliefs around its mean, as the cycles progress towards convergence, at which stage all elements of the final belief vector turn equal and constant, representing achievement of consensus. In its essence, the DeGroot update cycle represents what can be referred to as an ‘endogenous’ social learning cycle, i.e., learning from connected neighbors, within the social network. It is well known that such ‘endogenous’ learning culminates in ‘convergence’ or a stable state, with or without consensus, depending upon the structure of the network. There have been some previous explorations into the question of different ways of affecting the change of influence weights in the update cycle. Friedkin et al [10, 11, 12, 13] investigated a different approach to use the influence matrix to study social influence. DeMarzo et al [7] also investigated a mechanism that allowed agents to determine their own ‘idiosyncratic’ weights differently from the way they set weights for the attention they pay for others. Acemoglu et al [1, 2] have also investigated methods to link the network structure and the basic update cycle. Rules for some agents to learn differently from their ‘close’ circle than those somewhat ‘far’ have also been explored , for example, by changing the attention paid to agents who have higher ‘prestige’, and so on. We believe, however, that the central question of ‘lifting’ beliefs using only private information, in a manner that can help change the evolving consensus, has not been expressly explored before, especially as an issue for achieving convergence within the framework of non-homogenous Markov chain properties that these endogenous controls entail.
3 Endogenous Control In this paper we explore notions of ‘lifting’ belief value, i.e. we seek values for belief that rise above what the DeGroot update cycle achieves. Hence we seek to modify T in every cycle; our recursion is of the form: ′
(3)
such that the T’ matrix can change in every update cycle. In order to better focus on operations within one cycle, let us drop cycle subscripts for the moment and rewrite (3) in the form: ′
′
(4)
26
S. Mandyam and U. Sridhar
Consider agent i, for whom the perturbation in influence weights may be denoted as: ′
∆
.
∆
(5)
Equation (5) expresses the fact that we seek a perturbation in the influence row elements as a sum of what the weighted average update produces and an additional ‘lift’ for the belief. In order to obtain a positive ‘lift’, we must have: ′
∆ .
0
(6)
Suppose now that agent i seeks to achieve (6) by adding an amount rij to the influence weight tij. Note that the agent will need to re-normalize the weights, by dividing each weight by the sum of the weights, as follows: ′
∑ where We find that
1
..
1
..
1
1
needs to be selected so as to make: ′
.
0 …
where we have set vector need only pick .
..
(7)
. To achieve the ‘lift’ for agent i, we (8)
While there are clearly many ways to achieve this selection, we can see that the belief system represented by (3) itself holds examples which will satisfy (8), i.e., other agents beliefs. Noting that the averaging process in the operation of (2) produces a range of values for y between some maximum and some minimum , we may well find some k for which surely but . Better still, if , we could simply select that row, say the kth, , and set . Since we know that . , we can be sure that this selection of . will guarantee us the lift, noting the sum R is already unity for this selection. An iterative procedure is easy to devise around this method, in which we pick the row with the maximum belief value , and have each agent use the influence weights of this row to add to its own, and thereby produce sure lift. Mathematically this strategy for ‘lifting’ beliefs is workable, for: i) ii)
the perturbations applied to T generate a sequence of belief values that are positive, and hence achieve the objective of ‘lifting’ beliefs; the consensus level of belief is higher than that which the basic DeGroot update converges to.
Endogenous Control of DeGroot Learning
27
However, there are conceptual issues that arise with this strategy. They are related to the idea that beliefs are ‘public’ information in this general social learning framework, while the attention that an agent pays others she is connected to is ‘private’ information, known only to that agent. Hence any perturbation sought to be applied by an agent to her own weights should only use ‘public’ information available in that cycle, and/or ‘private’ information. In other words, an agent cannot utilize information that other agents are yet learning within the cycle, e.g., values of for other agents, or even values of influence (attention) of other agents. In this paper we shall propose a new mechanism to lift beliefs that do not violate the above requirements, and yet use the basic idea of (6). We shall consider here only belief values known at the start of the cycle as ‘public’ information. We shall also allow agents to only use their own influence weight information to generate a perturbation. Note that this implies row-wise increases, with the understanding that agent can modify the weight they attach to others, and not the weights other attach to them. In order to compare the results obtained for average belief with and without the modification proposed in (2), we shall assume in this paper that the structure of T represents a strongly connected network that is analogous to a strongly ergodic, aperiodic Markov Chain. We propose to apply perturbations to T in every cycle, and yet maintain its stochastic nature as before, as well as to preserve the network structure it represents. This implies that we shall not add or remove edges in T, but only modify their weights. The resultant sequence of Tt will represent a non-homogeneous Markov Chain. Later we shall examine the convergence behavior of the product of the sequence of matrices, Tt.
4 BLIFT Algorithm In this section we shall develop a new algorithm to achieve lift of beliefs using only public information on beliefs and private influence weights of each agent. We shall see that the method yet provides an interesting mechanism to collectively ‘lift’ the distribution of beliefs. Let us first consider a single row perturbation and simplify notation as before by dropping the cycle index subscripts for the moment. Consider first a perturbation, r applied to j column element Tij or row i, as shown below to obtain a new row ′ ensuring that the row is re-normalized to sum to 1: ′
1
1
..
1
..
1
Denoting the product of the original T with a belief vector as product of the perturbed row of T above with the same belief vector as find that:
′
and the ′ we
(9)
28
S. Mandyam and U. Sridhar
The change in the resultant belief can be written as: ′
(10)
Clearly, we can force ′ 0 if we select such a column j in the row Ti, such that 0. It is easy to see that since is a convex combination of the beliefs b, increasing the value of the jth column results in reducing all other values and thus 0 will hold only for some j, and we can select such a j among them that . maximizes If we perform a similar perturbation on all the rows, it is clearly possible to ‘lift’ all 1, . . simultaneously, by selecting a column in each row to the beliefs ′ , maximize each new belief element. This is depicted pictorially below.
bt −1
Convexity of Social Influences ‘contracts’ beliefs
yt
BLIFT Perturbation lifts beliefs
yt' yt' ≥ yt
DeGroot Update Example ‘LIFT’
Tt −1bt −1 → yt
Tt '−1bt −1 → yt'
Perturbation in Cycle t
Fig. 1. BLIFT perturbation lifts y’ over y for all agents
Due to the convexity of the rows of T, it is easy to see that every agent that applies the perturbation with the value r, will in fact, select the same column while finding the maximum in (10). In other words, we select a single column, say j=q, in the matrix of values in a cycle to perturb every row of that selected column so as to increase its value by r, and perturb all other of its other columns, j≠q, by reducing them by a factor of (1+r). Let one update cycle comprise ‘lifting’ beliefs for every agent in this manner. The new perturbed matrix T’ that results from such an operation is expected to have every row perturbed as described above. Let us now describe an algorithm to accomplish the lifting of beliefs as described above. We refer to the algorithm as BLIFT.
Endogenous Control of DeGroot Learning
29
Algorithm Belief-Lift(BLIFT) Set up inputs: • • • • • •
T: (m x m) row stochastic influence matrix, ≥ 0; ∑ = 1, ∀ i, T is assumed to represent a strongly connected (a path exists from any agent any other agent), and aperiodic (gcd of paths =1). b0: (m x 1) vector of initial beliefs with 0 < ≤ 1; ε = suitably small number Initialize η = ‖T‖ k =1; bk = b0; Tk = T;
while η > & > for ∀ i // STEP 1: Calculate DeGroot update ← // STEP 2: Calculate Lift Matrix for i=1,m for j=1,m =
[
−
} // STEP 3: Find max lift for row i in = max
;
1+
]
>0
// STEP 4: Apply Lift perturbation to column q, row i of T Matrix for j=1,m >0 If ′ If i==q = ( + )/(1 + ) ′ = /(1 + ) Else } } } } // End of for loop: Step 2 ← ′ ← ′ ′ = // STEP 5: Setup recursion for next cycle ← ′ , ← − ′ ← ′ ∗
} // End of While Loop
It is clear that we pick one agent in every row on the basis of maximum ‘lift’ that it can offer. This selection is ‘aspirational’, in the sense that we pick for agent i (row i), such an agent j whose older belief value the agent i wishes to exceed, over her own updated belief.
30
S. Mandyam and U. Sridhar
In BLIFT, we have used a single preset value of r>0. Note that since the r enters the expression for change in beliefs (10), merely as a ratio 1, the actual value of r is unimportant in so far as selection of the column on which to apply the perturbation. We have assumed here that the same value r is applied as perturbation to any row. In principle these can well be different. The update cycles continue until either the sequence of perturbed T matrices converge to a stable T or till there are no more positive entries in the matrix. We shall prove below that the resultant non-homogeneous Markov chain must indeed converge. We also see that if in any cycle there are no positive values left in , it must mean that it would not be possible to increase any ′ further – hence an upper bound was reached. Before proceeding to the analysis of the algorithm from a convergence perspective, it is important to mention that other variants of the key concept for lifting beliefs are also possible. For example we could consider applying some level of control on the perturbation by setting and varying r over the cycles. The perturbation transformations applied in Algorithm BLIFT are depicted pictorially below. The recursion starts at the right and proceeds to the left. We start with an initial (supplied) influence matrix and belief vector, T0 and b0.
←←←
←←←
←←←
←←←
←←←
←←←
Cycle 2
Cycle 1
Fig. 2. BLIFT Perturbation Cycles
The hexagon signifies the set of row perturbations applied to transform to in in the upper row produces , in the lower row the first cycle. While produces modified . That concludes the first cycle. The next cycle begins with from the lower row being used as in the upper row, and so on.
5 Convergence of BLIFT In order to prove convergence, we examine the variance of the columns of the perturbed T matrices, for they hold the key to how the T matrix produces a
Endogenous Control of DeGroot Learning
31
‘contraction’ of the beliefs. We use following form of variance calculation for this 1, . . , , with a purpose, which expresses the variance of any set of m values , ∑ mean , as: 1
We have the following Lemma which obtains the variances of the columns of T. Lemma 1. The application of the BLIFT perturbation to T results in reducing variance of all the columns of T by a factor 1 1 in every cycle. Proof In any cycle t, let us write the variance of the qth column of T’ as 1 1 ∑
where, the column mean 1
1 . For a given r> 0, then
1
1 This simplifies to 1
1
1 The expression in square brackets is nothing but variance of the unperturbed column of q. Hence, 1 1 For all other columns of T’, where j≠q, we have: 1 1
1
which is simply: 1 1
1
;
32
S. Mandyam and U. Sridhar
Hence, for any column of T’ ,
1
1, . . ,
1
□
The recursion defined in BLIFT with perturbations applied as above in every cycle results in the belief in some cycle (t+1) being obtained as the backward product of a series of stochastic perturbed T’ matrices:
.
…
(11)
Had we not applied any perturbation at all, the normal DeGroot update process would have left the belief value at cycle (t+1) at: (12) ,a It is well known [14] that as , a suitably large positive integer, constant matrix whose rows all equal the left (unit) eigenvector. Since the column variances of such a matrix at that stage would have vanished, we refer to the equivalent process of the column variances of a stochastic matrix tending to ε, a very small number falling below machine precision, as the process of achieving ‘stablity’1. Using the variance contraction property of T’, we can prove the following theorem on the convergence of the BLIFT algorithm by comparing how the variance contracts in the products of T’ and the power of T at any cycle (t+1). Theorem. Given an initial stochastic matrix of social influence weights, if known to turn stable as tN, a suitably large positive integer, then the product
. … obtained through BLIFT perturbation will also turn stable, with the variance of all its columns falling below a suitable machine precision ε, resulting in the convergence of BLIFT. . and , where is obtained by Let us consider two products applying a BLIFT perturbation on . The columns of both P and Q are convex combinations of the columns of , hence their individual column variances will be is lower than , the lower that of . However since the column variance of column variance of Q is lower than that of P. If we apply this result to the recursion of (11), it is easy to see that the column variance of products of the T’ matrices reduces monotonically, until it falls below ε as tN, a large enough positive integer. Hence BLIFT converges, and it converges to a consensus if represents a strongly connected network. 1
In its Markov chain analog the T matrix is said to be irreducible and aperiodic (equivalently, stable) if it represents a strongly connected social network where there is a ‘path’ connecting every agent to every other agent, and the gcd of paths is 1, implying that idiosyncratic weights are non-zero. Under these conditions the beliefs converge to a consensus in the DeGroot cycle [14].
Endogenous Control of DeGroot Learning
33
Another way of viewing g the variance contraction effect is to observe that in soome = ′ will bee ‘contracted’ by T’ by a factor ( ) over the contracttion cycle k, affected by , where T’ is obtained o from T by the BLIFT perturbation. The valuee of r>0 is a useful way to tun ne the rate of convergence since the higher the value oof r higher the contraction of vaariance. The implication of selection of a column in T to be perturbed by all agents (i.e.. all onsequence. If we define a prestige measure, , for evvery rows) has an interesting co
agent j in cycle t as the co olumn average for a column of T, i.e.
∑
,
where ( ) denotes the ij element of T in cycle t, we find that the BLIIFT perturbation on column k reenders the prestige values of this column higher than otther columns because: th
=
1 [ 1+
+ ]>
=
1 1+
; >0
This allows r to be designed d so that a) all agents may collectively increase the presttige of an agent they all ‘follow w’; and b) that the repeated attention to the same agent oover the cycles increases the atteention for that agent. As a final comment, it is i observed that while BLIFT selects an obvious extreeme case of picking the maxim mum possible ‘lift’ which results in a contraction in the variance by the large facctor of
(
)
, which in turn helps to produce raapid
convergence in the social influence matrix, it is clearly possible to allow higgher discrimination among the selection by agent on whom to follow, and yet obttain convergence. In other word ds different agents may choose to perturb their influeence weight differently, and yeet achieve convergence. These alternatives need furtther exploration.
6 Example In this section we comparee the convergence and consensus performance of DeGrroot and BLIFT algorithms for a simple 3-agent graph shown below.
The matrices T0 and thee vector b0 represent the initial social influences and beelief vector of the agents. T0 is a row-stochastic matrix. DeGroot and the BLIIFT algorithms are run on this network n and the results are shown below.
34
S. Mandyam and U. Sridhar
BLIFT Method Evolution of the Belief Vector
Belief Value
1.2 1 0.8 0.6 0.4 0.2 0
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15
Belief Value
DeGroot Method Evolution of the Belief Vector
agent1
agent2
agent3
1.2 1 0.8 0.6 0.4 0.2 0 c1
c2
agent1
c3
c4
agent2
c5
c6
c7
c8
agent3
Fig. 3. Evolution of the belief vector using DeGroot and BLIFT
The plot on the left in Figure 3 shows the belief evolution with the basic DeGroot update, and we see that the beliefs converge to a consensus of about 0.27. As was expected the BLIFT algorithm converged to a solution faster, besides lifting the final consensus belief values to about 0.6. The higher belief values connote a stronger ‘conviction’ about a global truth, and the fact that all agents tried to follow an agent with higher belief in every cycle. The agents dynamically revise their beliefs and align their influence weights to arrive at a consensus at end of the run.
7 Concluding Remarks We have developed a perturbation algorithm BLIFT that can provide endogenous control to agents to shape their beliefs, particularly to lift them to higher levels by following another agent. At its core is the DeGroot method, but with the additional feature that every agent makes a change to the influence matrix to signify a changed perception of one’s neighbors. This results in a non-homogeneous, but row stochastic influence matrix that turns stable and produces convergence in the beliefs. Theoretical proofs for the convergence of the BLIFT algorithm are provided. The DeGroot belief values obtained with constant social influences and no endogenous control are a lower bound on the belief vector for the BLIFT algorithm. BLIFT also produces a higher rate of convergence as the perturbation ensures larger variance contractions of the belief vector in consecutive iterations to attain consensus much faster. Further work to explore practical applicability and possibilities for tuning the perturbation to capture the notion of ‘prestige’ followers’ is underway, together with a thorough analysis of convergence and consensus characteristics of the algorithm.
References 1. Acemoglu, D., Ozdaglar, A., Ali: Spread of Misinformation in Social Networks (2009) 2. Acemoglu, D., Munther, D., Lobel, I., Ozdaglar, A.: Bayesian Learning in Social Networks. M.I.T, Mimeo (2008) 3. Bala, V., Goyal, S.: Learning from Neighbors. The Review of Economic Studies 65(3), 595–621 (1998)
Endogenous Control of DeGroot Learning
35
4. Berger, R.: A Necessary and Sufficient Condition for Reaching a Consensus Using DeGroot’s Method. Journal of the American Statistical Association 76(374), 415–418 (1981) 5. Bikhchandani, S., Hirshleifer, D., Welch, I.: A Theory of Fads, Fashion, Custom, and Cultural Change as Informational Cascades. Journal of Political Economy 100(51), 992– 1026 (1992) 6. DeGroot, M.H.: Reaching a Consensus. Journal of the American Statistical Association 69(345), 118–121 (1974) 7. DeMarzo, M., Vayanos, D., Zwiebel, J.: Persuasion Bias, Social Influence, and UniDimensional Opinions. Quarterly Journal of Economics 118, 909–968 (2003) 8. Eugene, S.: Non-negative matrices and Markov chains. Springer, Heidelberg (1981) 9. Friedkin, N.E., Eugene, C.: Social Influence Networks and Opinion Change- Advances in Group Processes, vol. 16, pp.1–29 (1999) 10. Friedkin, N.E.: A Structural Theory of Social Influence. Cambridge University Press, New York (1998) 11. Friedkin, N., Johnsen, E.C.: Social Positions in Influence Networks. Social Networks 19, 209–222 (1997) 12. Friedkin, N.E., Cook, K.S.: Peer Group Influence. Sociological Methodology and Research 19, 122–143 (1990) 13. Goyal, S.: Learning. In: Networks Handbook of Social Economics (2010) 14. Jackson, M.O.: Social and Economic Networks. Princeton University Press, Princeton (2008) 15. Medhi, J.: Stochastic Processes. New Age International Publishers (2010)
Mathematical Continuity in Dynamic Social Networks John L. Pfaltz Dept. of Computer Science, University of Virginia
[email protected]
Abstract. A rigorous concept of continuity for dynamic networks is developed. It is based on closed, rather than open, sets. It is local in nature, in that if the network change is discontinuous it will be so at a single point and the discontinuity will be apparent in that point’s immediate neighborhood. Necessary and sufficient criteria for continuity are provided when the change involves only the addition or deletion of individual nodes or connections (edges). Finally, we show that an effective network process to reduce large networks to their fundamental cycles is continuous.
1
Introduction
Networks, or undirected graphs (which we regard as total synonyms) are fundamental for modeling social phenomena [5]. Yet they also abound in both the sciences and humanities, c.f. [17] for its excellent survey and bibliography of over 400 applications. They may be huge; the connectivity of the world wide web is a network — they may be tiny; the atomic bonds in a molecule are an undirected graph. Such networks are dynamic; yet there has been little formal study of network change [4]. We introduce the concept of network transformation in Section 3. Typically, we are interested in those kinds of transformations which preserve elements of network structure. In particular, we are concerned with “continuous” transformations. Like open sets in continuous manifolds, closed sets can be a powerful tool for analyzing the structure of discrete systems. Closure is associated with rational choice operators in economics [12,16,15]. Galois closure can be used to extract rules from data sets for subsequent used in A.I. reasoning systems [22,23]. When the system can be partially, or totally, ordered the closed sets are usually intervals, ideals or filters [11,14]. In this paper we employ the closed set structure of undirected graphs and networks. Much of the current mathematical analysis of social networks is statistical [13,28] or combinatoric [27]. Both can provide valuable, broadbrush properties of the entire system. In contrast, our approach focuses on the decomposition of the system into its constituent closed set structure. The closed sets are created by a neighborhood closure introduced in Section 2.1. A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 36–50, 2011. c Springer-Verlag Berlin Heidelberg 2011
Mathematical Continuity in Dynamic Social Networks
37
In Section 3, we define the concept of continuous transformations of discrete systems in general, and use it in Section 3.1 to explore the behavior of continuous network transformations. All of the mathematical results associated with network closure in these two sections are original. Many of the other results about general closure are not widely available [20,21,24]; hence we have provided detailed proofs. These proofs can be skipped without losing the essential content of the paper. Section 4.1 presents a representative graph reduction process that is applicable to large networks; it is shown to be continuous. Lastly, Section 4.2 introduces the notion of “fuzzy” closure.
2
Closure
An operator ϕ is said to be a closure operator if for all Y, Z ⊆ P , it is: (C1) extensive, Y ⊆ Y.ϕ, (C2) monotone, Y ⊆ Z implies Y.ϕ ⊆ Z.ϕ, and, (C3) idempotent, Y.ϕ.ϕ = Y.ϕ. A subset Y is closed if Y = Y.ϕ. In this work we prefer to use suffix notation, in which an operator follows its operand. Consequently, when operators are composed the order of application is read naturally from left to right. With this suffix notation read Y.ϕ as “Y closure”. It is well known that the intersection of closed sets must be closed. This lattercan be used as the definition of closure, with the operator ϕ defined by Y.ϕ = Zi closed {Y ⊆ Zi }. By a closure system S = (P, ϕ), we mean a set P of “points” or “elements”, together with a closure operator ϕ. By (C1) the set P must be closed. In a social network these points are typically individuals, or possibly institutions. The empty set, Ø, may, or may not, be closed. A point y ∈ Y is said to be a ordinary point of Y if y ∈ (Y −{y}).ϕ. In contrast, a point y ∈ Y is said to be an extreme point of Y if y ∈ (Y −{y}).ϕ. (Extreme points have a central role in antimatroid closure theory [2,6].) A set is said to be whole if all of its points are ordinary points. 2.1
Neighborhood Closure
Let S = (P, A) be a set P of points, or elements, together with a symmetric adjacency relation A. By the neighborhood, or neighbors, of a set Y we mean the set Y.η = {x ∈ Y |∃y ∈ Y, (x, y) ∈ A}. By the region dominated by Y we mean Y.ρ = Y ∪ Y.η.1 Suppose P is a set of individuals and the relation A denotes a symmetric connection, such as mutual communication, between them. The neighborhood y.η about a person y is the set of individuals with which y directly communicates. The neighborhood, Y.η, of a set Y of individuals is the 1
In graph theory, Y.η is often called the “open neighborhood of Y ” and denoted N (Y ), while Y.ρ, denoted N [Y ] has been called the “closed neighborhood of Y ” [1,8]. This is a rather different meaning of “closed”.
38
J.L. Pfaltz
A a b c d e f g h
a 1 1 1 0 0 0 0 0
b 1 1 1 1 0 0 0 0
c 1 1 1 1 1 1 0 0
d 0 1 1 1 0 0 1 0
e 0 0 1 0 1 1 1 0
f 0 0 1 0 1 1 0 1
g 0 0 0 1 1 0 1 1
h 0 0 0 0 0 1 1 1
d b
g e
a
h c
f
Fig. 1. A symmetric adjacency matrix A and corresponding undirected graph
set of individuals not in Y who directly communicate with at least one individual in Y . The region, Y.ρ, also includes Y itself. Members of Y may, or may not, communicate with each other. We can visualize the neighborhood structure of a discrete set of points, or individuals, as an undirected graph such as Figure 1. The neighbors of any point are those adjacent in the graph. In the graph of Figure 1 we have {a}.η = {b, c} or more simply a.η = bc. And g.ρ = degh. Given the neighborhood concepts η and ρ, we define the neighborhood closure, ϕη to be (1) Y.ϕη = {x|x.ρ ⊆ Y.ρ} In a social system, the closure of a group Y of individuals are those additional individuals, x, all of whose connections match those of the group Y . A minimal set X ⊆ Y of individuals for which X.ϕη = Y.ϕη is sometimes called the nucleus, core, or generator of Y.ϕη . Readily, for all Y , Y ⊆ Y.ϕη ⊆ Y.ρ
(2)
that is, Y closure is always contained in the region dominated by Y . Proposition 1. ϕη is a closure operator. Proof. Readily, Y ⊆ Y.ϕη by definition. Let X ⊆ Y and let z ∈ X.ϕη . By (1) z.ρ ⊆ X.ρ ⊆ Y.ρ hence z ∈ Y.ϕη . Let z ∈ Y.ϕη .ϕη . Then z.ρ ⊆ Y.ϕη .ρ = x∈Y.ϕη x.ρ ⊆ Y.ρ, hence z ∈ Y.ϕη .
Proposition 2. X.ϕη ⊆ Y.ϕη if and only if X.ρ ⊆ Y.ρ. Proof. Let X.ϕη ⊆ Y.ϕη . ∀x ∈ X.ϕη , x.ρ ⊆ X.ρ, so x ∈ Y.ϕη implies x.ρ ⊆ Y.ρ or X.ρ ⊆ Y.ρ. Now suppose X.ρ ⊆ Y.ρ. Let z ∈ X.ϕη implying z.ρ ⊆ X.ρ ⊆ Y.ρ Hence z ∈ Y.ϕη . An immediate consequence of Proposition 2 is
Mathematical Continuity in Dynamic Social Networks
39
Corollary 1. X.ϕη = Y.ϕη if and only if X.ρ = Y.ρ. Proposition 3. Let ϕη be the closure operator. If y.η = Ø then there exists X ⊆ y.η such that y ∈ X.ϕη . Proof. Readily, y.ρ ⊆ y.η.ρ, so y ∈ y.η.ϕη . Choose a minimal X ⊆ y.η such that X.ρ ⊆ y.ρ. So, unless y is an isolated point, every point y is in the closure of some subset of its neighborhood. One might expect that every point in a discrete network must be closed, e.g. {x}.ϕη = {x}. But, this need not be true, as shown in Figure 1. The region c.ρ = abcdef while a.ρ = abc ⊆ c.ρ and b.ρ = abcd ⊆ c.ρ, so c.ϕη = abc. The points a and b are ordinary points of Y = {abc}, but Y is not whole because c ∈ (abc−c).ϕ = {ab}. Equation (2) suggests an effective computer algorithm to calculate the closure Y.ϕη of any set Y . Initially, let Y.ϕη = Y ; then since Y.ρ = Y ∪ Y.η examine only the points z in the neighborhood, Y.η, of Y . If z.ρ ⊆ Y.ρ, add z to Y.ϕη . The following sequence of propositions regarding ordinary points and whole sets all assume that the closure operator is the neighborhood closure. They need not be true in general. Proposition 4. If ϕη is the closure operator and y is an ordinary point of Y , then y.ρ ⊆ (Y −{y}).ρ ⊆ Y.ρ. Proof. The first containment follows from the definition of y ∈ (Y −{y}).ϕη . The second containment is always true. Proposition 5. Let ϕη be the closure operator. If X and Y are finite whole sets and X ∩ Y = Ø, then X = Y . Proof. Let z ∈ X ∩ Y , so z is an ordinary point of both X and Y . By Prop. 4, z.ρ ⊆ X.ρ ∩ Y.ρ. Consequently the iterated neighborhood z.ρ . . . ρ ⊆ Xρ . . . ρ ∩ Y ρ . . . ρ, and since both are finite this iteration must terminate with X ⊆ X ∩ Y , Y ⊆ X ∩ Y , so X = Y = X ∩ Y . It is apparent that with respect to neighborhood closure, whole sets are effectively the non-trivial connected components of the network.
3
Transformations
Almost any book on graph theory mentions graph homomorphism, that is a mapping h : (P, E) → (P , E ), or a function h : P → P in which (x, y) ∈ E implies that (h(x), h(y)) ∈ E [1,8]. But, a serious limitation of graph homomorphisms is that, since h : P → P is a function, the homomorphic “image” must always be “smaller”. In the real world, networks expand and contract.
40
J.L. Pfaltz
For this reason we introduce the notion of a graph, or network, transformation which is a function mapping the power set, 2P , of P into the the power set, 2P , of P . That is, every subset of P has a unique image subset in P . The operators η, ρ, and ϕη are transformations of a network (P, E) into itself, since every subset has a unique image. To emphasize this difference, a transformation f is denoted by our suffix notation, e.g. Y.f , rather than the customary prefix notation of functions and homomorphisms. Thus, in neighborhood notation, a graph homomorphism h would be y.η.h ⊆ y.h.ρ . f We denote transformations of network systems by (P, E) −→ (P , E ), or posf
sibly by (P, ϕ) −→ (P , ϕ ), since we are often interested in the closure structure induced by the neighborhood system. Note that a transformation f may only change the neighborhood system of P and hence ϕ . In this paper we require that any transformationf be monotone, that is X ⊆ Y implies X.f ⊆ Y.f
(3)
as seems to be always the case in real applications. Note that “monotone” in this sense only preserves containment relationships; it does not mean that the transformation is “increasing” or “decreasing”. By convention [19,29], a transformation f is said to be continuous if for all Y ⊆P (4) Y.ϕ.f ⊆ Y.f.ϕ Readily, (4) holds for all closed sets Y because Y.ϕ.f = Y.f ⊆ Y.f.ϕ . If one visualizes ϕ to be an operative force which causes social cohesion, then “continuity” assures that cohesion observed in the pre-image network will be contained in the cohesion modeled in the resulting image network. f
g
Proposition 6. Let (P, ϕ) −→ (P , ϕ ), (P , ϕ ) −→ (P , ϕ ) be transformaf ·g tions and let g be monotone. If both f and g are continuous, then so is P −→ P . Proof. We have X.ϕ.f ⊆ X.f.ϕ for any X ∈ P and Y.ϕ .g ⊆ Y.g.ϕ for any Y ∈ P . Consequently, as g is monotone, X.ϕ.f.g ⊆ X.f.ϕ .g ⊆ X.f.g.ϕ . Thus f · g is continuous. Continuous transformations of discrete spaces exhibit many of the properties of continuous real functions with which we are more familiar [26]. For example, let f be a function f : R → R ; if (a) f is onto, then for all y ∈ R there exists y ∈ R such that f (y) = y ; if (b) f is continuous and X is open/closed in R , then f −1 (X ) is open/closed in R; if (c) f is continuous and X is connected in R, then f (X) is connected in R . f
Proposition 7. Let (P, ϕ) −→ (P , ϕ ) be monotone, continuous and let Y = Y.f be closed. Then Y.ϕ.f = Y . Proof. Let Y.f be closed in P . Because f is continuous Y.ϕ.f ⊆ Y.f.ϕ = Y.f , since Y.f is closed. By monotonicity, Y.f ⊆ Y.ϕ.f , so Y.ϕ.f = Y.f .
Mathematical Continuity in Dynamic Social Networks
41
Or, in effect, if the pre-image of a closed set exists it must also be, in a sense, closed. One can also consider closed transformations which map closed sets in P onto closed sets in P . The term “closed transformation” is traditional for structure preserving maps, whether expressed in terms of open sets or closed sets. But, it is most unfortunate in this context, where the multiple meanings can lead to confusion. It is apparent that the composition of closed transformations is another closed transformation. f
Proposition 8. A monotone transformation (P, ϕ) −→ (P , ϕ ) is closed if and only if ∀X ⊆ P , X.f.ϕ ⊆ X.ϕ.f . Proof. Let f be closed. By monotonicity, X ⊆ X.ϕ implies X.f ⊆ X.ϕ.f . But, because X.ϕ is closed and f is closed, X.f.ϕ ⊆ X.ϕ.f Conversely, let all subsets X ⊆ P fulfill X.f.ϕ ⊆ X.ϕ.f and let X be a closed subset of (P, ϕ). Then X.f.ϕ ⊆ X.f . But, readily X.f ⊆ X.f.ϕ , so equality holds. Consequently, f
Proposition 9. A monotone transformation (P, ϕ) −→ (P , ϕ ) is closed and continuous if and only if, for all X ⊆ P , X.ϕ.f = X.f.ϕ . f
A common way of defining a graph transformation (P, E) −→ (P , E ) is to first define {y}.f for all singleton sets in P and then extend this to all Y ⊆ P by Y.f = y∈Y {y}.f . We call f an extended transformation if P.f = P . Any extended transformation is by construction, monotonic. f
Proposition 10. If (P, E) −→ (P , E ) is an extended transformation, then for all y ∈ Y = Y.f there exists y ∈ Y such that y ∈ {y}.f . Proof. Let y ∈ Y . By the extended construction Y = {y}.f for some y ∈ Y .
y∈Y
{y}.f , hence y ∈
Note that this is quite different from asserting a true inverse existence, that for all y ∈ Y , there exists some y ∈ Y such that y.f = y . To get some sense of the import of this “weak inverse existence” proposition, consider the simple transformation f of Figure 2. If we define f on P by x.f = x and y.f = y , then z’
f x
y x’
y’
Fig. 2. A simple transformation f with multiple definitions
42
J.L. Pfaltz
by extension {xy}.f = x y and z has no pre-image; so P.f = P . However, if we let x.f = {x z }, y.f = {y z } then {xy}.f = x y z . Now P.f = P , so f is an extended transformation, and Proposition 10 is clearly satisfied. Unless otherwise explicitly stated, all examples of this paper will be extended transformations. 3.1
Network Transformations
The preceding results are true for all closure systems. Now we focus specifically on network transformations. In the next set of propositions it is the neighborhood, y.η, which is central. Proposition 11. Let x ∈ y.η, then x ∈ y.ϕη if and only if x.ρ ⊆ y.ρ if and only if x.η−{y} ⊆ y.η. Proof. The first equivalence is simply a restatement of the definition of neighborhood closure. The second equivalence follows because if x.ρ ⊆ y.ρ then ∀z = y, z ∈ x.η we have z ∈ y.η and y ∈ z.η by symmetry. The converse is similar. f
Proposition 12. Let (P, E) −→ (P , E ) be extended. If f is not continuous, there exists Y ⊆ P , and y ∈ Y.η such that either (1) y ∈ Y.f.η or (2) y.η ⊆ Y.η and y .η ⊆ Y.f.η Proof. Since f is not continuous, there exists Y such that Y.ϕη .f ⊆ Y.f.ϕη . Thus, ∃y ∈ Y.ϕη .f, y ∈ Y.f.ϕη . By, Prop. 10, ∃y ∈ Y.ϕη such that y ∈ y.f . y ∈ Y else y ∈ Y.f . Consequently, y ∈ Y.η and y.η ⊆ Y.η. Now, since y ∈ Y.f.ϕη we know that either y ∈ Y.f.η or y .η ⊆ Y.f.η . Y is technically unspecified, but since y is an ordinary point, by Prop. 11 y ∈ y.η.ϕη ; hence we can assume {y} ⊆ Y ⊆ y.η. This proposition establishes that if f is discontinuous anywhere, then it will be discontinuous at, or near, a point y. One need not consider all subsets of 2P . Just as is the case with classical function theory, discontinuity, and thus continuity, is a local phenomena. Secondly, it provides conditions (1) and (2) which are are necessary, but not sufficient to demonstrate discontinuity. If for a point y ∈ P neither condition (1) nor (2) holds, we say f is continuous at y. If either condition holds, other criteria must be used, c.f. propositions 13, 14 or 16. f We have said that a transformation P −→ P is monotone if ∀X, Y , X ⊆ f Y implies X.f ⊆ Y.f . Let (P, E) −→ (P , E ) be a transformation between two neighborhood systems. The transformation f is said to be neighborhood monotone if X.ρ ⊆ Y.ρ implies X.f.ρ ⊆ Y.f.ρ . A transformation that is monotone need not be neighborhood monotone, and conversely. f
Proposition 13. Let (P, E) −→ (P , E ) be monotone, then f is continuous if and only if f is neighborhood monotone.
Mathematical Continuity in Dynamic Social Networks
43
Proof. Let f be continuous and let X.ρ ⊆ Y.ρ. By Prop. 2, X ⊆ X.ϕρ ⊆ Y.ϕρ . Thus, X.f ⊆ Y.ϕρ .f ⊆ Y.f.ϕρ by continuity. So X.f.ρ ⊆ Y.f.ρ . Conversely, let f be neighborhood monotone. By definition Y.ϕη = Y ∪ {x ∈ Y |x.ρ ⊆ Y.ρ}. Since for all y ∈ Y , y ∈ Y.f ⊆ Y.f.ϕη , we need only consider x ∈ Y , but x.ρ ⊆ Y.ρ. Since f is neighborhood monotone, x.ρ ⊆ Y.ρ implies x .ρ = x.f.ρ ⊆ Y.f.ρ so x ∈ Y.f.ϕη . Caution: information regarding the region X.ρ dominated by a set X reveals very little about X itself. For example, in Figure 1 we have {bd}.ρ = abcdg ⊆ abcdef gh = {cg}.ρ, yet {bd} ∩ {cg} = Ø. There is an extensive literature regarding dominating sets, c.f. [9,10]. 3.2
Network Growth
Unfortunately, both propositions 12 and 13 can be awkward to use in practice. We look for local criteria. A network can grow by adding points and/or edges. Any transformation which just adds an isolated point z will be continuous, since if X is closed in (P, ϕ), X and X ∪ {z } will be closed in (P , ϕ ). But, if continuity is important, care must be taken when adding edges or connections. Proposition 14. An extended network transformation f , which adds an edge (x , z ) to A at x, will be continuous at x if and only if for all y ∈ x.η, x ∈ y.ϕη implies z ∈ y.η. Proof. First we observe that x.ϕη .f ⊆ x.f.ϕη because f only expands x .η so y ∈ x.ϕη must imply that y ∈ x ϕη . Moreover, z ∈ x.η so ∀y ∈ x.η if w ∈ y.ϕη , w = x, then w ∈ y.ϕη .f or w ∈ y .ϕη because the neighborhoods of y and w are unchanged. However, x ∈ y.ϕη implies x.ρ ⊆ y.ρ, hence by Prop. 13, f is continuous iff x.ρ ⊆ y.ρ iff z ∈ y.ρ. The transformation f1 in Figure 3 which adds the two edges (d , i ) and (g , i ) to G1 satisfies Prop. 14. For example, d ∈ b.ϕη = ab, d ∈ c.ϕη = abc and i’’
i’
G
G
d
1
b
f1 h
c
f
G
d’
b’
g e
a
2
f2
f’
j’’
g’’ e’’
a’’
h’ c’
d’’
3
b’’
g’ e’
a’
h’’ c’’
f’’
Fig. 3. Two network transformations, f1 and f2
d ∈ g.ϕη = g, so the proposition is trivially satisfied. Similarly, examination at g shows that for all y ∈ g.η, y = y.ϕη , so f1 is continuous at g as well. Elsewhere
44
J.L. Pfaltz
it is the identity map so f1 is continuous everywhere. We observe that f1 is not a closed transformation because {dg} is closed in G1 , but {d g } is not closed in G2 because {dg}.ϕη = d g i . Expansion of G2 at a by creating the edge (a , j ) is different. Because a ∈ b .ϕη (and c .ϕη ), but (b , j ) ∈ A , by Prop. 14 f2 is discontinuous at b (and also c). We would also observe that f2 is not neighborhood monotone at b because a .η = a b c ⊆ b .η = a b c d but a .η = a b c j ⊆ b .η = a b c d , so f2 is not continuous by Prop. 13 as well. Finally, we verify that b .ϕη .f2 = a b ⊆ b = b .ϕη . As this example illustrates, the discontinuity need not occur at either x or z, but often at some point y in x.η or z.η 3.3
Network Contraction
Real networks lose members and connections; but this is hard to model mathematically with homomorphic functions. The problem is that every point in the existing network must map to some point in the image space — and to be a homomorphism it must bring its edges/connections with it. Of course, if the two network elements are truly combined in reality then homomorphism is the right model. But, when the member or connection simply disappears, it isn’t. When we use the transformation model of this paper we can map a point, or subset, onto the empty set, Ø. We call it point removal. Removal of any point, or node z, must also delete all edges incident to z, that is all edges of the form (y, z) ∈ E. This is equivalent to deleting a row and column from the adjacency relation, A . We let δz denote the removal of z from P and (y, z) from E for all y ∈ z.η. Proposition 15. δz is continuous at all y ∈ z.η. Proof. Let X.ρ ⊆ Y.ρ. Readily, X.ρ−{z} ⊆ Y.ρ−{z}, so X.ρ.δz ⊆ Y.ρ.δz and by Prop. 13 δz is continuous. Instead of deleting a point and all its incident edges we can remove one, or more, connections by changing the neighborhood structure represented by A . Proposition 16. An extended network transformation f , which deletes an edge (x , z ) from A at x, will be continuous at x if and only if either z ∈ x.ϕη or x.ϕη = z.ϕη . Proof. If z ∈ x.ϕη and x.ϕη = z.ϕη , then f must be discontinuous because z ∈ x .η so x.ϕη .f ⊆ x.f.ϕη . Now, consider y ∈ x.η, y = z so x ∈ y.η by symmetry. If x ∈ y.ϕη then x.η ⊆ y.η. Since A = A−(x , z ), x .η ⊆ x.f.η or y.ϕη .f ⊆ y.f.ϕη . The second condition, x.ϕη = z.ϕη , is needed only for situations such as that of Figure 4 in which x.ϕη = z.ϕη regardless of what other nodes are connected to y1 and y2 . Addition, or deletion, of the dashed edge (x, z) makes no change in the closed set structure whatever.
Mathematical Continuity in Dynamic Social Networks
45
x y
y
1
2
z
Fig. 4. Two points where x.ϕη = z.ϕη G3
b
d
a
f3 f
c
b’ g
G4
G5
d’
a’
h
e
f4
d"
a"
h’
f’
c’
b"
g’
f"
c"
e’
e"
Fig. 5. Contraction of a network by two successive deletions
The individual deletions δg and δh in Figure 5, The transformations f3 and f4 of Figure 5 illustrate network contractions. In Figure 5, the dashed edges of Gi indicate the deletions in Gi+1 . By Prop. 16, removing the edge (a, b) from G3 is discontinuous. Indeed, we find that a.ϕη .f3 = a b ⊆ a.f.ϕη = a . However, f3 is continuous at c ∈ a.η. The transformation f4 illustrates that rather large network changes can be continuous, since by Proposition 15 both δg and δh are continuous, and by Proposition 6, G4 .δg .δh must be continuous as well. However, removal of either connection (d , g ) or (g , h ) individually would be discontinuous. By Prop. 6 the composition of continuous transformations must be continuous; but as f4 illustrates, a continuous transformation need not be decomposable into primitive continuous steps. In Propositions 14 and 16 we established continuity criteria for network transformations which added and/or deleted elements or connections in a network. But, transformations can radically alter the adjacency structure as shown in Figure 6 and still be continuous. Here, the graph G7 is the continuous image of G6 under f6 . This is true because the only neighborhoods of G6 are abc, abd,
f6
b
G
6
a
d c
c’ a’
f7
G7
b’ d’
Fig. 6. f6 is continuous, f7 is not
acd, bcd and abcd so Proposition 13 is trivially satisfied. On the other hand, c .ρ = b c ⊆ a b c d = b .ρ, but c .f7 .ρ = a c d ⊆ a b d = b .f7 .ρ. So f7 cannot be continuous.
46
4
J.L. Pfaltz
Continuity in Practice
4.1
Network Reduction
In Figure 1 of Section 2.1, we observed that the point c is not closed, that a and b are elements of c.ϕη . Although {a} and {b} are themselves closed sets, they must be contained in any closed set containing c. We say a point z is subsumed by a set Y if z is an ordinary point of Y , that is (by Prop. 4) if z.ρ ⊆ Y.ρ. For the reduction process we describe below we will only consider singleton sets Y , such as {c}. In a sense, subsumed points such as a and b of Figure 1 contribute little to the closure structure, or topology, of the network. They can be eliminated with little loss of structural information. In [25], Richards and Seay provide a small 18 point network called the “Sampson” data. They use it to contrast various eigenvector algorithms; we will use it to illustrate graph reduction by point subsumption. Figure 7(a) is one visualization of this network. The circled points of Figure 7(b) denote all the points
9
9
1 2
10
5
6
18
4 15
13
17
14
18
15 [2]
17
14
[1]
12
12 16
[4]
7 15
13
17
2
10
8
4 7
18
2
5
6
1
1
10
8
4
[1] 3
3
3
11
(a)
16
11
(b)
16
11
(c)
Fig. 7. (a) Original “Sampson” network, (b) subsumed points, (c) reduced network
that are subsumed by other singleton sets. For example, 7 is subsumed by 2, 14 is subsumed by 15. Finally, Figure 7(c) is the reduced graph created by deleting all subsumed points. The reduced graph of Figure 7(c) is structurally simpler, yet its topology is faithful the the original. By recording [in brackets] the number of points subsumed by each individual node it also conveys a measure of the original density near that node. The key elements of Figure 7(c) are chordless cycles of length 4 or greater. These are < 3, 10, 2, 1, 3 >, < 18, 4, 2, 15, 17, 18 > and < 18, 17, 11, 16, 18 > in the figure. These are fundamental cycles; no point on a fundamental cycles can be subsumed by another. These fundamental cycles define the topology of the network in much the same manner that 1-cycles can be used to define the topological structure of manifolds [7]. By Proposition 15 the removal of subsumed points, such as δ7 in Figure 7(b) above, are each individually continuous. Thus by Proposition 6, their composition is continuous. Figure 7(a) is rather simple to begin with. The continuous reduction by subsumed points is more useful in larger, more complex networks. In [18], Newman
Mathematical Continuity in Dynamic Social Networks
47
presents a 379 node network of collaborating scientists in which each edge denotes at least one co-authored paper. This was reduced by the same program that generated Figure 7(c) to the 65 node network shown in Figure 8. As in
[23]
[23] [1] [1]
[1]
[2]
[15]
[7]
[16]
[6]
[1] [2]
[1]
[15]
[1]
[1] [1]
[5]
[5] [4]
[13] [1]
[4] [4]
[10]
[2] [4]
[15]
[5]
[1]
[1]
[4]
[1] [5]
[5]
[42]
[9]
[16]
[1]
[4] [7]
[1] [9]
[8] [1]
[2]
[2]
[9]
[1]
[1]
[1]
[1]
[1]
[15]
[1]
[1] [4] [10]
[2] [2]
[4]
Fig. 8. Fundamental cycles in the collaboration network of [18]
Figure 7(c), values [n] in brackets denote the number of nodes directly, or indirectly, subsumed by the retained node. Dashed lines crudely approximate the extent of nodes in the original network. All of the retained nodes lie on at least one fundamental cycle. The reduced representation in terms of fundamental cycles is shown in Figure 8. It is a continuous image of the original 379 node network.
4.2
Fuzzy Closure
With neighborhood closure, as defined in Section 2.1, a point z in the neighborhood of a set Y is in Y -closure if its neighborhood, z.η is completely contained in Y.ρ. Thus for z to be subsumed by a single point y, as in Section 4.1, all the neighbors/connections of z must already be neighbors of y. This is asking for a great deal, and it is rather surprising that the form of network reduction described above works as well is it does on real networks. When y and z are individuals we would be more likely to say z is tightly bound to y if “almost all” of z’s attachments/connections/neighbors are neighbors of y. Can such a fuzzy concept of closure be made rigorous?
48
J.L. Pfaltz
Let us define a fuzzy neighborhood closure, ϕf by Y.ϕf = Y ∪ {w ∈ Y.η : |w.ρ−Y.ρ| ≤ 1}, that is w can have one independent attached neighbor and still be considered to be in the closure Y.ϕf . We use the intersection property of closure systems to show: Proposition 17. ϕf is a closure operator. Proof. Let X and Z be closed w.r.t. ϕf . We claim that Y = X ∩ Z is also closed w.r.t. ϕf , that is Y.ϕf = Y . Suppose not, then ∃w ∈ (X ∩ Z).ϕf , w ∈ X ∩ z. Let y ∈ (X ∩ Z).ϕf . If y ∈ X, there exist at least two neighbors u, v ∈ y.η, u, v ∈ X, so u, v ∈ X ∩ Z contradicting the assumption that y ∈ (X ∩ Z).ϕf . So y ∈ X. Assuming y ∈ Z leads to precisely the same contradiction, so y ∈ X ∩ Z. Readily, Y ⊆ Y.ϕη ⊆ Y.ϕf so this fuzzy closure yields a coarser network structure. For example, the only non-trivial fuzzy closed sets of the graph of Figure 1 are abd, ef gh, and h Because ϕf is a closure operator, many of the preceding propositions are still valid; some are not. For example, the fundamental property (2) does not hold; Y.ϕf ⊆ Y.ρ. If S = (Z, A) with Z being the integers {1, . . . , n} and (i, i + 1) ∈ A, then the only closed sets are Ø and Z. No non-empty subset of Z can be closed. Because of the behavior of fuzzy closure in this last example, reduction of the network of Figure 8 using it yields only a single point! Nevertheless, the fact that one can define a fuzzy closure indicates the possibility of use in other kinds of social network analysis.
5
Summary
The results of this paper provide a rigorous mathematical foundation for studying the continuous transformation of large social networks. The characterization is based on local changes to the graph, or network, not massive, global transformations. But, “continuity” has always been a local concept couched in terms of very small changes in the pre-image space.2 However, Proposition 6, the example of f4 in Figure 5, and our application of Proposition 15 to network reduction demonstrate that global change, which is the composition of smaller continuous steps, may also be characterized as “continuous”. Unlike the traditional approach to continuity, the concept of the “closed set” structure of a network is fundamental. Perhaps the idea of a neighborhood, Y.η, comes closest to embodying the concept of “nearby points”, and thus an “open” set.3 However, neighborhoods have few of the key properties of open sets, and trying to fit them into this role seems futile. Mathematics is a formal exercise; yet surprisingly often it seems to mirror reality. For example, if connections are between individuals, as in social networks, then Proposition 14 would say that creating a connection (x, z) between two 2 3
E.g. the typical − δ definition of real analysis [26]. Many graph theory texts say that Y.η is an “open” neighborhood, c.f. [1,3,8].
Mathematical Continuity in Dynamic Social Networks
49
persons, x and z where x is closely bound to a third individual y, is smoother, easier, or continuous if a connection already exists between y and z. This seems to be the case in numerous studies cited by [5]. On the other hand, Proposition 16 would assert that breaking a connection between x and z represents a discontinuity if z is tightly bound to x, that is has the same shared connections to others nearby. This also seems to be true in the real world. While, the introduction of closed sets to the study of transformational change has resolved a number of key issues, there are many more yet to explore. For example, suppose there exists a bi-continuous transformation f between two graphs G and G . In what way would they be similar? We might observe that we have yet to encounter a bi-continuous transformation other than a plain isomorphism: it may be that none can exist. In Section 4.2, we show that a form of fuzzy closure can be defined, but we have not explored it rigorously. We only know that our reduction program, using fuzzy closure, always results in a network with only a single node! But, what properties might fuzzy continuity have? Similarly, we have assumed that the relation A is symmetric. But, many relationships, including friendship, need not be reciprocal. Is neighborhood closure well-defined for non-symmetric relations? Only Proposition 16 explicitly assumes symmetry; but it may be implicitly necessary elsewhere. Even with all these questions, we believe we have shown that a mathematically rigorous analysis of large social networks based on closed sets can be quite rewarding.
References 1. Agnarsson, G., Greenlaw, R.: Graph Theory: Modeling, Applications and Algorithms. Prentice Hall, Upper Saddle River (2007) 2. Ando, K.: Extreme point axioms for closure spaces. Discrete Mathematics 306, 3181–3188 (2006) 3. Behzad, M., Chartrand, G.: Introduction to the Theory of Graphs. Allyn and Bacon, Boston (1971) 4. Bourqui, R., Gilbert, F., Simonetto, P., Zaidi, F., Sharan, U., Jourdan, F.: Detecting structural changes and command hierarchies in dynamic social networks. In: 2009 Advances in Social Network Analysis and Mining, Athens, Greece, pp. 83–88 (2009) 5. Christakis, N.A., Fowler, J.H.: Connected, The surprising Power of Our Social Networks and How They Shape Our Lives. Little Brown & Co., New York (2009) 6. Edelman, P.H., Jamison, R.E.: The Theory of Convex Geometries. Geometriae Dedicata 19(3), 247–270 (1985) 7. Giblin, P.J.: Graphs, Surfaces and Homology. Chapman and Hall, London (1977) 8. Harary, F.: Graph Theory. Addison-Wesley, Reading (1969) 9. Haynes, T.W., Hedetniemi, S.T., Slater, P.J. (eds.): Domination in Graphs, Advanced Topics. Marcel Dekker, New York (1998) 10. Haynes, T.W., Hedetniemi, S.T., Slater, P.J.: Fundamentals of Domination in Graphs. Marcel Dekker, New York (1998)
50
J.L. Pfaltz
11. Jankovic, D., Hamlett, T.R.: New Topologies from Old via Ideals. Amer. Math. Monthly 97(4), 295–310 (1990) 12. Koshevoy, G.A.: Choice functions and abstract convex geometries. Mathematical Social Sciences 38(1), 35–44 (1999) 13. Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Statistical Properties of Community structure in Large Social and Information Networks. In: WWW 2008, Proc. of 17th International Conf. on the World Wide Web, pp. 695–704 (2008) 14. McKee, T.A., McMorris, F.R.: Topics in Intersection Graph Theory. SIAM Monographs on Discrete Mathematics and Applications. Society for Industrial and Applied Math., Philadelphia (1999) 15. Monjardet, B.: Closure operators and choice operators: a survey. In: Fifth Intern. Conf. on Concept Lattices and their Applications, Montpellier, France (October 2007); Lecture notes 16. Monjardet, B., Raderinirina, V.: The duality between the antiexchange closure operators and the path independent choice operators on a finite set. Math. Social Sciences 41(2), 131–150 (2001) 17. Newman, M.E.J.: The structure and function of complex networks. SIAM Review 45, 167–256 (2003) 18. Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(036104), 1–22 (2006) 19. Ore, O.: Mappings of Closure Relations. Annals of Math. 47(1), 56–72 (1946) 20. Pfaltz, J.L.: Closure Lattices. Discrete Mathematics 154, 217–236 (1996) 21. Pfaltz, J.L.: A Category of Discrete Partially Ordered Sets. In: Agnarsson, G., Shapiro, J. (eds.) Mid-Atlantic Algebra Conf. George Mason Univ., Fairfax (2004) 22. Pfaltz, J.L.: Logical Implication and Causal Dependency. In: Sch¨ arfe, H., Hitzler, P., Øhrstrøm, P. (eds.) ICCS 2006. LNCS (LNAI), vol. 4068, pp. 145–157. Springer, Heidelberg (2006) 23. Pfaltz, J.L.: Establishing Logical Rules from Empirical Data. Intern. Journal on Artificial Intelligence Tools 17(5), 985–1001 (2008) ˇ 24. Pfaltz, J.L., Slapal, J.: Neighborhood Transformations. In: 40th Southeastern International Conf. on Combinatorics, Graph Theory and Computing, Boca Raton, FL (March 2009) 25. Richards, W., Seary, A.: Eigen Analysis of Networks. J. of Social Structure 1(2), 1–16 (2000) 26. Royden, H.L.: Real Analysis. Mcmillian, New York (1988) 27. Saito, A. (ed.): Graphs and Combinatorics. Springer, Heidelberg (2010) ISSN 09110119 28. Smyth, P.: Statistical Modeling of Graph and Network Data. In: Proc. IJCAI Workshop on Learning Statistical Models from Relational Data, Acapulco, Mexico (August 2003) ˇ 29. Slapal, J.: A Galois Correspondence for Digital Topology. In: Denecke, K., Ern´e, M., Wismath, S.L. (eds.) Galois Connections and Applications, pp. 413–424. Kluwer Academic, Dordrecht (2004)
Government 2.0 Collects the Wisdom of Crowds Taewoo Nam and Djoko Sigit Sayogo Center for Technology in Government, University at Albany, State University of New York 187 Wolf Road, Suite 301, Albany, NY12205, U.S. {tnam,dsayogo}@ctg.albany.edu
Abstract. An emerging trend is noteworthy that government agencies tap on citizens’ innovative ideas. Government 2.0—governmental adoption of Web 2.0 technologies—enables and empowers citizens to participate in various functions and processes of government such as service provision, information production, and policy making. Government 2.0 is a tool for government to collect the wisdom of crowds, which helps improve service, information, and policy. Crowdsourcing is not only for businesses but is now being implemented in the public sector. Currently government agencies chiefly use four strategies for crowdsourcing: contest, wiki, social networking, and social voting. This paper takes a close look at how government agencies utilize those strategies. Keywords: Government 2.0, Web 2.0, Crowdsourcing, Wisdom of crowds.
1 An Emerging Wave Information and communication technologies (ICTs) has moved to the second generation—Web 2.0, which is characterized as multi-directional digital connections and as participatory and collaborative technology. Such properties of Web 2.0 technologies lead to changing the citizen-government relationship. Web 2.0 enables and empowers citizens to engage in governmental workings that did not or did only limitedly open to the public before. One can anticipate advancement of digital government enhanced by Government 2.0, which denotes governmental adoption and use of Web 2.0 tools. Beyond the efficiency of information dissemination that has been of primary value in the Web 1.0 age, today’s digital government in this Web 2.0 age is gaining an unprecedented opportunity to improve citizen engagement and participation [1]. This paper discusses the contribution of Government 2.0 to citizen engagement and participation in various governmental functions including service provision, information production, and policy making. Government agencies tap on citizens’ innovative ideas through crowdsourcing, which denotes “the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined network of people in the form of an open call” [12]. The trend can be an emerging paradigm for government operation, as Beth Noveck [16] identified an ongoing transition to wiki-like working government—Wiki-government. Such a new paradigm indicates a more transparent, collaborative, and participatory mechanism for government operation enabled by Government 2.0 platforms. The spirit underlying the new move gains powerful institutional support from the Obama Administration in the A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 51–58, 2011. © Springer-Verlag Berlin Heidelberg 2011
52
T. Nam and D.S. Sayogo
United States. The Open Government Directive endorsed by President Obama promised to usher in the new age of transparency, collaboration, and participation. The Directive stressed the novel mechanism through which government agencies learn from the wisdom of crowds. Participatory and collaborative potentials of Web 2.0 offer government agencies valuable opportunities to gather ideas and opinions from a larger populace of citizens and furthermore bring them to developing services, information, and policy. In this way government agencies can benefit from the wisdom of crowds. The new approach, however, cannot be a cure-all. Despite technological potentials for facilitating participation and collaboration, government’s collection of the wisdom scattered across the large population is not without concerns and challenges. Government’s crowdsourcing strategies require caution. While much is commentators’ hyperbolism on crowdsourcing and Government 2.0 in the public sector, there have been only few studies to address the down side of current governmental initiatives to collect the wisdom of crowds. This paper aims to fill the research gap by discussing how government agencies use crowdsourcing. The paper is simply structured into three sections, including the foregoing introduction. The next section suggests four approaches to governmental crowdsourcing: contest, wiki, social networking, and social voting. Then the section delves into some cases in the U.S. context where the approaches are being used. The final section addresses further implications and concluding remarks.
2 How Does a Government Collect the Wisdom of Crowds? We identify four types of collecting the wisdom of crowds from the current adoption of Web 2.0 technologies in the public sector. Many government agencies have recently launched projects to collect the wisdom of crowds. This section focuses on some popular cases, but more practices are now being operated with Government 2.0. Highlighted are main characteristics (strengths and weaknesses) of the practices categorized into one of four types. Table 1 summarizes strategy, mechanism, motivator, and main concern for citizen engagement. Table 1. Government 2.0 Strategies to Collect the Wisdom of Crowds
Strategy
Mechanism
Motivator
Main concern
Contest
Competition
Materials
How to select ideas
Wiki
Collaboration
Altruism
Limited to internal communities
Social networking
Networking
Social relationship
How to filter input against noise
Social voting
Voicing out
Political efficacy
Procedural democracy
2.1 Contest Motivators for this competition-driven approach are extrinsic. Material incentives and career opportunities invigorate activism in contest-type crowdsourcing projects [4,5]. Material motivation like cashes and prizes is powerful to encourage more active
Government 2.0 Collects the Wisdom of Crowds
53
participation in contests. The contest strategy is being used mainly for two purposes: collecting innovative ideas from the public, and collecting professional knowledge from semi-professional amateurs. There are many examples of getting ideas from general citizens. Some current practices successfully attract citizens’ interest, but also at the same time showing weaknesses in the design and implementation. EPA sponsored a 30 to 60 seconds video competition to raise awareness of the connection between the environment and the stuff people use, consume, recycle, and throw away (Out Planet, Out Stuff, Our Choice). The agency asked for citizens’ help in creating videos to inspire community involvement, spread information, and lead to action. The video is posted in YouTube, and viewers can evaluate by social voting (thumb up/down feature) and make a comment. Topics or themes are clear and easy enough to increase citizens’ interests in the contest, but viewers’ evaluation lacks a set of consistent criteria. GSA ran a video contest on USA.gov, asking citizens to submit 30 to 90 seconds videos showing how the website made their lives easier. The best video chosen through competition, “Get Your Voice Heard,” uses a catchy song to bring attention to many ways USA.gov is of service to the public: for example, renewing passports, applying for student aid, and contacting elected officials. In the selection process, submitted videos had been rated by a panel of judges from GSA based on message, accuracy, appropriateness of theme, creativity, and entertainment value. Various pieces of the wisdom were derived from the crowds, but the best wisdom was not selected by the crowds but only by government staff. Government could borrow a discerning eye from crowds. Government also can solicit semi-professional expertise. Technical solution is also a deliverable made by government crowdsourcing. For example, when NASA scientists had a difficulty in devising a formula to predict solar flares in 2009, they posted their problem online and offered a $30,000 prize to anyone who could solve it. Tapping on knowledge distributed to amateur scientists enables a new approach for on-demand problem solving. 2.2 Wiki Wiki for open collaboration is a strong strategy for tapping on semi-professional knowledge of amateurs. Altruism (voluntary contribution to societies) is the biggest motive for wikivism and open-source participation [3,8,9,10,14,15,17,18]. Some federal agencies have operated internal wikis where public employees can effectively share ideas, knowledge, and experiences. Wiki now becomes an ideagora online for active collaboration by concerned and informed professionals. Some federal agencies have established an internal virtual community to enable employees to share lessons, best practices, and subject matter expertise. Through these wikis, similar good ideas around a common topic continuously evolve toward the best idea. An agency benefits from innovative ideas that could not arise within a formal organizational structure of public bureaucracy. The wikis act as a community of practice where people share knowhow from experience. Examples are Bureaupedia of the Federal Bureau of Investigation (FBI), Intellipedia of the Central Information Agency (CIA), Diplopedia of the Department of State (DOS), and Techipedia of the Department of Defense (DOD). For instance,
54
T. Nam and D.S. Sayogo
Bureaupedia fills information gaps made when FBI employees and analysts left or retired the agency with their previous tacit knowledge. A wiki platform is also available for engaging citizens with expertise in government process. The White House Open Government Initiative and GSA created a new public engagement tool, ExpertNet. This wiki enables government officials to pose questions to the public about any topic they are working on and reach citizens with the greatest expertise or enthusiasm for a topic. Another example is Our Archives wiki for the public, researchers, educators, genealogists, and staff of the National Archives and Records Administration (NARA) to share research tips, subject matter expertise, and knowledge about NARA records. This wiki creates an informal (between inside and outside the agency) environment for cross-boundary information sharing and collaboration. However, most users of government wikis are ones inside the government, and people outside the government have not been engaged actively. The virtual community is not yet a well-developed connection between technical expertise, knowledge embedded in experiences, and concerned views from outside the organizational boundary. 2.3 Social Networking Human relationship is a primary motive for activism in social networking sites. Social networking services as a new genre for communication motivate participation chiefly with expectation and desires for making a new relationship and solidifying an existing one [6]. Those websites can also serve as a source to share and obtain information, using networks of friendship. Governmental commitment to social networking sites facilitates acquisition of grassroots information [19], and makes active visitors fans for governmental agencies. Strategic use of social networking services helps a government agency build a social consensus on and mobilize popular support for what the agency is doing and plans to do. Most federal agencies use their Facebook and Twitter sites to spread information and hear from citizens. The sites do not only act as topdown media to let more people know better about what a government agency currently does, but it also plays as social, interactive media to engage them in chatting and sometimes discussing the agency’s policy issues. San Francisco launched SF311 on Twitter (Twitter.com/SF311) to improve 311 service and decrease service costs. The new Twitter-based platform for the existing citizen services allows residents to access 311 services online. Those who have Twitter accounts can send tweets containing service requests and complaints. After a Twitter request has been made, 311 staff can easily provide follow-up, allowing residents to track resolution of the problem. However, Twitter is still too new for most people to harness its potential of sending civic inputs to government. While SF311 deals with non-emergency citizen services, Twitter service of the Department of Interior’s U.S. Geological Survey (USGS) collects and provides information of emergency and disaster (Twitter.com/USGSted). By mining real-time tweets, USGS expands its suite of seismically derived information and obtains firsthand accounts of shaking seconds after an earthquake occurs. The agency automatically gathers, summarizes, and maps earthquake tweets to provide a rapid overview of what people experienced during an earthquake [7]. However, for the side of the
Government 2.0 Collects the Wisdom of Crowds
55
agency, it is difficult to control much information about personal feelings of an event—rather than about facts as the event occurs—though such information could be indirectly helpful for further analysis. 2.4 Social Voting Social voting is another new strategy to collect the wisdom of citizens. That is a mechanism for interactive, communicative, and participatory evaluation (and also collection) of shared ideas. Participants in social voting post their own ideas, make comments on others’ ideas, and rate them. Political efficacy is a powerful motivator to encourage their participation. They expect and believe that their voices would contribute to society, government, and ultimately their lives. Social voting overcomes drawbacks inherited in the traditional voting mechanism. An unlimited number of ideas can be evaluated without temporal and spatial constraints—that feature is weakness as well, given too many inputs over administrative capacities to deal with them. Social voting can start without a given agenda, and thus a priority agenda for discussion can be also chosen by votes. Government and participants can learn from reasons of rating as well as results of rating. Many platforms for social voting are currently available as freeware (e.g., IdeaScale, IdeaStorm, and UserVoice), and some federal agencies and municipal governments are now adopting the platforms. There are various examples of active citizen engagement in social voting: e.g., Seattle’s Ideas for Seattle, Santa Cruz’s City Budget, and Austin’s Open Austin. Ideas For Seattle (IdeasForSeattle.org) is full of hot debates about a variety of metropolitan issues (e.g., expanding light rail, installing sidewalks, and revitalizing a public park). Seattle residents share their own ideas about raised topics with others, evaluate posted ideas, and make comments on them. The city government learns from what citizens present in the website, and reflects what it learned from social voting on policy. The direct democracy experiment driven by the White House also merits attention. The Obama administration in its very early days launched online engagement tools on its transition Web site, Change.gov. Their key function is to allow citizens to set priorities of national policy. Individual citizens were able to comment and rate the ideas of others so that the best idea rose to the top priority one. The Web site also provided American citizens across the nation with a direct line to the administration to ask what they wanted to know about governmental efforts to get the economy back on track. It made a new type of town hall meeting possible. The President and policymakers got a better sense of what is on people’s minds. The initiative for direct democracy and digital democracy proved the feasibility of policy experiment even in a nationwide scale and national-level issue. The democracy experiment, however, showed drawbacks as well as such new opportunities for digital democracy. Since the online forum lacked appropriate moderation, a sheer number of comments did not fit the topic in discussion, and participants often discussed the platform itself or talked about their personal stories. Optional (not required) anonymity increased the possibility for inappropriate comments and insults. Early submission bias made an idea with an early lead in social voting hold the top throughout the process [2]. While a good idea submitted later was disadvantaged from its very beginning, early submitted ideas usually attracted more votes.
56
T. Nam and D.S. Sayogo
3 Further Discussions and Conclusions Some government agencies now benefit from collecting the wisdom of crowds, but the processes are not without problems and weaknesses. The preceding section identified challenges to harnessing Government 2.0 to bring citizens into government processes. While current practices raise a variety of concerns, more central issues are about citizen engagement rather than about technical features. Government officials need to consider the following four points. First, the éclat of crowdsourcing projects in the business sector does not guarantee the good performance of governmental crowdsourcing initiatives. Issues demanding mass participation and collaboration are different in their nature between governmental crowdsourcing and business crowdsourcing. Especially, social and political issues distinguish between government and businesses. Howe [13] argued a community of like-minded peers creates better products than a corporate behemoth. However, that mechanism works well for the business sector. The collective wisdom from collaboration of like-minded peers is not likely to exist in politically-sensitive discussions. Substantial discrepancies across policy preferences of citizens may result in time-consuming debates without a fruitful conclusion. Innovative ideas in business crowdsourcing may come from anybody. However, Howe’s [13] statement that crowdsourcing promises perfect meritocracy––demographics and occupations no longer matter––is not realistic but rather idealistic and rhetorical in the public sector. Crowdsourcing in policy-making may qualify participation by only a small number of people who possess typical demographic and occupational characteristics of social elites [11]. Second, mass participation has a tradeoff. Current practices of government may experience two unpleasant situations related to participation inputs. Given only a handful number of active participants in governmental crowdsourcing, the process cannot be considered democratic. On the contrary, some government agencies receive excessively many comments over their administrative capacities to filter and classify the useful ideas. Hence crowdsourcing projects have a risk of ending up with burdensome works to government staff. Government employees might think crowdsourcing tools as less costly investments but with promising returns. However, government crowdsourcing projects will fail if a naïve optimism—once set up, it just works well—prevails. Self-organizing crowdsourcing of political ideas is prone to digression. Participants often stray from a given issue. Often non-moderated activism fills in the idea sharing space with unhelpful opinions. Third, so far governments have not bought actively the wisdom they collected from the citizens. In this sense, crowdsourcing through Government 2.0 could be no more than hype. Since current projects still seem very experimental, it is doubtful that the result of crowdsourcing is now being discussed at central importance. Participants in crowdsourcing expect governments to reflect their ideas in any way. Regarding that, we need to step back and ask about the real performance of government crowdsourcing: Do government agencies buy the wisdom of crowds into their policies or decisions? If decision-makers do not adopt the wisdom of crowds, employing various tools of Government 2.0 would look like rhetoric to citizen participants. Finally, many government agencies start crowdsourcing projects via Government 2.0, but lacking clear purposes for citizen engagement. There are different levels of
Government 2.0 Collects the Wisdom of Crowds
57
purposes that governments adopt Government 2.0. For a high level of achievement, Government 2.0 could be an effective vehicle to deliberative democracy. Democracy via Government 2.0 is desirable for engaging people in public deliberation about policy, not for replacing referendum, vote and opinion polls. It better connects between government and citizens, and among citizens in online spaces. Current utilization of Web 2.0, however, still inclines to head-counting rather than to learning from qualitative comments. Social voting enabling public deliberation and discussion should be distinguished from a general voting mechanism, which a majority always simply wins over a minority. We should ask if government agencies use high-level technologies with great potentials for only low-level purposes. Conclusively, a government agency should clarify the problem it hopes to solve, before launching a new project of gathering the wisdom from crowds. Different public problems would require different strategies. Current available strategies—contest, wiki, social networking, and social voting—need to be adapted to various contexts and circumstances surrounding government agencies. Now for some government agencies, the wisdom of crowds can be already an actual outcome made by engaging citizens in governmental functions and processes. For many other agencies that consider adopting and further developing Web-based tools of peer collaboration, this paper requires their caution and careful consideration in collecting the wisdom of crowds. A bottom line is clear. Poorly prepared government would fail to actualize the ideal of networked collective intelligence and harness collaborative potentials of Web 2.0, facing unhelpful voices from unwise mobs or apathy of citizens with little interests in participation. However, with proper design and management of technological tools for citizen engagement, Government 2.0 would offer public agencies greater hopes and feasibilities over fears and challenges.
References 1. Batorski, M., Hadden, D.: Embracing government 2.0: Leading transformative change in the public sector. Grant Thornton International Ltd., Alexandria (2010), http://www.freebalance.com/whitepapers/ FreeBalance_Gov20_WP.pdf 2. Bittle, S., Haller, C., Kadlec, A.: Promising Practices in Online Engagement. Center for Advances in Public Engagement, New York (2009), http://publicagenda.org/files/pdf/PA_CAPE_Paper3_Promising_M ech2.pdf 3. Bonaccorsi, A., Rossi, C.: Altruistic individuals, selfish firms? The structure of motivation in Open Source software. First Monday 9(1) (2004), http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/ar ticle/view/1113 4. Brabham, D.C.: Crowdsourced advertising: How we outperform Madison Avenue. Flow: A Critical Forum on Television and Media Culture 9(10) (2009), http://flowtv.org/?p=3221 5. Brabham, D.C.: Crowdsourcing the public participation process for planning projects. Planning Theory 8(3), 242–262 (2009) 6. Burke, M., Marlow, C., Lento, T.: Feed me: Motivating newcomer contribution in social networking sites. In: The CHI 2009, Boston (April 7, 2009)
58
T. Nam and D.S. Sayogo
7. Chavez, C., Repas, M.A., Stefaniak, T.L.: Local Government Use of Social Media to Prepare for Emergencies. International City/County Management Association (ICMA), Washington, DC (2010), http://icma.org/en/icma/knowledge_network/documents/kn/docum ent/301647/local_government_use_of_social_media_to_prepare_f or_emegencies 8. Ghosh, R.A.: FM interview with Linus Torvalds: What motivates free software developers? First Monday 3(3) (1998), http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/ar ticle/view/583/504 9. Hars, A., Ou, S.: Working for free?: Motivations for participating in open source projects. International Journal of Electronic Commerce 6(3), 25–39 (2002) 10. Hertel, G., Niedner, S., Hermann, S.: Motivation of software developers in the open source projects: An Internet-based survey of contributors to the Linux kernel. Research Policy 32(7), 1159–1177 (2003) 11. Hindman, M.: “Open-source politics” reconsidered: Emerging patterns in online political participation. In: Mayer-Schönberger, V., Lazer, D. (eds.) Governance and Information Technology: From Electronic Government to Information Government, pp. 183–207. MIT Press, Cambridge (2007) 12. Howe, J.: The rise of crowd sourcing. Wired 14(6), 176–183 (2006) 13. Howe, J.: Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business. Random House, New York (2009) 14. Moore, T.D., Serva, M.A.: Understanding member motivation for contributing to different types of virtual communities: A proposed framework. In: The SICMIS-CPR, April 19-21, St. Louis, Missouri (2007) 15. Nov, O.: What motivates Wikipedians? Communications of the ACM 50(11), 60–64 (2007) 16. Noveck, B.S.: Wiki Government: How Technology Can Make Government Better, Democracy Stronger, and Citizens More Powerful. Brookings Institution Press, Washington, DC (2009) 17. Peddibhotla, N.B., Subramani, M.R.: Contributing to public document repositories: A critical mass theory perspective. Organization Studies 28(3), 327–346 (2007) 18. Rafaeli, S., Ariel, Y.: Online motivational factors: Incentives for participation and contribution in Wikipedia. In: Barak, A. (ed.) Psychological Aspects of Cyberspace: Theory, Research, Applications. Cambridge University Press, New York (2008) 19. Ramos, M., Piper, P.S.: Letting the grass grow: Grassroots information on blogs and wikis. Reference Services Review 34(4), 570–574 (2006)
Web Searching for Health: Theoretical Foundations for Analyzing Problematic Search Engine Use Pallavi Rao and Marko M. Skoric Wee Kim Wee School of Communication & Information, Nanyang Technological University
[email protected],
[email protected]
Abstract. Increasingly, consumers are searching online for health information. This rise in Web searching for health calls for a theoretical approach that explains the problems associated with consumers’ use of search engines for health information retrieval. In this context, this paper provides an exploratory framework for understanding problematic search engine use in the context of online health information retrieval. It extends Caplan’s (2005) theoretical framework of problematic Internet use by integrating users’ cognitive shift in the search process. The framework highlights the cognitive, behavioural and affective symptoms leading to negative outcomes of improper search engine use. Finally, the paper discusses implications of adopting the framework for understanding consumers’ search behaviour in health information retrieval. Keywords: Web Search, Online Health Information Retrieval, Cognitive Shift, Problematic Internet Use, Problematic Search Engine Use.
1 Introduction Currently, searching for health information constitutes an important use of the Web. As per the recent Pew Internet & American Life study conducted in 2011, healthcare is high among Web searches (Freudenheim, 2011). According to the survey, four in five Internet users search the Web for health information. Freedom in accessing unlimited resources, quick retrieval of information and low cost are some of the crucial factors in the diffusion of online health information. In spite of many advantages of online health information, there are some disadvantages of it. Studies are being conducted on various problems in interacting with online health information which could cause potential risks to consumers. For instance, quality of online health information is a big concern and number of studies have indicated that much of the health information available online is, to varying degrees, incomplete, inaccurate, oversimplified, and/or misleading (Eysenbach et al., 2002; Gualtieri, 2009). Information overload is another problem in online health and studies have been conducted on the effect of information overload on people while searching for health information (Kim, Lustria & Burke, 2007). Apart from the quality of health information, there is a growing recognition among healthcare researchers about the role of literacy in individuals’ health outcomes (Zarcadoolas, Plesant & Greer, 2006). Online health literacy skills include locating and evaluating information for credibility and A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 59–66, 2011. © Springer-Verlag Berlin Heidelberg 2011
60
P. Rao and M.M. Skoric
quality and analyzing relative risks and benefits of treatment options (Gualtieri, 2009). Benigeri and Pluye (2003) showed that exposing people with no medical training to complex medical terminology may put them at risk of harm from wrong selfdiagnosis and self-treatment. According to a report by the London School of Economics, eight in ten Australians head online for health information, with 47 percent using it for self-diagnosis (Singh, 2011). White and Horvitz (2008) conducted a log-based study on how people search online for health information. Their study focused on the content retrieved from Web searches and results showed that search engines have the potential to escalate medical concerns. They called this kind of unfounded escalation of concerns about common symptoms, based on the search results as “Cyberchondria”. Often this leads to people assuming the worst possible outcome and in turn might take risky health behaviours such as wrong self-diagnosis. The main challenge that researchers face is the increasing complexity involved in the process of interaction. Hence, it is crucial to understand consumers’ interaction with online health information. Human Computer Interaction (HCI) deals with modelling the interaction between users and system. Information searching is associated with this interaction process. HCI aspects of information searching have become more important as Web search engine use continues to grow (Spink & Jansen, 2004). Web search studies examine how people search the Web, including the cognitive processes involved in the Web search activities (Du, 2009). In this context, information researchers have begun to identify and model cognitive shifts taking place during Web search process (e.g. Spink & Dee, 2007). Cognitive shift, as defined in cognitive science, is the shift in cognitive focus as triggered by the brain’s response and change due to some external force (Jacobs, 2002). Cognitive shifting studies are an important aspect of human computer interaction and cognitive science research. Interactive information retrieval (IR) involves many human cognitive shifts at different information behaviour levels (Spink & Dee, 2007). Traditionally, IR studies have been system oriented, focusing on the performance of the algorithms matching queries with relevant documents. However, interest in the users’ search strategies has increased in the IR community over the years (Kaki & Aula, 2008). For over a decade, there is a growing concern over the relationship between users’ Internet use and reduced psychosocial health (e.g. depression, anxiety and loneliness). Davis (2001) introduced a cognitive behavioural theory of Problematic Internet Use (PIU) and conceptualized PIU as a distinct pattern of Internet related cognitions and behaviours that result in negative life outcomes. PIU can be applied to online health information retrieval context, where search engine use is continuously growing for retrieving health information. As research in online health has identified different problems and their consequences to users, it is crucial to understand search related cognitions and behaviours that could result in negative life outcomes. Prior research in e-health, HCI or IR has not studied problematic search engine use with respect to online health information retrieval. Hence, the objective of this paper is to build an exploratory framework to analyze problematic search engine use with respect to online health information retrieval.
Web Searching for Health: Theoretical Foundations
61
2 Theoretical Foundations for Problematic Search Engine Use 2.1 Problematic Internet Use Problematic Internet Use (PIU) is defined as, “use of the Internet that creates psychological, social, school and/or work difficulties in a person’s life” (Beard & Wolf, 2001). Davis (2001) employed cognitive-behavioural approach in analyzing PIU. He conceptualized PIU as a distinct pattern of Internet related cognitions and behaviours that result in negative life outcomes. The cognitive behavioural model of PIU proposes that the presence of maladaptive cognitions are critical to the development of PIU behaviours. And also, over time, PIU cognitions and behaviours intensify and continue to produce negative outcomes resulting in a diminished sense of self-worth and increased social withdrawal (Davis, 2001). Based on Davis’ (2001) cognitive behavioural model of PIU, Caplan (2002) developed a theory based measurement instrument of PIU by operationalizing the cognitivebehavioural symptoms and negative outcomes of PIU. He termed the maladaptive cognitive symptoms of PIU as “a preference for online social interaction” and defined it as “a cognitive individual difference construct characterized by beliefs with online interpersonal interactions and relationships than with traditional face-to-face (FtF) social activities”. The three behavioural symptoms of PIU were; (i) mood alteration (the extent to which people utilize the Internet when feeling socially isolated or down), (ii) Compulsive Internet use (the inability to control, reduce or stop online behaviour along with the feelings of guilt about time spent online), and (iii) Excessive Internet Use (the degree to which an individual feels that he or she spends an excessive amount of time online or even loses track of time when using the Internet). Negative outcomes of Internet use were defined as personal, social and professional problems resulting from one’s Internet use (Caplan, 2002, 2003). Preliminary findings of Caplan’s (2002) study showed that the cognitive PIU symptoms were a strong predictor of PIU behavioural symptoms, particularly mood alteration and compulsive Internet use. Furthermore, amongst all the PIU behavioural symptoms, compulsive Internet use emerged as the strongest behavioural predictor of negative outcomes stemming from Internet use. Later, Caplan (2005) tested the theoretical model of PIU which postulated that people’s preference for online social interactions would lead to over dependence on the Internet, consequently leading to compulsive Internet use which in turn likely to culminate in negative personal, social and professional consequences of Internet use. According to Caplan (2003), preference for online social interaction may develop from one’s perceptions that computer mediated communication is relatively easier (i.e., requiring less interpersonal sophistication), less risky (e.g. greater anonymity) and more exciting (e.g., more spontaneous, intense, and exaggerated) than FtF communication. He says that individuals who have deficient social skills may also develop a heightened preference for online social interaction because they perceive online interaction to be less face threatening and hence they perceive themselves to be more socially efficacious when interacting with others online. A recent study conducted in Singapore (Ng, 2011) stated that depression, anxiety and stress are the top health issues discussed online in Singapore. The anonymity of the Internet allows people to ask questions freely without fear of being judged. It implies that people prefer searching online for health information because they prefer to be anonymous.
62
P. Rao and M.M. Skoric
PIU has been applied to various situations. Researchers have tried to analyze PIU in the workplace (Davis, Flet & Besser, 2002), education (Lin & Tsai, 2002) and healthcare (Aboujaoude et al., 2006). Neo & Skoric (2009) have applied PIU framework to the context of problematic instant messaging use. This paper adapts PIU framework to specifically examine problematic search engine use in the context of health information retrieval. 2.2 Cognitive Shift Spink (2002) proposed that cognitive shifts could be used to measure IR system’s performance. She developed a Web search tool to explore user centred approach to the evaluation of Web search engines and reported that using the Web search tool developed, users experienced some level of shift/change in their information problem, information seeking, and personal knowledge due to their search interaction and different study participants reported different levels of cognitive shift. While searching the Web for health information, users anticipate that search engines provide relevant and useful results in response to some user input, typically a query. Web search engines use dozens of factors to determine how to score relevance and to rank the retrieved results. Typically, the user has no idea what factors lead to a particular result being retrieved and ranked. People come to look at search engines as question-answering machines. In this context, White and Horvitz (2008) say that significant proportion of users consider that the higher the results in the listing, the more credible it perceived to be and majority of people misinterpret the ranking of search results as a list of likely ailments, in order of probable diagnosis. They say that such usage of Web search as diagnostic inference is natural for people, yet is not typically considered in the design and optimization of general-purpose ranking algorithms. According to Spink and Dee (2007), during Web searching process, users experience various cognitive, emotional and physical reactions when they identify a gap in knowledge that needs to be filled with the information they are searching for. Studies have shown that Interactive Web searching involves many human shifts on cognition at different levels of information behaviour (Du, 2009; Spink & Dee, 2007). The identification of types of cognitive shifts may be meaningful in understanding the outcomes of users’ Web searching for health. Due to plethora of health information available, it is likely that users experience cognitive shift in Web searching causing a change in their original problem. This change could be positive or negative depending on the type of cognitive shift experienced. Uncertainty is a “cognitive state that commonly causes affective symptoms of anxiety and lack of confidence” (Kuhlthau, 1993). Uncertainty is one of the cognitive shifts examined by IR research studies (Spink & Dee, 2007) and is considered as an important concept in Web search studies. Researchers (e.g. Kuhlthau, 1993; Wilson et al., 1999) point out that uncertainty decreases when the searcher proceeds towards the completion of the search process. However, it may not always hold true when the user searches for health information. With the proliferation of search tools and information sources, uncertainty continues to be a significant factor in the search process. Users may feel uncertain at any stage of Web searching process and uncertainty may remain even after the process is completed (Chowdhury et al., 2011). Hence, in addition to
Web Searching for Health: Theoretical Foundations
63
the uncertainty that triggers the information search process (as proposed by Wilson et al., 1999), it is likely that users suffer from uncertainty at every stage in the process of searching health information. Uncertainty may result in negative feelings such as frustration, anxiety, lack of confidence etc., (affective symptoms) in users (Chowdhury et al., 2011). Chowdhury et al. (2011) have studied uncertainty issues in relation to the various stages of the of the information search and retrieval process. Their research showed that varying degrees of uncertainty exist among users in the context of various information seeking activities (e.g. choosing an appropriate source, formulating a search expression), information seeking problems (e.g. information overload, lack of information and communication skills, too many irrelevant results) and in relation to specific information channels and sources (e.g. different type of sources). Applying their findings to this paper, the problems identified in interacting with online health information (e.g. information overload, misinformation) could be the factors causing uncertainty amongst users. In this context, health information efficacy could be an important concept to be studied. The concept of health information efficacy is built on the existing research on self efficacy which refers to the degree of confidence individuals have in their ability to perform health behaviours and positively predicts the adoption of the preventive behaviour (Bandura, 2002). It refers to the intrinsic consumer belief in his or her ability to search for and process health information (Dutta & Bodie, 2008). Hence, people with high health information efficacy acquire the ability to process the information that may contain many uncertainties. 2.3 Research Framework This paper is an extension of Caplan’s psychosocial model of PIU to analyze problematic search engine use in the context of health information retrieval. As discussed above, Cognitive shift (Uncertainty) which is a significant factor in the search process is added to the original model along with the affective symptoms (negative feelings like anxiety, frustration, lack of confidence) which it causes.
Anonymity
Preferences for Web searching for health information
Problems in information seeking
Compulsive Search Engine Use Negative outcomes
Cognitive Shift
Negative feelings
(Uncertainty) during
(Anxiety, Frustration,
/after Web search
Lack of confidence)
of Search Engine Use
Health Information Efficacy
Fig. 1. Framework for Analyzing Problematic Search Engine Use in online Health IR
Fig. 1. Framework for Analyzing Problematic Search Engine Use in online Health IR
64
P. Rao and M.M. Skoric
3 Implications Users’ Web search behaviour is continuing to evolve with the evolving information conditions and also because different people may have different methods in searching. Hence, there is a continuous need to understand users’ Web search behaviour within the broad framework of social science theories and models. The integration of cognitive shift to Caplan’s (2005) model provides a comprehensive model for analyzing problematic search engine use. All of these efforts will greatly enhance the knowledge of the search process and allow the opportunities for the designers to incorporate human factors into IR system design. The main strength of this paper is the theoretical understanding of problematic search engine use in the process of health IR. One of the goals of IR research is to theoretically understand the information retrieval in the forms of models and theories and it has by and large two communities: a Computer science oriented experimental approach and a user-oriented information science approach with a social science background (Vakkari & Jarvelin, 2005). Combination of both fields is required for the growth of knowledge. Developing frameworks for understanding the process of interaction is crucial in the healthcare context. If a system is familiar with users’ cognitive, behavioural and affective patterns, it may more easily adapt and personalize users’ interactive process. The framework gives an idea about problematic search engine use and the possible negative health outcomes. Search engine architects have a responsibility to ensure that searchers do not experience uncertainty generated by the ranking algorithms their design use. Efforts should be made to reduce negative uncertainty by improved search designs once specific causes for negative uncertainty are identified.
4 Future Avenues The current paper is an initial step towards developing a new theory on problematic search engine use in health IR. It is a part of on going research on problematic search engine use in health IR. Future studies will validate and test the framework. Although the current framework suggests the cognitive, behavioural and affective aspects of problematic search engine use, it also raises an important question for future research to address: Are the negative outcomes different for healthy versus ill individuals? A possible interpretation is that people who choose to search Internet health resources may be especially sensitive to hypochondriasis or excessive worry about minor health symptoms (Bessiere et al., 2010). The association between depression and seeking online health information is an evidence of this. Search results might be compelling for such persons, as they come up with list of symptoms, narratives of pain and grief, treatment and medicine and even photos of diseased organs. Reading these may cause this group to imagine being ill and to inflate their perceptions of risk. Consistent with this argument is evidence suggesting that psychosocially distressed individuals have a stronger preference for online social interaction than non-distressed individuals (Caplan, 2003) and that people with high levels of health anxiety or hypochondriasis use health resources significantly more than their non-anxious counter parts (Bessiere et al., 2010). This shows the importance of including users’ health anxiety level in testing the negative outcomes.
Web Searching for Health: Theoretical Foundations
65
Regarding uncertainty, some of the researchers (Anderson, 2006; Case, 2007) have identified its positive effects. They argue that increased uncertainty may motivate users in spending more time in information seeking or exploring other/alternate avenues. Hence, it is important to study the level of uncertainty at various stages of search process and how it can influence the users. Although search engine is an important source for health information, there are other social media sources (e.g. health discussion forums, blogs, wikis etc.) where people find health information. A recent study by IBM (“The future of connected health devices,” 2011), found that consumers want the ability to collaborate online with peers who have similar health issues and interests. This calls for further understanding of searching online for health information in collaborative environments.
References 1. Aboujaoude, E., Koran, L.M., Gamel, N., Large, M.D., Serpe, R.T.: Potential Markers for Problematic Internet Use: A Telephone Survey of 2,513 Adults. CNS Spectrums (2006) 2. Anderson, T.D.: Uncertainty in action: Observing information seeking within the creative processes of scholarly research. Information Research 12(1) (2006) 3. Anonymous: The future of connected health devices: Liberating the Information Seeker (2011), http://www-935.ibm.com/services/us/gbs/thoughtleadership/ ibv-connected-health-devices.html (accessed July 10, 2011) 4. Bandura, A.: Social cognitive theory of mass communication. In: Bryant, J., Zillman, D. (eds.) Media Effects: Advances in Theory and Research, pp. 121–154. Lawerence Erlbaum Associates, Hillsdale (2002) 5. Beard, K.W., Wolf, E.M.: Modificaiton in the proposed diagnostic criteria for Internet addiction. CyberPsychology & Behavior 4, 377–383 (2001) 6. Benigeri, M., Pluye, P.: Shortcoming of health relation information on the internet. Health Promotion International 18(4), 381–387 (2003) 7. Bessiere, K., Pressman, S., Kiesler, S., Kraut, R.: Effects of Internet Use on Health and Depression: A Longitudinal Study. Journal of Medical Internet Research 12(1) (2010) 8. Caplan, S.E.: Problematic Internet use and psychosocial well-being: Development of a theory based cognitive-behavioral measurement instrument. Computers in Human Behavior 18, 553–575 (2002) 9. Caplan, S.E.: Preference for online social interaction: A theory of problematic Internet use and psychosocial well-being. Communication Research 30, 625–648 (2003) 10. Caplan, S.E.: A social skill account of problematic Internet use. Journal of Communication 55, 721–736 (2005) 11. Case, D.O.: Looking for information: A survey of research on information seeking, needs, and behaviour. Elsevier, Amsterdam (2007) 12. Chowdhury, S., Gibb, S., Landoni, M.: Uncertainty in information seeking and retrieval: A study in an academic environment. Information Processing and Management 47, 157–175 (2011) 13. Davis, R.A.: A cognitive-behavioral model of pathological Internet use. Computers in Human Behavior 17, 187–195 (2001) 14. Davis, R.A., Flett, G.L., Besser, A.: Validation of a new scale for measuring problematic Internet use: Implications for pre-employment screening. CyberPsychology and Behavior 5, 331–345 (2002)
66
P. Rao and M.M. Skoric
15. Du, J.T.: Multitasking, Cognitive Coordination and Cognitive Shifts During Web Searching. Queensland University of Technology (2009) 16. Dutta, M.J., Bodie, G.D.: Web Searching for Health: Theoretical Foundations and Connections to Health Related Outcomes. In: Spink, A., Zimmer, M. (eds.) Web Search, Information Science and Knowledge Management, Springer, Heidelberg (2008) 17. Eysenbach, G., Powell, J., Kuss, O., Sa, E.-R.: Empirical studies assessing the quality of health information for consumers on the World Wide Web, a systematic review. Journal of the American Medical Association 287(20), 2691–2700 (2002) 18. Freudenheim, M.: Health care is high among Web searches (2011), http://www.pewinternet.org/Media-Mentions/2011/NYT-HealthCare-Is-High-Among-Web-Searches.aspx (accessed June 2, 2011) 19. Gualtieri, L.N.: The Doctor as the Second Opinion and the Internet as the First. Paper Presented at the CHI 2009, Boston, MA, USA (2009) 20. Jacobs, D.: Cognitive Strategies: Applied Psychology Today. Kendall Hunt Publishers, Dubuque (2002) 21. Kaki, M., Aula, A.: Controlling the complexity in comparing search user interfaces via user studies. Information Processing and Management 44, 82–91 (2008) 22. Kim, K., Lustria, M.L., Burke, D.: Predictors of cancer information overload: findings from a national survey. Information Research 12(4) (2007) 23. Kuhlthau, C.C.: A principle of uncertainty for information seeking. Journal of Documentation 49(4), 339–355 (1993) 24. Lin, S.S., Tsai, C.C.: Sensation seeking and internet dependence of Taiwanese high school adolescents. Computers in Human Behavior 18, 411–426 (2002) 25. Neo, R., Skoric, M.M.: Problematic Instant Messaging Use. Journal of Computer Mediated Communication 14, 627–657 (2009) 26. Ng, G.: Netizens’ top concern: Mental woes. In: My Paper, Singapore (2011) 27. Singh, S.: The cyberchondriacs; WELL-BEING. Sydney Morning Herald (Australia) (June 25, 2011) 28. Spink, A.: A user-centered approach to evaluating human interaction with Web search engines: an exploratory study. Information Processing and Management 38, 401–426 (2002) 29. Spink, A., Dee, C.: Cognitive shifts related to interactive information retrieval. Interactive Information Retrieval 31(6), 845–860 (2007) 30. Spink, A., Jansen, B.: How People Search The Web. Web Search: Public Searching of the Web. Kluwer Academic Publishers, Dordrecht (2004) 31. Vakkari, P., Jarvelin, K.: Explanation in Information seeking and retrieval. In: Spink, A., Cole, C. (eds.) New Directions in Cognitive Information Retrieval, pp. 113–138. Springer, Heidelberg (2005) 32. White, R., Horvitz, E.: Cyberchondria: Studies of the Escalation of Medical Concerns in Web Search (2008), ftp://ftp.research.microsoft.com/pub/tr/TR-2008-178.pdf (accessed November 10, 2010) 33. White, R., Horvitz, E.: Experiences with Web Search on Medical Concerns and Self Diagnosis. In: Annual Symposium of the American Medical Informatics Association (2009) 34. Wilson, T.D., Ellis, D., Ford, N., Foster, A.: Uncertainty in information seeking (1999), http://informationr.net/tdw/publ/unis/app3.html#upro (accessed May 2, 2011) 35. Zarcadoolas, C., Pleasant, A., Greer, D.: Advancing health literacy: A framework for understanding and action. Jossey-Bass, San Francisco (2006)
The Role of Trust and ICT Proficiency in Structuring the Cross-Boundary Digital Government Research Djoko Sigit Sayogo1, Taewoo Nam1, and Jing Zhang2 1
Center for Technology in Government, University at Albany-SUNY, Albany, New York 2 Clark University, Worcester, MA {dsayogo,tnam}@ctg.albany.edu,
[email protected]
Abstract. This paper aims to ascertain the significant role of trust and communication in structuring the formation of digital government research collaboration. The data shows that trust has prominent role in structuring collaboration manifest in three instances of interpersonal linkages, namely: network closure, reputation, and similarity of country of origin. This study also found that multicultural collaboration requires communication medium for richer interpretation and discussion, including online tools. This result suggests that venturing on multi-cultural or cross-boundary collaboration requires well thought-out and carefully planned approach with closeness, interaction, and trust emerge as the major considerations.
1 Introduction Study by Ulbrich et.al. (2009) pointed at three complementally main components of collaborative capability, namely: trust, communication, and commitment [19]. The tripartite of trust, communication ability, and commitment are essential determinants to support the formation of collaboration [19]. Communication ability works mutually, complementary, and/or perhaps interactively, with trust and commitment to ensure successful development of collaborative activity. In addition, Information technology is also found to be significant determinant in cross-boundary research collaboration, by accelerating research team formation [3] and reduces cost of collaboration [1]. Majority of studies investigating scientific collaboration pattern generally were inferred from structuring the co-authorship of published articles [1, 3]. Co-authorship in published journal might indicate the outcomes of collaboration, but it does not capture the process to develop and create collaboration and particularly, lack explanatory power to explain the role of trust, communication, and commitment in developing collaboration. In addition, Rethemeyer [16] strongly argue for the inadequacy of caseby-variables as basis of studying networks and network relationship and strongly emphasizes on the measurement based on dyadic relationship [16]. This paper contributes in providing insight on the role of trust and communication technology in structuring the scientific collaboration formation based on the dyadic relationship measurement. This paper will address the following research questions 1) how do collaborations emerge within a trans-national network? 2) How does trust and online collaborative tools affect the creation of digital government research collaborations? The data are derived using sociometric questionnaire at two times point from A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 67–74, 2011. © Springer-Verlag Berlin Heidelberg 2011
68
D.S. Sayogo, T. Nam, and J. Zhang
North American Digital Government Working Group (NADGWG), a working group of digital government researchers from Canada, Mexico, and the United States.
2 Literature Review and Hypothesis Development 2.1 Multi-cultural Research Team Structure and the Role of Communication Medium Comparative research category has become major themes in digital government research [18]. The significant development of internationalization and cross-national characteristics of digital government research pointed at the challenge of multinational research team in cross-boundary digital government research [4, 9]. Deciding on the composition of research team, the countries representations, and the mechanism of collaboration becomes significant issues to overcome cultural bias and distortion [4]. Forming team in multi-cultural research collaboration is challenged with the issue of “contextual distance”, the disparities of contexts that distance the participating members [2]. Overcoming contextual distance necessitate the promotion of affective-based trust built from dense and close network [6], close interpersonal linkage [14], and through the choice of communication mediums. Different communication mediums have different impact on collaboration [5]. For instance, face-to-face meetings provide a more effective platform to convey intention, thus leading to greater level of trust and cooperation [5]. While online collaboration enables greater reach, larger access to new information, share expertise, and discuss ideas that are often not available locally [21]. Thus, we hypothesize: H1 : To promote personal trust, researchers engage in network closure and short distance path in creating ties. H2: Face-to-face meetings significantly determine the creation of ties in scientific collaboration overtime. H3: Online collaborative media positively determines the creation of ties. 2.2 Trust and the Role in Collaboration This study will focus on the interpersonal level of trust among the member of transnational digital government team. The inter-personal trust can be categorized into affective based and cognitive based trust [8]. Affective and cognitive based trust play interchangeable role in the formation of relationship within research collaboration. The intensification and significance of affective- and cognitive-based trust will differ along the stages of collaboration [15]. Affective based trust refers to the emotional ties linking individuals who participate in the relationship [8, 11]. Hence, affective based trust is significant in the context of close social relationships [7] and is developed through frequent interactions among actors [8]. A cognitive-based trust, on the other hand, is based on calculative and rationality motive [7]. Cognitive based trust can be regarded as trust developed through cognitive process of individual on the condition and evidence of trustworthiness of other [8, 11]
The Role of Trust and ICT Proficiency in Structuring the Cross-Boundary
69
developed through acquiring, processing, and storing of information [7]. As result, lack of interactions among collaborators in the initial stage of collaboration will force collaborators to confide their assessment on backgrounds, reputation, professional credentials, and affiliations [7, 15, 17]. Collaborators will use reputation as the basis to evaluate partner’s trustworthiness to make decision on partner selection. The significance of reputation in collaborative work will diminish, especially when the actors have more opportunity to interact with other subsequent of their first meeting. The emotional ties that underlie the creation of affective-based trust, could also emerge in the form of “cultural homophily”, a similarity in culture [10]. Similarity in culture will increase the likelihood of interaction [10], while contextual differences might create distance [2]. Arguably, in the initial stage of collaboration, researchers will create ties and forming collaborative works with other from similar country in an effort to reduce the risk of cultural distance. This impact will diminish in the subsequent stages when researchers had the opportunity to interact and get to know each others. We hypothesize: H4: Reputation will significantly determines the creation of ties in the initial stages of collaboration. H5: Cultural Homophily will significantly determines the creation of ties in the initial stage of collaboration.
3 Research Design and Methodology The sample is based on two waves of longitudinal complete network data from scientific collaboration within the North American Digital Government Working Group (NADGWG). NADGWG is a working group of researchers and practitioners from a variety of institutions and disciplines from Canada, the United States, and Mexico region to advance digital government research and practice across geographic and political boundaries in the region. Data were collected using sociometric questionnaires within two wave periods. The first wave was May, 2007 and the second wave was November 2008. The dependent variable in this research is the dyadic collaborative relationship measuring whether actors ever wrote journal article, book chapter, conference paper, research grant proposal, or professional report together. There are five independent variables of interest in this research. 1) Network structure refers to micro and macro structure, such as: reciprocity. 2) Proficiency in using virtual collaborative tools, measured using a 5-point likert scale. This variable consists of 5 sub-variables, namely, email, Microsoft sharepoint, wikis, blogs, and chats and forums. 3) Face to face meeting is dichotomous measure indicating ever have face to face meeting before 4) Reputation variable is dichotomous measure indicating ever heard of the person listed in the network before. 5) Similarity of country of origin is measured using nominal value. The two kinds of analysis used in this research are graph theory and exponential random graph model (ERGM), both represent social network analysis.
70
D.S. Sayogo, T. Nam, and J. Zhang
4 Results and Discussion 4.1 The Structure of NADGWG Collaboration: Network Closure and “Small World” Network The result based on the graph theory indicates that the density of network increases overtime by 109% from pre-collaboration to mid-collaboration (table 1). The average distance, measuring the distance of actor connection to another, decreases by 22%. The cohesiveness of distance of actors to all other reachable actors increases almost by 79% and the closeness of immediate neighbor in the network is shown to be high. The results from the deterministic approach are also supported by the findings from ERGM. There is high propensity of tie creation in scientific collaboration affected by reciprocity of relationship and out-2-stars structure. The result also reveals that higher order structure (out-k-stars, in-k-stars, and k-triangles) is not statistically significant for scientific collaboration formation. Table 1. Network Measurement
Dimension Density Average distance Compactness Breadth Number of triplets Degree of Centralization Closeness centrality
Precollaboration 0.1884 2.027 0.370 0.630 12,144 47.83% 45.384
Midcollaboration 0.3953 1.585 0.662 0.338 21,924 54.76% 66.174
Source: Ucinet result
The results both from the graph theory (table 1) and ERGM approach (table 2) indicates dense, cohesive network with small path distance, and lower order structure. Conclusively, the findings suggest two possible structure of team formation in multicultural NADGWG network, namely: a) network closure and b) small-world structure. A network closure is a condition in which member of the network create dense and close connection among the members of the network. Closed and clustered network increases the chance of interactions [6] that could lead to increase in affectivebased trust [14]. Dense and close relation also beneficial in decreasing search time and cost by shortening the access channels of communication [20]. The result also suggests the possible “small-world” network structure, shown from the decrease of average distance along with significant increase of cohesiveness distance. In small-world network, the strong connection to the adjacent actor in the network is complemented with a random connection to other non-adjacent actor through a short-cut. These findings support the assertion that knowledge networks are best supported in ‘small network’ structures. ‘Small-world’ types of network assist in fostering knowledge generation through shared researchers’ specialization [13].
The Role of Trust and ICT Proficiency in Structuring the Cross-Boundary
71
4.2 Differential Impact of Communication Medium This study found that different communication mediums have different impact on the formation of multi-cultural digital government research team. This study found a face-to-face meeting as critical factor affecting the formation of research collaboration. The findings from model 2 (table 2) suggest that the variable “ever had a face-toface meeting before” is statistically significant in both phases of collaboration with decrease of coefficient magnitudes over time. Considering the constant significant with decrease in coefficient magnitude over time, this paper argues for the suggestive evidence of differential roles and importance of face-to-face meeting. In the initial stage, the role of face to face is to facilitate evaluation of trustworthiness in partner selection based on the richer cues and interpretations from physical engagement. In the second stage, face to face functions as providing assurance and maintenance of trustworthiness. On the other hand, this research found that only similarity in proficiency in using sharepoints as positive and significant determinant of team formation over time (table 2), while similarity in proficiency in using blogs or chat have negative notation and only significant in pre-collaboration. This finding provides suggestive evidence on the impact of different features from different online collaboration tools in shaping the creation of ties in collaboration. Anecdotally, lack of interactivity in blogs and oneon-one basis of interaction in instant messaging limit the ability to provide rich medium for sharing and discussion. This finding calls for further research to examine the relationship of different characteristics and features of online collaborative tools in mediating and facilitating ties creation in collaboration. 4.3 The Role of Trust in Structuring Collaboration This study found that reputation is only statistically significant in the initial stages of collaboration (table 2). The finding of this research support the argument that in the condition where the collaborators lack of initial engagements or interactions, they support their trust creation on other references, such as: reputation or professional credentials [7, 12, 15, 17]. Trust based on reputation and credentials is regarded as swift trust which is crucial in the development of collaboration in a temporary team [12], virtual team [7], or geographically dispersed team. Rousseau et.al. (1998) pointed out that trust is developed through frequent interactions in social relation [17]. Arguably, the swiftness of trust development, based on reputation and other references to support virtual or geographically dispersed team formation, will not linger. This study found that reputation is no longer significant in the second stage of collaboration. Thus, we argue that when the collaborators are able to build frequent interactions and relationship, reputation as the basis of trust loose it significance. This study also found the significant of “cultural homophily” as the basis of ties creation in the initial stages of collaboration table 2). Referring to the argument of “contextual distance” in multi-cultural research team [2], researchers use similarity of country of origin as the initial base to support their decision on team selection. On the other hand, the magnitude of coefficient estimates also decreases by 68 percentage points from 0.82 in pre to 0.26 in mid-collaboration. This diminishing magnitude of
72
D.S. Sayogo, T. Nam, and J. Zhang
estimates and significance level from pre to mid collaboration might suggest that researcher consider cultural homophily less important after the first meeting. Arguably in the second stages, researchers have the opportunity to interact and get to know others. This increase in interactions reduces the impact of cultural homophily in determining the creation of ties. Table 2. ERGM Result Model 1
Variables Pre Reciprocity transitive triplets 3-cycles out-2-stars in-2-stars alternating out-kstars alternating in-k-stars alternating k-triangle Email proficiency Sharepoint proficiency wiki proficiency Blogs proficiency Chat proficiency Educational background Country of Origin Facetoface meeting Ever heard * significant at 5% (df = ∞) ** significant at 10% (df = ∞)
4.55 -0.76 0.43 -0.06 -0.81 -0.19 0.20 -0.07 0.07 -0.09 -1.26 -0.62 -0.08 -0.66 0.37 -0.31
Model 2 Mid
* * * *
*
5.38 -0.54 0.18 -0.04 -0.54 -0.10 0.15 -0.03 0.12 -0.03 0.13 -0.79 0.53 -0.82 0.10 -0.42
Pre * * * * *
3.65 -0.80 0.49 -0.09 -0.88 -0.26 0.24 -0.09 0.14 -0.11 -2.03 -0.77 -1.54 -0.81 0.27 -0.41 -0.28 -0.44 0.83 -0.48 0.50 -0.55 -0.97 -0.64 -1.03 -0.73 0.16 -0.42 0.82 -0.34 2.14 -0.44 0.96 -0.53
Mid * * * *
* *
*
** **
* * *
5.28 -0.55 0.20 -0.05 -0.58 -0.13 0.11 -0.03 0.09 -0.03 -0.08 -0.79 0.27 -0.82 0.16 -0.43 0.08 -0.15 0.33 -0.25 0.26 -0.25 -0.18 -0.22 -0.07 -0.26 -0.36 -0.21 0.26 -0.18 0.73 -0.18 0.13 -0.24
* * * * *
**
* **
*
The Role of Trust and ICT Proficiency in Structuring the Cross-Boundary
73
5 Conclusion This study found the suggestive evidence of the prominent role of trust and communication mediums in structuring collaboration in the NADGWG research network manifested in 1) the closed interpersonal linkage of the network structure, 2) the significance of reputation, and 3) similarity of country of origin. The role of trust in structuring collaboration differs across the stages of collaboration. This study also found a face-to-face meeting (as offline medium) and Microsoft sharepoints (with richer features for collaboration) are significant determinants of multi-cultural research collaboration formation. This result suggests that conducting research collaboration across discipline and cross boundaries imposed high psychological cost for the researchers. Hence, requires well thought-out, carefully planned approach with closeness, interactivity, and trust are the major considerations in partner selection. This present findings should be interpreted in light of the study’s limitations which directs for future research. First, the objects of this study were network of researchers born and reside in the North American region. Thus limits the generalibility of the findings. Future research could test the hypothesis on different collaboration settings. Second, this study measures the impact of communication tools based on the subjective perceptions of proficiency in using technology.
References [1] Adams, J.D., Black, G.C., Clemmons, J.R., Stephan, P.E.: Scientific Teams and Institutional Collaborations: Evidence from US Universities, 1981-1999. Research Policy 34(3), 259–285 (2005) [2] Dawes, S.S., Gharawi, M., Burke, B.: Knowledge and Information Sharing in Transnational Knowledge Network: A Contextual Perspective. In: 44th Hawaii International Conference on System Sciences. IEEE, Los Alamitos (2011) [3] Ding, W.W., Levin, S.G., Stephan, P.E., Winkler, A.E.: The Impact of Information Technology on Academic Scientists’ Productivity and Collaboration Patterns. Management Science 56(9), 1436 (2010) [4] Eglene, O., Dawes, S.S.: Challenges and Strategies for Conducting International Public Management Research. Administration & Society 38(5), 596 (2006) [5] Frohlich, N., Oppenheimer, J.: Some Consequences of E-Mail vs. Face-To-Face Communication in Experiment. Journal of Economic Behavior & Organization 35(3), 389– 403 (1998) [6] Haythornthwaite, C., Wellman, B.: Work, Friendship, and Media Use for Information Exchange In A Networked Organization. Journal of the American Society for Information Science 49(12), 1101–1114, 389–403 [7] Kanawattanachai, P., Yoo, Y.: Dynamic Nature of Trust in Virtual Teams. Journal of Strategic Information Systems 11, 187–213 (2002) [8] Lewis, J.D., Weigert, A.: Trust as a Social Reality. Social Forces 63(4), 967 (1985) [9] Lim, L., Firkola, P.: Methodological issues in cross-cultural management research: Problems, solutions, and proposals. Asia Pacific Journal of Management 17(1), 133–154 (2000)
74
D.S. Sayogo, T. Nam, and J. Zhang
[10] Mark, N.P.: Culture and Competition: Homophily and Distancing Explanations for Cultural Niches. American Sociological Review 68(3), 319–345 (2003) [11] McAllister, D.: Affect- and Cognitive-Based Trust as Foundations for Interpersonal Cooperation in Organizations. Academy of Management Journal 38(1), 24–59 (1995) [12] Meyerson, D., Weick, K.E., Kramer, R.M.: Swift, Trust and Temporary Groups. In: Kramer, R.M., Tyler, T.R. (eds.) Trust in Organizations: Frontiers of Theory and Research, pp. 166–195. Sage, Thousand Oaks (1996) [13] Müller, M., Cowan, R., Duysters, G., Jonard, N.: Knowledge Structures. Working Papers of BETA (2009) [14] Nicholson, C.Y., Compeau, L.D., Sethi, R.: The Role of Interpersonal Liking in Building Trust in Long-Term Channel Relationships. Journal of the Academy of Marketing Science 29(1), 3–15 (2001) [15] Nielsen, B.B.: The Role of Trust in Collaborative Relationships: A Multi-Dimensional Approach. Management 7(3), 239–256 (2004) [16] Rethemeyer, R.K.: Making Sense of Collaboration and Governance: Issues and Challenges. Public Performance & Management Review 32(4), 565–573 (2009) [17] Rousseau, D.M., Sitkin, S.B., Burt, R.S., Camerer, C.: Not so Different After All: A Cross-Discipline View of Trust – Introduction to Special Topic Forum. Academy of Management Review 23(3), 393–404 (1998) [18] Scholl, H.: Profiling the EG Research Community and Its Core. Electronic Government, 1–12 (2009) [19] Ulbrich, S., Troitzsch, H., van den Anker, F., Plüss, A., Huber, C.: Collaborative Capability of Teams in Network Organizations. In: Camarinha-Matos, L.M., Paraskakis, I., Afsarmanesh, H. (eds.) PRO-VE 2009. IFIP AICT, vol. 307, pp. 149–156. Springer, Heidelberg (2009) [20] Walter, J., Lechner, C., Kellermanns, F.W.: Knowledge Transfer Between and Within Alliance Partners: Private versus Collective Benefits of Social Capital. Journal of Business Research 60(7), 698–710 (2007) [21] Wasko, M., Faraj, S., Teigland, R.: Collective Action and Knowledge Contribution in Electronic Networks of Practice. Journal of the Association for Information Systems 5(11-12), 493–513 (2004)
Integration and Warehousing of Social Metadata for Search and Assessment of Scientific Knowledge Daniil Mirylenka, Fabio Casati, and Maurizio Marchese Department of Information Engineering and Computer Science University of Trento, Via Sommarive 14, 38123, Trento, Italy {dmirylenka,casati,marchese}@disi.unitn.it Abstract. With the advancement of Web, novel types of scientificrelated data and metadata are emerging from a growing number of various sources. Alongside traditional bibliographic data provided by digital libraries great amounts of social metadata (such as bookmarks, ”reads”, tags, comments and ”likes”) are created and accumulated by social networking services. We believe that these metadata can be fruitfully used for improving search and assessment of scientific knowledge. The individual sources of scientific metadata differ largely in their focus, functionality, data coverage and data quality, and are currently limited to their own databases and data types. We suggest that we can enhance the current individual services by integrating their data and metadata. In this paper we discuss the opportunities and challenges of such integration for the purpose of facilitating the discovery and evaluation of scientific knowledge, and present a framework for integration and warehousing of both bibliographic and social scientific metadata.
1
Introduction
Dissemination and evaluation of scientific knowledge is essential to the progress of science in any field. On a daily basis researchers search for scientific contributions, being guided by various reputation metrics in judging their quality and relevance. With the advent of the Web, the opportunity for new models of scientific knowledge dissemination and evaluation has emerged. Digital libraries have enabled effective search over the large collections of bibliographic metadata about published contributions and their authors, and provided access to citation-based metrics such as the number of citations and h-index [6]. The Social Web has created new types of scientific data and metadata. Being no longer restricted to published articles, scientific knowledge may now be contained in different types of resources such as publication preprints or user blogs. Social networking services have also influenced the way the scientific knowledge is disseminated. Using the Web, researchers now generate large amounts of usage metadata by expressing their opinions on scientific resources, either explicitly or implicitly – by adding them to the personal and group libraries, ”liking”, sharing, downloading or sending them by e-mail to the colleagues. Moreover, they A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 75–83, 2011. c Springer-Verlag Berlin Heidelberg 2011
76
D. Mirylenka, F. Casati, and M. Marchese
semantically enrich and structure the information by tagging, annotating and linking the resources. There are, however, a number of problems that prevent these social and bibliographic metadata from being fully exploited. First, today’s scientific digital libraries differ in their focus, data coverage and data quality, restricting most often the search to one particular database. Second, web users usually participate in a limited number of social networking services, thus partitioning the potentially available social metadata and, similarly, limiting the search to few sources at a time, typically one. In our current work, we propose a conceptual model, design and implementation of a socially-oriented Scientific Resource Space (SRS) – an IT platform for the integration of various sources of bibliographic and social scientific metadata. Our final goal is to use this platform and the warehoused metadata to facilitate the discovery, navigation and assessment of the scientific knowledge. We argue that, by integrating the metadata from various sources, we will be able to improve upon existing services by providing: (a) enhanced search over a greater amount of data and metadata, (b) optimized search and navigation taking into account larger amounts of user-provided structural metadata, such as tags and links between resources, (c) improved ranking algorithms based on the combination of traditional citation-based and novel social usage metrics. Moreover, we suggest that the proposed platform can be the primary tool for exploring and analyzing the social metrics and the space of scientific resources in general. In the following section, we first present a critical overview of the state of the art about the integration of bibliographic and social metadata. In Section 3 we describe our conceptual model for the proposed socially-oriented Scientific Resource Space, while in Sections 4 we detail the architecture and the processes involved in the implementation of the SRS platform. In Section 5 we discuss our early experiences and conclude the paper.
2
State of The Art
Many scholarly digital libraries are populated in part or in full by collecting data from other digital libraries or web pages. Therefore, the problem of data and metadata integration has been widely addressed in this field. Depending on the scope and sizes of their datasets, different libraries employed different approaches to collecting and maintaining their data. The Digital Bibliography & Library Project (DBLP) is one of the most widely used bibliographic databases in the Computer Science domain. DBLP gets its data from a predefined number of sources, relying largely on the human effort during both data acquisition and cleaning phases [11]. This approach allows for high data quality, but is only feasible with small to medium datasets. Scholarly search engines, such as Google Scholar or Microsoft Academic Search, represent another approach to integration of bibliographic metadata. Although they disclose no precise information about their architectures and algorithms,
Integrating Social Metadata
77
it is known that they obtain their data by crawling the web and extracting bibliographic records from the publication texts1 . Attempts at personalization and socialization of digital libraries led to creation of a number of specialized social networking services (SNS) for scientists [7]. Besides search functionality, sites like Mendeley, CiteULike or Connotea allow users to create and maintain the libraries of scientific resources, and tag, annotate, group, share and comment on them. In contrast to traditional digital libraries, resource metadata can also be provided by users, who are allowed (and encouraged) to add new resources to the database. In general, this approach creates the opportunity for collecting large amounts of metadata, but results in a lower metadata quality, for instance resource duplication. Various data models and protocols of scholarly metadata integration have been proposed, with their focus being mainly on bibliographic metadata. The Dublin Core [19] metadata standard has been adopted by Open Archive Initiative (OAI) [9] to enable interoperability between the digital libraries through a metadata harvesting protocol (OAI-PMH) [10]. This allowed for creation of large bibliographic databases, such as OAIster, that integrate data from the sources that support OAI-PMH (by exposing their data in a predefined format). Other models of scientific metadata include The Bibliographic Ontology [5], The SWRC Ontology [18], Common European Research Information Format (CERIF) [1], CCLRC Scientific Metadata Model [12], MESUR ontology [17] and others. Attempts have also been taken to face the integration of social metadata. World Wide Web Consortium (W3C) has investigated the opportunities for federated social web [16]. Among other activities, this includes the work on OStatus (a standard for exchanging status updates), Portable Contacts, and Semantic Microblogging Framework (SMOB) [14]. The Semantic Web community has proposed ontologies for modeling different aspects of social web [3][2][4][15].
3
Conceptual Model
In this work we rely on the definition of the Scientific Resource Space (SRS) [13] and extend it with the notion of social metadata. In brief, SRS provides homogeneous programmatic access to scientific resources present on the web by abstracting the various kinds of data and metadata into a uniform conceptual model to support uniform access logic. Our conceptual model for a socially-oriented SRS (Figure 1) revolves around scientific resources and relations between them, as well as on their bibliographic and social metadata. Scientific Resource is the central concept in our model, and it represents any identifiable content on the web that is of scientific value or is involved in the research process, be it a publication, review, dataset, experiment or even blog entry or wiki page. The main attributes of a Scientific Resource are URI, type, format and title. For example, consider the Scientific Resource with the following attributes: (a) title: ”Data integration: A theoretical perspective”, 1
http://scholar.google.com/scholar/inclusion.html http://academic.research.microsoft.com/About/Help.htm
78
D. Mirylenka, F. Casati, and M. Marchese
Fig. 1. Conceptual model of Scientific Resource Space (SRS)
(b) URI: http://doi.acm.org/10.1145/543613.543644, (c) type: conference paper, and (d) format: PDF. Connected to Scientific Resource with many-to-many relations are entities: Author, P ublisher and V enue, representing respectively contributors to the creation of the resource and to its dissemination. For our example paper, the Author would be entity representing a scientist, Maurizio Lenzerini, V enue would represent Symposium on Principles of Database Systems, and P ublisher would represent Association for Computing Machinery (ACM). Connections between scientific resources are modeled as relations of different types, of which an important example is a citation relation between publications. Others include relations between papers and the corresponding presentation slides, or between experiments and datasets they are based on, or between experiments and papers reporting on them. Versioning and similarity between scientific resources are among other aspects that can be modeled via relations. The main focus of our model are the social activities around scientific resources, such as how people use and disseminate the resources, and what they say about them. SocialM etadata captures these activities with its three subtypes: F reeT ext represents unstructured texts such as comments or notes attached to resources, which we do not intend to interpret. LabelT ext is a text that can serve for classification of resources, with typical example being users’ tags in social bookmarking sites. We may or may not want to interpret these labels, establish relations between them or merge them into a single categorization scheme. The third type of SocialM etadata is Action, and it models any kind of user activities towards resource, such as sharing, ranking, ”liking”, bookmarking or downloading. Depending on the type and value associated with it, action
Integrating Social Metadata
79
may express users’ interest to resource, and their assessment of its quality or relevance. The interpretation of Actions is, however, left to applications. The presented conceptual model is also the underlying model of our metadata warehouse, thereby explicitly including some of the attributes of the data integration process. Source stands for the source system, such as DBLP or Mendeley or CiteULike, that provided the particular metadata element. T ime is the time when the metadata element was acquired from the source. U ser is the optional attribute representing the web user who created, explicitly or implicitly, the metadata element within the Source. In the case of SocialM etadata, U ser is the same subject who preforms an activity involving a scientific resources.
4
Socially-oriented Scientific Resource Space and Metadata Integration
The proposed SRS model presents a facade between the client applications and various data sources, providing a uniform access to the integrated data of the latter. It is composed of the integration layer and the set of APIs through which it is accessed by the applications. The integration layer consists of the adapter layer, the Metadata Warehouse, and the on-demand data acquisition engine. The Adapter Layer incapsulates the particularities of the data sources and their data models and helps to cope with the heterogeneity of scientific metadata. Each adapter is responsible for getting metadata according to the protocols and APIs of the source and transforming it into the model of Scientific Resource Space. The transformed metadata can then be subjected to warehousing or, after being processed, served directly to the client application. In the following, we describe the metadata warehouse and the on-demand data acquisition in more detail. 4.1
Metadata Warehousing
The central component of the SRS integration layer is the Metadata Warehouse module (Figure 2), whose implementation largely follows the traditional ETL (Extract Transform Load) process. The scientific metadata is first gathered from a source by the corresponding adapter and stored into a so-called source dump – set of preliminary tables dedicated to this source. The metadata is then loaded into the staging area where it is joined with metadata from other sources. At this stage, metadata elements are preliminary merged based on the identifiers provided by sources to ensure no duplicates at the source level. During the following cleaning phase the staging area is analyzed to discover and merge entities duplicated across different sources. After being cleaned, the metadata is finally loaded from the staging area into the target database, where it is made available to the applications. At each stage of the process only the incremental changes are made to the corresponding tables, which is achieved by computing the difference between the desired and the current state of the tables. The applications built on top of SRS focus on different usage of the scientific metadata. In order to provide useful functionality with reasonable performance,
80
D. Mirylenka, F. Casati, and M. Marchese
Fig. 2. High-level architecture of Scientific Resource Space (SRS)
Fig. 3. On-the-fly search application
they require efficient access to their own representations of the scientific resource space. For instance, Reseval [8] – a tool for evaluating scientific contributions, authors and institutions – uses various research impact metrics. The number of citations and self-citations of a paper or an author are the primary units of data for Reseval, accessed very frequently and used to construct more complex metrics. For efficient access, these numbers can not be calculated dynamically and have to be precomputed. SRS addresses this problem by creating the applicationspecific views that contain all the data needed by the application in a suitable format, and are updated at the final stage of the ETL process. In order to enable source-dependent requests, SRS propagates the information about the sources of metadata elements through all of the stages of the process to the target database and the application-specific views. At any time for any metadata element it is possible to know which source or sources it originates from. In case of Reseval this enables the computation of metrics with respect to any source or combination of them. 4.2
On-demand Data Acquisition and Integration
For some applications it is possible to answer queries by forwarding requests to the services provided by the sources and integrating the results on the fly. This functionality is implemented by the on-demand data acquisition engine of SRS. It allows for the small portion of up-to-date metadata to be fetched from the sources and used to answer the query, without making it undergo the heavy and off-line warehousing process. The adapter layer is still involved to translate the query into the language of the source and map the results back into SRS model. The integration can, however, be done on demand and in real time.
Integrating Social Metadata
81
One of the examples of an application using this implementation of SRS is a scientific metasearch2. In this application the search queries are forwarded to the sources (in the specific case, Mendeley, Microsoft Academic Search (MSAS) and CiteULike), and the search results are obtained and transformed into the model of SRS. Results from different sources are then matched against each other to identify results representing the same resources. The matched results are merged into a single resource combining the metadata of all of them. For instance, the search results for the term ”dataspaces” (Figure 3) contain 8 entities corresponding to the paper ”From databases to dataspaces...”, 6 of them coming from Mendeley and two from MSAS and CiteULike respectively. In the search results of our system they all are merged into one resource, for which the citation data is coming from MSAS, while readership statistics and tags are aggregated over the number of corresponding entities in Mendeley and CiteULike. The aggregated resources can optionally be augmented with metadata from other sources, re-ranked and filtered according to the user preferences. The user can explore the results by reordering and filtering them, and following the links to resources within various source systems.
5
Preliminary Results and Conclusion
At present, we are using the first implementation of SRS and experimenting with a prototype search application2 following the on-demand metadata acquisition and integration approach. In our experiments we have used Microsoft Academic Search (MSAS), CiteULike and Mendeley as sources. All these sources have provided primary metadata about publications, such as authors, venue and publication year. In addition, citation statistics (the number of citations) has been obtained from Microsoft Academic Search, while CiteULike and Mendeley have also provided some usage statistics (mainly the number of users who bookmarked the publication). This application has allowed us to compare the search results returned by these services and start exploring the difference between them. This has supported our intuition that joining the search results from different sources can improve the coverage and the diversity of search results. We have also learnt some lessons regarding the benefits and the limitations of the on-demand data acquisition. On the one hand, this approach enables to use more sources. Specifically, in this approach we can leverage from the fact that normally sources are more likely to provide search API than the direct access to their data. On the other hand, this approach does not allow us to influence the search algorithms of the sources, but only to reorder the retrieved results. In contrast, a full metadata warehousing solution requires all the data to be gathered and processed in advance, but it provides complete control in the implementation and fine-tuning of the search algorithms. Another limitation of the on-demand approach is the response time. We have built an initial implementation of the Metadata Warehouse and used it to build a number of research applications. One example of such applications is 2
http://metasearch.mateine.org
82
D. Mirylenka, F. Casati, and M. Marchese
a survey on how researchers find references for their papers3 . Given a user name, the application suggests a number of recent publications of the user. The user can choose the publication, and specify, for each reference of this publication, the way in which it was found (for example, by searching in a digital library, or as a suggestion of a colleague, etc.). The results of this survey can later be used as another source of metadata for SRS, and thus made available to other applications. Another application built on top of SRS investigates the potential of various social networks as sources of reference recommendation4 . In this paper we have focused on the management and use of social and bibliographic metadata available on the Web for search and evaluation of scientific resources. We have discussed the challenges and opportunities of the integration of these metadata, and proposed an integration solution called Scientific Resource Space (SRS). We have then described the model and the architecture of SRS and discussed some preliminary results. Future work includes: (1) a rigorous investigation of the difference in the ranking of the search results obtained from different metadata and (2) the exploration of novel social metrics based both on social metadata and on the combination of bibliographic and social metadata.
References [1] Asserson, A., Jeffery, K., Lopatenko, A.: CERIF: past, present and future: an overview. In: CRIS (2002) [2] Breslin, J., Decker, S.: SIOC: an approach to connect web-based communities. International Journal of Web Based Communities IJWBC 2(2) (2006) [3] Brickley, D., Miller, L.: FOAF vocabulary specification (2005) [4] Ding, Y., Toma, I., Kang, S., Fried, M., Yan, Z.: Data mediation and interoperation in social web: Modeling, crawling and integrating social tagging data. In: SWSM (2008) [5] D’Arcus, B., Giasson, F.: Bibliographic ontology specification. (retrieved October 8, 2010) [6] Hirsch, J.: An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America 102(46) (2005) [7] Hull, D., Pettifer, S.R., Kell, D.B.: Defrosting the digital library: bibliographic tools for the next generation web. PLoS Computational Biology 4(10) (2008) [8] Imran, M., Marchese, M., Ragone, A., Birukou, A., Casati, F., Laconich, J.J.J.: Reseval: An open and resource-oriented research impact evaluation tool. Research Evaluation (2010) [9] Lagoze, C., Van de Sompel, H.: The Open Archives Initiative: Building a lowbarrier interoperability framework (2001) [10] Lagoze, C., Van de Sompel, H., Nelson, M., Warner, S.: Open archives initiativeprotocol for metadata harvesting-v. 2.0 (2002) [11] Ley, M., Reuther, P.: Maintaining an Online Bibliographical Database: the Problem of Data Quality (2006) 3 4
http://survey.mateine.org/ http://discover.mateine.org
Integrating Social Metadata
83
[12] Matthews, B., Sufi, S.: The CCLRC Scientific Metadata Model-Version 2 (2002) [13] Parra, C., Baez, M., Daniel, F., Casati, F., Marchese, M., Cernuzzi, L.: A scientific resource space management system (2010) [14] Passant, A., Bojars, U., Breslin, J., Hastrup, T., Stankovic, M., Laublet, P., et al.: An Overview of SMOB 2: Open, Semantic and Distributed Microblogging. In: ICWSM (2010) [15] Passant, A., Laublet, P.: Meaning of a tag: A collaborative approach to bridge the gap between tagging and linked data. In: LDOW 2008 (2008) [16] Prodromou, E., Halpin, H.: W3C Federated Social Web Incubator Group (2010) [17] Rodriguez, M.A., Bollen, J., Sompel, H.V.D.: A practical ontology for the largescale modeling of scholarly artifacts and their usage. In: ICDL (2007) [18] Sure, Y., Bloehdorn, S., Haase, P., Hartmann, J., Oberle, D.: The SWRC Ontology – Semantic Web for Research Communities. In: EPIA (2005) [19] Weibel, S., Kunze, J., Lagoze, C., Wolf, M.: Dublin core metadata for resource discovery. Internet Engineering Task Force RFC, 2413 (1998)
Comparing Linkage Graph and Activity Graph of Online Social Networks Yuan Yao1,2 , Jiufeng Zhou3 , Lixin Han3 , Feng Xu1,2 , and Jian L¨ u1,2 1
State Key Laboratory for Novel Software Technology, Nanjing, China Department of Computer Science and Technology, Nanjing University, China
[email protected], {xf,lj}@nju.edu.cn 3 Department of Computer Science and Technology, HoHai University, China {jfzhou,lhan}@hhu.edu.cn
2
Abstract. In the context of online social networks, the linkage graph— a graph composed of social links—has been studied for several years, while researchers have recently suggested studying the activity graph of real user interactions. Understanding these two types of graphs is important since different online applications might rely on different underlying structures. In this paper, we first analyze two specific online social networks, one of which stands for a linkage graph and the other for an activity graph. Based on our analysis, we find that the two networks exhibit several static and dynamic properties in common, but show significant difference in degree correlation. This property of degree correlation is further confirmed as a key distinction between these two types of graphs. To further understand this difference, we propose a network generator which could as well capture the other examined properties. Finally, we provide some potential implications of our findings and generator. Keywords: Linkage Graph, Activity Graph, Online Social Networks, Degree Correlation, Network Generator.
1
Introduction
Researchers have made remarkable achievements in analyzing structural properties of the linkage graph, i.e., a graph composed of social links [10,15,1]. Several applications have used these properties, for example, to protect against Sybils [25] or to prevent unwanted communication [16,21]. Recently, researchers have suggested studying the activity graph of real user interactions instead, in order to enhance social network based applications [24,22]. We define linkage graph as a graph where nodes stand for the people in the social network and edges are their friendship links. Activity graph is correspondingly defined as a graph where nodes are the still the people but edges stand for their interactions. Understanding the linkage graph and the activity graph, as well as their similarities and differences, is important for developing future online applications. Wilson et al. [24] have shown that, if operating on the activity graph instead of the linkage graph, the RE application [7] actually performs better while the A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 84–97, 2011. Springer-Verlag Berlin Heidelberg 2011
Comparing Linkage Graph and Activity Graph of Online Social Networks
85
SybilGuard system [25] behaves less effectively. In addition, different online applications might rely on different underlying structures. We identify two categories of applications: linkage-based applications and activity-based applications. Linkage-based applications need available links, for example, to disseminate information, and therefore should be built on the linkage graph. By comparison, activity-based applications, such as trust inference, are based on a history of interactions, and therefore should be built on the activity graph. Existing analysis on online social networks tends to experiment on bidirectional or undirected graphs, or map directed graphs to undirected graphs by removing the unidirectional edges (e.g. [10]). However, simply removing the unidirectional edges may result in a loss of information. Additionally, some applications rely on unidirectional edges, such as edges representing trust which is asymmetric in nature [8]. In view of this, we primarily focus on the analysis of directed networks. In this paper, we mainly study the two graphs mapped from online social networks: the Flickr network which consists of social links, and the Epinions network of user interactions. These two graphs are both directed, and they both have timestamps on every edge for us to study dynamic properties. Dynamic properties are important as many online social networks keep evolving over time. To this end, in addition to analyzing several common static properties, we also explore some dynamic properties including densification, clustering, and diameter over time. Our results show that the two networks follow some common static properties, i.e., they both exhibit power-law degree distribution, high clustering coefficient, and small diameter. As to dynamic properties, we find that both networks follow densification law and relatively stable clustering coefficient over time. However, we do not observe diameter shrinking in Epinions’ activity graph, while this shrinking exists in Flickr. One of our major findings is the difference in degree correlation, also known as degree mixing [17], of the two graph types. Traditional social networks have essentially positive degree correlation, indicating that gregarious people tend to make friends with each other. This property can also differentiate social networks from other networks such as technological and biological networks [17]. However, this positive degree correlation does not always hold in online social networks. Based on additional experiments, we find that linkage graphs still have positive degree correlation whereas activity graphs show neutral degree correlation. We then confirm degree correlation as a key distinction between activity graphs and linkage graphs. To further understand and capture the difference of degree correlation, we propose a network generator which could capture other static properties and dynamic properties as well. Our generator has only two parameters and follows the intuition that online linkage graphs have a high reciprocity relative to activity graphs. In addition to understanding the underlying factors of the emerged properties, the generator can also be used to evaluate algorithms on networks [8], and to generate realistic graphs when real data is unavailable or useless.
86
Y. Yao et al.
Our findings and generator, while preliminary, could provide potential implications for a variety of online applications, including viral marketing, social search, and trust and reputation management. In this paper, we concentrate on two specific applications from the linkage-based applications and the activitybased applications, respectively. The rest of the paper is organized as follows. Section 2 covers related work on property analysis and network generator. Section 3 presents our results of networks in several aspects, including static and dynamic properties. Section 4 presents our network generator and discusses its simulation results. Section 5 discusses some implications of our findings and generator for different applications. Section 6 concludes the paper.
2
Related Work
There is a rich set of research on analyzing the structural properties of graphs mapped from online social networks. Kumar et al. have studied the datasets of Flickr and Myspace, and they found that the networks are composed of three groups: singletons, giant component, and isolated communities [10]. They also analyzed how nodes evolved among these three groups. For example, isolated communities might merge into the giant component. However, they mapped the networks to undirected graphs by leaving out the unidirectional edges. In contrast, all the measurements we choose to study are based on directed graphs. Mislove et al. have taken a set of measurements on Flickr, LiveJournal, Orkut, and YouTube [15]. They found that power-law exists in both out-degree and indegree distribution, and nodes with high out-degree tend to have high in-degree. They also found that high clustered nodes are usually of low degree, and the clusters connect to each other through a relatively small number of high-degree nodes. Different from their work, we put special emphasis on the measurements of degree correlation which is confirmed as a key indicator to distinguish linkage graphs and activity graphs. Ahn et al. have analyzed the Cyworld network, and observed a multi-scaling behavior in degree distribution [1]. In addition, they compared the explicit linkage graph and the implicit activity graph constructed by messages exchanged on Cyworld’s guestbook. They only focus on static properties, while we also concern dynamic properties. Wilson et al. have studied the activity graph and linkage graph of Facebook [24]. Similar to Ahn et al. [1], they built the activity graph through real interactions between friends, and compared it to the linkage graph. However, findings based on this technique might be biased, because the extracted activity graph is actually a sub-graph of the linkage graph. Viswanath et al. also checked the activity graph of Facebook [22]. They found that although the individuals’ behavior changed rapidly, the properties of the network as a whole remained relatively stable. Different from the preceding work, we compare the linkage graph and activity graph based on two distinct datasets to eliminate bias.
Comparing Linkage Graph and Activity Graph of Online Social Networks
87
A parallel body of work focuses on network generators. We give a brief history of the network generators based on our examined properties, and detailed discussion can be found in [5]. The BA generator [2] effects on a large body of later generators. The idea of preferential attachment from the model is thought to be the cause of powerlaw degree distribution. New nodes connect to existing high-degree nodes with greater probability, and make their degree even higher, forming the heavy tail of the degree distribution. As to the clustering of networks, a well-known idea is perhaps derived from the edge copying generator [9]. Its basic idea is to copy the links of an existing node with some probability when a new node arrives. This idea is analogical to the process of making friends in human communities. The small world generator [23] is another well-known generator that can meet the clustering property of social networks. Based on a ring lattice and the rewiring process, networks generated by this generator also have low average distance between nodes. However, all the preceding generators do not exhibit the dynamic properties of networks. Forest fire generator [11] could meet a set of dynamic and static properties of social networks. However, when trying to generate the Flickr network, the generated network does not have a positive degree correlation while other properties are met. Based on our experiments, this positive degree correlation is a key distinction between activity graphs and linkage graphs of online social networks. Our generator incorporates reciprocity and the fire burning idea from the forest fire generator. The results show that our generator can generate networks with positive degree correlation, and at the same time capture other examined properties. The two generators could complement with each other, as forest fire generator could meet several examined properties with neutral degree correlation.
3
Structural Property Analysis
In this section, we study several static and dynamic properties of the two graph types. We first describe our chosen datasets, and then present our results of these properties, including degree distribution, clustering coefficient, and diameter. After that, we give emphasis to the degree correlation property which is further confirmed as a key indicator to distinguish linkage graphs and activity graphs. Overall, we find that the two graphs are surprisingly similar to each other in many properties except degree correlation. 3.1
Datasets
The Flickr dataset [15,14] consists of a linkage graph with each edge representing a social friendship link. This data is continuous crawled for 104 days, form February 3rd, 2007, to May 18th, 2007. According to the timestamps, we cut the data into 27 snapshots over time. The first snapshot is the initial graph of February 3rd, 2007, and we will refer to this snapshot as GF 0 . Each of the remaining snapshots consists of four more days’ data.
88
Y. Yao et al.
Table 1. High level statistics of the two chosen online social networks as directed graphs Graph Initial nodes Initial edges Final nodes Final edges Time span Snapshots Node growth Edge growth Average node growth per snapshot Average edge growth per snapshot
Flickr Epinions 1,834,425 93,598 24,828,241 588,616 2,570,535 131,828 33,140,018 841,200 104 days 31 months 27 31 40,13% 40.84% 33.48% 42.91% 1.54% 1.36% 1.29% 1.43%
The Epinions dataset [13] consists of an activity graph as each edge stands for a user interaction. We use the data from January 12nd, 2001 to August 12nd, 2003. Similar to Flickr, we cut the Epinions data into 31 snapshots. The first snapshot is the initial graph of January 12nd, 2001, and we will refer to this snapshot as GE0 . Every additional month of the remaining data forms a snapshot of the network. High level statistics of the two graphs can be seen in Table 1. Although the datasets have different time span, their growth rates are similar to each other. For the sake of simplicity, we will only give the results of GF 0 and GE0 for static analysis. 3.2
Static Properties
Power-law degree distribution [2] indicates that the count of nodes with degree k, versus the degree k, forms a line on a log-log scale. This skewed distribution is very common in social networks, and we observe this distribution in both graphs. The degree of Epinions is about one order of magnitude smaller than that of Flickr, but the power-law coefficient of the four distributions are quite close to each other. To verify this, we use the fitting method described by Clauset et al. [6] and apply it to the four distributions. We do not give the figures of these static properties due to space limit, but the statistics can be seen in Table 2. Clustering coefficient is widely used as an indicator of community structure [5], or the transitivity of a graph [18]. We use the definition of clustering Table 2. Statistics of static properties on GF 0 and GE0 Graph In-degree coef.1 Out-degree coef.1 Clustering Coef. Diameter GF 0 1.76, 0.0122 1.76, 0.0188 0.2090 7.5173 1.78, 0.0287 1.75, 0.0146 0.2357 5.8646 GE0 1 Values in these columns are the power-law coefficient estimate and corresponding Kolmogorov-Smirnov goodness-of-fit, respectively.
Comparing Linkage Graph and Activity Graph of Online Social Networks
89
coefficient by Watts and Strogatz [23]. The two networks both have a high clustering coefficient, and as Table 2 shows, the global clustering coefficient value is 0.2090 for GF 0 and 0.2357 for GE0 . As to the local clustering coefficient (not shown here), nodes with lower degree tend to cluster more compactly at the edge of the graph, and nodes with higher degree stay in the middle of the graph to maintain high connectivity. Diameter can be are calculated by many approaches, such as characteristic path length, average diameter, and effective diameter. The effective diameter [20] is defined as the minimum number of hops in which 90% of all nodes can reach each other, and we use the smoothed version described by Leskovec et al. [11]. To reduce the randomness impact of the effective diameter algorithm, every diameter value in this paper is an average of four calculations. Our results show that, GF 0 has an effective diameter of 7.5173, and the value for GE0 is 5.8646. Unlike the results by Wilson et al. [24], we find that the two kinds of graphs are quite similar in examined static properties. Although the diameter of Flickr is a little bigger, we find later in the dynamic analysis that the Flickr network is in a diameter shrinking stage while Epinions are much stable in diameter. In addition, Flickr and Epinions both are small-world networks, with high clustering coefficient and low diameter. 3.3
Dynamic Properties
The dynamic properties we study include the densification law and diameter shrinking discovered by Leskovec et al. [11], and the clustering coefficient over time. Densification law indicates that the network is getting denser, and the densification follows a power-law pattern: E(t) ∼ N (t)α , where E(t) and N (t) are the edge and node numbers at time t, respectively. We find both graphs exhibit densification over time. This means both graphs are in a stage when more and more edges are created [10]. Moreover, we find that the coefficient of the densification power-law increases a little over time (See Fig. 1(a)). We also check the clustering coefficient and effective diameter over time of Flickr and Epinions, as shown in Fig. 1(b) and Fig. 1(c). We find that the
1.2
0.3
10
0.28 1.19
8
1.17
Effective diameter
Clustering coefficient
Power-law coefficient
0.26
1.18
0.24
0.22
6
4
0.2
1.16
0.18
2
0.16
Flickr Epinions
Flickr Epinions
Flickr Epinions
1.15
0 0
5
10
15
20
25
30
Snapshot number
(a) The densification power-law coefficient over time
0
5
10
15
20
25
30
Snapshot number
(b) The clustering coefficient over time
0
5
10
15
20
25
30
Snapshot number
(c) The effective diameter over time
Fig. 1. Some dynamic properties on Flickr and Epinions with horizontal axis representing the snapshot number over time
90
Y. Yao et al.
clustering coefficient is relatively stable with a slight decline over time in both graphs, and the diameter shrinking only appears in Flickr, while Epinions exhibit a stable effective diameter over time. These two results are quite consistent with the results by Viswanath et al. [22], who also find their activity graph strikingly stable in clustering coefficient and diameter. In addition, the shrinking phenomenon of diameter is also found in earlier work [11,10]. Analyzing dynamic properties can help to predict network growth, as well as assess the quality of graph generators. Overall, except for the slight difference in diameter over time, the two graphs are again similar to each other in examined dynamic properties. 3.4
Degree Correlation
Degree correlation reflects how frequently nodes with similar degrees connect to each other. Degree correlation of a graph can be measured by knn distribution and corresponding assortativity coefficient r. knn of an undirected graph is a mapping between degree k and the average degree of all neighbors connected from nodes of that degree k. Assortativity coefficient r, ranging between -1 and 1, gives a quantitative measurement of degree correlation. For example, positive r value indicates a preference of high-degree nodes connecting to each other, and a random graph’s r value should be 0 in theory. We can define four kinds of knn and assortativity r on our directed graphs. out−in can be defined as a mapping between an out-degree k As an example, knn (horizontal axis in Fig. 2(d)) and the average in-degree of all neighbors connected from nodes of that degree k (vertical axis in Fig. 2(d)). We can further calculate r values according to the formulae given by Newman [17]. As shown in Fig. 2(d), out−in distribution of Flickr is significantly upward and the corresponding the knn r value is 0.2329 (shown in Table 3). This upwardness of knn , along with the significantly positive value r, indicates that Flickr nodes with high out-degree 10^{4}
10^{4}
10^{4}
10^{3}
10^{3}
10^{3}
10^{3}
10^{2}
10^{2}
10^{2}
10^{1} 10^{0}
10^{1} 10^{0}
10^{1} 10^{0}
10^{2}
10^{3}
10^{4}
10^{5}
10^{1}
of GF 0
10^{4}
10^{5}
(b)
Knn_In-Out 10^{1}
10^{2} In-Degree
in−in knn
10^{1}
10^{3}
10^{4}
of GF 0
of GE0
(f)
10^{3}
10^{4}
(c) GF 0
10^{1} 10^{0}
10^{5}
10^{1}
10^{2} In-Degree
in−out knn
10^{1}
10^{3}
10^{4}
of
of GE0
(g) GE0
10^{3}
10^{4}
10^{5}
out−in (d) knn of GF 0
10^{3}
10^{2}
10^{1} 10^{0}
10^{2} Out-Degree
out−out knn
10^{3}
10^{2}
10^{1} 10^{0}
10^{2} Out-Degree
10^{3}
10^{2}
(e)
10^{3} In-Degree
10^{3}
10^{1} 10^{0}
10^{2}
in−out knn
Knn_Out-Out
(a)
in−in knn
10^{2}
Knn_Out-In
10^{1}
In-Degree
Knn_In-In
Knn_Out-In
Knn_In-In
Knn_In-Out
Knn_Out-Out
10^{4}
10^{1}
10^{2} Out-Degree
out−out knn
10^{3}
10^{4}
of
Fig. 2. The knn distribution of GF 0 and GE0
10^{2}
10^{1} 10^{0}
10^{1}
10^{2} Out-Degree
10^{3}
10^{4}
out−in (h) knn of GE0
Comparing Linkage Graph and Activity Graph of Online Social Networks
91
Table 3. Degree correlation analysis of activity graphs and linkage graphs. All activity graphs studied here are based on interactions and linkage graphs based on friendship links. Network Size Symmetry r in−in r in−out r out−out r out−in Epinions 93,598 30.5% 0.0135 0.0698 -0.0164 -0.0556 Advogato1 7,421 39.8% -0.0250 0.0003 -0.0008 -0.0683 Activity graph Wiki-Vote2 7,115 14.4% 0.0051 0.0071 -0.0189 -0.0832 Wike-Talk2 2,394,385 5.6% -0.0566 -0.0482 0.0231 -0.0809 Flickr 1,834,425 62.3% 0.3383 0.2614 0.1830 0.2329 46,952 62.9% 0.1830 0.2131 0.2719 0.2435 Linkage graph Facebook [22] LiveJournal [15] 5,204,176 73.5% 0.1759 0.3633 0.3763 0.1796 1 Advogato dataset, available at http://www.trustlet.org/datasets/advogato/ 2 Snap datasets, available at http://snap.stanford.edu/data/ Graph Type
have a strong tendency to connect to nodes with high in-degree. The other three kinds of knn and assortativity r can be defined and calculated accordingly, as shown in Fig. 2. Compared to the upward tendencies of the four knn distributions of Flickr (Fig. 2(a)-2(d)), Epinions have relatively flat knn distributions (Fig. 2(e)-2(h)). This difference is again reflected by the assortativity coefficients. While Flickr has significantly positive assortativity coefficients, the r values of Epinions are much neutral. This neutrality indicates that the edges of Epinions are relatively random, recalling that r values of random graphs should be 0 in theory. Existing work already shows that social networks tend to have positive r values while the r values of other networks tend to be negative [17]. However, this rule does not hold in activity graphs of online social networks. To further study the degree correlation of the two graph types, we analyze additional datasets as shown in Table 3. We observe from the table that all activity graphs we studied tend to have neutral r values around 0, while linkage graphs have significantly positive r values. One possible reason for the neutrality of activity graphs could be the relative randomness of interactions, while linkage graphs have strong reciprocity. The reciprocity of the datasets are also shown in the forth column of Table 3, and we will discuss this further in the next section.
4
Network Generator
In this section, we present our generator for generating online social networks. The generator captures the local behavior that forms positive degree correlation, and focuses especially on generating linkage graphs. Our goal in developing this generator is to understand the local behavior of the global degree correlation property, and to generate a network that exhibits all examined properties in the previous section. 4.1
Generator Description
The fire burning idea of forest fire generator aims at meeting a set of properties, including power-law degree distribution, high clustering, small diameter, densi-
92
Y. Yao et al.
fication, and shrinking diameter [11]. However, the generator does not consider degree correlation. By exploring the parameter space of forest fire generator, we find that the generated networks can exhibit only neutral assortativity coefficients when other examined properties are met. In contrast, networks by our generator could achieve significant positive assortativity coefficients of degree correlation, and exhibit other properties at the same time. Forest fire generator was proposed based on the intuition of how authors find references in citation networks. In this generator, new nodes can only connect to old nodes because authors cannot refer to unpublished papers. This is not the case of online social networks, because these networks allow old nodes to connect to new ones. Actually, the linkage graphs are quite reciprocated with high link symmetry [10,15]. In view of this, we incorporate the idea of symmetry into our generator, while retaining the fire burning idea in order to obtain the other properties.
Algorithm 1. AddNode(v, Pb , Ps ) 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19:
Upon a new node v arrives; v chooses an ambassador node w uniformly at random; Set S := Ø; Queue Q.add(w); while Q is not empty do generate a random number x, that is geometrically distributed with means 1/(1 − Pb ); node c := Q.head(); Q.remove(c); S.add(c); Set T := the outlink ends of c that are not in S; for i := 1 to min(x, T .size()) do node t := a randomly sampled node in T without replacement; Q.add(t); S.add(t); end for end while for node u in S do v forms a link to u; u forms a link to v with probability Ps ; end for
Our generator has two parameters, the burning probability Pb which is in charge of the burning process, and the symmetry probability Ps which indicates backward linking from old nodes to new ones. Consider the arrival of a new node, v, it follows the process as shown in Algorithm 1. The Pb controls a BFS-based forward burning process, as users in these kinds of networks can check the out-linked friends of their out-linked friends. The fire burns increasingly fiercely with Pb approaching 1. Meanwhile, the Ps adds fuel to the fire as it brings more links.
Comparing Linkage Graph and Activity Graph of Online Social Networks
4.2
93
Experiments
We partially explore the parameter space of our generator, in order to understand the degree correlation property which is measured by assortativity coefficients. Our exploration covers burning probability Pb from 0 to 0.45 and symmetry probability Ps from 0 to 1, with step size 0.01 for both. One problem about generating online social networks is the large size of the datasets. For simplicity, we fix the number of nodes to 90,000 in our experiments.
0.6
0.4 assortativity r(in−out)
assortatvity r(in−in)
0.5 0.4 0.3 0.2 0.1
0.3 0.2 0.1 0
0 −0.1
−0.1
1
1
0.4
0.4 0.3
0.3
0.5
0.2
0.1
0.1 0
burning probability b
0
in−in
0
burning probability b
symmetry probability s
(a) assortativity coefficient r
0
symmetry probability s
(b) assortativity coefficient r in−out
0.4
0.4
0.3 assortativity r(out−in)
assortativity r(out−out)
0.5
0.2
0.2 0.1 0 −0.1 0.5 0.4 1
0.3
0.8 0.2
0.6 0.4
0.1 burning probability b
0
0.3 0.2 0.1 0 −0.1 0.5
1 0.4 0.3 0.1
0.2 0
symmetry probability s
(c) assortativity coefficient r out−out
0.5
0.2 burning probability b
0
0
symmetry probability s
(d) assortativity coefficient r out−in
Fig. 3. The assortativity coefficients over the partial parameter space of our generator
Fig. 3 shows our results of four assortativity coefficients over the parameter space of our generator. The vertical axis of every subfigure represents the assortativity coefficient, and the two horizontal axes represent the burning probability and the symmetry probability, respectively. We observe that our generator can generate significantly positive assortativity coefficients in general. This is probably because Ps gives chances for big nodes to connect back to big nodes, while the links in forest fire generator are created much randomly. In addition, with the increase of Ps and Pb , the graphs generated by our generator show upward trend of r values. This is because symmetry has an increasing impact when more links are burned. As discussed above, degree correlation is one of the major differences between linkage graphs and activity graphs, and the fraction of symmetric links of linkage
94
Y. Yao et al.
Table 4. Examined properties of our generator with symmetry probability Ps = 0.7 and burning probability Pb = 0.472 nodes 90,000 edges 691,296
r in−in 0.2231 r in−out 0.2135
r out−out Power-law coef. Diameter Diameter over time 0.2031 2.30/2.33 8.1225 shrinking out−in r Densification coef. CC CC over time 0.2129 1.18 0.45 stable
graphs is relatively high. Based on our exploration of the parameter space, we also find that adding symmetry to network generator can produce significantly positive assortativity coefficients. Symmetry is a reflection of reciprocity in linkage graphs, and therefore, we argue that reciprocity is a key factor that could lead to positive degree correlation and positive assortativity coefficients. As an example, we try to generate a graph that can meet all the observed properties of Flickr. When the number of nodes is fixed at 90,000, the edge number of Flickr is around 666,000 which can be estimated by the tendency of the densification power-law coefficient. With symmetry probability Ps = 0.7 and burning probability Pb = 0.472, the results are shown in Table 4. The generated graph has significantly positive r values which are similar to Flickr. Moreover, the graph follows static properties of Flickr including power-law degree distribution, small diameter, and high clustering coefficient, as well as dynamic properties of shrinking diameter, stable clustering coefficient, and densification.
5
Discussion
In this section, we discuss some potential implications based on our findings and experiments. Specially, we concentrate on two applications: the linkage-based application of information dissemination, and the activity-based application of trust inference. 5.1
Information Dissemination
Online social networks have become a popular way to disseminate information. This kind of applications should be built on the linkage graph, as linkage graph is a common mechanism for information dissemination in the content sharing networks, such as Flickr and Youtube. In addition, Cha et al. have found that a large portion of information spread through social links [3,4], making the underlying structure worth a thorough study. They observe that information is limited to the vicinity of the source, although the small world property implies that information could spread widely through the network. We give a possible explanation based on the observation of the positive degree correlation: highdegree nodes tend to connect to each other, and thus their impact is limited within the high-connected core of the network. They also find that information spreads relatively widely when the source node is of high degree. Consequently, in order to widely spread the information, the information sources should include
Comparing Linkage Graph and Activity Graph of Online Social Networks
95
some high-degree nodes in the core of the graph, as well as some nodes at the edge of the graph. 5.2
Trust Inference
In the context of online social networks, we may have to interact with unknown participants. It is necessary to evaluate the trustworthiness of these unknown participants in the open environment, and one particular approach among others is to infer trust through existing trust relationships. In order to assist trust inference, researchers have proposed to study the structural properties of the underlying social networks [19,8,12]. We believe that it is suitable to infer trust through activity graph of real user interactions, rather than through the linkage graph of social links in the online environment. First, edges in a linkage graph may be the result of reciprocity, and these edges cannot indicate trust. Second, we need explicit trust ratings along the way to the unknown participant to carry out the computation of trust inference [8]. Activity graphs can mitigate this problem, because we can obtain trust ratings from feedback of every interaction [12]. Golbeck and Hendler have mentioned that graph generator is necessary for evaluating algorithms about trust inference on networks [8]. However, they conduct their experiments on networks generated by the small world generator, and this generator captures only the clustering and diameter properties of social networks. To make the results more convincing, we need generators that could generate more realistic graphs. Our generator can capture several dynamic properties of social networks, while retaining conciseness with only two parameters.
6
Conclusion
In this paper, we have studied several structural properties on two direct graphs mapped from online social networks. We recognize the two graphs as linkage graph and activity graph, respectively. Our results show that the two graphs are very similar to each other in several common static and dynamic properties, but quite different in degree correlation. We analyze several additional datasets and confirm that degree correlation is a key indicator of the two graph types. To further understand this property, we propose our network generator and find that reciprocity is a key factor for this difference. Future developers should consider and take advantage of the structural properties of the corresponding underlying network, and develop their applications accordingly. Moreover, our findings and generator together could be used to detect anomalies, predict network evolving behavior, and generate realistic graphs of online social networks. Acknowledgments. The authors would like thank Alan Mislove for sharing their Flickr data, and Paolo Massa for sharing their Epinions data. This work is supported by the National Natural Science Foundation of China (No. 60736015,
96
Y. Yao et al.
60721002, 61073030), the National 973 Program of China (2009CB320702), the National 863 Program of China (2009AA01Z117), and the ”Climbing Program” of Jiangsu Province, China (BK2008017).
References 1. Ahn, Y.Y., Han, S., Kwak, H., Moon, S., Jeong, H.: Analysis of topological characteristics of huge online social networking services. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 835–844. ACM, New York (2007) 2. Barab´ asi, A., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999) 3. Cha, M., Mislove, A., Adams, B., Gummadi, K.P.: Characterizing social cascades in flickr. In: Proceedings of the First Workshop on Online Social Networks, WOSP 2008, pp. 13–18. ACM, New York (2008) 4. Cha, M., Mislove, A., Gummadi, K.P.: A measurement-driven analysis of information propagation in the flickr social network. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, pp. 721–730. ACM, New York (2009) 5. Chakrabarti, D., Faloutsos, C.: Graph mining: Laws, generators, and algorithms. ACM Comput. Surv. 38 (2006) 6. Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. SIAM Review 51, 661–703 (2009) 7. Garriss, S., Kaminsky, M., Freedman, M.J., Karp, B., Mazi`eres, D., Yu, H.: RE: reliable email. In: Proceedings of the 3rd Conference on Networked Systems Design & Implementation, NSDI 2006, vol. 3, pp. 297–310. USENIX Association, Berkeley (2006) 8. Golbeck, J., Hendler, J.: Inferring binary trust relationships in Web-based social networks. ACM Transaction on Internet Technology 6, 497–529 (2006) 9. Kleinberg, J.M., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.S.: The web as a graph: Measurements, models, and methods. In: Asano, T., Imai, H., Lee, D.T., Nakano, S.-i., Tokuyama, T. (eds.) COCOON 1999. LNCS, vol. 1627, pp. 1–17. Springer, Heidelberg (1999) 10. Kumar, R., Novak, J., Tomkins, A.: Structure and evolution of online social networks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, pp. 611–617. ACM, New York (2006) 11. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, KDD 2005, pp. 177–187. ACM, New York (2005) 12. Liu, G., Wang, Y., Orgun, M.A.: Optimal social trust path selection in complex social networks. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, AAAI 2010, pp. 1391–1398 (2010) 13. Massa, P., Avesani, P.: Trust-aware collaborative filtering for recommender systems. In: Chung, S. (ed.) OTM 2004. LNCS, vol. 3290, pp. 492–508. Springer, Heidelberg (2004) 14. Mislove, A., Koppula, H.S., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Growth of the Flickr social network. In: Proceedings of the First Workshop on Online Social Networks, WOSP 2008, pp. 25–30. ACM, New York (2008)
Comparing Linkage Graph and Activity Graph of Online Social Networks
97
15. Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, IMC 2007, pp. 29–42. ACM, New York (2007) 16. Mislove, A., Post, A., Druschel, P., Gummadi, K.P.: Ostra: leveraging trust to thwart unwanted communication. In: Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2008, pp. 15–30. USENIX Association, Berkeley (2008) 17. Newman, M.: Mixing patterns in networks. Physical Review E 67(2), 026126 (2003) 18. Newman, M.: The structure and function of complex networks. SIAM Review 45, 167–256 (2003) 19. Pujol, J.M., Sang¨ uesa, R., Delgado, J.: Extracting reputation in multi agent systems by means of social network topology. In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS 2002, pp. 467–474. ACM, New York (2002) 20. Tauro, S., Palmer, C., Siganos, G., Faloutsos, M.: A simple conceptual model for the Internet topology. In: Global Telecommunications Conference, GLOBECOM 2001, vol. 3, pp. 1667–1671. IEEE, Los Alamitos (2001) 21. Tran, T., Rowe, J., Wu, S.F.: Social email: A framework and application for more socially-aware communications. In: Bolc, L., Makowski, M., Wierzbicki, A. (eds.) SocInfo 2010. LNCS, vol. 6430, pp. 203–215. Springer, Heidelberg (2010) 22. Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: On the evolution of user interaction in Facebook. In: Proceedings of the 2nd ACM Workshop on Online Social Networks, WOSN 2009, pp. 37–42. ACM, New York (2009) 23. Watts, D., Strogatz, S.: Collective dynamics of ’small-world’ networks. Nature 393(6684), 440–442 (1998) 24. Wilson, C., Boe, B., Sala, A., Puttaswamy, K.P., Zhao, B.Y.: User interactions in social networks and their implications. In: Proceedings of the 4th ACM European Conference on Computer Systems, EuroSys 2009, pp. 205–218. ACM, New York (2009) 25. Yu, H., Kaminsky, M., Gibbons, P.B., Flaxman, A.: SybilGuard: defending against Sybil attacks via social networks. In: Proceedings of the 2006 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, SIGCOMM 2006, pp. 267–278. ACM, New York (2006)
Context-Aware Nearest Neighbor Query on Social Networks Yazhe Wang and Baihua Zheng Singapore Management University {yazhe.wang.2008,bhzheng}@smu.edu.sg
Abstract. Social networking has grown rapidly over the last few years, and social networks contain a huge amount of content. However, it can be not easy to navigate the social networks to find specific information. In this paper, we define a new type of queries, namely context-aware nearest neighbor (CANN) search over social network to retrieve the nearest node to the query node that matches the context specified. CANN considers both the structure of the social network, and the profile information of the nodes. We design a hyper-graph based index structure to support approximated CANN search efficiently.
1
Introduction
Social network websites and applications have grown rapidly over the last few years. Take Facebook as an example. From an initial website used by Harvard students to one of the most famous social networking websites, it has currently attracted more than 400 million active users worldwide [17]. Obviously, more and more people start using social networks to share ideas, activities, and interests with each other, and social networks contain a huge amount of content. However, it might not be easy to navigate social networks to find specific information. Consequently, we focus this paper on querying social networks. We model the social network as undirected graph, and assume each node of the graph maintains some profile information. Take the co-authorship network G depicted in Fig. 1 as an example. Each node represents a researcher and a link between two nodes states that those two researchers have collaborated at least once. Some descriptive information (e.g., name, profession, and research topics) of each node is maintained, as depicted in Fig. 1. A context-aware nearest neighbor (CANN) query is defined to search over social network based on both network structure and the profile information. It retrieves the nearest node to the query node that matches the context specified, as well as the shortest path between them. For example, Michael (i.e., node v3 ) may issue a CANN query Q1 “finding me the shortest path to reach the nearest professor working in data
We would like to acknowledge that this research/project was carried out at the Living Analytics Research Centre (LARC), sponsored by Singapore National Research Foundation and Interactive & Digital Media Programme Office, Media Development Authority..
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 98–112, 2011. c Springer-Verlag Berlin Heidelberg 2011
Sara, t1
Emmy, t3
v10
v8
v7
v11
v6
Tony, t1 v9 v1
James, t1
Robert, t4
v4 v2
Robin, t3 Grace, t3
v3
Kelly, t2 Michael, t1
v5
Anna, t4
Research topics
Jonson, t2
profession
Context-Aware Nearest Neighbor Query on Social Networks
99
student research fellow professor t1 database management t2 information retrieval t3 data mining t4 data privacy
Fig. 1. A collaboration social network G
mining”. Here, distance from the query node v3 to a node v is evaluated by the shortest path distance; and the context is represented by keywords {professor, data mining}. The answer to Q1 is node v4 with its shortest path {v3 , v4 }. CANN query considers both the network distance and the context. It has a large application base. For example, researchers can issue CANN to find potential collaborators to start new research and employers can issue CANN to locate qualified employees to work on specific tasks. There are two naive approaches for CANN search. First, we can invoke traditional shortest path search algorithm to approach nodes based on ascending order of their distances to the query node until one that matches the queried keywords is found, denoted as SPA-based approach. Second, we can employ well-known information retrieval techniques to locate all the nodes that match the queried keywords, and then order them based on shortest path distances, denoted as IR-based approach. However, both approaches are inefficient, in terms of search performance and storage consumption. On one hand, SPA-based approach traverses the social network purely based on the distance but not context and hence it might have to visit many unnecessary nodes before the answer node is found. On the other hand, IR-based approach may find many nodes that match the queried keywords as intermediate results and hence the ranking process based on the distances between query node and all the intermediate nodes could be very costly. In addition, the inverted index used by IR-based approach might take up large storage space if the graph is big and/or the vocabulary of the context is large. Given the fact that the exact search of CANN query is relatively expensive and some applications might be willing to trade in the accuracy for performance, we propose an approach, namely hyper-graph based approximation, that provides an approximated result to CANN queries with high performance and relatively cheap storage requirement. It tries to utilize the unique power law degree distribution feature of social network, and identifies some nodes with very high degree (whose number will be small) as center nodes. It then partitions the social network into disjoint sub-graphs with each centering around a center node. Based on the assumption that a path linking a node in one sub-graph Gi to a node in another sub-graph Gj is very likely to pass the corresponding center nodes of Gi and Gj , it builds a hyper graph to index i) among sub-graphs, the shortest
100
Y. Wang and B. Zheng
paths between center nodes; and ii) within each sub-graph, the shortest paths between non-center nodes and the center node. Meanwhile, it attaches certain signature information, called signature map, at each center node that facilitates the space pruning based on queried keywords. The rest of the paper is organized as follows. Section 2 reviews related work. Section 3 defines CANN search and approximated CANN search. Section 4 presents the details of the hyper-graph based approximation technique. Section 5 analyzes the experimental results. Finally, Section 6 concludes this paper.
2
Related Work
In this section, we briefly review existing work related to CANN search, including (approximated) shortest path search, keyword query and signature technique used in information retrieval. 2.1
Shortest Path Search and Keywords Query
The most well-known shortest path search algorithm on graphs is the Dijkstra’s algorithm [2]. It explores the graph in a best-first manner starting from the query node until the target node is reached. Some faster solutions were proposed to prune the graph exploration space based on domain-specific heuristic and pre-processing, such as A* search [5] and reach based method [6]. The algorithms mentioned above usually assume the searched graph can be stored in main memory, and do not scale very well for very large graphs. In recent years, efficient indexing techniques have been designed for shortest path search on large graphs. Some index techniques are designed based on partial pre-computation. For example, HEPV [10] and HiTi [11] build index based on materializing local shortest paths of a number of disjoint subgraphs. The global shortest path is then obtained by combining selected local shortest paths. Recently, a novel tree decomposition based graph index structure has been proposed [20], which supports efficient shortest path query with even smaller index size. There are other works considering encoding all-pairs shortest paths of a graph in small-sized indexes. For instance, [19] proposes a quadtree-structured index utilizing the spatial coherence of the destination (or source and destination) nodes. Distance signature method [8] pre-computes the distance from each node v to a set of objects of interests, and maintains this information as a signature at v. Compact BFS-tree [21] is another example. It exploits symmetry properties of graphs to reduce the index size of all-pairs shortest paths. However, it is only applicable to un-weighted graphs. All these discussed approaches support efficient shortest path search for given source and destination nodes. However, none of them considers the context of the nodes, or supports the queries that do not specify the destination at the query time. There are some techniques designed for approximated shortest path/distance queries. Spanner [1] is a subgraph obtained by deleting edges from the original graph. Due to the smaller size, the search performed on the spanner is much
Context-Aware Nearest Neighbor Query on Social Networks
101
faster. However, it is hard to decide which edges should be deleted in order to generate a good spanner so that the distances between nodes do not change substantially. Spanners perform worse on dense graphs with large girth. Distant labeling and embedding techniques [4,18] assign each node of a graph a label such that the (approximated) distance between two nodes can be directly computed based on the corresponding labels. However, these approaches can only provide distance information but not the paths. Keyword query on graphs [7, 9, 12, 16] also considers both the distance and context information. It is to find closely connected clusters of nodes in the graph which contain specific keywords. Based on different query semantic, the result of the query could be rooted trees or subgraphs embedded in the graph. Obviously, the definition of keyword query is different from our CANN search. 2.2
Signature
Signature techniques have been studied extensively in information retrieval [13, 15]. A signature is basically an abstract of the keyword information of a data item. Given a set of keywords that index the data item i, the signature Si is typically formed by first hashing each keyword in the set into a bit string and then superimposing (i.e., bitwise-OR, ∨) all these bit strings into a signature. Note that the size of a signature equals to the size of the bit string. To decide whether a data item i matches/contains the query keyword Q, a query signature SQ is generated first, based on the same hash function. Thereafter, SQ is compared against the signatures Si using bitwise-AND (∧). The signatures match if for every bit set in SQ , the corresponding bit in the compared signature Si is also set. If SQ does not match Si , then data item i does not match query Q. While, if a match happens, it could be a true match that the data item is really what the query searches for; or it could be a false drop that the data item in fact does not satisfy the search criteria.
3
Problem Definition
In this section, we first describe the graph model of the social network, and then formally define the context-aware nearest neighbor (CANN) query and approximated CANN (ACANN) query. In general, we model a social network as an undirected graph G(V, E), with V being a set of nodes and E being the set of edges. An edge e(vi , vj ) ∈ E represents that nodes vi and vj are connected in the network. The weights of edges are captured by W . A non-negative weight w(vi , vj ) ∈ W of edge e(vi , vj ) ∈ E represents the strength of the linkage. In this paper, we assume that the context of each node vi ∈ V is maintained as a set of keywords, denoted as vi .k. The domain of keywords for a graph G is represented by L with L = ∪vi ∈V vi .k. Given two nodes vi and vj of a graph G(V, E), a path and the shortest path connecting them are described in Definition 1.
102
Y. Wang and B. Zheng
Definition 1 Path and Shortest Path. Given a social network G(V, E) and two nodes vi , vj ∈ V , a path P (vi , vj ) connecting vi and vj sequentially passes nodes vp1 , vp2 , · · · , vpm , denoted as P (vi , vj ) = {vp0 , vp1 , vp2 , . . . , vpm , vpm+1 }, with vp0 = vi and vpm+1 = vj . The length of P (vi , vj ), denoted as |P (vi , vj )|, m is n=0 w(vpn , vpn+1 ). The shortest path SP (vi , vj ) is the one with the shortest distance among all the paths between vi and vj , and its distance, denoted as ||vi , vj || (= |SP (vi , vj )|), is the shortest distance between vi and vj . Take the social network in Figure 1 as an example. Path P (v1 , v3 ) = {v1 , v9 , v4 , v3 } is a path from v1 to v3 via nodes v9 and v4 , and path P (v1 , v3 ) = {v1 , v2 , v3 } is another one via v2 . Assume G(V, E) is an unweighted graph with ∀e(vi , vj ) ∈ E, w(vi , vj ) = 1, the path P (v1 , v3 ) is the shortest path between v1 and v3 , i.e., SP (v1 , v3 ) = {v1 , v2 , v3 } and ||v1 , v3 || = |SP (v1 , v3 )| = w(v1 , v2 ) + w(v2 , v3 ) = 2. With vj .k capturing the context of vj , CANN search is to locate the nearest node with its context matching the queried keywords, as given in Definition 2. Definition 2 Context-aware Nearest Neighbor Search (CANN). Given a graph G(V, E), a CANN search Q specifies a query node Q.v and a set of queried keywords Q.k, and it asks for a shortest path P to a node vj ∈ V such that the context of vj matches queried keywords and its distance to Q.v is the shortest among all the nodes with context matching Q.k. In other words, CANN(Q) = vj , P ⇒ vj .k ⊇ Q.k ∧ P = SP (Q.v, vj ), and meanwhile vi ∈ V such that Q.k ⊆ vi .k ∧ ||Q.v, vi || < |P |. As the exact search of CANN query is relatively expensive, we, in this paper, focus on supporting an approximated CANN search as defined in Definition 3. Definition 3 Approximated CANN Search (ACANN). Given a graph G(V, E), an ACANN search Q specifies a query node Q.v and a set of queried keywords Q.k. It returns a path P to a node vj ∈ V such that the context of vj matches queried keywords. However, it does not guarantee that i) vj is the nearest node that satisfies the query; or ii) P is the shortest path from Q.v to vj . The quality of the approximation is measured by the ratio of the length of the N (Q).P | returned path of ACANN search to that of CANN query, i.e., |ACAN |CAN N (Q).P | .
4
Hyper-Graph based Approximation
In this section, we present an index structure, namely hyper-graph, to support approximated CANN search. We first explain the basic idea of hyper-graph index based approximation, then present the structure of hyper-graph index and its construction algorithm, and finally explain the corresponding search algorithm. 4.1
Basic Idea
The idea of hyper-graph index comes from the intuition of how we search for information in the real social network. Usually, there are a small number of
Context-Aware Nearest Neighbor Query on Social Networks
103
important persons who have strong connections with people in their local social network. For example, Prof. Jiawei Han is a distinguished researcher in the data mining field. If a UIUC graduate wants to build a connection with another data mining researcher, it is very likely that Prof. Han can provide great help. Based on this finding, we first identify a small set of important persons as center nodes in the social network, and divide the social network into disjoint partitions Pi with each around one center node ci . We then employ the center node as the knowledge hub of its partition, i.e., each center node carries distance information and context information of the nodes within its partition. We assume the center nodes serve as glues to connect nodes. In other words, a path linking nodes within a partition Gi will pass the center node ci , and a path linking a node in partition Gi to a node in another partition Gj will pass the center nodes ci and cj , i.e., it is very likely that center nodes lie on the shortest paths between nodes. Consequently, we index the shortest paths from each node within a partition Gi to the center nodes, and the shortest paths from center nodes to the center nodes of their neighboring partitions, namely hyper graph. With the help of hyper graph, an ACANN query issued at a node v can be first forwarded to the center node ci of the partition that covers v via a local search conducted by the center node ci within its own partition. Meanwhile, ci expands the search to its neighboring partitions via expanded search. The construction of hyper graphs, and the details of local search as well as expanded search will be presented in the following subsections. 4.2
Hyper Graph Index
The hyper graph index construction contains three steps, i.e., center node selection, network partition, and hyper graph formation, as detailed in the following. Center Nodes Selection. There are multiple ways to select center nodes, such as random selection and betweenness-based selection. The former picks center nodes randomly while the latter selects those nodes with highest betweenness scores1. However, random selection may pick nodes that do not lie on many shortest paths, and betweenness based selection may suffer from very high computation cost. Consequently, we propose degree-based selection. The rationale is that in social network, the persons with wide social connections tend to exist on many shortest paths linking different nodes. We will evaluate those center node selection methods in Section 5. Network Partition. Once the center nodes are fixed, we assign other nodes to their nearest center nodes for network partition, as formally defined in Definition 4. In case a node shares the same distance to multiple center nodes, it is randomly assigned to one of them. Accordingly, we need to locate the shortest paths from each node to center nodes. The graph partition could be computed in time O(|E| + |V | + |V | log(|V |)|) using the algorithm proposed in [3]. 1
The betweenness score of a node equals the number of shortest paths crossing it.
104
Y. Wang and B. Zheng
Definition 4 Network Partition. Given a social network G(V, E) and a set of center nodes C ={c1 , c2 , · · · , cr } with C ⊂ V , a network partition PG = {G1 (VG1 , EG1 ), G2 (VG2 , EG2 ), · · · , Gr (VGr , EGr )} is a set of subgraphs Gi that i) ∀ci ∈ C, ci ∈ VGi ; ii) ∀v ∈ V , ∃Gi ∈ PG , v ∈ VGi ; iii) ∀v ∈ VGi ∧ ∀j(= i) ∈ [1, r], ||v, ci || ≤ ||v, cj ||; iv) ∀v, v (v = v ) ∈ VGi , if e(v, v ) ∈E, e(v, v ) ∈ EGi ; and v) ∀i, j(i = j) ∈ [1, r], VGi ∩ VGj = ∅ ∧ EGi ∩ EGj = ∅ ∧ 1≤i≤r EGi ⊆ E. Hyper Graph Formation. As explained before, ACANN search contains local search and expanded search. In order to support local search, within each partition Gi , we store the shortest paths from each non-center node v to the center node ci , via a two-tuple vector ci , vnext . Here, the shortest path from a non-center node v to the center node ci is identified during the social network partition process, and vnext is the next-hop node on SP (v, ci ). In addition, to support space pruning based on queried keywords, each center node ci maintains signatures representing the context of the nodes within its . To be more specific, within partition, via the signature map, denoted as cmap i each partition Gi , we order the non-center nodes based on their distances to the center node ci and cluster them into groups. For each group, a signature is generated by superimposing the signatures of the context of the nodes within the group. Thereafter, when a search reaches a center node ci , we compare the queried keywords with the signatures of ci ’s groups, and examine the nodes within a group only when its signature indicates a match. Obviously, how to cluster the nodes into groups affects the search efficiency. In general, given a hash function for signature generation (i.e., a fixed signature size), the more the nodes clustered into a group are, the higher the false drop rate is. In this work, we pre-define a false drop rate threshold γ (e.g., 0.01) and decide the maximal number of distinct keywords, denoted as η, that could be represented by a signature with approximated false drop rate bounded by γ, based on Equation (1) [14]. Here, |sig| is the length of the signature. η=
|sig| · (loge 2)2 −loge γ
(1)
The clustering algorithm then works as follows. First, all the nodes vj within a partition Gi are sorted based on ascending order of their shortest distances to ci , maintained in a queue Que. Next, we dequeue the head node vj from Que, insert vj into set S, and check the total number of keywords associated with nodes in S, denoted as ϕ. There are three cases. Case (i) ϕ > γ: all the nodes in S, except vj , form a new group gl , with S = {vj }; Case (ii) ϕ = γ: all the nodes in S form a new group gl , with S = ∅; and Case (iii) ϕ < γ: no action. This process continues until Que is empty. Notation cmap [l] is used to i represent the signature map for the l-th group gl w.r.t. the center node ci , in the format of sig, dis, nodes. Here, cmap [l].sig is the signature generated based i on all the keywords associated with nodes within the group gl , cmap [l].dis is the i lower bound of the shortest distance from any node within group gl to the center node ci (i.e., ∀vj ∈ gl , ||vj , ci || ≥ cmap [l].dis ∧ ∃v ∈ gl , ||v , ci || = cmap [l].dis), i i map and ci [l].nodes records all the nodes within group gl .
Context-Aware Nearest Neighbor Query on Social Networks dis 0 1
v9.map sig v9.sig v v1.sig v2.sig v v10.sig v v11.sig
nodes v9 , v 1 , v2, v10, v11
v11
v9
v1
v7.map
v8
dis 0 1
v7
v10
v2
1
1
1
v4
v3
v6
v5
105
dis 0 1
sig v7.sig v v6.sig v8.sig
v4.map sig v4.sig v v3.sig v5.sig
nodes v7, v6 v8
nodes v4, v3 v5
Fig. 2. An example of the hyper graph index
In order to support expanded search, we pre-compute the shortest paths between two center nodes whose partitions are adjacent. Two partitions Gi , Gj are adjacent, denoted as Gi Gj , if there is an edge in G that connects a node in Gi to a node in Gj . Then, we build hyper graph which includes all the center nodes as vertexes, and the shortest paths between center nodes of adjacent partitions. Definition 5 Hyper Graph. Given a social network G(V, E) and a set of center nodes C = {c1 , c2 , . . ., cr }, the hyper graph GH (VH , EH ) consists of the set of center nodes, and the connections between those center nodes with their corresponding partitions are adjacent, i.e., VH = C, and EH = ∪Gi Gj ∧|SP (ci ,cj )|=∞ e(ci , cj ) with w(ci , cj ) = |SP (ci , cj )|. An example of the hyper graph index is depicted in Fig. 2. Assume the number of center nodes is three. Using degree-based selection, nodes v4 , v7 , and v9 with the top-three maximal degrees are selected as the center nodes. Thereafter, the network partition takes place. Each non-center node is attached to its nearest center node as demonstrated by the dashed circle in Fig. 2. Once the social network is partitioned, we proceed to form hyper graph. As all the partitions are adjacent, the hyper graph actually is a complete graph with vertices VH = C = {v4 , v7 , v9 } and edges EH = {e(v4 , v7 ), e(v7 , v9 ), e(v4 , v9 )}. The content of each center node signature map is also depicted. Take center node v7 as an example. Its partition has three nodes, and they are sorted based on ascending order of their shortest distances to the center node v7 . Suppose each signature contains up to four keywords (i.e., η = 4). Nodes v6 and v7 are clustered into the first group, and node v8 is clustered into the second group. For each group, the signature is formed by superimposing the signature of each node and the distant is set to the shortest distance between the first node of the group to v7 . 4.3
Approximated Search Algorithm
The hyper graph based ACANN search assumes that a path from a node v within a partition Gi to node v within a partition Gj (i = j) must pass corresponding center nodes ci , cj , i.e., a center node serves as the only entrance to and the exit from its partition. To be more specific, a path from v to v consists of three path segments, the one from v to ci , the one from ci to cj , and the one from cj to
106
Y. Wang and B. Zheng
v . Algorithm 1 lists the pseudo code of ACANN search. For an ACANN query Q issued at node Q.v, if the query node matches the queried keywords Q.k, the search terminates (lines 2-3). Otherwise, we locate the center node cq that covers Q.v via Q.v’s two-tuple vector ci , vnext , with d being the shortest distance from Q.v to cq . We then enqueue cq into Que, a priority queue maintaining center nodes of those partitions that might deserve examinations (line 6). All the entries in Que are two-tuple vectors ci , ||ci , cq ||, ordered based on ascending order of the distance between center nodes ci and cq .
Algorithm 1: ACANN Search based on Hyper Graph Index
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Input: a social network G(V, E) with corresponding context L and weight W , a hash function H, hyper graph GH (VH , EH ), an ACANN query Q Output: the approximated answer node vans , dans , and Pans vans = ∅, dans = ∞; if Q.k ⊆ Q.v.k then return vans = Q.v, dans = 0, Pans = {Q.v}; for each ci ∈ VH do dci = ∞; cq = Q.v.ci , Que = cq , 0, d = ||cq , Q.v||; while Que is not empty do ci , ||cq , ci || = dequeue(Que); if (d + ||cq , ci ||) ≥ dans then return vans , dans , Pans ; for each cmap [l] ∈ cmap do i i if (d + ||cq , ci || + cmap [l].dis) ≥ dans then i break; else if H(Q.k) ∧ cmap [l].sig = H(Q.k) then i for each vj ∈ cmap [l].nodes do i if Q.k ⊂ vj .k and (d + ||cq , ci || + ||ci , vj ||) < dans then vans = vj ; dans = d + ||cq , ci || + ||ci , vj ||; Pans = append(SP (Q.v, cq ), P (cq , ci ), SP (ci , vj )); for each neighboring node cn of ci in GH do if dci + ||ci , cn || < dcn then enqueue(cn , dci + ||ci , cn ||); P (cq , cn ) = append(P (cq , ci ), e(ci , cn )); dcn = dci + ||ci , cn ||;
Thereafter, we continuously dequeue the head entry from Que until it becomes empty. Every time when a head entry ci , ||ci , cq || is dequeued, the lower bound of the approximated distance from Q.v to any node in partition Gi centered at ci (i.e., d + ||ci , cq ||) is compared against the approximated distance dans from Q.v to the current answer node. If the lower bound is longer than dans , the partition Gi can be safely discarded. Similarly, all the entries in Que, due to larger ||ci , cq ||
Context-Aware Nearest Neighbor Query on Social Networks
107
values, are pruned away to terminate the search (lines 9-10). Otherwise, partition Gi needs examination. We use cmap to filter out unnecessary nodes. The first i filtering condition is based on distance. We calculate (d + ||ci , cq || + cmap [l].dis), i [l].nodes to the lower bound of the approximated distance from a node in cmap i Q.v. If it is longer than dans , there is no need to examine nodes within this l-th group and the following groups (lines 12-13). The second filtering condition is based on the context. We could safely discard cmap [l].nodes if cmap [l].sig does i i not match the query context Q.k. If this l-th group is not filtered out by the previous two conditions, we need examine the nodes in this group one by one, [l].nodes that match the search and update the answer when the nodes vj ∈ cmap i context are found(line 14-18). Up to this point, we have examined the partition centered at ci , i.e., the local search is finished. We then start the extended search by inserting all the unexamined neighboring center nodes of ci in GH for further examination (lines 19-22).
5
Experiments
In this section, we report the experimental evaluation. First, we evaluate various center node selection schemes for the hyper graph index construction. Next, we test the hyper graph index based ACANN search performance, including preprocessing time, storage overhead, query time, and approximation quality. Two real social network datasets are used, including dblp and gamma. The former is extracted from DBLP (http://dblp.uni-trier.de/xml/). We sample dblp graphs with number of nodes changing from 0.5K to 8K. For each node, we extract 20 keywords from papers published by the author as the context. The latter is provided by MyGamma, a mobile social networking service provider (http://m.mygamma.com/). We sample mygamma graphs with node number changing from 10K to 20K. Each node has on average 10 keywords, including user’s nickname, race, country and so on extracted from user’s profile. For both datasets, the graphs are unweighted (i.e. the weight on every edges is 1). We implemented all the evaluated schemes in C++, running on an AMD 2.4GHz Dual Processors server with 4GB RAM. In addition, the false drop rate γ is set to 0.01 and the size of the signature |sig| is set to 128 in our implementation. Due to the space limitation, we skip some results w.r.t. gamma that share the similar trends as dblp. 5.1
Evaluating Center Node Selection Schemes
As mentioned in Section 4, there are three center nodes selection schemes, including random selection, betweenness based selection, and degree based selection, denoted as Random, Betweenness, and Degree respectively. In the first set of experiments, we compare the performance of these three approaches in terms of selection time and the quality of approximation. The test results on a 5K nodes dblp graph is reported in Fig. 3 as a representative. Fig. 3(a) shows the selection time when the number of center nodes,
10
3
101 10-1 10
-3
10
-5
Degree Random Betweenness
1 2 3 4 5 6 7 8 9 10 Center node percentage (%)
(a) Selection time.
|ACANN(Q).P|/|CANN(Q).P|
Y. Wang and B. Zheng Selection time (sec)
108
2.0 1.5 1.0 0.5
Degree Random Betweenness 1 2 3 4 5 6 7 8 9 10 Center node percentage (%)
(b) Approximation quality.
Fig. 3. Performance of the center node selection schemes (dblp, |V | = 5K)
presented as the percentage of the dataset size, changes. As we can see, Random is the most efficient in terms of selection time. Degree takes more time than Random, but is still efficient. However, Betweenness is very time consuming due to the high cost of computing nodes betweenness. Fig. 3(b) reports the approximation quality of ACANN under different schemes. The approximation quality is measured by |ACAN N (Q).P |/|CAN N (Q).P |, as defined in Definition 3. We run 200 random queries with each having 1 to 5 keywords randomly selected from the keywords vocabulary and report the average result. As shown in the figure, Random leads to very inaccurate results, while Betweenness offers the highest quality. The result on Degree is very close to that of Betweenness. Consider both the center node selection time and approximation accuracy, we set Degree as the default center node selection approach in the following evaluation. 5.2
Performance of ACANN Search Algorithm
Next, we evaluate the performance of ACANN search with the help of hyper graph index. Two algorithms are implemented as the comparison in our evaluation. One is the naive SPA-based approach introduced in Section 1, referred as Naive. Started from the queried node, it explores the graph based on distance and does not rely on any index structure. The other method is based on the pre-computed all-pairs shortest paths, referred as AllPath. In AllPath, for each node v in G, we construct a signature map for each of its neighboring node nvi as described in Section 4.2. The signature map summarizes the context of the nodes u which can be reached from v via nvi (i.e. the shortest path from v to u passes nvi ). When a query is performed on v, the signature map could efficiently direct the search towards the potential result node whose context actually matches the query. We also implement the hyper graph index method, referred as HGI. Pre-processing Time. First, we evaluate the pre-processing time of different approaches vs. size of the datasets, as reported in Fig. 4(a). Note that Naive does not require any index structure and hence it is not reported. It is observed that as the graph size grows, the index construction time increases as well. AllPath takes longer construction time due to the need of computing all-pairs of shortest paths, and the construction time increases sharply with the increase of
1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.5 1
Storage (MB)
2
Time(×10 sec)
Context-Aware Nearest Neighbor Query on Social Networks
AllPath HGI
2
3
4
5
6
7
8
30 Naive 25 AllPath 20 HGI 15 10 5 0 0.5 1 2 3 4
Number of nodes (×103)
5
6
7
109
8
Number of nodes (×103)
(a) Pre-processing time
(b) Storage cost
12 11 10 9 8 7 6 5 4 3
HGI Storage (MB)
Time (sec)
Fig. 4. Performance vs. dataset size (dblp, 5% center nodes) 10.55 10.5 10.45 10.4 10.35 10.3 10.25 10.2 10.15 10.1
HGI
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
Center node percentage (%)
Center node percentage (%)
(a) Pre-computation cost.
(b) Storage cost.
Fig. 5. HGI performance vs. # center nodes. (dblp, |V | = 5K)
graph size. On the other hand, HGI takes much shorter construction time and hence the hyper graph based algorithm has a better scalability. We also report the preprocessing time of HGI with various number of center nodes selected, as depicted in Fig. 5(a). Generally, when the number of center nodes increases, the index construction time increases. Storage Costs. Next, we evaluate the storage costs of various approaches in Fig. 4(b). Notice that Naive does not request any index. For other methods, we record the storage space taken by the social network and the corresponding indexes. We observe that for both datasets, the storage cost increases with the graph size growing, and HGI takes up much less space than AllPath. In addition, compared with Naive, the extra space consumed by HGI is smaller than 5% for both datasets. The storage cost of the hyper graph index is also affected by the number of center nodes selected. As shown in Figure 5(b), the more the selected center nodes are, the larger the hyper graph is. Query Time. The query performance is evaluated by the query time and the approximation quality. We first test the query time of different approaches under different size of graphs, as reported in Fig. 6. Generally, Naive performs the worst, especially on large sized graphs. This is because it has to visit a large number of nodes with extremely long processing time. On the other hand, AllPath and HGI both significantly shorten the query time by precomputing certain information. For the dblp graphs, HGI even takes shorter query time than AllPath. This is probably because that there are more nodes in a dblp graph with their contexts matching the query keywords, thus it takes more time for AllPath to filter out
Y. Wang and B. Zheng 140 Naive 120 AllPath 100 HGI 80 60 40 20 0 0.5 1 2 3
12
Query time (ms)
Query time (ms)
110
4
5
6
7
8
2 211 10 2 29 28 7 2 26 5 2 4 2 1.0
Number of nodes (×103)
Naive AllPath HGI
1.2
1.4
1.6
1.8
2.0
Number of nodes (×104)
(a) dblp
(b) gamma
1.35 1.30 1.25 1.20 1.15 1.10 1.05 1.00
HGI
0.5 1 2 3 4 5 6 7 8 Number of nodes (×103)
(a) dblp
|ACANN(Q).P|/|CANN(Q).P|
|ACANN(Q).P|/|CANN(Q).P|
Fig. 6. Query time vs. dataset size (5% center nodes) 1.35 1.30 1.25 1.20 1.15 1.10 1.05 1.00
HGI
1.0 1.2 1.4 1.6 1.8 2.0 Number of nodes (×104)
(b) gamma
Fig. 7. Approximation quality vs. dataset size (5% center nodes)
the non-result nodes based on distance. While, for the gamma graphs, HGI, in most cases, incurs similar query time as AllPath. Then, we fix the graph size and change the number of center nodes selected, and report its impact on the query time of HGI in Figure 8. Similar as previous observation, the more the selected center nodes are, the larger the index is, thus the longer the search time is. Approximation Quality. We then evaluate the approximation quality of the ACANN search under hyper graph index. First, we study the impact of dataset size on the approximation quality of HGI, as depicted in Fig. 7. For the dblp datasets, the approximated shortest path returned by HGI is 0.3 times longer than the real shortest distance as shown in Figure 7(a). Given that shortest distances between nodes of the dblp/gamma datasets are short (usually less than 5 for dblp datasets, and around 3 for gamma datasets), the approximated shortest paths are usually only one or at most two steps further, compared to the real shortest paths. Consequently, for those applications with high demand on search performance, our ACANN search algorithm can provide considerably good approximations with fast response time. We further study the impact of the number of center nodes selected on the approximation quality of HGI, as reported in Figure 9. Again as observed from the results, the more the selected center nodes are, the better the approximation quality for HGI is. It is because that when more center nodes are selected, the graph is partitioned into finer partitions. Consequently, each partition contains less non-center nodes and the average distance from a non-center node to its nearest center node is shorter.
12 11 10 9 8 7 6 5 4 3
111
40 HGI
Query time (ms)
Query time (ms)
Context-Aware Nearest Neighbor Query on Social Networks
HGI 35 30 25 20 15
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
Center node percentage (%)
(a) dblp (|V | = 5, 000).
Center node percentage (%)
(b) gamma (|V | = 16, 000).
1.30 1.25 1.20 1.15 1.10 1.05 1.00
HGI
1 2 3 4 5 6 7 8 9 10 Center node percentage (%)
(a) dblp (|V | = 5, 000).
|ACANN(Q).P|/|CANN(Q).P|
|ACANN(Q).P|/|CANN(Q).P|
Fig. 8. HGI query time vs. # center nodes (γ = 0.01, |sig| = 128) 1.25 1.20 1.15
HGI
1.10 1.05 1.00 1 2 3 4 5 6 7 8 9 10 Center node percentage (%)
(b) gamma (|V | = 16, 000).
Fig. 9. Approximation quality vs. # center nodes
To sum up, we evaluate the pre-processing time, storage overhead, query time, and approximation quality of the HGI method. The results demonstrate that HGI has relatively low preprocessing and storage overhead with certain sacrifice of the query accuracy. However, the average error factor is less than 1.3.
6
Conclusion
Motivated by the fact that social networking is growing rapidly, we, in this paper, formulate a new type of queries, namely context aware nearest neighbor search, over social networks. It returns a node that is closest to the query node, and meanwhile has its context matching the query condition. A hyper graph index structure is designed to support approximate CANN search. Through extensive evaluation tests, hyper-graph based approaches provide relative accurate results with low preprocessing and storage overhead.
References 1. Cohen, E.: Fast algorithms for constructing t-spanners and paths with stretch t. SIAM J. Comput. 28, 210–236 (1999) 2. Dijkstra, E.W.: A note on two problems in connexion with graphs. Numerische Mathematik 1(1), 269–271 (1959) 3. Erwig, M.: The graph Voronoi diagram with applications. Networks 36(3), 156–163 (2000)
112
Y. Wang and B. Zheng
4. Gaboille, C., Peleg, D., P´erennes, S., Raz, R.: Distance labeling in graphs. Journal of Algorithms 53(1), 85–112 (2004) 5. Goldberg, A.V., Harrelson, C.: Computing the shortest path: A search meets graph theory. In: SODA, pp. 156–165 (2005) 6. Gutman, R.: Reach-based routing: A new approach to shortest path algorithms optimized for road networks. In: ALENEX, pp. 100–111 (2004) 7. He, H., Wang, H., Yang, J., Yu, P.S.: Blinks: ranked keyword searches on graphs. In: SIGMOD, pp. 305–316 (2007) 8. Hu, H., Lee, D.L., Lee, V.C.S.: Distance indexing on road networks. In: VLDB, pp. 894–905 (2006) 9. Hulgeri, A., Nakhe, C.: Keyword searching and browsing in databases using banks. In: ICDE, pp. 431–443 (2002) 10. Jing, N., Huang, Y.-W., Rundensteiner, E.A.: Hierarchical encoded path views for path query processing: An optimal model and its performance evaluation. TKDE 10(3), 409–432 (1998) 11. Jung, S., Pramanik, S.: An efficient path computation model for hierarchically structured topographical road maps. TKDE 14(5), 1029–1046 (2002) 12. Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: VLDB, pp. 505–516 (2005) 13. Lee, D., Leng, C.: Partitioned signature file: Design considerations and performance evaluation. TOIS 7(2), 158–180 (1989) 14. Lee, D.L., Kim, Y.M., Patel, G.: Efficient signature file methods for text retrieval. TKDE 7(3), 423–435 (1995) 15. Leng, C., Lee, D.: Optimal weight assignment for signature generation. TODS 17(2), 346–373 (1992) 16. Li, G., Feng, J., Chin Ooi, B., Wang, J., Zhou, L.: An effective 3-in-1 keyword search method over heterogeneous data sources. Inf. Syst. 36, 248–266 (2011) 17. Burcher, N.: http://www.nickburcher.com/2010/03/ facebook-usage-statistics-march-2010.html 18. Peleg, D.: Proximity-preserving labeling schemes. J. Graph Theory 33, 167–176 (2000) 19. Samet, H., Sankaranarayanan, J., Alborzi, H.: Scalable network distance browsing in spatial databases. In: SIGMOD, pp. 43–54 (2008) 20. Wei, F.: TEDI: efficient shortest path query answering on graphs. In: SIGMOD, pp. 99–110 (2010) 21. Xiao, Y., Wu, W., Pei, J., Wang, W., He, Z.: Efficiently indexing shortest paths by exploiting symmetry in graphs. In: EDBT, pp. 493–504 (2009)
Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments Eva Zangerle, Wolfgang Gassler, and G¨ unther Specht Databases and Information Systems Institute of Computer Science University of Innsbruck, Austria
[email protected]
Abstract. Microblogging applications such as Twitter are experiencing tremendous success. Twitter users use hashtags to categorize posted messages which aim at bringing order to the chaos of the Twittersphere. However, the percentage of messages including hashtags is very small and the used hashtags are very heterogeneous as hashtags may be chosen freely and may consist of any arbitrary combination of characters. This heterogeneity and the lack of use of hashtags lead to significant drawbacks in regards of the search functionality as messages are not categorized in a homogeneous way. In this paper we present an approach for the recommendation of hashtags suitable for the tweet the user currently enters which aims at creating a more homogeneous set of hashtags. Furthermore, users are encouraged to using hashtags as they are provided with suitable recommendations for hashtags.
1
Introduction
Microblogging has become immensely popular throughout the last years. Twitter, the most successful platform for microblogging, is experiencing tremendous popularity on the web. Essentially, microblogging allows users to post messages on the Twitter platform which are at most 140 characters long. These posted messages – also known as tweets – are available to the public. Users are able to ”follow“ other users, which basically means that if user A follows user B (the followee), user A subscribes to the feed of tweets of user B. These messages are then added to the user’s timeline (overview about his own tweets and the tweets of his followees) which enables him to always be up-to-date with the followee’s tweets. Considering the fact that currently about 140,000,000 Twitter messages are posted every day, it becomes clear that the data posted is very diverse and heterogeneous. Therefore, Twitter users themselves started to manually categorize and classify their tweets – they started to use so-called hashtags as a part of the message. The only requirement for a hashtag is that it has to be preceded by a hash symbol #, like e.g. in the hashtags #apple, #elections or #obama. There are no further restrictions in regards of the syntax or semantics of hashtags, which makes them a very convenient, easy-to-use way of categorizing A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 113–126, 2011. c Springer-Verlag Berlin Heidelberg 2011
114
E. Zangerle, W. Gassler, and G. Specht
tweets. Most importantly, hashtags can be used for searching messages, following a certain thread or topic and therefore mark a set of tweets focusing on a certain topic described by the hashtag. Hence, the use of appropriate hashtags is crucial for the popularity of a message in regards of how quickly messages concerning a certain topic can be found. Therefore, hashtags can also be seen as a way to give a certain amount of “context” to a tweet. However, choosing the best hashtags for a certain message can be a difficult task. Hence, users often feel forced to use multiple hashtags having the same meaning (synonyms), like e.g. for tweets regarding the SocInfo conference, one could use #socinfo, #socinfo2011 and #socinfo11. The usage of multiple synonymous hashtags decreases the possible length of the actual content of the tweet as only 140 characters including hashtags are allowed per tweet. Furthermore, the usage of synonyms also motivates other users to cram their messages with hashtags to cover as many searches as possible. To avoid such a proliferation of hashtags, for example hashtags concerning a certain event are often predefined and propagated to all its participants in order to ensure that the hashtags used for tweets regarding this event are homogeneous. This often leads event organizers (e.g. of conferences) to announce an ”official“ hashtag. E.g. Tim O’Reilly (@timoreilly) posted on 2011-03-05: At Wired Disruptive by Design conference, no hashtag announced. Hmmm.. Such scenarios could easily be avoided if the tag vocabulary of the folksonomy is kept homogeneous which basically implies that no synonymous hashtags are used. In this paper we present an approach aiming at supporting the user and creating a more homogeneous set of hashtags within the Twittersphere by facilitating a recommender system for the suggestion of suitable hashtags to the users. We show how the computation of hashtags can be facilitated and prove that this approach is able to provide the user with suitable hashtag recommendations. The remainder of this paper is organized as follows. Section 2 outlines the characteristics of the data set underlying our evaluations. Section 3 is concerned with the algorithms underlying our approach. Section 4 features the evaluation of our approach and Section 5 describes work closely related to our approach. The paper concludes with final remarks in Section 6.
2
Used Dataset for Recommendations
The approach presented in this paper and its evaluation are based on an underlying data set of tweets which is used to compute the hastag recommendations. As there are no large Twitter datasets publicly available , we had to crawl tweets in order to build up such a database. The crawling of Twitter data has been constrained significantly by the abolishment of so-called Whitelisting. Whitelisting allowed users to query the Twitter API without any restrictions. Currently, the Twitter API only allows 350 requests per hour, each call returning about 100 tweets on average. The dataset was crawled by using the search API. As input for these search calls, we made use of an English dictionary consisting of more than 32,000 words. We used each of these words as input for the search process
Using Tag Recommendations to Homogenize Folksonomies
115
Table 1. Overview about the Tweet Data Set Characteristic Crawled messages total
Value Percentage 18,731,880
100%
3,753,927
20%
14,977,953
80%
Retweets
2,970,964
16%
Direct messages
3,565,455
19%
Hashtags usages total
5,968,571
–
585,140
–
1.5932
–
23
–
Hashtags occurring < 5 times in total
502,172
–
Hashtags occurring < 3 times in total
452,687
–
Hashtags occurring only once
377,691
–
Messages containing at least one hashtag Messages containing no hashtags
Hashtags distinct Average number of hashtags per message Maximum number of hashtags per message
and stored the search results. This strategy enabled us to crawl about 18 million tweets between July 2010 and April 2011. Only 20% of these messages contained hashtags. Further details about the characteristics of the data set can be found in Table 1.
3
Hashtag Recommendations
The recommendation of hashtags supports the user during the process of creating a new tweet. While the user is typing, hashtags appropriate for the already entered message are computed on the fly. With every new keystroke, the recommendations are recomputed and get refined. Due to the fact that both the cognition of the user and the space available for displaying the recommendations is limited, the shown size of the set of suggested hashtags is restricted. In most cases a set of 5-10 recommendations is most appropriate which also corresponds to the capacity of short-term memory (Miller, 1956). Therefore the top-k recommendations are shown to the user, where k denotes the size of the set of recommended hashtags. The value k was chosen between 1 and 10 in our evaluation. For a given tweet (or part of it), the computation of these recommendation for suitable hashtags based on the underlying data set comprises the following steps which are also illustrated in Figure 1. 1. For a given input tweet (or a part of it), retrieve the most similar messages featuring hashtags from the data set. 2. Extract the hashtags contained in the top-k similar messages. These hashtags constitute the hashtag recommendation candidate set.
116
E. Zangerle, W. Gassler, and G. Specht
User enters Message
Retrieve most similar Messages
Retrieve Set of Hashtags
Apply Ranking to Set of Hashtags
Top-k Hashtag Recommendations
Fig. 1. Workflow: Hashtag Recommendation Computation
3. Rank the recommendation candidates, computed in step 2 according to the ranking methods proposed in this paper. 4. Present the top-k ranked hashtags to the user. These steps are described in detail in the following sections. 3.1
Similarity of Messages
Retrieving the set of k most similar messages to the input (query) tweet is the first step in computing recommendations. The similarity between the input tweet and the messages within the data set is computed by the cosine similarity of the tf/idf weighted term vectors. The messages within the data set are ranked according to this similarity measure and the top-k messages (k = 500 in our evaluations) are used for the further computation of recommendations as these most similar messages are most likely to contain suitable hashtags for the current input message. Therefore, the hashtags contained in these messages are extracted. These hashtags are referred to as hashtag recommendation candidates throughout the remainder of this paper. 3.2
Ranking
The ranking of the hashtag recommendation candidates is a crucial part of the recommendation process as only the top-k (with k between 5 and 10) hashtags are shown to the user. Therefore, we propose four basic ranking methods for the recommendation of hashtags. These ranking methods are either based on the hashtags themselves (TimeRank, RecCountRank, PopularityRank) or the messages where the tweets are embedded in (SimilarityRank). – SimRank (1) - this ranking method is based on the similarity values of the input tweet tinput and the tweets containing the hashtag recommendation candidates CT . The cosine similarity has to be computed for every term within the input tweet and are used for the ranking of the recommendation candidates. – TimeRank (2) - this ranking method is considering the recency of the usage of the hashtag recommendation candidates. The more recent a certain hashtag has been used, the higher its ranking. This ranking enables the detection and prioritization of currently trending hashtags (most probably about trending topics) which have been used only recently.
Using Tag Recommendations to Homogenize Folksonomies
117
– RecCountRank (3) - the recommended-count-rank is based on the popularity of hashtags within the hashtag recommendation candidate set. This basically means that the more similar messages contain a certain hashtag, the more suitable the hashtag might be. – PopRank (4) - the popularity-rank is based on the global popularity of hashtags within the whole underlying data set. As only a few hashtags are used at a high frequency, it is likely that such a popular hashtag matches the tweet entered by the user. Therefore, ranking the overall most popular hashtags from within the candidate set higher is also a suitable approach for the ranking of hashtags. The ranking methods are formally described in the following equations, where T is the crawled data set containing all tweets and CT is the candidate consisting of all top-k tweets regarding the similarity measure to the input string. CH denotes the set of all extracted hashtags from the set CT . The function contains(t, h) returns 1 if the specified hashtag h is present in the specified message t and 0 if it cannot be found in the message text. The function now() returns the current UNIX-timestamp and createdAt(t) corresponds to the timestamp the respective tweet t was created.
sim(tinput , tc ) =
V (tinput ) · V (tc ) V (tinput ) V (tc )
f oreach tc ∈ CT ,
(1)
where V (tinput ) and V (tc ) are the weighted term vectors of tinput resp. tc timeDif f (tc) = now() − createdAt(tc ) recCount(h) =
contains(tc , h)
for each tc ∈ CT with tc ∈ CT
(2)
(3)
c
pop(h) =
contains(ti , h)
with ti ∈ T
(4)
i
After the computation of the sim, timeDif f , recCount and pop values, all suitable hashtag candidates of set CH are subsequently ranked in descending order to compute the final ranking. Beside these basic ranking algorithms, we propose to use hybrid ranking methods which are based on the presented basic ranking algorithms. The combination of two ranking methods is computed by by the following formula: hybrid(r1, r2) = α ∗ r1 + (1 − α) ∗ r2
(5)
where α is the weight coefficient determining the weight of the respective ranking within the hybrid rank. r1 and r2 are normalized to be in the range of [0, 1] and can therefore be combined to a hybrid rank.
118
4
E. Zangerle, W. Gassler, and G. Specht
Evaluation
The evaluations were conducted based on a prototype of the approach which was implemented in Java on top of a Lucene fulltext index. As a data set based on which the evaluations were performed on, we used the data set described in Section 2. This implies that our Lucene index kept 3.75 million tweets. The evaluation was performed on a Quad-Core machine with 8 GB of RAM on CentOS release 5.1. Essentially, we performed leave-one-out tests on the collected tweets in order to evaluate our approach. For this purpose, we arbitrarily chose 10.000 sample tweets from the data set. For our tests we only use tweets which contain less than 6 hashtags to exlude possible spam messages. Furthermore, we did not use any retweets or messages which are present several times in the dataset for the evaluation as these would lead to hashtag recommendations based on identical messages and would therefore distort our evaluation. Such a leave-one-out test consists of the following steps which were performed for each of the 10.000 test-tweets: 1. Remove the hashtags occurring in the test-tweet. 2. Remove the test-tweet from the index (underlying dataset) as leaving the original tweet in the index would lead to a perfect match when searching for similar messages. Therefore, also the original hashtags would be recommended based on the same tweet. 3. Use the test-tweet (without hashtags) or a part of the message as the input string for the recommendation computation algorithm. 4. Compute the hashtag recommendations using the recommendation approach including the different ranking methods introduced in section 3. 5. Evaluate the resulting hashtag recommendations in comparison to the originally used hashtags based on the measures described Section 4.1. In order to determine the quality and suitability of the recommendations of hashtags provided to the users, we chose to apply the traditional IR-metrics recall, precision and F-measure (also known as F1-score). As a hashtag recommendation system should be aiming at providing the user with an optimal number of correct tags, the recall value is the most important quality measure for our approach. 4.1
Recall and Precision, F-Measure
Figure 2 shows the top-k (k = 1, 2, ..., 10) plot of the recall values of the four basic ranking methods. The good performance of the SimilarityRank can be explained by the fact that the message in which the hashtag recommendation candidate is embedded in is directly related to the relevancy of the hashtag. The other ranking methods are based on time or (global) hashtag popularity which are only loosely coupled to the hashtag and the message it is contained in. It can be seen that already five shown hashtags are sufficient to get a reasonable recall
Using Tag Recommendations to Homogenize Folksonomies 0.5
119
SimRank TimeRank RecCountRank PopRank
0.4
Recall
0.3
0.2
0.1
0 1
2
3
4
5
6
7
8
9
10
Top-k
Fig. 2. Top-k Recall for k=[1..10] for the Basic Ranking Methods
value of about 35% and therefore allow to build a lightweight recommendation interface without overwhelming the user by too many recommendations. The increment of the number of shown hashtags k showed very slight improvements regarding the recall value. As for the hybrid ranking approaches, we chose to evaluate these in regards of their recall, precision and F-measure. The SimilarityRank method proved to be the ranking method performing best throughout our evaluations. Therefore, we chose to combine the other ranking methods proposed in this paper with the SimilarityRank-method. The recall values for the top-5 recommendations (recall@5) for the three hybrid ranking methods are displayed in Figure 3. On the x-axis we plotted the weight coefficient α = [0...1] and on the y-axis we plotted the according recall values for the proposed hybrid ranking mechanisms. Obviously, setting α to 1 corresponds to the result of the SimilarityRank method. On the other hand, α = 0 leads to the same result as the sole execution of the second ranking method used for the hybrid ranking method. This way, also the base ranking methods can be compared to the hybrid methods as at α = 0, simTimeRank corresponds to TimeRank, SimPopularityRank corresponds to PopularityRank and SimRecCountRank corresponds to RecCountRank. The Figure shows that SimRecCountRank performs best for all weight coefficients. The other ranking methods, especially SimTimeRank and SimPopRank suffer from the poor performance of the base ranking methods (TimeRank, PopularityRank). This is due to the fact that both TimeRank and PopularityRank do only consider the global factors time and the overall popularity of hashtags and are not considering the actual content of the tweet itself. Using the recency of the tweet might have a bigger effect when using a long-time dataset as basis for the recommendations. In contrast to the time and popularity-based ranking methods, SimRecCountRank considers the context of the hashtag which leads to a good performance. The context of the hashtag is characterized by both similarity of the input tweet and the tweet containing the hashtag candidate and also the number of occurrences within the most similar messages. The overall best result can be reached using SimRecCountRank with α being set to 0.6.
120
E. Zangerle, W. Gassler, and G. Specht 0.4 0.35 0.3
Recall
0.25 0.2 0.15 0.1 SimTimeRank SimRecCountRank SimPopRank
0.05 0 0
0.2
0.4
0.6
0.8
1
Weight Coefficient
Fig. 3. Recall@5 for Hybrid Ranking Methods 0.11 0.1 0.09 0.08 Precision
0.07 0.06 0.05 0.04 0.03 0.02
SimTimeRank SimRecCountRank SimPopRank
0.01 0 0
0.2
0.4 0.6 Weight Coefficient
0.8
1
Fig. 4. Precision@5 for Hybrid Ranking Methods
The precision@5 values for the hybrid ranking methods are shown in Figure 4. In general, the precision values reached by our prototype are low. This can be explained by the fact that the number of hashtags used within a tweet is very small. On average, about 1.5 hashtags are used per message. Therefore, evaluating the precision values for e.g. ten recommendations for tweets which do only contain two hashtags naturally leads to very low precision values. Even if the recommendations were 100% correct, still eight other recommended hashtags were not suitable and therefore decrease the precision value. The F-measure of the hybrid ranking methods with k = 5 is shown in Figure 5 and underlines the performance of the ranking method SimRecCountRank. In order to further investigate the behavior of the hybrid approaches, we also evaluated the precision/recall values for the described ranking methods. We set the merge coefficient to α = 0.6 as this has in general proven to lead to the best results. The resulting recall/precision plot can be seen in Figure 6 where the recall values with k = 1, 2, ..., 10 of the corresponding ranking methods are plotted on the x-axis and the precision values are plotted on the y-axis. It
Using Tag Recommendations to Homogenize Folksonomies
121
0.18 0.16 0.14 F-Measure
0.12 0.1 0.08 0.06 0.04 SimTimeRank SimRecCountRank SimPopRank
0.02 0 0
0.2
0.4
0.6
0.8
1
Weight Coefficient
Fig. 5. F-Measure@5 for Hybrid Ranking Methods 0.3
SimTimeRank =0.6 SimPopRank =0.6 SimRecCountRank =0.6
0.25
Precision
0.2 0.15 0.1 0.05 0 0.15
0.2
0.25
0.3
0.35
0.4
Recall
Fig. 6. Precision/Recall Plot for weight α=0.6 and k=[1...10]
turned out that the hybrid SimRecCountRank performed best overall whereas the performance of the other two hybrid ranking methods were rather poor. 4.2
Refinement of Recommendations
In order to show how our recommendation approach performs and how the recommendations are refined with every keystroke during the creation, we compute the recall and precision values of the input tweet at ten different stages during the process of entering a tweet. Therefore, we take the original tweet (without hashtags) and compute the precision and recall values for 10%, 20%, ..., 90%, 100% of the text. The average length of tweets in our datasets are 98 characters without hashtags. Thus, we started the evaluation using an input tweet containing about 10 characters of the original message and evaluated the proposed recommendation algorithms. We proceeded with the recommendation computations until the original length of the tweet without hashtags was reached. The results using a weight α of 0.6 can be seen in Figure 7. It can be seen that constraining
122
E. Zangerle, W. Gassler, and G. Specht 0.35 0.3
Recall
0.25 0.2 0.15 SimTimeRank =0.6 SimRecCountRank =0.6 SimPopRank =0.6 SimRank
0.1 0.05 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Percentage of Tweet entered
Fig. 7. Development of Recall Values as the User advances in entering the Tweet
the length of an input string directly influences the performance of the ranking methods. The plot shows that the recommendations for a tweet which has only been entered partly, the SimRecCountRank performs significantly better than the other ranking methods. However, it is remarkable that the ranking strategies which take global factors like time or popularity into account performed reasonably well for short input strings. Therefore, we elaborated this fact further and analyzed the behaviour of the different ranking strategies if only 20% of the text were entered. Figure 8 shows the recall values of the different ranking strategies in which the according weight coeffients α are plotted on the x-axis. As the available part of the message is very short, we expected an increasing performance of the ranking methods SimTimeRank and SimPopRank. We also evaluated the different weights of the hybrid ranking methods as shown in Figure 8. Even if the tweet is cut down to 20% of its original length, the SimRecCountRank still performs best – despite the lack of context. This ranking method has proven to be the best performing method regardless of the length of the input tweet.
0.25
0.2
Recall
0.15
0.1
0.05
SimTimeRank SimRecCountRank SimPopRank
0 0
0.2
0.4
0.6
0.8
1
Weight Coefficient
Fig. 8. Recall Values for weight α=0.6 with 20% of the Message as Input
Using Tag Recommendations to Homogenize Folksonomies
5
123
Related Work
The recommendation of hashtags within the Twittersphere is closely related to the field of microblogging, tagging in Web 2.0 applications and the field of recommender systems as a whole. Tagging of online resources has become popular with the advent of Web 2.0 paradigms. However, the task of recommending traditional tags differs considerably from recommending hashtags. Our recommendation approach is solely based on 140 characters whereas in traditional tag recommender systems, much more data is taken into consideration for the computation of tags recommendations. Furthermore, tweets, hashtags and trends within the Twittersphere are changing at a fast pace and are very dynamic. New hashtags may evolve around trending topics and therefore the recommendations have to consider this dynamic nature of Twitter. Sigurbj¨ ornsson et al. [23] presented an approach for the recommendation of tags within Flickr which was based on the co-occurrence of tags (also used in [7, 15]). Two different tags co-occur if they are both used for the same photo. Based on this information about the co-occurrence of tags for Flickr photos, the authors developed a prototype which is able to recommend hashtags for photos which have been partly tagged. This recommendation is computed by finding those tags which have been used together with the tag the user already specified for a certain photo. These tags are subsequently ranked and recommended to the user. It is important to note that such an approach is not feasible if a photo has not been tagged at all. Partly based on this work, Rae et al. [19] proposed a method for Flickr tag recommendations which is based on different contexts of tag usage. Rae distinguishes four different context which are used for the computation of recommendations: (i) the user’s previously used tags, (ii) the tags of the user’s contacts, (iii) the tags of the users which are members of the same groups as the user and (iv) the collectively most used tags by the whole community. A similar approach has also been facilitated by Garg and Weber in [6]. Furthermore, e.g. on the BibSonomy platform which basically allows its users to add bibliographic entries the users are provided with recommendations for suitable tags annotating these entries [15]. This approach extracts tags which might be suitable for the entry from the title of the entry, the tags previously used for the entry and tags previously used by the current user. Based on these resources, the authors propose different approaches for merging these sets of tags. The resulting set is subsequently recommended to the user. Tag recommendations based on Moviebase data has been presented in [22]. J¨aschke et al. [11] propose a collaborative filtering approach for the recommendation of tags. The authors therefore construct a graph based on the users, the tags and the tagged entities. Within these graphs, the recommendations are computed and ranked based on a PageRank-like ranking algorithm for folksonomies. Recommendations based on the content of the entity which has to be tagged have been studied in [24]. Additionally, there have been numerous papers concerned with the analysis of the tagging behavior and motivation of users [2, 16]. The social aspects within social online media, such as the Twitter platform, has been analysed heavily throughout the last years. These analysis were con-
124
E. Zangerle, W. Gassler, and G. Specht
cerned with the motivations behind tweeting, like e.g. in [12]. Boyd et al. [4] showed how users make use of the retweet function and why users retweet at all. Honeycutt and Hering examined how direct Twitter messages can be used for online collaboration [9]. Recently, the work by Romero et al.[21] analyzed how the exposure of Twitter users to hashtags affects their hashtagging behavior and how the use of certain hashtags is spread within the Twittersphere. The authors found that the adoption of hashtags is dependent on the category of the tweet. E.g. hashtags concerned with politics or sports are adopted faster than hashtags concerned with any other topic category. Further analysis of Twitter data and the behavior of Twitter users can be found in [10, 13, 14, 25]. As for the recommendation of items within Twitter or based on Twitter data, there have been numerous approaches dealing with these matters. Hannon et al. [8] propose a recommender system which provides users with recommendations for users who might be interesting to follow. Chen et al. present an approach aiming at recommending interesting URLs to users [5]. The work by Phelan, McCarthy and Smyth[18] is concerned with the recommendation of news to users. Traditionally, recommender systems are used in e-commerce where users are provided with recommendations for interesting products, like e.g. on the Amazon website. Recommendations are typically computed based on one of the following two approaches: (i) a collaborative filtering [1, 20] approach which is based on finding similar users with a similar behavior for the recommendation of e.g. tags used by these users and (ii) a content-based approach [3, 17] which aims at finding items having the most similar characteristics as the items which have already been used by the user. However, to the best of our knowledge, there is currently no other approach aiming at the recommendation of tags in microblogging platforms and hashtags for a certain Twitter message.
6
Conclusion
In this paper we presented an approach aiming at the recommendation of hashtags to microblogging users. Such recommendations help the user to (i) use more appropriate hashtags and therefore to homogenize the set of hashtags and (ii) encourage the users to use hashtags as siutable hashtags recommendations are provided. The approach is based on analyzing tweets similar to the tweet the user currently enters and deducing a set of hashtag recommendation candidates from these Twitter messages. We furthermore presented different ranking techniques for these recommendation candidates. The evaluations we conducted showed that our approach is capable of providing users with suitable recommendations for hashtags. The best results were achieved by combining the similarity of messages and the popularity of hashtags in the recommendation candidate set. Future work will include incorporating the social graph of Twitter users into the process of computing recommendations for hashtags to optimize the presented hashtag recommendation approach.
Using Tag Recommendations to Homogenize Folksonomies
125
References 1. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering 17(6), 734–749 (2005) 2. Ames, M., Naaman, M.: Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2007, pp. 971–980. ACM, New York (2007) 3. Balabanovi´c, M., Shoham, Y.: Fab: content-based, collaborative recommendation. Commun. ACM 40, 66–72 (1997) 4. Boyd, D., Golder, S., Lotan, G.: Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In: HICSS, pp. 1–10. IEEE Computer Society, Los Alamitos (1899) 5. Chen, J., Nairn, R., Nelson, L., Bernstein, M., Chi, E.: Short and tweet: experiments on recommending content from information streams. In: Proceedings of the 28th International Conference on Human Factors in Computing Systems, pp. 1185–1194. ACM, New York (2010) 6. Garg, N., Weber, I.: Personalized, interactive tag recommendation for flickr. In: Proceedings of the 2008 ACM Conference on Recommender Systems, RecSys 2008, pp. 67–74. ACM, New York (2008) 7. Gassler, W., Zangerle, E., Specht, G.: The Snoopy Concept: Fighting Heterogeneity in Semistructured and Collaborative Information Systems by using Recommendations. In: The 2011 International Conference on Collaboration Technologies and Systems (CTS 2011), Philadelphia, PE (May 2011) 8. Hannon, J., Bennett, M., Smyth, B.: Recommending twitter users to follow using content and collaborative filtering approaches. In: RecSys 2010: Proceedings of the Fourth ACM Conference on Recommender Systems, pp. 199–206. ACM, New York (2010) 9. Honeycutt, C., Herring, S.C.: Beyond Microblogging: Conversation and Collaboration via Twitter. In: HICSS, pp. 1–10. IEEE Computer Society, Los Alamitos (2009) 10. Huberman, B., Romero, D., Wu, F.: Social networks that matter: Twitter under the microscope. First Monday 14(1), 8 (2009) 11. J¨ aschke, R., Marinho, L., Hotho, A., Schmidt-Thieme, L., Stumme, G.: Tag Recommendations in Folksonomies. In: Kok, J., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladeniˇc, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 506–514. Springer, Heidelberg (2007) 12. Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities. In: Proceedings of the 9th WebKDD and 1st SNAKDD 2007 Workshop on Web Mining and Social Network Analysis, pp. 56–65. ACM, New York (2007) 13. Krishnamurthy, B., Gill, P., Arlitt, M.: A few chirps about twitter. In: Proceedings of the First Workshop on Online Social Networks, pp. 19–24. ACM, New York (2008) 14. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM, New York (2010) 15. Lipczak, M., Milios, E.: Learning in efficient tag recommendation. In: Proceedings of the Fourth ACM Conference on Recommender Systems, RecSys 2010, pp. 167– 174. ACM, New York (2010)
126
E. Zangerle, W. Gassler, and G. Specht
16. Marlow, C., Naaman, M., Boyd, D., Davis, M.: HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proceedings of the Seventeenth Conference on Hypertext and Hypermedia, HT 2006, pp. 31–40. ACM, New York (2006) 17. Pazzani, M., Billsus, D.: Content-Based Recommendation Systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 325–341. Springer, Heidelberg (2007) 18. Phelan, O., McCarthy, K., Smyth, B.: Using twitter to recommend real-time topical news. In: Proceedings of the Third ACM Conference on Recommender Systems, pp. 385–388. ACM, New York (2009) 19. Rae, A., Sigurbj¨ ornsson, B., van Zwol, R.: Improving tag recommendation using social networks. In: Adaptivity, Personalization and Fusion of Heterogeneous Information, RIAO 2010, Paris, France, pp. 92–99. Le Centre de Hautes Etudes Internationales d’Informatique Documentaire (2010) 20. Resnick, P., Varian, H.: Recommender systems. Communications of the ACM 40(3), 58 (1997) 21. Romero, D.M., Meeder, B., Kleinberg, J.M.: Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Srinivasan, S., Ramamritham, K., Kumar, A., Ravindra, M.P., Bertino, E., Kumar, R. (eds.) WWW, pp. 695–704. ACM, New York (2011) 22. Sen, S., Vig, J., Riedl, J.: Tagommenders: connecting users to items through tags. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, pp. 671–680. ACM, New York (2009) 23. Sigurbj¨ ornsson, B., Van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceeding of the 17th International Conference on World Wide Web, pp. 327–336. ACM, New York (2008) 24. Tatu, M., Srikanth, M., D’Silva, T.: RSDC 2008: Tag Recommendations using Bookmark Content. In: Workshop at 18th Europ. Conf. on Machine Learning (ECML 2008)/11th Europ. Conf. on Principles and Practice of Knowledge Discovery in Databases, PKDD 2008 (2008) 25. Ye, S., Wu, S.F.: Measuring Message Propagation and Social Influence on Twitter.com. In: Bolc, L., Makowski, M., Wierzbicki, A. (eds.) SocInfo 2010. LNCS, vol. 6430, pp. 216–231. Springer, Heidelberg (2010)
A Spectral Analysis Approach for Social Media Community Detection Xuning Tang1, Christopher C. Yang1, and Xiajing Gong2 1
College of Information Science and Technology, Drexel University, Philadelphia, USA 2 School of Biomedical Engineering, Drexel University, Philadelphia, USA {xt24,chris.yang,xg33}@drexel.edu
Abstract. Online forums are ideal platforms for worldwide Internet users to share ideas, raise discussions and disseminate information. It is of great interest to gain a better understanding on the dynamic of user interactions and identify user communities in online forums. In this paper, we propose a temporal coherence analysis approach to detect user communities in online forum. Users are represented by vectors of activeness and communities are extracted by a soft community detection algorithm with the support of spectral analysis.
Keywords: Spectral Analysis, Community Detection, Soft Clustering.
1 Background Due to the advance of Web 2.0 technologies, user interactions via online forum become increasingly intensive. It is of great interest to have a better understanding on the dynamic of user interactions and identify user communities from online forums. Although a social network is an ideal representation for studying user behavior and social structure, constructing a precise social network is difficult. Typically, only direct interactions between users are used to construct social network while implicit or indirect interactions are largely ignored. For example, a real-world event may trigger a collection of threads in an online forum. Each of these threads may be followed by different groups of users. These users may not interact with each other directly but they are indeed discussing an event of common interest. This type of implicit relationship that represents a common interest is not easy to be captured by traditional methods, although it is very useful. Besides implicit relationships, recently, some research works investigated how to incorporate temporal information into social network analysis[1-7]. However, it is still an open question about how to incorporate temporal information effectively into network analysis. To handle implicit interaction between users and incorporate temporal information properly, novel approaches are desired to bridge the gap and supplement social network analysis techniques. In this study, we employ spectral analysis techniques to extract users’ implicit associations. Our key insight is that different users sharing interest toward common external events will have comparable reactions/activities when these external events happen. As a result, if each user is represented by a time-series signal according to his/her daily activeness, we can detect users with similar interest and A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 127–134, 2011. © Springer-Verlag Berlin Heidelberg 2011
128
X. Tang, C.C. Yang, and X. Gong
behavior on the time and frequency domains. Similar approach has been employed for Web mining in the literature. He et al.’s[8] detected aperiodic and periodic events by considering word frequency of news streams in time and frequency domains. Chien and Immorlica [9] identified semantically related search engine queries based on their temporal correlation. In this work, we argue that users who share common interest toward external events may behave similarly when the external events happen, leading to strong correlated time-series signals, even though they do not directly interact with each other. In this paper, we first introduce the representation of forum users as user feature vectors. We then apply spectral analysis techniques to quantify user’s overall activeness by dominant power spectrum and associations with other users by spectral coherence score. With the support of these spectral coherence scores between users, we proposed a soft community detection algorithm to identify user communities with focused theme. We conducted our experiment on the Ansar AlJihad Network data in the Dark Web dataset. Without using any content analysis and explicit user relationship, we were able to identify user communities with focused themes.
2 Methodology To detect user communities from an online forum, we propose a framework which consists of two steps: temporal coherence analysis and soft community detection. 2.1 Temporal Coherence Analysis The objective of temporal coherence analysis is to calculate the temporal similarities between any pair of forum users, which will result in a similarity matrix and serve as the input of soft community detection. To quantify the similarity between any pair of users, say i and j, we first represent them by user feature vectors, then compute auto-spectrum for each individual vector and cross-spectrum of i and j, finally the spectral coherence of i and j is employed to measure their similarity. User Feature Vector. Given an online forum, let T be the period (in days) during which we investigate user behavior and interaction. We represent user activeness in an online forum by a vector. The vector representation of a user is defined as follows: Definition (User Feature Vector): The vector of a user can be denoted as 1 ,
2 ,…,
(1) th
where each element represents the activeness of user on the i day. In this this paper, we assume that thread ID, message ID, user ID and timestamp are given. As a result, we define A i using a score similar to TF-IDF: log
(2)
is the number of threads that user participated on day , is the where the number of threads user participated over time , is the number of threads
A Spectral Analysis Approach for Social Media Community Detection
129
threads of day , and N is the number threads over T. In this study, we consider a group of m users form an m-dimensional multivariate process. By employing the user feature vector defined above, the m-dimensional multivariate process can be denoted as: 1
T
A ,A ,…,A
(4) 1
Spectral Analysis. User behaviors in terms of multivariate time series are often rich in oscillatory content, leading them naturally to spectral analysis. To calculate spectral estimate of user , we first perform Fourier transform on A . In this work, we apply K K tapers successively to the ith user feature vector and take the Fourier transform: ∑
,
exp
2
(5)
(k = 1, 2, …, K) represent K orthogonal taper functions with appropriate where appropriate properties. A particular choice of these taper functions, with optimal leakage properties, is given by the discrete prolate spheroidal sequences (DPSS). The multitaper estimates for the spectrum S f is then defined as: ∑
,
(7)
we further define the dominant power spectrum of user as its maximum spectrum spectrum value across all potential frequencies: f
(8)
which can be used to represent the overall activeness of user in online forum. Similarly, the cross-spectrum S between and is defined as: ∑
,
,
,
, denotes the complex-conjugate transpose of where the spectral density matrix for the multivariate processes as: ,
(9) ,
. We then have
,
(10) ,
,
Spectral Coherence. In this work, we quantify the similarity of two user feature vectors by using spectral coherence. Spectral coherency for any pair of user feature vectors and at frequency f is calculated as: ,
, ,
,
(11)
We obtain an overall spectral coherence score for each pair of users to represent the similarity of these two users, by summing over their spectral coherence value in different frequency, so that we have:
130
X. Tang, C.C. Yang, and X. Gong
,
∑
(12)
,
2.2 Soft Community Detection Problem Formulation. Given the similarity (spectral coherence score) between any pair of users, , , , , … , , , the research problem is to detect K overlapping clusters of users. Each of these clusters includes users with common interest. Soft Community Detection Algorithm. We propose a soft community detection algorithm, namely Soft Community Detection, which takes the similarity matrix of forum users as input and consists of 3 steps: Filtering. Given a similarity matrix, for each user , we sort the similarity scores between i and all other users in the descending order and retrieve users that have the the top similarity score with , denoted as . Parameter t is provided as as an input. As a result, has relatively higher similarity with users in . Secondly, for each users in , , we check whether also belongs to to ensures that has local high similarity with . If it is true, we retain j j in , otherwise we remove j from . If then then becomes an empty set, we correlate with the user p that has the highest , . Finally, except for the relationships in Candidate(i) where i from 1 to N, we set all other elements in the similarity matrix equal to zero. By considering the original similarity matrix as a complete graph, this step removes edges with relatively lower weights and constructs a sparse graph. Hard Clustering. Based on the sparse matrix, we construct a graph G where each node represents a user and two nodes , are connected if belongs to and belongs to . Weight of edge , equals to , . We then employ the Girvan-Newman algorithm[10] to repeatedly remove the edge with the highest betweenness and decompose G into multiple sub-graphs. G will be continually decomposed into sub-graphs if the normalized modularity is rising and will terminate once the normalized modularity decreases. Peripheral User Re-Identify. Given the above steps, we detect non-overlapping user communities from the similarity matrix. Our goal is to identify soft communities where peripheral users can be a member of more than one community. This can be conveniently achieved by analyzing the hard clustering result and re-identify those potential peripheral users. Given a graph , , where V denotes vertices and E denotes weighted edges, and k communities identified by the hard clustering step , … where , and , the membership of a vertex toward toward community is defined as: ,
∑
,
∑
,
(13)
A Spectral Analysis Approach for Social Media Community Detection
131
equals to 1 when vertex only interact with vertices According to (13), , of community , so that we name vertex the core users of community . , , equals to 0 when vertex does not interact with any vertices of community is between zero . We call vertex a peripheral user of community if , and one .In this step, given the hard clustering result, we calculate the membership score for each user toward all identified communities. We maintain the membership assignments of core users while assigning peripheral users to multiple communities if their membership scores are larger than a predefined threshold.
3 Experiment 3.1 Dataset Our experiment is conducted based on a Dark Web dataset which is available from the Challenge of the ISI-KDD Challenge 2010. Dark Web dataset was exported from Dark Web Portal, which consists of several complete multi-year extremist forums. In this dataset, there was 377 unique users, 11133 threads and 29056 messages. Each thread contained around 2.6 messages. Each message record consisted of the Thread ID, Message ID, User ID, Timestamp, and Content. The timestamp of these messages spanned from 12/08/2008 to 01/02/2010. Each thread consisted of messages written mostly in Arabic language, which makes it a difficult task for natural language processing. In each discussion threads, only the list of participating users with one of them as the initiator is captured However, the interaction relationships between users who are replying the same post are not captured. Instead of a multi-level hierarchical interaction patterns, only two-level, one-to-many interactions are captured.
Fig. 1. Visualization of Detected Communities from Dark Web Dataset
132
X. Tang, C.C. Yang, and X. Gong
3.2 Experiment Results In this dataset, only 58 out of 337 users were active in more than 3 days in the span of 390 days. We first removed those inactive users and only studied the active ones. It is important to note that these 58 active users contributed 19233 messages which are 66.2% of the whole dataset, so that it is still a large dataset. By applying the techniques introduced in section 2, detected user communities are displayed in Figure 1. In this figure, the color of the nodes represents the result of hard clustering. To visualize the soft clustering effect, each community is highlighted by an oval. Users who belong to more than one community are covered by more than one oval. Table 1. Topic-Word Distribution of Dark Web Dataset
Topic 1 Pakistan 0.019315 Military 0.017197 Police 0.016799 Iraq 0.01597 Taliban 0.014456 Official 0.010099 Security 0.008334 Bomb 0.007096 Forces 0.006523 Baghdad 0.005744 Suicide 0.005554 City 0.005485 Topic 4 Al 0.035396 Somalia 0.027645 Government 0.01766 Israel 0.016088 Islamist 0.014475 Shabaab 0.009395 Qaeda 0.007431 Mogadishu 0.007064 Palestinian 0.006844 Gaza 0.006203 Hamas 0.005571 Sheikh 0.005051
Topic 2 Afghanistan 0.030411 Troops 0.011377 Military 0.009504 Taliban 0.008849 Country 0.008168 War 0.008122 Obama 0.006884 Forces 0.006498 Government 0.00622 President 0.005182 Pakistan 0.004785 American 0.00707
Topic 3 Mujahideen 0.0331 Afghanistan 0.0206 Soldier 0.019066 Islamic 0.017422 Province 0.016423 Army 0.015787 Emirate 0.014763 Terrorist 0.013966 Vehicle 0.013946 District 0.013716 Invader 0.01167 Puppet 0.011396
Topic 5 Released 0.005775 Terrorist 0.005322 Authorities 0.003056 Court 0.003017 Family 0.002894 Arrested 0.002894 Information 0.002772 Prison 0.002656 Report 0.002302 Women 0.002189 CIA 0.002186 Guantanamo 0.00183
note: the number beside the word represents the probability of observing this word given the topic
3.3 Evaluation Since there isn’t any ground truth for forum users’ community membership, it is difficult to evaluate our result. In this work, we evaluated our result based on a reasonable assumption: users of the same community should have common interest, so
A Spectral Analysis Approach for Social Media Community Detection
133
that the topics of the messages written by the users of one community should be different from the topics of the messages written by the users of another community. We treated each message as a document, translated it to English using Google translator and then employed a LDA model[11] to detect topics from the messages written by these 58 users. The topics detected by LDA model is shown in table 1. We predefined the topic number to be eight. Each of these topics is represented by a bag of words with the probability of assigning a word to the topic. We remove three general topics which are greetings, forum operations, and Muslim religion. We carefully reviewed the popular words within each remained topic and provided an annotation for each topic, shown in table 1. Besides the topic-word distribution, the LDA also returned the document-topic distribution Pr | . According to our soft community detection results, firstly, we group forum users into different communities. Secondly, for each community , we extracted all messages written by the users of community and named it as , where consists of a collection of messages { , … | | }. GivenPr | for every documents (messages), by Bayesian chain rule, it is easy to compute the probability of writing these messages if the users of the community are interested in topic , | . For each community i, We computed Pr | for all identified topics and Pr then rank them in a descending order. The result is shown in table 3. Table 2. Annotation of Each Identified Topics
Topic 1 Topic 2 Topic 3 Topic 4 Topic 5
Suicide Boom in Iraq and Pakistan United States – Afghanistan relationship Islamic Fighters or Mujahideen Relationship between Israel and Muslim Countries Guantanamo Prison Table 3. Topic Popularity of Each Detected Community
Community No. Community 1 Community 2 Community 3 Community 4 Community 5
Topic of Interest in Descending Order 5, 1, 2, 3, 4 5, 2, 4, 1, 3 1, 2, 5, 4, 3 3, 5, 2, 4, 1 5, 3, 2, 1, 4
From table 3, we observed that community 3 was more interested in Suicide Boom in Iraq and Pakistan. Community 4 was more interested in Islamic fighter. Communities 1, 2 and 5 have common interests on Guantanamo Prison, which explained why they were in close proximity in figure 2. However, they also have different focuses (as the second most interested topic). For example, community 1 was more interested on Suicide Boom in Iraq and Pakistan. Community 2 was more interested on U.S.-Afghanistan relationship. Community 5 was more interested on Islamic Fighters or Mujahideen. From this result, users with similar interests were indeed clustered together by our soft community detection technique, which were indirectly confirmed by the topic popularity within each detected communities, although we did not clustered users by their messages’ content.
134
X. Tang, C.C. Yang, and X. Gong
4 Conclusions In this paper, we propose to employ spectral analysis techniques to analyze user behavior and their interaction. Firstly, we represent each user by a vector where each element stands for his/her activeness on each individual day. Secondly, we use the power spectrum to quantify user’s overall activeness and the spectral coherence to measure the similarity between two users. We have introduced a soft community detection algorithm to extract clusters of users with common interests. Using a realworld Dark Web dataset as a test bed, we have tested our proposed techniques. Experiment results demonstrated that users from different detected communities exhibited different focuses/interests, which was confirmed by the topic analysis.
References 1. Sarkar, P., Moore, A.W.: Dynamic social network analysis using latent space models. SIGKDD Explor. Newsl. 7, 31–40 (2005) 2. Sun, J., Faloutsos, C., Papadimitriou, S., Yu, P.: Graphscope: parameter-free mining of large time-evolving graphs. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 687–696. ACM, New York (2007) 3. Toyoda, M., Kitsuregawa, M.: Extracting evolution of web communities from a series of web archives. In: Proceedings of the Fourteenth ACM Conference on Hypertext and Hypermedia, pp. 28–37. ACM, Nottingham (2003) 4. Asur, S., Parthasarathy, S., Ucar, D.: An event-based framework for characterizing the evolutionary behavior of interaction graphs. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 913–921. ACM, New York (2007) 5. Falkowski, T., Bartelheimer, J., Spiliopoulou, M.: Mining and Visualizing the Evolution of Subgroups in Social Networks. In: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 52–58. IEEE Computer Society, Los Alamitos (2006) 6. Tantipathananandh, C., Berger-Wolf, T., Kempe, D.: A framework for community identification in dynamic social networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 717–726. ACM, New York (2007) 7. Aggarwal, C., Yu, P.: Online analysis of community evolution in data streams. In: Proceedings of the SIAM International Conference on Data Mining (SDM 2005), pp. 56– 67 (2005) 8. He, Q., Chang, K., Lim, E.: Analyzing feature trajectories for event detection. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 207–214. ACM, New York (2007) 9. Chien, S., Immorlica, N.: Semantic similarity between search engine queries using temporal correlation. In: Proceedings of the 14th International Conference on World Wide Web, pp. 2–11. ACM, New York (2005) 10. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69, 026113 (2004) 11. Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)
Design of a Reputation System Based on Dynamic Coalition Formation Yuan Liu1 , Jie Zhang1 , and Quanyan Zhu2 1
School of Computer Engineering, Nanyang Technological University, Singapore Department of Electrical and Computer Engineering, UIUC, United States
[email protected],
[email protected],
[email protected]
2
Abstract. Reputation systems bear some challenging problems where buyers have different subjectivity in evaluating their experience with sellers and they may not have incentives to share their experience. In this paper, we propose a novel reputation system based on dynamic coalition formation where buyers with similar subjectivity and rich experience will be awarded virtual credits for helping others find trustworthy sellers to successfully conduct business. Our theoretical analysis confirms that the coalitions formed in this way are stable.
1
Introduction
In a multiagent-based e-commerce environment, buying agents and selling agents involved in monetary transactions have asymmetric information. Sellers know more about their products, while buyers never fully know whether the products satisfy them until receiving the products for which they have paid. On another hand, buyers’ satisfaction is very important for the success of e-commerce. In addition, buyers are always, to some degree, uncertain about the future behaviors of sellers. Thus, the main motivations for introducing trust and reputation systems into e-commerce are to: i) mitigate such information asymmetry problem; ii) help buyers find trustworthy sellers to conduct satisfactory transactions; and iii) decrease the uncertainty of buyers about sellers’ future behaviors. Compared to trust models where only buyers’ own experience with sellers is taken into account when modeling the trustworthiness of sellers, reputation systems are more useful especially for the new buyers that do not have much personal experience with sellers, because in reputation systems, buyers share their experience/information about sellers with other buyers [3]. However, reputation systems also face two challenging problems. One is the subjectivity problem where the information about sellers shared by other buyers is their own subjective evaluation about the products delivered by the sellers and may be biased. Another is the incentive problem in the sense that buyers may not have incentives to share their information with others. To address the two problems, in this paper, we design a dynamic coalition based reputation system. In our system, we introduce the notion of virtual credits to provide buyers with incentives to share their information about sellers. A A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 135–144, 2011. c Springer-Verlag Berlin Heidelberg 2011
136
Y. Liu, J. Zhang, and Q. Zhu
novel credit allocation algorithm is proposed to allocate credits to coalition members based on the quantified subjective difference among them and the amount of information they provided. The result is that buyers with similar subjectivity will form a coalition. Well-experienced buyers will join coalitions to share their information about sellers for receiving virtual credits. Less-experienced buyers can join coalitions to gain information from buyers that have the similar subjectivity. The coalitions formed in our system are also proven to be stable.
2
Uncertainty and Subjectivity in Trust Modeling
Feedbacks from buyers that have ever been directly involved into transactions with a seller s compose the evidence space for the trustworthiness of the seller. In the evidence space, a buyer i has (Pis , Nis ) to express its direct experience with the seller s, where Pis ∈ N is the number of satisfactory transactions and Nis ∈ N is the number of dissatisfactory transactions. According to the DempsterShafer theory (DST) and Jøsang’s trust metric [1], the evidence space can be Ps Ns mapped to a trust space Tis (b, d, u) as follows: bsi = P s +Ni s +2 , dsi = P s +Ni s +2 , i
i
i
i
2 , where bsi , dsi and usi represent belief, disbelief and uncertainty usi = P s +N s i i +2 parameters, respectively. Here, bsi represents the probability that the proposition that the seller s is trustworthy is true, and dsi represents the probability of the proposition is false. Note that bsi + dsi + usi = 1 and bsi ∈ [0, 1), dsi ∈ [0, 1), usi ∈ (0, 1]. We can then define the amount of information Eis the buyer i has about the seller s and link Eis to the uncertainty usi as follows:
Definition 1. Amount of Information Eis : Given that a buyer i has done Pis + Nis transactions with a seller s, the amount of information i has about s, Eis , is P s +N s +2 defined as i 2 i . Then, Eis = u1s . i
Given two buyers’ modelings of the same seller in the trust space, we can also define their subjective difference in their trust modelings of the same seller: Definition 2. Subjective Difference: Given the two respective trust tuples that the two buyers i and j have of the same seller, Tis (bsi , dsi , usi ) and Tjs (bsj , dsj , usj ), the subjective difference of the buyers i and j regarding the seller s is defined as s s s s s s s s |b u − b u | u − d u | |d 1 i j j i i j j i s = + s s , (1) Di,j 2 bsi usj + bsj usi di uj + dsj usi s where Di,j ∈ [0, 1), usi = 1 and usj = 1. Then, the subjective difference of i and Ds
i,j , where S is the set of sellers i and j both have encountered j is Di,j = s∈S |S| with and |S| represents the number of sellers in S.
3
Dynamic Coalition Formation
To address the problems of subjectivity and incentives in reputation systems, we propose a credit allocation algorithm for dynamic coalition formation.
Design of a Reputation System Based on Dynamic Coalition Formation
3.1
137
Model Overview
In a typical multiagent-based electronic marketplace, buying agents conduct business with selling agents. After the transactions are finished, buyers evaluate whether the transactions are successful. In our work, we assume that the evaluation results are binary, either successful or unsuccessful. These are precisely the experience about sellers that the buyers will later on share with other buyers in the system. In the e-marketplace, we assume that sellers sell the similar kinds of products. For sellers selling a different type of products, a different set of coalitions will be formed regarding those sellers. By this simplified assumption, we do not deal with the transformation from buyers’ subjectivity on one type of sellers to that on another type of sellers. Because of this assumption, we can also assume that each buyer will be able to gain the same amount of profit if its transaction with a seller is successful, which is denoted as α ∈ R+ . But, if the transaction is unsuccessful, the buyer will lose a certain amount of profit denoted as β ∈ R+ . For the purpose of numerical analysis, we also assume that every buyer has the same amount of need for purchasing products, which is represented by transaction rate, the number of transactions the buyer will conduct with sellers over a fixed period of time, denoted as r ∈ N. Based on this assumption, buyers in the system have different amount of transaction history or personal experience with sellers, only because they participate in the system for different time periods. The longer the buyers participate in the system, the more experience they will be able to gain. Therefore, if the success rate of transactions is pi ∈ [0, 1], then the profit Fi ∈ R a buyer is able to gain within a specific time period t0 can be calculated as: Fi = rt0 (pi α − (1 − pi )β).
(2)
In our system, buyers autonomously form coalitions. Within each coalition, buyers (coalition members) can share their experience (information about sellers) with other members. To create incentives for buyers to share their experience with their coalition members, the buyers will be rewarded with virtual points if the transactions of their members with sellers are successful [2]. The number of credits rewarded to the buyers in the coalition is proportional to the profit gained by the members from successfully conducting transactions with the sellers. For the purpose of simplicity, we make the number of credits after a successful transaction with a seller equal to the amount of profit gained from the transaction, which is α. These credits can be redeemed by buyers for discounts from sellers or privileges in the system, therefore, the attitude of buyers towards the credits is positive, i.e. the more credits the better. We assume here that a buyer’s utility towards virtual credits is discounted by a constant θ ∈ (0, 1) set for the system. Thus, the utility of a buyer i has two parts, the profit gained by successfully conducting transactions with sellers and the virtual credits gained by sharing its experience with other coalition members, formalized as follows: Rji , (3) Ui = Fi + θ j=i
138
Y. Liu, J. Zhang, and Q. Zhu
where Rji ∈ R+ is the virtual credits rewarded to buyer i due to buyer j’s successful transactions with sellers, and Fi is calculated using Equation (2). In the initiation stage of our coalition formation, each buyer is a singleton coalition. It evaluates the subjective difference with other buyers. Buyers with similar subjectivity will merge to form a coalition for two reasons. One reason is to increase the success rate of conducting business with sellers so that their transaction profit F will be increased accordingly. Another reason is to gain more virtual credits because their information about sellers will be more valuable to others with similar subjectivity. The number of virtual credits awarded to the buyers is determined partially by the factor of the subjectivity difference. More details about the virtual credits allocation algorithm will be presented in the next section. When both the transaction profit and virtual points are increased, the buyers’ utility will also be increased, according to Equation (3). When a new buyer joins the system, every coalition is presented to the buyer as a coalition center (defined in the next section) and the amount of information of this coalition. The new buyer can first randomly join in one coalition. One buyer can take part in only one coalition at one time. It is possible that the random choice was wrong, but later on when the buyer gains more personal experience with sellers, the buyer will be able to switch to a correct coalition where it shares the similar subjectivity with other members in the coalition. 3.2
Credit Allocation Algorithm
Virtual credits assigned to a coalition when a buyer in the coalition conducts a successful transaction with a seller will be allocated to other coalition members, depending on how much their information about the seller contributes to this successful transaction. It is affected by both the subjectivity of the coalition members regarding the seller and how much information the coalition members have about the seller. The subjectivity of a coalition member is measured as the subjective difference between the member and the average opinion of all members in the coalition. Thus, we first define the center of a coalition as the average opinion of all the members in the coalition, as follows: Definition 3. Coalition Center c: In a coalition C, for any given s with seller s s i∈c (Pi ) be the which some members have conducted transactions, let Pc = m average number of satisfactory transactions between the members and s , Ncs = s i∈c (Ni ) be the average number of unsatisfied transactions and m is the number m of such members. The coalition center c regarding s is defined as Tcs (bsc , dsc , usc ), Ps Ns 2 . The coalition center where bsc = P s +Nc s +2 , dsc = P s +Nc s +2 and usc = P s +N s c c c c c c +2 s c is then a collection of Tc for each s ∈ S with which at least one member in c has interacted. Given the center c, we then calculate the discounted amount of information buyer i has about the seller s as follows: ˆis = (1 − Di,c ) × Eis , E
(4)
Design of a Reputation System Based on Dynamic Coalition Formation
139
where Di,c is the subjective difference between the center and buyer i (see Definition 2), and Eis is the amount of information buyer i has about the seller (see Definition 1). The detailed credit allocation rule is described in Algorithm 1. The number of credits allocated to a buyer is then proportional to the discounted amount of information it contributes to the coalition. If its subjectivity is similar to the coalition’s average opinion, its information will be less discounted.
1 2
Alg. 1 : Credit Allocation Rule Input : C, the coalition formed by a number of buyers; e, a transaction conducted by a member j with a seller; α, profit gained by the member j from transaction e; if e is successful then foreach i in coalition C and i = j do Rji =
3 4 5 6 7
4
Rjj
ˆs E i
l=j
ˆ s α; E l
//credits allocated to each member other than j //no credit is allocated to j itself
= 0;
else foreach i in coalition C do Rji = 0;
Stability Analysis and Proof
Stability is an important property for dynamic coalition formation. We analyze and prove that the coalitions formed based on our proposed credit allocation rule (Algorithm 1) are stable, by proving that they are split-proof and merge-proof. 4.1
Analysis
According to Equation (3), a buyer’s utility has two parts, its profit of conducting successful transactions and the virtual credits gained by sharing its experience with other coalition members. When the buyer has successfully conducted a transaction with a seller, a certain number of virtual credits will be awarded to other coalition members. In this case, we can transfer the profit part of the buyer’s utility to the number of credits awarded to other coalition members because of the buyer’s successful transactions with sellers. We perform this transformation mainly for the purpose of stability proof in the next section. According to Equation (2), Equation (3) and the credit allocation rule in Algorithm 1, Equation (3) can then be further changed to: Ui =
ˆs E α+β i rt0 pi α + θ Rjs − rt0 β, s ˆ α l=j El
(5)
s∈S j=i
where rt0 pi α is the number of credits awarded to other members because of ˆs E buyer i’s successful transactions with sellers, and s∈S j=i i Eˆ s Rjs is the l=j
l
140
Y. Liu, J. Zhang, and Q. Zhu
number of credits buyer i receives from successful transactions conducted by other coalition members. In Equation (5), as α, β, r and t0 are fixed values, the buyer i’s probability of conducting successful transactions with sellers, pi is crucial to the buyer’s utility. If pi is higher, the buyer is likely able to gain larger utility. This success probability is in fact affected by the total amount information the buyer has about sellers, including the buyer’s own information and the information shared by other coalition members. We denote it as a function p(E) where E is the total amount of information about sellers, and assume that p(E) is an increasing and concave function with the upper boundary of 1. When there is little information about sellers, gaining more information will help a lot in increasing the probability of conducting successful transactions. But, when there is already a lot of information about sellers and the probability of conducting successful transactions is already high, gaining more information will not help much in increasing the probability of conducting successful transactions. Based on the amount of information/experience about sellers a buying agent contributes to its coalition, we classify buyers into three types: senior, common and junior, defined as follows: Definition 4. Given a coalition C with m ∈ N ≥ 2 members/buyers and the center c, for any buyer i ∈ C, if i meets condition 1: then buyer i is a senior buyer; if i meets condition 2: and Ei
Eis
1 ˆs > m , E l 1 ≤ m , then
l∈C
Eis +
ˆs E i
Eis +
l=i ˆs E i
ˆs E l
l=i
ˆs E l
≥ <
1 m−1 , 1 m−1
then buyer i is a common buyer; if i meets condition 3:
buyer i is a junior buyer, where s ∈ S and S is the set of common sellers all the members ever have ever interacted with. l∈C
Elc
According to the definition, a senior buyer is well experienced and generally has large amount of information about sellers. Its probability of conducting successful transactions is already high, and gaining more information by joining a coalition will not increase much the probability (because of the property of the probability function p(E)). Thus, the senior buyer’s main purpose of joining a coalition is to gain more virtual credits in order to increase its utility. Indeed, the senior buyer’s rich information about sellers will allow it to receive a lot of credits according to our credit allocation rule. On another hand, a junior buyer does not have much experience with sellers. Its little information will not bring many virtual credits to itself. Thus, its main purpose of joining a coalition is to increase its probability of conducting successful transactions with sellers by utilizing information about sellers shared by other buyers (mostly common and senior buyers), to increase its utility. All in all, we classify buying agents into the three types mainly because senior and junior buyers have different purposes for joining or leaving coalitions. In the next section, we will separately discuss their behaviors when proving the stability of our dynamic coalition formation.
Design of a Reputation System Based on Dynamic Coalition Formation
4.2
141
Stability Proof
We first describe the stable status of our system and provide the properties associated with the stable status. Given a partition P = {C1 , ..., Cn } of N (the set of all buyers in the system) and any two coalitions C (with the center c) and C (with the center c ) in P , when our system is in the stable stage, the following three properties hold. (P1) Disconnection: Defining τc ∈ (0, 1] as the the radius of the coalition C, we have maxi∈C Di,c < τc , meaning that the subjective difference between any buyer in the coalition and the center should be smaller than the radius. Also, the subjective difference between the centers of the any given two coalitions C and C should be larger than the two times of the maximum radius of these two coalitions, i.e. Dc,c > 2 × max{τc , τc }; (P2) Existence: In each coalition, there should be some senior buyers that have fairly large amount of information about sellers; (P3) Equality: Given any junior buyer i (i ∈ C) and any junior buyer j (j ∈ C ), their probabilities of successfully conducting transactions with sellers are similar and approach 1, i.e. pi ≈ pj → 1. When the system evolves for a sufficiently long period of time and reaches the stable stage, the buyers that share the similar subjectivity will form a coalition because only those buyers with the similar subjectivity can provide each other with useful information about their common sellers. In other words, different coalitions will have different subjectivity towards sellers. This gives us the first property (disconnection), meaning that there is sufficient difference in subjectivity between any two coalitions so that buyers do not switch from one to another. Also, in order for a coalition to exist, the junior buyers should be able to gain information about sellers from the senior buyers to benefit from forming coalitions. Thus, in a coalition, there should exist some senior buyers that can provide information to other members for them to successfully conduct transactions with sellers, which is the second property (existence). Based on the property of existence, which expresses that it is reasonable to say that some buyers will become well experienced and gain much information about sellers to become senior members in each coalition, it is safe to assume the property of equality where junior buyers in different coalitions have the similar probability of successfully conducting transactions with sellers by gaining sufficient information from senior buyers in their coalitions, and the probability of success approaches 1. In the rest of this section, we base on the properties summarized above for the stable status of our system to theoretically prove that the coalitions formed in our system are both split-proof and merge-proof and thus stable. Proposition 1. Given a partition P = {C1 , ..., Cn } of N buyers (the set of all buyers in the system) that has the three properties: disconnection, existence and equality, in each coalition C with coalition center c, any senior buyer i would gain more credits than the credits Ris generated due to buyer i’s successful transactions, where s ∈ S.
142
Y. Liu, J. Zhang, and Q. Zhu
Proof. Without losing generality, we assume there are m buyers in coalition C. ˆs Since buyer i is a senior, i’s contributed personal experience/information E i should take a larger proportion than the buyers that are not seniors. According ˆs E 1 holds for any to the definition of senior agent in Definition 4, E s + i Eˆ s ≥ m−1 i
l=i
l
ˆ s using Equation (4), we derive s ∈ S. Replacing Eis by E i
ˆs E i l∈C
ˆs E l
≥
1 Di,c m−1− 1−D
.
i,c
The disconnection property indicates that maxi∈C Di,c < τc and the subjective difference between any two coalitions C and C is larger than 2×max{τ, τ }. Since the upper boundary of subjective difference in Definition 1 is 1, τ should ˆs E 1 be smaller than 12 . Therefore, i Eˆ s > m−1 . According to the credit allocation l∈C
l
rule in Algorithm 1, the number of credits allocated to i due to the successful transactions conducted by any other agent j in coalition C in a certain period ˆs E of time t0 can be formalized as follows: Rji = i Eˆ s pj αrt0 . The equality propl=j
l
erty shows that pi ≈ pj → 1. Then, we can obtain: Rji =
ˆs E i l∈C
ˆ s pj αrt0 E l
>
1 m−1 pi αrt0
=
ˆs E i l=j
ˆ s pj αrt0 E l
>
1 s m−1 Ri .
Buyer i can gain credits from the successful transactions conducted by m − 1 agents in C (excluding i). Thus, the total number of credits that buyer i is able ˆs E to obtain R(i), and R(i) = s∈S l∈C,j=i i Eˆ s pj αrt0 > s∈S Ris holds. l=j
l
Theorem 1. Given a partition P = {C1 , ..., Cn } having the three properties: disconnection, existence and equality, any coalition C in P is split-proof. Proof. According to the analysis of stability, a partition is split-proof if for each group of agents A in coalition C, there exists at least one agent whose utility in A is smaller than that in C. We will analyze the behavior of each type of buyers (junior, common and senior) in coalition C. For a junior buyer i in the coalition C with the center c, according to our analysis in Section 4.1, its main purpose of joining coalition C is to increase the probability of successfully conducting transactions with sellers by gaining information about sellers from senior buyers in the coalition. Thus, it will choose ˆ s . If the junior agent i splits out, a coalition that maximizes (1 − Di,c ) l=i E l the total amount of available information in the new coalition will decrease. This will further decrease i’s utility. Therefore junior buyers do not have incentives to split out from coalition C with any group of other buyers. For a senior buyer j in the coalition C, its main purpose of joining C is to obtain more credits due to other members’ successful transactions with sellers. Suppose that some of the seniors in coalition C split out to form a new coalition A. Because the seniors have the similar amount of information about their common sellers, the number of credits generated by them is similar. Thus, the number of credits received by them will also be similar to that generated by them when those senors splits out as A. However, according to Proposition 1, those seniors can gain more credits than that generated by them. These seniors should be able to gain more credits in coalition C than A. In the case where
Design of a Reputation System Based on Dynamic Coalition Formation
143
some seniors have more information than other seniors, those seniors with less information will gain less credits in A than C. Thus, senior buyers do not have incentives to split out to form a new coalition with other seniors. For a common buyer k in the coalition C, it has some amount of experience, which is less than that of a senior buyer but more than that of a junior buyer. It can also be allocated some number of credits. Some of the common buyers may prefer to gain more information about sellers. These buyers do not have the incentive to split out, which is similar to junior buyers’ behavior analyzed earlier. Some other common buyers may prefer to increase credits and want to split out with seniors. But, due to their less amount of experience about sellers compared to seniors, they will be allocated with even less credits than that when they are in coalition C, according to the credit allocation rule. In conclusion, no group of buying agents splitting out from C to form a new coalition A can guarantee that each of the buyers in A can gain more utility. Our dynamic coalition formation is proven to be split-proof. Theorem 2. Given a partition P having the three properties: disconnection, existence and equality, any pair of coalitions C and C in P is merge-proof. Proof. According to the analysis of stability, the pair of coalitions C and C is merge-proof if given any group of buyers A from both the two coalitions, not all buyers in A can gain more credits than in C or C . We prove this by analyzing the behaviors of each type of buyers. For any junior buyer i in the coalition C, its purpose of joining a coalition is to gain more information about sellers. Therefore, it prefers to merge with a group of buyers that i has less subjective difference with but that have more information about sellers. According to the equality property, junior buyer i in coalition C with center c and another junior buyer j in coalition C with center c can both gain sufficient amount about sellers in their respective coalition, of information ˆ s = (1 − Dj,c ) ˆ s therefore, (1 − Di,c ) l∈C,l=i E l l∈C ,l=i El . The disconnection property indicates that Di,c < τc , Dj,c < τc and Dc,c > 2 × max{τc , τc }. Thus, we can derive Di,c > τc > Dj,c and (1 − Di,c ) l∈C ,l=i Eˆls < (1 − ˆ s , meaning that the amount of inDj,c ) l∈C ,l=i Eˆls = (1 − Di,c ) l∈C,l=i E l formation i can gain in coalition C will be less than that gained in coalition C. Junior buyers do not have incentives to merge with other coalitions. For any senior buyer j in C, the subjective difference between the agent j with any group of agents from another coalition C is larger than τ (C’s radius), making j’s information less useful. In consequence, the number of credits j can receive by joining coalition C will be smaller than that in C. Therefore, the seniors do not have incentives to merge with buyers from any other coalition. For a common buyer k in C, after merging with buyers from another coalition C , either its probability of successfully conducting transactions with sellers or the number of credits it can receive may be decreased. Based on the above analysis, our dynamic coalition is also merge-proof. From Theorems 1 and 2, we can conclude that our dynamic coalition is stable.
144
5
Y. Liu, J. Zhang, and Q. Zhu
Conclusion and Future Work
In this paper, we design a reputation system based on dynamic coalition formation. A credit allocation algorithm is also proposed to elicit buying agents to share their personal experience/information about selling agents. In this system, buyer with different subjectivity will form disconnected coalitions. And, we theoretically prove that the coalitions formed in this way are stable. The results of our work address the two fundamental and important problems of existing reputation systems, subjectivity and incentives for sharing experience. In our current work, we make some assumptions for the purpose of simplifying the quantitative and theoretical analysis of agents’ behaviors in the system. For future work, we will relieve these assumptions in our experimental analysis to more extensively evaluate the effectiveness of our system.
References 1. Jøsang, A., Knapskog, S.J.: A metric for trusted systems. In: Proceedings of the 21st National Security Conference, pp. 16–29 (1998) 2. Wang, Y., Zhang, J., Vassileva, J.: Effective web service selection via communities formed by super-agents. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, pp. 549–556 (2010) 3. Zhang, J., Cohen, R.: A personalized approach to address unfair ratings in multiagent reputation systems. In: Proceedings of the AAMAS Workshop on Trust in Agent Societies (2006)
Guild Play in MMOGs: Rethinking Common Group Dynamics Models Muhammad Aurangzeb Ahmad1, Zoheb Borbora1, Cuihua Shen2, Jaideep Srivastava1, and Dmitri Williams3 2
1 Department of Computer Science, University of Minnesota, MN, 55455, USA Emerging Media & Communication Program, University of Texas Dallas, Richardson, TX, 75080, USA 3 Annenberg School for Communication, USC, Los Angeles, CA 90089, USA {mahmad,zborbora,srivastav}@cs.umn.edu,
[email protected],
[email protected]
Abstract. Humans form groups and congregate into groups for a variety of reasons and in a variety of contexts e.g., corporations in offline space and guilds in Massively Multiplayer Online Games (MMOGs). In recent years a number of models of group formation have been proposed. One such model is Johnson et al’s [10] model of group evolution. The model is motivated by commonalities observed in evolution of street gangs in Los Angeles and guilds in an MMOG (World of Warcraft). In this paper we first apply their model to guilds in another MMOG (EQ2)1 and found results inconsistent from the model’s predictions, additionally we found support for the role of homophily in guild formation, which was ruled out in previous results, Alternatively, we explore alternative models for guild formation and evolution in MMOGs by modifying earlier models to account for the existence of previous relationships between people. Keywords: Guilds, MMOGs, Groups, Models of group evolution.
1 Introduction How humans form groups and how these groups evolve over time has a long history of research [6,8,15]. Large-scale study of group formation has been limited because of the unavailability of data. With the advent of the internet and online systems where millions of people can simultaneously interact with one another in virtual communities and virtual worlds, the data are no longer an obstacle. It is now possible to analyze human behavior and group formation in much more detail and at vast scales. Researchers have argued that given the complex and interdependent nature of interaction and behavior in MMOGs, they are often sufficiently similar to the “real world” for us to gain important insights about the social [21], behavioral [11,16] and economic [5] aspects of the real world, a scientific analysis known as “mapping”[17]. Guilds are formal organizations of players in MMOGs, and are ideal to study the formation and evolution of human groups because they parallel existing, well-known and studied groups such as work teams and friendship networks guilds [18]. A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 145–152, 2011. © Springer-Verlag Berlin Heidelberg 2011
146
M.A. Ahmad et al.
Johnson et al [10] posited a model of evolution of groups and applied it to both guilds in World of Warcraft (WoW) and street gangs in Los Angeles. They discovered that the same model can be used to describe how guilds and street gangs in Los Angeles evolve over time. Their model was based on the idea that the driving force in their evolution is the tendency to maximize the diversity of skillsets in the group. They also showed that a variation of their model based on homophily cannot reproduce the behavior of these groups over time and thus homophily can be ruled out as an explanation for the formation and evolution of these groups. In this paper we apply Johnson et al’s model and its homophily based variation and apply it to data from another MMOG called EverQuest II (EQ2). Given that WoW and EQ2 are sufficiently similar and the results of Johnson et al’s model are applicable to two very different domains, the model should be applicable to the EQ2 dataset. However we find that the results that we obtain in EQ2 are divergent from what was reported previously, namely we observe that homophily based models can also explain the evolution and formation of guilds. Based on these results we propose a new model of group evolution which is obtained by modifying the team formation model and introducing the element of prior relationships between them. There is a growing body of research on MMOGs and there many studies on the multiple aspects of socialization [3] that highlight the importance of grouping and guilds. Previous work in MMOGs has studied multiple types of interactions in MMOGs [2], mentoring [1], task-oriented group formation depends upon common challenge-orientated nature of its participants [9]. Keegan et al [4] analyzed the trade networks of gold farmers and social networks of drug dealers. Previous work on guilds includes looking at factors that make them successful [10,18]. Thurau et al [12] examined the evolution of social groups in World of Warcraft (WoW).
3 Models of Guild Formation in MMOGs A guild is a formal and relatively stable organization supported by the code of an online game. Guilds can range in size from several players to a couple hundred or even more. Player characters can belong to only one guild but are allowed to quit one guild and join another. Guilds form so that members have an easy way to play together and a common identity. Each guild has a guild master analogous to a company president, and a hierarchy of players analogous to military or corporate forms. People join guilds for a variety of reasons, but typically for access to resources and knowledge, for social support, and to avoid playing with more anonymous strangers [18]. While a number of papers have been written about guilds in MMOGs these studies suffer from the following deficiencies: •
• •
Almost all the previous studies use data from one MMOG namely World of Warcraft. While some researchers think that [10,18] the results of these studies may be generalizable, this is an empirical question and until results are replicated in other MMOGs generalizations cannot be really made. Most of the studies on MMOGs take a static snapshot of the data and thus there have not been many longitudinal studies, with some exceptions [10]. Most papers analyze guild data at the character level but not at the account level.
Guild Play in MMOGs: Rethinking Common Group Dynamics Models
147
Prior bonds can also serve as a strong basis for the formation of bonds [9]. We first describe the group formation models of Johnson et al [10] before describing the results of that model on our dataset. The Johnson et al model consists of n agents. Given an agent i, it is randomly assigned attribute parameters pi and Δpi such that the Δpi describes how much parameter pi can vary. Both of these attributes are sampled from a Gaussian distribution with a mean
and standard deviation σΔp. Associated with each agent is a tolerance value τ. The parameter pi is an abstract representation of a person’s attributes, and homophily is defined with respect to similarity in this attribute. The simulation starts with each agent being part of a “group” and at each time step an agent is picked at random and has to decide to stay with her current group, join a new group or merger groups based on certain pre-defined criteria. The following scenarios are possible: (i) Joining a Guild: If the agent is not part of any guild and has to decide in joining a 1/ ∑ guild then the agent considers the average attribute of the guild and decides to join the guild if its attributes are sufficiently different from the attributes of the guild i.e., if the following condition is met |pi-PJ| >Δpi. Since not only the player but the guild has to consent for the player to join then the attributes of the individual members of the guild are also compared with that of the applicant to the guild. The person is admitted if her attributes are sufficiently different. Thus consider the function f which measures the range of attributes covered by the applicant as compared to the rest of the guild. 1 F ,J θ ∆p p p n J
Where J = guild under consideration, θ(x) = 1 for x > 0 and θ(x) = 0 otherwise. The new person is acceptable to the guild if the value of the function f is less than the average tolerance τJ of the guild. (ii) Leaving a Guild: A person can decide to leave a guild if the person is already part of the guild but realizes that its abilities are sufficiently similar to the rest of the guild members and thus decides to leave. This is measured by the fraction of guild members with similar attribute ranges as follows: f
1 nI
θ ∆p
|p
p|
K I
If fi < τi then the agent leaves the guild. (iii) Switching Guilds: Even if an agent finds a guild tolerable she can still switch guilds if she finds a more suitable guild. Thus in this case another agent j is selected at random and the characteristics of guild J of agent j are considered. The agent switches guilds if the following two criteria are met: |pi – PJ | > |pi-PI| and fi,J < τj. (iv) Guild Mergers: If nothing happens in the previous two steps then a merger of the two guilds is considered. Guild I to which agent i belongs merges with guild J to which the agent j belongs if the following condition is satisfied. |PI –PJ| > ΔPI, where ∑ ∆ . Guild J considers merging with guild I if |PJ –PI| > ΔPJ. ∆
148
M.A. Ahmad et al.
Fig. 1 & 2. Distribution of Guild Sizes when the data from March (account level) and May (account level) is taken as the seed respectively
Fig. 3. & 4. Distribution of Guild Sizes at the account level when the data from May (account level) and July (character level) is take`n as the seed respectively
5 Criticism and Alternative Models While Johnson et al’s model can replicate some features of group evolution in two different datasets over time; there are a number of other areas where it falls short. First we note that the comparison of the kinship model with the given model parameters does not map well because the game mechanics of MMOGs incentivize class diversity rather than uniformity in groups. Additionally the manner in which kinship and homophily [13] is described greatly affects how the simulations are set up and consequently what type of results one obtains e.g., in the Johnson et al. model homophily is defined unidimentionally, in terms of similarities in abilities. There are other dimensions of homophily e.g., in terms of demographics, game play etc. Instead, in a game scenario pure similarity is a strategic liability. For example, a group of all healers or all wizards doesn’t perform as well as a mixed group [14]. Consequently their model is not sufficient in refuting the kinship hypothesis of group and guild formation. Secondly, the authors state that since different parameter values are obtained for different ethnic groups in gang memberships and servers that fit the data, thus servers are analogous to ethnicities. However this conclusion is not warranted since different servers usually represent different types of game play where either the goals or the rules of the game are slightly different, thus creating varying
Guild Play in MMOGs: Rethinking Common Group Dynamics Models
149
social dynamics [18]. Equating these to ethnic groups does not fit. Thirdly, they only initialize their simulations at the character level. Players in MMOGs typically maintain one account but often create multiple characters. Based on these observations, we use the models proposed by Johnson et al. to replicate the distribution of guild sizes in another MMOG, EQ2. in terms of nature of play and the setting of the game, it is quite similar to WoW which was used in the original experiments by them. In additional to replicating the experiments based on the models given by Johnson et al., we also formulate new variations of their models based on the observations regarding the social networks of the players in MMOGs. We note that players in MMOGs form social relations for a variety of reasons and in a variety of contexts [9]. Thus consider the process of guild formation, guilds form either around existing social ties or facilitate the formation of new social ties. Thus we consider scenarios where a person’s decision to join a guild is directly dependent upon the existence of social ties that may already be present. (i) Joining a guild: Consider the case when the agent has to decide if she wants to join a group, she will compare her abilities with the rest of the guild. ∑ where α is a variable that captures the amount of socialization of the agent with respect to the other agents in the network. Thus
∑ ∑
where the
function µ i(j) defines the number of interactions that the agent i has with agent j. The set J is the set of all members of the guild under consideration and the set K is the set of all agents that agent i has interacted with. This quantity is the relative measure of the socialization of the agent i with members of the guild J. The same condition still holds for joining a guild i.e., if |pi-PJ| >Δpi holds. The tolerance of the agents is modified based on amount of socialization i.e., τi = τi * (1- αi). ii) Leaving a Guild: The agent decides to leave the guild in an analogous manner with the tolerance being defined in the same modified manner as in the previous step. (iii) Switching Guilds: The same scheme is used for switching guilds as in the previous case but with the difference that the formula for the preference is modified to compensate for the socialization factor of the agent i.e., the factor α. (iv) Guild Mergers: In case of guild mergers we modify the tolerance of the guild based on commonality of socialization between the two guilds. Thus consider guild I and guild J from the previous examples, the tolerance of a guild is given by the Jaccard’s coefficient, the tolerance of the guild can thus be defined as follows τI = τI * (1- αI). Thus the proposed model retains the features of the original model with the emphasis on a minimalist model but it introduces the idea of social interactions in the model to determine how guilds grow over time based on the social network of the agents who participate in the group.
6 Experiments and Simulations The data from EQ2 spans from January 1, 2006 to September 4, 2006. The dataset has a total of 2,122,612 characters and 675,281 unique accounts. Each account therefore
150
M.A. Ahmad et al.
has a little more than 3 characters attached to it on average, suggesting a difference from assumptions made by Johnson et al. Not all the players are however part of a guild. We use data from one of the servers (Guk) where 45,800 players are observed, 13,115 were part of a guild i.e., only 28.67 percent of all the players. We note that the data is already anonymized so that it is not possible to link accounts in the game to real world people and thus the privacy of the players is preserved. We first describe the results for replicating Johnson et al.’s model in EQ2 dataset. We start with using the parameters that were given by them in their paper [10]. Additionally we also used grid search for searching the space of parameters to find the best results but we only report the best results because of limitations in space. In order to determine the best set of parameters we computed the KL divergence of the simulated versus the real distribution of guild sizes. The series of figure from Figure 1 through Figure 4 shows the best results for the distribution of the guilds at the end of the simulation. The experimental setup is such that we start with different months as the starting point for the simulations and then compare the distributions at the end of the time span i.e., September 2006. Thus consider Figures 1 and 3 which show the distribution of the guild sizes. The months in this case (March, May, July) refer to the starting point for the simulated i.e., the data which was used as the seed for the simulation. In Figure 1 through 4, “Actual” refers to the data collected from EQ2 and “Team Formation” refers to the results from the simulations where the team formation proposed by Johnson et al is used and “Kinship” refers to the Kinship model described in their paper. The x-axis is the size of the guilds and the y-axis is the number of guilds for which that guild size is observed. In contrast to previous studies the results here are given at both the character level as well as the account level. It should be noted that the distribution of the guild sizes are different at the character level as compared to the account level as participation rates of players may vary. This is because in many cases the same account may have multiple characters. In general the results are better at the account level as compared to the character level. If we compare these results to the results reported in Johnson et al [10] for the selforganized guild model in their paper then it is clear that the discrepancy is much higher in the EQ2 data as compared to WoW. They also reported that they obtained a poor fit between the guild distributions and the simulated data if the homophily based model was used. In our case while this is true for most of the case, for the extreme values of tolerance (0.95) for the homophily model, we do get results which are comparable to the other team formation model as evident in Figure 1 though Figure 4. This is also in contrast to the results of Johnson et al where they did not find any support for homophily. This points towards a major difference between EQ2 on one hand and WoW and street gangs in LA on the other hand. Additionally we also report the results of our simulations using the modified model. In EQ2 a strong form of social relationshiop between players can be inferred based on the trust between them since the game has a built in mechanism where players can describe how much they trust other players. We thus use the trust information to modify the model that if there is a trust relationship between two nodes then it becomes more likely that players will get together in guilds and are more likely to stay in guilds. Here we give the results at the character level only due to limitations in space, in Figure 5 and 6 for our modified
Guild Play in MMOGs: Rethinking Common Group Dynamics Models
151
Fig. 5. & 6. Distribution of Guild Sizes at the character level for network and the team formation models when March and May data is taken as the seed
kinship model as well as the original model of Johnson et al. From these figures it is apparent that the best results from the network based model are indistinguishable from the group formation model. The main thing to note here is that it is possible to get the same results and fit as the team formation model by just making minimal changes to the homophily model. We observe the same model dynamics for the various models described here i.e., overall the results are better in case of the models at the account level as compared to the character These observations points us to the direction that such models may be inadequate with respect to modeling characters as the fundamental level of analysis but they work well at the account level.
7 Conclusion In this paper we tried to replicate a previous model of evolution of groups in virtual worlds, critiqued flaws in its data and assumptions, and proposed a new model that can be used when back-end data are available. The results implied that online worlds could have distinct features and nuances, such that some social dynamics are not similar across virtual worlds. It is also important to map the dimensions of what exactly is meant by “diversity,” as a principle for group formation. Biologically based diversity operates on a different principle than skill-based diversity in an environment where identity may not play as strong a role as efficiency and strategy. Offline, while people may have different aspects of their personality, they aren’t actually physically different with different appearances and skills from context to context [7]. In virtual worlds, this is common and trackable in many cases. Any analysis must account for this multiple personality equivalent before beginning calculations. We proposed an alternative model for the evolution of groups. Given that discrepant results were observed for WoW and EQ2, which are both MMOGs and one would expect them to behave similarly, we caution against the generalizability of either of the results and recommend that such models should be explored in greater detail and in more datasets from other virtual systems before any generalizations can be made. Acknowledgement. The research reported herein was supported by the AFRL via Contract No. FA8650-10-C-7010, and the ARL Network Science CTA via BBN TECH/W911NF-09-2-0053. The data used for this research was provided by the
152
M.A. Ahmad et al.
SONY corporation. We gratefully acknowledge all our sponsors. The findings presented do not in any way represent, either directly or through implication, the policies of these organizations."
References 1. Ahmad, M.A., Huffakar, D., Wang, J., Treem, J., Poole, S., Srivastava, J.: GTPA: A Generative Model for Online Mentor-Apprentice Networks. In: 24th AAAI Conference on Artificial Intelligence, Atlanta, Georgia, July 11-15 (2010) 2. Ahmad, M.A., Borbora, Z., Srivastava, J., Contractor, N.: Link Prediction Across Multiple Social Networks. In: Domain Driven Data Mining Workshop (DDDM 2010), ICDM 2010 (2010) 3. Bainbridge, W.S.: The Warcraft civilization: Social science in a virtual world. The MIT Press, Cambridge (2010) 4. Keegan, B., Ahmad, M.A., Williams, D., Srivastava, J., Contractor, N.: Dark Gold: Statistical Properties of Clandestine Networks in Massively-Muliplayer Online Games. In: IEEE Social Computing Conference, Minneapolis, MN, USA, August 20-22 (2010) 5. Castronova, E.: Synthetic worlds: The business and culture of online games. University of Chicago Press, Chicago (1995) 6. Epstein, J., Axtell, R.: Growing. In: Artificial Societies: Social Science from the BottomUp. MIT Press, Cambridge (1996) 7. Goffman, E.: The presentation of self in everyday life. Doubleday, Garden City (1959) 8. Guimerà, R., Spiro, U.: Team assembly mechanisms determine collaboration network structure and team performance. Science 308, 697–702 (2005) 9. Huang, Y., Zhu, M., Wang, J., Pathak, N., Shen, C., Keegan, B., Williams, D., Contractor, N.: The Formation of Task-Oriented Groups: Exploring Combat Activities in Online Games. In: Proceedings of IEEE, SocialComm-2009 (2009) 10. Johnson, N.F., Xu, C., Zhao, Z., Ducheneaut, N., Yee, N., Tita, G.: Human group formation in online guilds and offline gangs driven by a common team dynamic. Physical Review E 79(6), 066117 (2009) 11. Huffaker, D., Wang, J., Treem, J., Ahmad, M.A., Fullerton, L., Williams, D., Poole, S., Contractor, N.: The Social Behaviors of Experts in Massive Multiplayer Online Roleplaying Games. In: 2009 IEEE Social Computing, SIN 2009, August 29-31 (2009) 12. Thurau, C., Bauckhage, C.: Analyzing the Evolution of Social Groups in World of Warcraft. In: Proc. IEEE Conf. on Computational Intelligence and Games (2010) 13. McPherson, M., Smith-Lovin, L., Cook, J.M.: Birds of a Feather: Homophily in Social Networks. Annual Review of Sociology 27, 415–444 (2001) 14. Mulligan, J., Patrovsky, B., Koster, R.: Developing online games: An insider’s guide. Pearson Education, London (2003) 15. Palla, G., Barabási, A.-L., Vicsek, T.: Quantifying social group evolution. Nature 446(7136), 664–667 (2007) 16. Schrader, P.G., McCreery, M.: The acquisition of skill and expertise in massively multiplayer online games. Educational Technology Research and Development 56, 557– 574 (2008) 17. Williams, D.: The Mapping Principle, and a Research Framework for Virtual Worlds. Communication Theory 20(4), 451–470 (2010) 18. Williams, D., Ducheneaut, N., Xiong, L., Zhang, Y., Yee, N., Nickell, E.: From Tree House to Barracks: The Social Life of Guilds in World of Warcraft. Games and Culture 1(4), 338–361 (2006)
Tadvise: A Twitter Assistant Based on Twitter Lists Peyman Nasirifard and Conor Hayes Digital Enterprise Research Institute National University of Ireland, Galway IDA Business Park, Lower Dangan, Galway, Ireland [email protected]
Abstract. Micro-blogging is yet another dynamic information channel where the user needs assistance to manage incoming and outgoing information streams. In this paper, we present our Twitter assistant called Tadvise that aims to help users to know their followers / communities better. Tadvise recommends well-connected topic-sensitive followers, who may act as hubs for broadcasting a tweet to a larger relevant audience. Each piece of advice given by Tadvise is supported by declarative explanations. Our evaluation shows that Tadvise helps users to know their followers better and also to find better hubs for propagating communityrelated tweets. Keywords: Micro-blog, Twitter, People-Tag, Information Sharing.
1
Introduction
In this paper we present Tadvise (http://tadvise.net), a novel application to assist Twitter users to select which followers would best be able to propagate the message to a relevant community-oriented audience. Tadvise automatically adds such well-connected hubs to a tweet to attract their attention. Hubs are considered as those followers, who have more well-connected topic-sensitive followers than others. Our approach is mainly based on Twitter lists. Twitter lists can be perceived as a way of tagging people [2]. Our work (Tadvise) uses Twitter lists for building user profiles in order to make recommendations on tweet diffusion. Tadvise is most useful for those Twitter users interested in sharing information, recommendations and news (such as conference announcements and events) with like-minded users in a community. Earlier work [8,3] demonstrated the community (i.e., highly reciprocal network) structure of the Twitter network. As such, the scope of our work is focused on community-related pass-along tweets. For example, tweets like “deadline extended for next drupal conference...” are considered to be in the scope of Tadvise, as they are relevant to a particular interest group. On the other hand, informal status updates such as “having breakfast now...” are out of scope of Tadvise. We analyse the followers of a seed user (followers at distance of 1 ) plus the followers of the followers of the seed (followers at distance of 2 ) when considering the relevant audience for a (re)tweet. While A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 153–160, 2011. c Springer-Verlag Berlin Heidelberg 2011
154
P. Nasirifard and C. Hayes
not actually following the seed, followers at distance of 2 may be influenced by or be interested in a seed’s community-related tweets, due to the dense community structure of the network [8,3] and principle of locality [1]. Our focus is not to prohibit users generating and submitting novel contents, but to understand their followers’ communities better.
2
Tadvise Overview and Components
Tadvise builds user profiles for twitterers in order to recommend tweets or retweets that could be potentially relevant to a community of their followers. To register for Tadvise, a twitterer u chooses to follow the Tadvise Twitter account (i.e., @Tadvise). Once notified, Tadvise crawls the social network of u and builds user profiles of her followers. After completing these steps, which are performed offline, Tadvise sends a direct message to u, indicating that it is ready to provide advice. By visiting the Tadvise homepage, u can benefit from advice and/or tweet a message directly to Twitter. Current version of Tadvise uses a traffic light metaphor to indicate its advice. A green light means that the majority of u’s followers were tagged with one or more (hash)tags that exist in the tweet. The red light means that none of u’s followers were tagged with the (hash)tags in the tweet. Finally, the amber light means that some of u’s followers were tagged with the (hash)tags in the tweet, but they are not the majority of u’s followers. Tadvise has three main components, namely a crawler, a user profile builder and an advice engine. In the following, we describe all three components. Before proceeding any further, we formally define a Twitter-like system: A system S with n nodes (users) U = {u1 , u2 , ...un }, where there exists a set of unidirectional relationships R between users, so that if ui makes a relationship (rij ∈ R) with uj , we call ui a follower of uj and uj a followee of ui . We denote this relationship with ui → uj . We assume that the system S is open, so that any user can make relationships with other users. The set of followees and followers of ui are denoted by Uif r and Uif o respectively. User ui can assign zero or more tags ({t1 , t2 , ...tm }) to each of her followees. We define a function lists that gets a user uj as input and returns pairs (ui , tk ) meaning that ui has assigned tk to uj . 2.1
Crawler of Tadvise
The crawling component of Tadvise gets a seed as input and uses the Twitter API for crawling twitterers. The crawling component does its job in two steps. First, it crawls the network of followers at distance of one and two of a seed (i.e., breath-first mechanism). The second step of crawling consists of crawling Twitter lists. This step takes the network of followers from the first step and crawls Twitter lists associated with each follower. Each API call returns 20 lists membership of a user. We put a limit (i.e., 300) on the number of Twitter lists associated with a user that we crawl, as 300 tags are reasonably enough for building a high-quality user profile for our purpose.
Tadvise: A Twitter Assistant Based on Twitter Lists
2.2
155
User Profile Builder of Tadvise
In order to assess the relevance of a tweet to a single user uj , we create a weighted user profile for uj containing metadata for uj ’s communities, interests, expertise, etc. In short, each user profile is composed from metadata extracted from Twitter lists (tags) associated with the user by other users. In order to build a weighted user profile, we need to rank the tags that have been associated with a user (i.e., rank the result of lists(ui ).) We do this by ranking the users who assigned the tags. There have been several studies of user ranking on Twitter [3,8,4] with no one technique demonstrating superiority. As such we make use of Kwak et al.’s finding [4] that a simple in-degree measure behaves similarly to PageRank on the Twitter network (see equation 1). As Twitter is an open platform and the connections are not necessarily reciprocal and does not require confirmation of the followee-side for public accounts, we do not consider the outgoing links (i.e., followees) for ranking purposes. rank(ui ) = log(#Uif o )
(1)
Note that our ranking method can be generalised to a recursive one (see equation 2). In brief, users, who have more high-ranked followers, have higher ranks. rank(ui ) =
rank(uj )
(2)
uj ∈Uif o
weight(tk , uj ) =
rank(ui )
(3)
(ui ,tk )∈lists(uj )
The weight of a particular Twitter list for a target user profile is calculated by summing up the rank of people, who have assigned that Twitter list description to the target person (see equation 3). As Twitter lists consist of arbitrary phrases, we use the Porter stemming algorithm [6] to reduce the number of unique terms. For tags that comprise more than one term, we use the stemmer on each term. 2.3
Advice Engine of Tadvise
The advice engine component takes user profiles and a tweet as inputs and provides two kinds of real-time diffusion advice: a) audience profiling that allows users to identify the subset of their followers that were tagged with a term used in the tweet; and b) recommending well-connected topic-sensitive users for a tweet, who may retweet the tweet. Given a tweet and a user ui , first we extract tags from the tweet. Typically, twitterers use the hashtags to specify particular topics (e.g., #drupal). We extract such tags from the tweet and enrich them using Google Sets (http://labs.google.com/sets). Enriching hashtags is important, as it may
156
P. Nasirifard and C. Hayes
give us a set of tags that are semantically relevant to the original tags. Our analysis suggests that Google Sets provide more contextually relevant suggestions than lexical databases such as WordNet. Moreover, we also analyse the URLs within a tweet. Using regular expressions, we extract HTTP and FTP URLs from a tweet. Then we use the delicious API (http://delicious.com/help/api) to retrieve the tags associated with each URL. We do not enrich delicious tags, as delicious recommends already sufficient tags for a given URL. We then merge the tags from delicious and Google Sets. For the first part of the diffusion advice (i.e., detecting the tags that are relevant to majority of the followers), we build aggregated user profiles that comprise user profiles of all followers of a seed at distance of 1 and 2 (i.e., summation). We represent such aggregated profiles as f ollowersP rof ile1(ui ) and f ollowersP rof ile2(ui ) respectively. These profiles contain (sorted) weights of all tags which were assigned to followers and also followers of the followers of a seed. Moreover, we cluster the sorted weights in f ollowersP rof ile1(ui ) and f ollowersP rof ile2(ui ) into two partitions which represent frequently occurring (thus highly weighted) lists and infrequently occurring lists. Rather than applying a fixed threshold to each profile, we find a knee point between the two partitions by applying the k -means clustering algorithm with k =2. The first partition, which groups high-ranked tags, represents the source of green light for the traffic light. The second partition represents the source of amber light advice. Tadvise shows the red light, if it is unable to find any representative tags of a tweet within either partition. Note that the traffic light metaphor was not aimed to prohibit users of generating novel contents. Algorithm 1 shows pseudocode of the second part of the diffusion advice (i.e., recommending several well-connected topic-sensitive followers). The input of this algorithm is a directed graph g which is built as follows: The root of g is the seed ui . We also add all members of Uif o to g (uj → ui ). The reason is that when ui tweets a message, all of her followers receive that tweet and thus can act as potential hubs. Then, those followers of each follower of ui , who were tagged with one or more (hash)tags in the tweet, will be added to g (using f ollowersP rof ile2(ui )). We pass g to the algorithm. The algorithm finds k hubs in g using In-degree so that the hubs cover as many interested followers (at distance of 2 of ui ) as possible and have as few overlapping followers as possible with each other. The reason that we also consider overlapping followers is to minimise redundant tweets, however, we envision allowing users to enable/disable this feature. The default value of k in the algorithm 1 is 3. The “hub score” in the algorithm 1 indicates the number of interested users, who potentially could receive a tweet through a hub. As tweets are 140-characters in length, we also consider the length of screen name of a hub, when making a recommendation. That means if two hubs disclose a tweet further with n users, we choose the hub, who has shorter length of screen name. We add the recommended candidates automatically to the tweet by inserting the screen name after the ’@’ sign and enable the user to tweet it directly from the Tadvise interface. In order to
Tadvise: A Twitter Assistant Based on Twitter Lists
157
input : Directed graph (g) Integer k // number of recommended hubs output: candidates ⊂ g 1 2 3 4 5 6 7 8 9 10 11
candidates ← ∅; covered ← ∅; while size(candidates)!=k do calculate hubs in g and sort them based on hubs scores; node ← get the node with the highest score of hubs, so that f ollowers(node) ∩ covered is minimum; candidates ← candidates ∪ node ; covered ← covered ∪ f ollowers(node) ; g ← g − f ollowers(node) − node ; if g == root(g) then break; end return candidates; Algorithm 1. Finding Well-Connected Hubs
convince end users that our recommendations are relevant, we provide simple text-based explanations.
3
Evaluation and User Study
We evaluated the following three main hypotheses. Hypothesis 1 : Twitter lists assist twitterers to know each other better. Hypothesis 2 : Users find it difficult to keep track of their followers. Tadvise helps users to know their followers (as a whole) better by identifying their communities, interests, expertise, etc. This hypothesis is important, because this may help users to boost communication and collaboration opportunities and may encourage users to tweet community-related tweets more often. Hypothesis 3 : Tadvise helps users to propagate their community-related tweets more efficiently and effectively by proposing well-connected followers for a particular topic (instead of blind and ad-hoc retweeting requests.) The first two hypotheses are rather more general hypotheses and aimed to shed some light on (future) research on Twitter lists. The third hypothesis is the main one that is related to Tadvise functionalities. 3.1
Experiment – Design
In order to provide support for our hypotheses, we designed a survey that was personalised for each participant. For the survey design we studied the design recommendations of [7] and the well-known Questionnaire for User Interaction Satisfaction (QUIS) (http://lap.umd.edu/quis/). The survey had five main steps with a number of questions in each step. Most of questions in the survey had five possible replies: strongly agree, agree, neutral, disagree, and strongly disagree.
158
P. Nasirifard and C. Hayes
Step 1: General Questions - In the first step, the goal was to study: a) Whether subjects agree with the Twitter lists assigned to them; b) Whether the lists that were assigned to them fall into certain categories; and c) Whether the lists they assign(ed) to others fall into certain categories. The aforementioned categories refer to common people-tagging categories discovered in a large-scale analysis of tagging behaviour [5]. They are as follows: Characteristic (e.g., friendly, cool), Interest and Hobby, Affiliation (e.g., IBM), Working Group, Location, Name (e.g., Peter, Mary), Project, Role (e.g., boss), Skill and Expertise, Sport, and Technology (e.g., drupal, semantic-web). Steps 2-4 were presented in a game-like fashion with the subject having to guess or choose from a set of answers. Each step had 4 sub-steps. Step 2: Usefulness of Twitter Lists/People-Tags - In step 2, we collected data on usefulness of Twitter lists. For the first three sub-steps of step 2, we picked one random follower, who had been assigned to at least three Twitter lists by any user and was also a followee of the subject. Then, we asked the subject to assign three Twitter lists to the follower. After clicking the submit button, we fetched the real Twitter lists assigned to the follower and asked the subject whether the result was useful in knowing the follower better. In sub-step 2.4, we focused on the community of the subject and asked the subject to guess three Twitter lists that fit the majority of her followers. After submitting the result, we showed our analysis result (i.e., all Twitter lists of first partition of the f ollowersP rof ile1(subject)) to the subject and asked, if it helps to know the community of her followers better. Step 3: Knowledge of Followers - Step 3 of the survey measured how well subjects know their followers. In each sub-step, we showed a random Twitter list (fetched from f ollowersP rof ile1(subject)) to the subject and asked two questions: 1) Approximate percentage of the followers, who were assigned to that list. And 2) The followers (from twenty random followers), who were assigned to that Twitter list. In sub-steps 3.1 and 3.2, we picked a random Twitter list from the first partition of the f ollowersP rof ile1(subject) and ensured that at least 50% (if possible) of the 20 random followers are correct answers. In sub-steps 3.3 and 3.4, we picked a random Twitter list from the second partition. We enabled the subjects to skip a Twitter list (maximum three times in each sub-step), if they could not understand its meaning. In order to prevent the subjects selecting all followers, we put a maximum limit on the number of followers that could be selected. After submitting the result, we showed correct percentages and the missing followers from the list and asked the subjects whether this information helped in knowing their followers/communities better. Step 4: Usefulness of Recommendations - In step 4, we investigated whether subjects found Tadvise recommendations to be useful. In sub-steps 4.1 and 4.2, we showed a random Twitter list (as a topic) from the first partition of the f ollowersP rof ile1(subject) and asked the subject to select two wellconnected followers who could propagate a tweet about the topic to a broader audience. We enabled the subjects to select two followers from drop-down boxes, each containing twenty random followers, two of which were the correct answer.
Tadvise: A Twitter Assistant Based on Twitter Lists
159
For the sub-steps 4.3 and 4.4, we carried out the same experiment, but with the Twitter lists from the second partition. After submitting the result, we presented the subject with our recommended hubs and provided explanations to justify our recommendations. Subjects were asked whether they were sufficiently convinced to use the recommendations. Step 5: General Questions - In the final step, we asked subjects several general questions. Among others, we asked the subjects if they would find it useful to receive advice on whether their followers may be interested in a particular tweet. We also asked the subjects if they would find it useful to receive advice about the most effective and well-connected hubs. 3.2
Experiment - Result
Participants Overview. We made personalised online surveys for 112 Twitter candidates, among them 11 candidates did not fulfill our requirements for the survey - Each subject had to have at least three followers that been assigned to at least three Twitter lists, and who were also followees of the subject (i.e., reciprocal link). The survey was online for four weeks and we asked all 101 eligible candidates via email, instant messaging or direct tweet to participate in our survey. In total, 76 eligible candidates participated in our survey, among them 66 participants completed the survey. 47% of participants, who completed the survey (i.e., 31 participants) had 100 or more followers, among them twelve participants had more than 500 followers. Four participants had 1000 or more followers. Results. The results show that 79.1% of participants who were assigned to one or more Twitter lists mentioned that Twitter lists represent them correctly. Only 1.6% of participants claimed that they were assigned incorrectly to a list. Whether assigning lists or being assigned to lists, participants indicated that 96% of lists came from the following categories: Affiliation: 24.3%, Technology: 14.6%, Interest and Hobby: 15.9%, Skill and Expertise: 13.8%, Working Group:
(a)
(b)
(c)
(d)
Fig. 1. Figure (a) is related to our first hypothesis: 58.1% of participants agreed that Twitter lists assist them to know their followers better, whereas 18.6% disagreed; figure (b) is related to our second hypothesis: 57.4% of participants agreed that Tadvise helps them to know their followers/community better, whereas 17.3% disagreed; figures (c) and (d) are related to our third hypothesis: 72% of participants found Tadvise recommendations and explanations for propagating community-related tweets convincing, whereas 13.7% disagreed (figure (c)); moreover, 49.3% of participants found Tadvise recommendations and explanations for propagating non-community-related tweets convincing, whereas 18.2% disagreed (figure (d)).
160
P. Nasirifard and C. Hayes
9.2%, Location: 8.4%, Characteristic: 6.3%, Project: 3.8%, Role: 1.7%, Name: 1.3%, and Sport: 0.8%. We used the results of sub-steps 2.1, 2.2, 2.3, 3.3 and 3.4 for evaluating our first hypothesis; sub-steps 2.4, 3.1, and 3.2 for evaluating our second hypothesis; and sub-steps 4.1, 4.2, 4.3, and 4.4 for evaluating our third hypothesis. Figure 1(a)1(d) show the result for our hypotheses (refer to figures for the results). In step 5, 48.4% of participants were positive about being advised, if a tweet is relevant for majority of community-related followers, whereas 28.1% of participants were negative. The rest (23.5%) selected the Undecided option. 78.1% of participants were positive about being recommended hubs that could efficiently retweet a tweet and only 7.8% of participants found it useless. The rest (14.1%) selected the Undecided option.
4
Conclusion
In this paper we presented Tadvise, a system for helping users to manage the flow of messages in a micro-blogging network. We described our method for profiling the followers in a user’s network and for giving advice on whom are well-connected topic-sensitive hubs in relation to a tweet. The result of our personalised evaluation surveys suggests that participants were mainly interested in being recommended hubs that can effectively retweet their messages and they found Tadvise recommendations for (mainly) community-related tweets useful and convincing. Acknowledgments. This work is partially supported by Science Foundation Ireland (SFI) under Grant No. SFI/08/CE/I1380 (Lion-2 project).
References 1. Chen, J., Nairn, R., Nelson, L., Bernstein, M., Chi, E.: Short and tweet: experiments on recommending content from information streams. In: CHI 2010, pp. 1185–1194. ACM, New York (2006) 2. Farrell, S., Lau, T., Nusser, S., Wilcox, E., Muller, M.: Socially augmenting employee profiles with people-tagging. In: UIST 2007, pp. 91–100. ACM, New York (2007) 3. Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities. In: WebKDD/SNA-KDD 2007: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, pp. 56–65. ACM, New York (2007) 4. Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: WWW 2010, pp. 591–600. ACM, New York (2010) 5. Muller, M.J., Ehrlich, K., Farrell, S.: Social tagging and self-tagging for impression management. Tech. rep., IBM Watson Research Center (2007) 6. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980) 7. Shneiderman, B., Plaisant, C.: Designing the User Interface: Strategies for Effective Human-Computer Interaction, vol. 4. Pearson Addison Wesley (2004) 8. Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers. In: WSDM 2010, pp. 261–270. ACM, New York (2010)
A Case Study of the Effects of Moderator Posts within a Facebook Brand Page Irena Pletikosa Cvijikj and Florian Michahelles Information Management, ETH Zurich Scheuchzerstrasse 7, 8092 Zurich, Switzerland {ipletikosa,fmichahelles}@ethz.ch
Abstract. Social networks have become an additional marketing channel that could be integrated with the traditional ones, such as news and television media, as well as online channels. User participation as a main feature of the social networks imposes challenges to the traditional one-way marketing, resulting in companies experimenting with many different approaches, thus shaping a successful social media approach based on the trial-and-error experiences. Our study analyses the effects of moderator posts characteristics such as post type, category and posting day, on the user interaction in terms of number of comments and likes, and interaction duration for the domain of a sponsored Facebook brand page. Our results show that there is a significant effect of the post type and category on likes and comments (p < 0.0001) as well as on interaction duration (p < 0.01). The posting day has effect only over the comments ratio (p < 0.05). We discuss the implications of our findings for social media marketing. Keywords: Web mining, Facebook, social media marketing.
1 Introduction Marketing has recently undergone significant changes in the way information is delivered to the customers [1]. Social networks as a part of Web 2.0 technology provide the technological platform for the individuals to connect, produce and share content online. They are becoming an additional marketing channel that could be integrated with the traditional ones as a part of the marketing mix. Through users’ feedback or by observing conversations, a company could learn about customers’ needs, resulting in involvement of members of the community in the co‐creation of value through the generation of ideas [2]. Companies, from food to electronics, are starting to understand the possibilities offered by the social network marketing. They have evolved the approach to their customers, shifting from traditional one-to-many communication to one-to-one approach, offering assistance at any time through the social media sites such as Facebook, Twitter, MySpace, etc. [3]. Still, social network marketing is currently at a relatively early evolutionary stage and has yet to be studied from different perspectives. A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 161–170, 2011. © Springer-Verlag Berlin Heidelberg 2011
162
I. Pletikosa Cvijikj and F. Michahelles
The goal of our paper is to evaluate the effect of the post characteristics: (1) post type, (2) post category and (3) posting day on the interaction level on the sponsored Facebook brand page. We measure the interaction level through (1) the number of comments on individual post, (2) number of likes and (3) interaction duration. The question we try to answer is: • What is the effect of the moderator posts on the level of interaction within a Facebook page? In the continuation of the paper we present the related work, explain the method we used and present and discuss the obtained results.
2 Related Work A social network (SN) is an online service that allows an individual to create a public profile, connect to other users and access and explore personal and other users’ lists of connections [4]. At the moment, Facebook is the largest SN with more than 500 million active users [5] and the second most visited web page [6]. SNs and Facebook have been studied from different perspectives. The usage patterns have been investigated in [7], i.e. “social searching” - to maintain/solidify existing offline relationships, as opposed to “social browsing” for meeting new people. In addition, [7] also revealed the surveillance function of Facebook, allowing users to “track the actions, beliefs and interests of the larger groups to which they belong”. Other studies include usage motivations, such as social connection, shared identities, content, social investigation, social network surfing and status updating [8], existence and usage characteristics of communities with high degree of internal interaction [9], messaging activities in terms of regularities in daily and weekly traffic in relation to the users’ demographics [10] and high-level characteristics of the users [11]. User participation as a main feature of the social networks imposes challenges to the traditional one-way marketing, resulting in companies experimenting with many different approaches, thus shaping a successful social media approach based on the trial-and-error experiences [12]. Still, according to [13], social networks may play a key role in the future of marketing; they may increase customers’ engagement, and help to transform the traditional focus on control with a collaborative approach suitable for the modern business environment. Previous studies in the field have focused on the users by trying to identify the most influential target group [14] or explain their relation to the social media [15]. Others have addressed the challenges of social marketing such as aggressive advertisement, lack of e-commerce abilities, invasion of user privacy and legal pitfalls [16]. In addition, companies should avoid over-commercialization and favor transparency instead of trying to fully control their image [12], [3]. Inappropriate approach to these challenges could lead to fan loss and exposing the company to the risk of destroying its own credibility. Apart from the challenges, many opportunities have also been recognized, such as raising public awareness about the company, community involvement and gathering experience for the future steps [16]. In addition, [17] argues that social networking
A Case Study of the Effects of Moderator Posts within a Facebook Brand Page
163
can also help find talent and new customers, and help conduct brand intelligence and market research. Based on exploratory findings and practical examples scholars try to generate guidelines for successful social marketing. Guidelines that apply for online word-ofmouth [18] can also be used for Facebook marketing: (1) sharing the control of the brand with consumers and (2) engaging them in an open, honest, and authentic dialog. According to [14], companies need to build an approach plan before diving into the social marketing in order to appropriately approach the frequent users who are most likely to virally spread their enthusiasm for a new product or service. The given suggestions include (1) focusing on a conversation, (2) developing a close relationship with the brand through “friending” with the social marketing pages and (3) building a plan for engagement and finding out what interactions, content, and features will keep users coming back. Our study analyses the effects caused by the posts shared by the moderator on a sponsored Facebook brand page in terms of user interactions, such as number of comments and likes, and interaction duration. To the best of our knowledge, this study is the first one trying to measure the interaction on Facebook in relation to the actions undertaken by the page moderator. We discuss our results in order to identify the implications for social media marketing.
3 The Method 3.1 The Dataset The dataset used for this study consists of posts shared on the ok.- Facebook brand page. ok.- is a Swiss consumer goods brand, targeting the younger customers with a social network marketing approach. This particular brand was selected for this study for the reason of having the possibility to access the shared data since the first day of creation of its Facebook brand page. The data collection was performed over one year, from the official launch of the ok.- page in March, 2010 to March, 2011. To guarantee accuracy of the data and ensure independence from potentially changing Facebook policies, post were fetched on a daily basis, using a script utilizing the Facebook Graph API1. For the selected period of time 120 moderator posts were obtained. 3.2 Post Categories Assignment In addition to the data fetched through the Facebook Graph API, we were interested in evaluating the effect of different post categories. Categories definition was made by the ok.- social media marketing manager in the communication planning phase, before the official launch of the Facebook page. As such, they represent a part of the company’s social media marketing strategy for the ok.- Facebook brand page. The assignment of the categories to each of the posts was also done by the ok.social media manager, as a part of the interaction planning process. The explanation for each of the assigned categories and a corresponding example are given in Table 1. 1
http://developers.facebook.com/docs/reference/api/
164
I. Pletikosa Cvijikj and F. Michahelles Table 1. Post categories and examples
Post Category Product(s) announcement Information Designed question
Questioner Competition Advertisement
Statement
Explanation Announcement of new product launch. Information regarding a sales location, number of page fans, etc. Posts in form of questions with a goal to engage users in a dialog. Using the Facebook Poll to obtain answers on a specific question. Posts related to competition, i.e. announcements, rules, winners, etc. Advertisement of existing products (mostly used in a form of photo post). Posts in form of statement, stating opinion on certain topic.
Example 4 new ok.- chocolate bars are here! Two k kiosk Shops opened today in Egg. Have fun shopping! Is it ok never to grow up? There is a new questioner under "Polls/Quizzes+" on a topic… Do you want to be an ok.- star? Our displays wait for your post… ok.- products, 5 new photos (photo post) The fact that sun and rain are changing at the moment is not ok.-
3.3 Used Variables There are two basic elements that correlate to the posting activity of the moderator as a part of the engagement plan, (1) what should a moderator post on the “wall” to trigger more user interaction, and (2) when should the content be posted. Posts shared on the Facebook could be categorized by the type of the post and their content. Post type corresponds to the “sharing” action taken by the page moderator within a Facebook page. For the observed period, Facebook pages offered the possibility to share: (1) status, (2) photo, (3) video and (4) link. Depending on the selected sharing action, Facebook assigns the corresponding post type to each post. Description of the content could be done through the topics reflected in the posts. Since the classification of the posts into topics would result in too many groups, thus making the statistical analysis difficult, we have decided to use the assigned post categories as a more general representation. In order to answer the second question, we have selected the posting day of the week as a factor that might influence the level of user interaction. This particularly applies to the selected Facebook page since it represents a regional brand, thus all of the users originate from the same time zone. We have confirmed our reasoning with the demographics data from the Facebook Insights. Based on this reasoning we have selected the following independent variables for our study: (1) the post type, as defined by Facebook, (2) the assigned post category, as described in previous section and (3) the day of the week when posting was done. In terms of user interaction, apart from posting, Facebook offers the possibility to comment or “like” the posts shared on the “wall”. Based on this, we have selected the number of comments and likes as a measure for the level of user interaction. Since the number of comments and likes is not an absolute measure, but is related to the number of page fans at the moment of posting, we have decided to use the likes and comments ratio as a more accurate interaction measure. Thus, the calculation of the depended variables was performed using the following formulas:
A Case Study of the Effects of Moderator Posts within a Facebook Brand Page
165
NL , NF
(1)
NC , and NF
(2)
ID = DLI − DC ,
(3)
LR =
CR =
where NL is the number of likes, NC is the number of comments and NF is the total number of fans on the day of posting. In addition, DC, the date of creation and DLI, the date of last interaction are used to calculate the interaction duration. Table 2 explains all of the used independent and dependent variables and all of their possible values. Table 2. Independent and dependent variables used in the study Variable PT DOW C LR CR ID
Description Post type Day of week Category Likes ratio Comments ratio Interaction duration
Values status, photo, video, link Monday, Tuesday, …, Sunday (see Chapter 3.2) Numerical Numerical Numerical
Type Independent Independent Independent Dependent Dependent Dependent
Source Graph API Graph API Valora Graph API Graph API Graph API
3.4 Data Analysis
In order to answer our research questions, we needed to analyze the effects that each of our independent variables has on each of the dependent variables. For that purpose we decided to perform a statistical testing to see if there is a significant difference in our results. We have decided to use the Kruskal–Wallis non-parametric test for oneway analysis of variance since the normality test on our data resulted in negative outcome for all three dependent variables (CI = 95%, p < 0.0001). Furthermore, for the post-hoc analysis we have applied the Mann-Whitney tests with Bonferroni correction.
4 Results 4.1 Post Type
In the selected dataset only three post types were present: status, photo and link. A Kruskal-Wallis test has show that there is a significant effect of post type on all three variables, the likes ratio (H(2) = 20.24, p < 0.0001), the comments ratio (H(2) = 21.90, p < 0.0001) and the interaction duration (H(2) = 11.32, p = 0.0035). Table 3 illustrates the obtained descriptive statistics from the Kruskal-Wallis test.
166
I. Pletikosa Cvijikj and F. Michahelles Table 3. Descriptive statistics for LR, CR and ID for each post type
Type Status Photo Link
LR Median 0.00213 0.00338 0.00072
N 74 29 17
Sum 0.24853 0.87995 0.08528
CR Median 0.00078 0.00122 0.00006
Sum 0.12424 0.98923 0.00418
ID Median 0.464 2.121 0.067
Sum 393.825 1626.295 20.379
The results from the post-hoc analysis have shown that there are also significant differences between different post types. The detailed results are shown in Table 4. Table 4. Effect size obtained from the post-hoc analysis (*p<0.05, **p<0.005, ***p<0.0001)
Status Status Photo
Photo Link Link
LR 0.25* 0.37** 0.56***
CR 0.44*** 0.57***
ID 0.44**
4.2 Post Category
The obtained descriptive statistics for the post category effect are shown in Table 5. Table 5. Descriptive statistics for LR, CR and ID for each post category
Category Statement Des. Question Announcement Information Competition Advertisement Questioner
N 8 24 12 44 21 5 6
LR Median 2.82E-03 2.21E-03 4.70E-03 2.09E-03 5.54E-04 9.30E-03 1.12E-03
Sum 3.08E-02 6.17E-02 1.34E-01 2.36E-01 2.26E-02 7.07E-01 2.21E-02
CR Median 3.65E-04 1.91E-03 2.59E-03 3.66E-04 2.53E-04 5.08E-03 1.22E-03
Sum 8.05E-03 5.16E-02 7.13E-02 3.34E-02 1.43E-02 9.03E-01 3.57E-02
ID Median 3.08E-01 1.24E+00 3.29E+00 3.52E-01 1.39E-01 8.34E+01 1.15E+01
Sum 1.02E+01 2.12E+02 6.81E+02 3.61E+02 3.88E+01 6.03E+02 1.33E+02
Table 6. Effect size obtained from the post-hoc analysis (*p<0.05, ** p<0.005, ***p<0.0001)
Statement Des. Question Des. Question Announcement Announcement Information Information Competition
Competition Information Competition Information Competition Competition Advertisement Advertisement
LR 0.57* 0.58* 0.74*** 0.42** 0.59**
CR 0.55*** 0.55** 0.49** 0.50* 0.42* 0.56*
A Case Study of the Effects of Moderator Posts within a Facebook Brand Page
167
Significant effect of post category was found to exist on all three variables, the likes ratio (H(6) = 34.34, p < 0.0001), comments ratio (H(6) = 35.54, p < 0.0001) and the interaction duration (H(6) = 17.28, p = 0.008). A post-hoc analysis has revealed the significant differences between different post categories. Table 6 shows the results of the post-hoc analysis. 4.3 Day of Week
The obtained descriptive statistics from the Kruskal-Wallis test for the effect of the day of week are shown in Table 7. Table 7. Descriptive statistics for LR, CR and ID for each post category
Category Monday Tuesday Wednesday Thursday Friday Saturday Sunday
N 25 18 24 15 30 4 4
LR Median 1.89E-03 2.09E-03 2.55E-03 1.22E-03 2.25E-03 2.35E-03 2.69E-03
Sum 1.21E-01 7.06E-01 1.17E-01 6.16E-02 1.57E-01 3.64E-02 1.51E-02
CR Median 9.46E-04 1.59E-03 8.96E-04 9.64E-05 6.07E-04 2.34E-04 1.46E-03
Sum 6.75E-02 9.15E-01 4.56E-02 2.81E-02 4.93E-02 3.32E-03 8.95E-03
ID Median 3.21E-01 8.30E-01 9.19E-01 1.90E-01 1.60E+00 1.72E-02 1.97E+00
Sum 4.97E+02 5.52E+02 1.47E+02 4.12E+02 3.89E+02 8.00E-01 4.31E+01
The results obtained from the statistical testing show no significant effect on the likes ratios and interaction duration (p > 0.05). A significant effect of day of the week occurs only over the comments ratio (H(6) = 14.00, p = 0.030). In addition, the significant difference in the comments ratio exists only between posts shared on Tuesday and Thursday (p = 0.019, r = 0.54).
5 Discussion and Conclusions The results presented in the previous section have shown that different post characteristics have effect over the interaction on the Facebook page. Post type has effect on all three measures for user interaction, i.e. the likes ratio (H(2) = 20.24, p < 0.0001), the comments ratio (H(2) = 21.90, p < 0.0001) and the interaction duration (H(2) = 11.32, p = 0.0035). Photos have caused the greatest level of interaction, followed by Statues and Links. In addition, a significant difference exists between each of the post types. The likes ratio is significantly larger for Photos compared to Statuses and Links, and for Statuses compared to Links. The comments ratio is significantly smaller for Links compared to other two post types. Finally, interaction on Photos lasts significantly longer compared to the one on Links. Post category also displays significant effect over all three measures for user interaction, the likes ratio (H(6) = 34.34, p < 0.0001), the comments ratio (H(6) = 35.54, p < 0.0001) and the interaction duration (H(6) = 17.28, p = 0.008). The results from the post-hoc analysis show that the likes ratio is significantly lower for Competitions compared to all other categories except Questioners. Furthermore,
168
I. Pletikosa Cvijikj and F. Michahelles
Information and Competitions have significantly lower comments ratio compared to Designed Questions, Announcements and Advertisements. For the interaction duration there is no significant difference between any individual categories. The posting day has shown no effect on the likes ratio and interaction duration (p > 0.05). The effect is only visible for the comments ratio (H(6) = 14.00, p = 0.030). In addition, a significant difference exists only between posts shared on Tuesday and Thursday, i.e. comments ratio is much higher on the posts shared Tuesday. User engagement is the social media marketing’s new key metrics [19]. In addition, [20] proposes triggering the user interaction as one of the actions to be taken in order to optimize the marketing investment. In the context of social media marketing on Facebook, user engagement could be measured through the number of posts shared by the user on the brand page, number of comments and likes and the duration of the interaction. Increasing the interaction could be achieved by finding out what actions, content, and features will keep users coming back. In our previous discussion, we have proposed two basic questions that correlate to the posting activity of the moderator, i.e. what should a moderator post on the “wall” and when the content should be posted in order to increase the user interaction. Furthermore, we suggested the usage of the post type and category for classification of the content, and the posting day of the week as a relevant factor for the selection of the appropriate time for posting. Our results show that posting type and category have a significant effect over the user interaction and as such should be used for planning of the communication strategy. Furthermore, the effects over comments and likes ratios are larger compared to the effects over the interaction duration. We assume that this is related to the fact that the “wall” of the Facebook brand page can only display a limited number of posts. Once they are not visible on the “wall”, the interaction stops. In case of a regular posting strategy, each post would be visible for approximately same time, resulting in the approximately same interaction duration. The interaction over Photos lasts significantly longer because of the fact that when a user “clicks” on a photo posts, Facebook opens the full photo album shared on the page. The user would then probably go through all the photos that are new to him, thus interacting with some photos long time after they were posted. We plan to investigate this further in our future studies. In regard to the posting day of the week, our results indicate that this is not a valuable factor to be used for the interaction planning. Thus, the question when a moderator should share new post on the wall remains to be studied further. We are aware of the limitations of our study in terms of having a dataset containing only 120 posts from a single Facebook page. In order to overcome this limitation we plan to expand our analysis to the larger dataset gathered from other Facebook brand pages. Still, our results show clear evidence of moderator posts increasing activity of the fans on a Facebook brand page. This should encourage moderators of Facebook pages to prepare clear posting strategies that trigger the activity of users and drive adoption in the long run.
A Case Study of the Effects of Moderator Posts within a Facebook Brand Page
169
6 Future Work In this paper we present our results from the evaluation of the effect of the post characteristics: type, category and posting day on the interaction level in terms of number of comments, likes and interaction duration. Our results show that there is a significant effect of the post type and category on all three interaction measures, while posting day has effect only over the comments ratio. The results presented in this paper are limited to the dataset obtained from only one Facebook page. In order to confirm our findings we plan to expand our analysis to the posts gathered from other Facebook brand pages as well. In addition, we plan to introduce the post topic as an influencing factor. Furthermore, we would like to investigate the interaction on the posts shared by the page fans to understand if they exhibit similar results. This would provide us with an insight into the level of influence of the individual users, i.e. “superfans” [12] versus the moderator within the Facebook brand page. Finally, we want to compare our results to those from different categories of Facebook pages.
References 1. Brandt, K.S.: You Should be on YouTube. ABA Bank Marketing 40(6), 28–33 (2008) 2. Palmer, A., Koenig-Lewis, N.: An experiential, social network-based approach to direct marketing. Direct Marketing: An International Journal 3(3), 162–176 (2009) 3. Gordhamer, S.: 4 Ways Social Media is Changing Business. Mashable.com (2009), http://mashable.com/2009/09/22/social-media-business/ 4. Boyd, D.M., Ellison, N.B.: Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication 13(1), 210–230 (2008) 5. Facebook Statistics, http://www.facebook.com/press/info.php?statistics 6. Alexa.com, http://www.alexa.com 7. Lampe, C., Ellison, N., Steinfield, C.: A Face (book) in the crowd: Social searching vs. social browsing. In: Proceedings of the 20th Anniversary Conference on CSCW, pp. 167– 170 (2006) 8. Joinson, A.N.: Looking at, looking up or keeping up with people?: Motives and use of Facebook. In: Proceeding of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in Computing Systems, pp. 1027–1036 (2008) 9. Nazir, A., Raza, S., Chuah, C.N.: Unveiling Facebook: a measurement study of social network based applications. In: Proceedings of the 8th ACM SIGCOMM Conference on Internet Measurement, pp. 43–56 (2008) 10. Golder, S., Wilkinson, D., Huberman, B.A.: Rhythms of Social Interaction: Messaging within a Massive Online Network. In: International Conference on Communities and Technologies (2007) 11. Gjoka, M., Sirivianos, M., Markopoulou, A., Yang, X.: Poking Facebook: Characterization of OSN Applications. In: Workshop on Online Social Networks (2008) 12. Coon, M.: Social Media Marketing: Successful Case Studies of Businesses Using Facebook and YouTube With An In-Depth Look into the Business Use of Twitter. M.A. Thesis (2010), http://comm.stanford.edu/coterm/projects/2010/maddycoon.pdf
170
I. Pletikosa Cvijikj and F. Michahelles
13. Harris, L., Rae, A.: Social networks: the future of marketing for small business. Journal of Business Strategy 30(5), 24–31 (2009) 14. Li, C.: How consumers use social networks. Forrester Research Paper (2007), http://www.eranium.at/blog/upload/consumers_socialmedia.pdf 15. Agozzino, A.L.: Millennial students relationship with 2008 top 10 social media brands via social media tools. Office. Bowling Green State University (2010), http://gradworks.umi.com/34/05/3405748.html 16. Bolotaeva, V., Cata, T.: Marketing Opportunities with Social Networks. Journal of Internet Social Networking and Virtual Communities (2010) 17. Weston, R.: 7 Social Networking Strategies. Entrepreneur.com (2008), http://www.entrepreneur.com/technology/bmighty/ article191312.html 18. Brown, J., Broderick, A.J., Lee, N.: Word of mouth communication within online communities: Conceptualizing the online social network. Journal of Interactive Marketing 21(3), 2–20 (2007) 19. Haven, B., Vittal, S.: Measuring Engagement. Forrester Research Group (2008), http://www.adobe.com/engagement/pdfs/ measuring_engagement.pdf 20. Sterne, J.: Social Media Metrics. How to measure and optimize your marketing investment. John Wiley and Sons, Hoboken(2010)
Cognition or Affect? – Exploring Information Processing on Facebook Ksenia Koroleva, Hanna Krasnova, and Oliver Günther Humboldt-University Berlin, Institute of Information Systems Spandauer Straße 1, 10178 Berlin, Germany
Abstract. Recognizing the increasing amount of information shared on Social Networking Sites (SNS), in this study we aim to explore the information processing strategies of users on Facebook. Specifically, we aim to investigate the impact of various factors on user attitudes towards the posts on their Newsfeed. To collect the data, we program a Facebook application that allows users to evaluate posts in real time. Applying Structural Equation Modeling to a sample of 857 observations we find that it is mostly the affective attitude that shapes user behavior on the network. This attitude, in turn, is mainly determined by the communication intensity between users, overriding comprehensibility of the post and almost neglecting post length and user posting frequency.
Keywords: information processing, cognitive heuristics, attitude, cognitive and affective dimensions, social networking sites, Facebook.
1 Introduction The Social Networking Site (SNS) Facebook currently counts over 500 million active users [13] and is by far integrated into the daily life of most of them. Its popularity lies in providing the opportunity for users to share their daily experiences, memorable moments, thoughts, feelings and opinions with each other. Facebook is the largest database of social information, increasing at a rate of 30 billion pieces of shared content per month [13]. As the networks of users are growing, the amount of information shared on the network increases manifold. Thus, users have to apply certain strategies to process and evaluate the information from their friends presented to them on the Newsfeed. As users are limited in their cognitive capacities and motivation, it becomes critical to understand what users like and consider useful. Overall, increased information sharing on the Newsfeed is bounded by the problem of information overload - a phenomenon referring to the emotional state of dissatisfaction due to the increasing amount and decreasing quality of information [11], revealed in the qualitative study by [26]. Recognizing this problem, Facebook introduces information filtering. As opposed to presenting information on the Newsfeed in order of appearance (Most Recent), the algorithm identifies the news it finds most relevant for the user (Top News) based on three criteria: time, post type and communication intensity [33]. However, the algorithm does not take into account A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 171–183, 2011. © Springer-Verlag Berlin Heidelberg 2011
172
K. Koroleva, H. Krasnova, and O. Günther
other seemingly important factors, such as match in tastes or posting frequency of users, as well as does not consider individual preferences for information presentation. As a result, users increasingly express negative attitudes towards the Newsfeed: “It takes so much effort to pick out the information I am curios about, in between this and this” [26]. Usual consequences of information overload include confusion, dysfunctional effects in form of stress and anxiety [11], as well as diminishing decision quality [5]. On SNS, information overload may result in reduced levels of activity on the network, negatively impact the overall attitude towards the Newsfeed as well as diminish the possibility to obtain social capital [26]. Thus it becomes important to try to reduce the perception of information overload on such social applications as Facebook. Motivated by the risks posed by the rising amount of information on the Newsfeed, as well as absence of systematic research on the topic in the context of SNS [9], in this paper we aim to empirically investigate how users process the information on their Newsfeed and the attitudes that they form as a result. We aim to explore what users like (affective attitude) and find useful (cognitive attitude) as well as what creates information overload on the Newsfeed. In this way we can provide valuable insights for the improvement of Newsfeed filtering algorithms. To achieve our goals, we aim to test the causal part of the model proposed by [26], namely explore how various characteristics of the post and the person who posted (‘poster’) influence the formation of cognitive and affective components of attitude towards the post.
2 Theoretical Background 2.1 Attitude Formation Process In order to investigate the behaviour of users on the Newsfeed, we make use of several theories. First, the Theory of Reasoned Action (TRA) postulates that users act with respect to their intentions, which, in turn, are influenced by attitudes and subjective norms [14]. Validity of the TRA has been confirmed for various contexts, including the acceptance of IT, e.g. [29], or various social phenomena [1] and thus can be applied to the SNS context as well. Second, the Technology Acceptance Model (TAM)– tailored specifically to evaluate adoption of IT - explores the impact of perceived usefulness and ease of use on user attitudes and the subsequent intention to use a specific technology [8]. The model is applied in a wide array of studies in IS, e.g. [19], and focuses on attitude as the main determinant of system use. In line with these models, we presume that the behaviour of users on the Newsfeed will be a function of their intention to do something with a post (such as read, like or comment) based on their attitude towards it. As the main component of both TRA and TAM, attitude is a function of beliefs that the object of evaluation possesses certain desired attributes and the importance of these attributes to the person [14]. The attitude of users towards the post on their Newsfeed will be determined by the expectation of certain positive outcomes associated with reading the post, for example finding out some new information, increased sense of connectedness [25] or social capital [10]. Consistent with the model of attitude formation
Cognition or Affect? – Exploring Information Processing on Facebook
173
described by [2, 14], this belief will be conditioned on the level of personal importance of either the content of the post or the person who posted it. Attitudes are complex and multidimensional [7, 34]: [28] mention intensity, importance, knowledge, accessibility and affective-cognitive consistency among a myriad of attitude components. Empirical studies show that in different contexts some dimensions of attitude are more important: e.g., for spreadsheet technology it is the cognitive, but not the affective attitude that is associated with the behavioural intention [35]. According to [26], on SNS both affective and cognitive components play a role. Affective attitude refers to feelings about the post, whereas cognitive involves evaluation of information contained in it [14]. Depending on characteristics and content of the post, the attitude formation process may differ: affective components are likely to play a role for evaluation of the posts expressing feelings and emotional states: ‘I am so happy today’, whereas cognitive components rather correspond with the posts that contain some valuable information: ‘Does anyone know a good doctor?’ Differentiating between the two components of attitude may allow us to uncover differing dynamics in attitude formation, as well as explore the behavioural intention, which is found to be a result of the cognitive and affective evaluations [2, 35]. Therefore, our first research question is: - What is the impact of the cognitive and affective components of attitude on the behavioural intentions of users towards the posts on their Newsfeed? 2.2 Information Processing Depending on the level of personal importance, motivation as well as the ability to process information, there are two possible strategies of information processing: heuristic and systematic processing. If the level of importance is high, systematic processing involving deliberate evaluation of an issue’s pro’s and con’s and its comparison to existing knowledge structures [2,4] is applied. Systematic processing requires high motivation to process the message and dedication of a significant amount of cognitive resources to the task [4]. In the conditions of increasing information overload, however, the ability of subjects to systematically evaluate every post on the Newsfeed may be limited. On the contrary, when the level of importance is low or the cognitive ability is constrained, users are said to rely on relatively simple cognitive heuristics to form their attitudes [2]. [3] find that users are often prone to base their decision on such arbitrary characteristics, as attractiveness of the person, expert opinions or stereotypes. Heuristics, however, are learned knowledge structures that require availability in and activation from memory [4]. It is interesting to explore whether users apply heuristic or systematic processing when evaluating the posts on their Newsfeed. [26] find support for the existence of both information processing strategies. More complex systematic processing is usually applied, when the subject matter is personally relevant to the person: “This could be something more interesting because she is talking about Econometrics, and I am also taking Econometrics, so it's interesting for me to look at it” [26]. At the same time, users increasingly rely on friend-based, interest-based, distance-based, selfcentered or explicit cues to identify relevant information: “Usually I check my close friends, or the people I like most...” [26]. However, information processing theory [2]
174
K. Koroleva, H. Krasnova, and O. Günther
suggests that when confronted with posts from the people with whom a user communicates a lot, a high involvement situation should be created, inducing users to process information systematically. Our second research question is: - Do SNS users process information on the Newsfeed heuristically or systematically? 2.3 The Model In line with the theories presented above, figure 1 presents the model that explores the attitude formation process of users towards the posts on their Newsfeed. The dependent variable is the intention to engage in certain behaviour on the Newsfeed: comment, like, read or ignore the post. It has been numerously proven that a person’s intention to perform certain behaviours will be the immediate determinant of actual behaviour [14]. The intention of behaviour is, in turn, determined by the cognitive and affective components of the attitude of users towards a respective post. We adopt this two-dimensional conceptualization of attitude as affective and cognitive components have been found to distinctly impact behaviour [7, 34].
Post Length
H1a H1b
Posting Frequency
AFFECTIVE ATTITUDE
H2a
H5a
H3a H2b
H4a INTENTION
Post Comprehensibility Communication Intensity
H3b H4b
H5b COGNITIVE ATTITUDE
Fig. 1. The research model and hypothesized relationships
The independent variables are based on the conceptual model proposed by [26]. Specifically, we explore the impact of: the amount of information, reflected by: (i) the length of post; and (ii) the posting frequency of the ‘poster’; and the value of information approximated by: (iii) comprehensibility of the post; as well as (iv) communication intensity between users. In line with TRA, we want to explore how these distinct characteristics impact the cognitive and affective dimensions of attitude towards the post, and how these, in turn, are translated into the behavioural intention. 2.4 Derivation of Hypotheses On the Newsfeed, users have to process on average 30 billion pieces of content each month [13]. Thus, the amount of information – measured by the length of post and posting frequency of others – may determine the attitude of users towards the post. Although the amount of information has been found to initially positively correlate with the quality of decisions or reasoning [11], increase in information input after a certain level may result in information overload [32], empirically confirmed in
Cognition or Affect? – Exploring Information Processing on Facebook
175
numerous studies, e.g. [5]. On SNS, [23] find that people are more likely to respond to simpler messages in the “overloaded mass interaction” in online forums, and [30] in their study of status message usage report that shorter messages receive more useful responses. Therefore we hypothesize that: H1: The length of the post is negatively related to the affective (H1a) and cognitive (H1b) dimensions of attitude towards the post. The phenomenon of information overload is closely related to the growing networks of users and the increasing probability to have several “spammers” on the network, the connection to whom is superficial: “Every second message is from Sam and most of them are not useful to me” [26]. Moreover, [24] find that high frequency of posting in online chats requires quicker and more sustained processing by group members and can cause information overload. Thus, an increased amount of messages from the same person on the Newsfeed may lead to the formation of the negative attitude towards the posts from this person. Therefore we hypothesize that: H2: The posting frequency of the ‘poster’ is negatively related to the affective (H2a) and cognitive dimensions of attitude towards the post. Perceived post comprehensibility is a basic prerequisite of being able to evaluate the post, and thus it may impact attitude formation. In the qualitative study [26] recognize that for the formation of a favourable attitude posts need to be understood, not only in terms of language, but also their meaning: “And I don’t like it, because I do not know what she is talking about: ‘I feel like I never left’, left what, who, when?” Therefore we hypothesize that: H3: Comprehensibility of the post is positively related to the affective (H3a) and cognitive (H3b) dimensions of attitude towards the post. Communication intensity has been recognized as important for determining information relevance on Facebook [33]. This is due to the fact that communication intensity can be used as a proxy for the level of relationship with the ‘poster’, as users tend to communicate more with those they are close with. [30] find that the closeness of the relationship with the asker is an important factor of receiving feedback on status updates. In line with the information processing theory [2], communication intensity can either act as a cognitive heuristic and result in automatic attitude formation or create a high importance situation in which a user is determined to dedicate significant resources to evaluate the posts from those with whom s(he) regularly communicates with. In either case high communication intensity should exert a positive impact on the attitude. Therefore we hypothesize that: H4: Communication intensity between the ‘poster’ and the user is positively related to the affective (H4a) and cognitive (H4b) dimensions of attitude towards the post. In different contexts affective and cognitive dimensions of attitude have been found to perform differently [7, 34]. On SNS both attitudes can be distinguished: on the one hand, SNS are hedonic information systems [27] responsible for the formation of the affective attitude, whereas on the other hand can deliver a lot of valuable information [10] and thus promote the formation of the cognitive attitude. Therefore we hypothesize that: H5: Affective (H5a) and cognitive (H5b) dimensions of attitude are positively related to the intentions of behavior of users on the Newsfeed.
176
K. Koroleva, H. Krasnova, and O. Günther
3 Empirical Study 3.1 Survey Design and Sampling In order to evaluate Newsfeed data in real time, the survey was designed and registered as a Facebook application. Users had to log-in to their Facebook accounts and install the application, after which they were explicitly asked for permission to access 6 posts on their Newsfeed. The posts were retrieved from the Facebook database using Facebook query language (structure similar to SQL), which is an API (application programming interface) provided by Facebook [12]. Out of all available posts on the user’s Newsfeed over the last 72 hours, 3 status updates, 2 links and 1 picture were randomly selected and presented for evaluation one at a time together with an integrated survey tool (see table 1 for questions). The invitations to take part in the survey were posted on numerous Facebook groups, as well as virally marketed through friends and friends of friends of the authors. As a reward for participating in the study, users were provided with the scores reflecting their Facebook usage patterns. In total, 158 people completed the survey. As each user evaluated up to 6 posts, 857 observations were obtained. 3.2 Development of Measurement Scales The items of the survey are presented in table 1. First, for each post the participant was first asked about his/her attitude towards it. In line with [7, 34], the two dimensions of attitude were measured by two items each: affective attitude was operationalized as the likability and interest level of the post, whereas cognitive – as perceived usefulness and relevance of the post. These items were measured on a 6pt scale, where the ‘neutral’ answer option was omitted in order to induce respondents to make their choice in a particular direction. This approach is justifiable as authors believe that if given the possibility to answer neutrally users might have overpreferred this option in order to avoid engaging in the complex process of attitude formation [16]. Then the participants had to answer a series of questions regarding the comprehensibility of the post, the communication intensity and the posting frequency of the ‘poster’ (see table 1). Finally, participants answered a series of demographic questions. The amount of words – the only manifest variable used in the model - was recorded by the application automatically. 3.3 Descriptive Statistics Our sample of 158 people consists of 51% male and 49% female respondents. 80% of respondents are below 30 years old, with the age range from 21 to 55 years old. Considering that 70% of Facebook users are between 18 and 44 years of age and 55.60% of Facebook users are female [22], our sample is representative for a significant part of Facebook population. Our respondents are frequent users of Facebook: 82% log-in at least once a day, a quarter of whom have Facebook running in the background when they are online. Moreover, they maintain considerably large networks: the mean number of friends is 242 and the median 196, which is higher than an average of 130 reported by [13].
Cognition or Affect? – Exploring Information Processing on Facebook
177
Posting Frequency
Comprehensibility
Composite Reliability
Cronbach’s Alpha
Standard Deviation
Mean
0.945
.89
.94
.88
.91
.95
.91
1.0
1.0
1.0
.70
.90
.86
How do you feel about this post? Dislike (1)–Like very much (6) 3.95 1.21 Very boring (1) –very interesting (6) 3.52
1.42
0.946
How do you evaluate this post? Very useless (1) – very useful (6) 2.86 1.47
0.955
Very irrelevant (1)–very relevant (6) 2.95
1.49
0.963
1.34
1.0
What will you do with this post? Behavioral comment (6); like (5); read (4); Intention brief look; (3) ignore (2); hide (1)
Communication Intensity
AVE
Cognitive attitude
Factor Loading
Affective attitude
Items
Construct
Table 1. Survey Instrument and Evaluation of the Measurement Model
3.34
How often do you communicate with this person through Facebook? almost never (1) – almost always (5) - private communication 1.64 0.9 0.835 - public communication 1.81 0.98 0.902 - following 2.12 1.04 0.885 How often do you communicate 2.07 1.13 0.733 with this person in real life? How much does this person post on Facebook? Very little (1) – very much (5) Almost never (1)– always (5)
3.48
0.88
0.988
3.33
0.98
0.794
How well do you understand the post? Not at all (1) – very well (3) - language of the post 2.58 0.79 - meaning of the post 2.47 0.73
0.943 0.860
.80 .88
.81
.81 .89
.78
Fig. 2. The frequency distribution of post evaluations
What concerns the evaluated 857 posts, in figure 2 we can vividly notice the differences between affective and cognitive post evaluations. Whereas 70% of posts are perceived as generally likable (like very much – slightly like), only 37% are perceived as generally useful (very useful – slightly useful). Most differences are at the extremes: more than 24% of posts are considered completely useless, whereas
178
K. Koroleva, H. Krasnova, and O. Günther
only 3% are extremely disliked. Although this result seems intuitive, it strengthens the necessity to explore which factors impact each dimension of attitude. What concerns the intentions of people with respect to the posts they encounter, only 20% of posts are either liked or commented, whereas ca. 50% are read or briefly looked at, hinting at rather heuristic, than systematic processing of posts by users. At the same time, almost 30% of posts are ignored and/or hidden, which corroborates the existence of information overload on the Newsfeed. 3.4 Evaluation of the Measurement Model Partial Least Squares (PLS) approach was used to evaluate the proposed model due to its suitability for testing and validating exploratory models [18]. The choice of the methodology is further justified by the fact that PLS is mainly used to measure multiitem latent constructs that most of our variables represent. Moreover, as PLS requires fewer statistical assumptions, it can be used even when normality assumptions are violated [6], which is the case for most our variables. As suggested by [6], first the measurement and then the structural model were evaluated. Only reflective measurement evaluations were used. All calculations were carried out using SmartPLS 2.0 [31]. As SmartPLS standardizes all indicators in the first step of the analysis, the differences in scale width across different constructs are addressed. To examine validity of the measurement model, convergent and discriminant validity were assessed. The results are presented in the four right columns of table 1. For convergent validity, three criteria have to be fulfilled. First, indicator reliability is ensured, if all factor loadings are higher than the required cut-off criteria of 0.7 [21], which is the case for all indicators in our model. Second, composite reliability of all our latent constructs is above 0.8, which exceeds the minimum required threshold of 0.6 [20]. Third, Average Variance Extracted (AVE) of all latent variables in the model is bigger than 0.5, as required by [15]. Taken together, convergent validity can be assumed. As the length of the post and behavioral intention were measured only by one indicator per construct, their AVE’s are equal to 1, and thus no evaluation of the above criteria was performed for these variables. Second, discriminant validity, indicating the degree of difference between constructs, was assessed by ensuring that the square root of the AVE for any latent variable is bigger than the correlation between this variable with all other latent variables in the model, as recommended by [15]. Judging by table 2, in the tested model no correlation between two variables was bigger than the square root of the AVE. Hence, discriminant validity can be assumed. Table 2. Square Root of AVE (Diagonal Elements) and Correlation between Latent Variables (Off-diagonal Elements) Construct Affective (AFF) Cognitive (COG) Comm. Intensity (C) Behavioural Intention Length (L) Posting frequency (P) Comprehensibility(U)
AFF 0.943 0.749 0.418 0.744 0.048 0.147 0.362
COG
C
BI
L
P
U
0.954 0.378 0.657 0.075 0.152 0.318
0.837 0.492 -0.017 0.418 0.165
1.0 0.073 0.203 0.343
1.0 0.037 0.021
0.894 0.107
0.9
Cognition or Affect? – Exploring Information Processing on Facebook
179
We note that the correlations between the affective and cognitive dimensions of attitude as well as the behavioral intention are quite high (0.65-0.75), although discriminant validity of our latent variables is assured. This leads us to conclude that although the dimensions of attitude are highly correlated, they can be empirically distinguished, as proposed by [2]. 3.5 Evaluation of the Structural Model Since PLS does not generate an overall goodness of fit index, model validity is assessed by examining the structural paths and R² values. R² measures the share of the variance of the latent endogenous variable, which is explained by the latent exogenous variables in the model. For the purposes of explorative research, R² is considered high when it is above 0.65, R² of over .33 is considered sufficient, and R² of over .19 is also accepted [17]. R² of our model of 0.577 is close to the high benchmark, suggesting that two dimensions of attitude do indeed explain a large share or variance in the behavioral intention as suggested by previous studies [1, 19]. The R² of the affective and cognitive dimensions of attitude in our model are 0.268 and 0.217 respectively, which are acceptable considering the few exogenous variables that predict them as well as the exploratory nature of our research. At the next step significance of the path coefficients was evaluated. PLS does not make any assumptions on the distributions of the latent variables, which makes standard parametric testing impossible. Instead, t-tests are performed on the basis of bootstrapping results (final bootstrap was performed with 200 samples and cases equal to the sample size – 857). Figure 3 presents path coefficients and respective significance levels for our model.
Rsq = 0.268 Post Length
0.051* 0.077**
Posting Frequency
AFFECTIVE ATTITUDE
-0.050*
0.574***
0.303*** -0.024
0.390*** INTENTION
Post Comprehensibility Communication Intensity
0.262*** 0.346***
0.228***
Rsq = 0.577
COGNITIVE ATTITUDE Rsq = 0.217
Fig. 3. The structural model (***p < 0.01; **p < 0.05; *p < 0.10)
As we can see from the figure, communication intensity between the users and comprehensibility of the post are particularly salient in predicting both affective (0.390***/0.303***) and cognitive (0.346***/0.262***) components of attitude. Therefore, we confirm the hypotheses H3a and H3b, as well as H4a and H4b. Moreover, the impact of communication intensity on both the affective and cognitive components of attitude is higher than that of comprehensibility of the post (the corresponding t-test yielded a test statistic of 2.02 with affective and 1.97 with cognitive attitude).
180
K. Koroleva, H. Krasnova, and O. Günther
Judging by the coefficients and significance levels, frequency of posting exerts a negative impact only on the affective attitude at the 10% significance level (-0.050*), and does not have any impact on the cognitive attitude. Thus, we can only marginally confirm the hypothesis H2a and have to reject hypothesis H2b. At the same time, length of post exerts a positive, rather than the hypothesized negative impact on the affective (0.051*) and cognitive (0.077**) components of attitude. Therefore we reject the hypotheses H1a and H1b and note this peculiar finding. Moreover, we find that both affective (0.574***) and cognitive (0.228***) components of attitude significantly impact the behavioural intention towards the post. Thus, we confirm the hypotheses H5a and H5b. Additionally, affective attitude is a more important predictor of the behavioural intention than cognitive: the corresponding t-test yields a test statistic of 6.5.
4 Discussion and Implications This paper provides an array of theoretical and practical contributions as well as opens up several venues for further research. First of all, we find that communication intensity and comprehensibility of the post are the most important factors that determine the attitude of users towards the posts on their Newsfeed. In contrast to previous studies, e.g. [24], which link length of post or the frequency of posting to information overload, these variables exert only a slight impact on the dimensions of attitude in our study. Length of post has a positive, as opposed to the expected negative impact, which is most likely due to the u-shape form of the information processing curve [32]: up to a certain amount the received information has a positive impact on the attitude. Thus, in order for users to experience information overload on the Newsfeed, the post has to be very long, which is rarely the case on Facebook. The most interesting finding of the study is that communication intensity between users is even more important than comprehensibility of the post. It appears that unclear posts from people with whom one communicates a lot are perceived as more likable and useful than clear posts from people who one rarely communicates with. This hints at the heuristic processing of information by users: communication intensity serves as a heuristic that increases the value of the post: “I just like it because I saw it was from her … I did not really read what it said” [26]. However, the information processing theory suggests that when confronted with a post of high personal relevance, the user should rather engage in systematic processing. We can link this finding to the biased impact of heuristic cues, explored by [4]: when confronted with posts from people with whom a user communicates regularly with, the motivational prerequisites for systematic processing may be present, but as users are constrained in their abilities to process large amounts of information, personal relevance only enhances the persuasive impact of heuristic cues [4]. Thus, due to information overload, on the Newsfeed users seem to apply cognitive heuristics to process also highly relevant information. Another important contribution of our study is that it is rather the affective than the cognitive attitude that determines the behavioural intention of users with respect to the post. That is, users are more prone to comment and like the posts that are funny and interesting, rather than useful, thus corroborating the hedonic function of SNS
Cognition or Affect? – Exploring Information Processing on Facebook
181
recognized in previous studies, e.g. [27]. There is also a slight difference in the process of attitude formation between these two dimensions of attitude: posting frequency has an impact on the affective, and not cognitive evaluations, whereas post length is more important for the formation of the cognitive attitude. This is quite intuitive, as high posting frequency can easily serve as a source of irritation, especially by people with whom a user does not communicate a lot, whereas post length is necessary to evaluate the usefulness of information. In order to fight information overload users can hide posts from people they are less interested in or clean up their networks [26]. However, users rarely utilize these strategies as they are bounded by time constraints, social pressure and other psychological distortions. Based on the findings of our study, we can provide valuable recommendations to network providers on what users like and consider useful on their Newsfeed. This, in turn, can be used by them to better design filtering algorithms. All of the factors that were recognized as important for the attitude formation process can be measured objectively using the available network data: communication intensity and posting frequency can be collected and analysed, whereas comprehensibility can be implied, for example, by the match in languages reflected in the profile. At the same time, the results of this study can be generalized to other social media applications, as they provide valuable insights on how users process information in social settings.
5 Conclusion This study identifies the critical role of communication intensity in the process of attitude formation of users towards the posts on their Newsfeed. We uncover that high communication intensity induces users to neglect the real cognitive value of the post and base their attitude solely on the affect – communication level with the user in question that serves as a heuristic to form the attitude towards the post. Assuming that heuristic processing results in the formation of the affective attitude, whereas systematic processing is more typical for the cognitive evaluations, the finding of this study that users mainly rely on affective attitudes when determining their behaviour on SNS leads us to conclude that users apply rather heuristic than systematic strategy to process information on their Newsfeed. Further validating the proposition, however, is an interesting venue for further research.
6 Limitations and Next Steps In order to accelerate data collection, each user evaluated up to 6 posts leading to an important limitation of our study - nested data structure. Unfortunately, PLS does not allow to take into account the individual differences in post evaluation. Another major difficulty was to measure the ‘intention’ variable on the ordinal scale. Although the elements of this scale – such as commenting or liking – may not have been driven by the same motivations, commenting requires more effort and thus is placed on the higher end of the scale continuum. Additionally, the experimental environment might have induced users to process information systematically, rather than heuristically, but we believe that these
182
K. Koroleva, H. Krasnova, and O. Günther
evaluations can be used as approximations of real post evaluations. Finally, the variables used in the study were mainly subjective perceptions of users, which may differ from their real attitude formation process. Thus, in the next step we aim to collect the objective data and to compare it to the subjective evaluations of users. At the same time, we are working on the design of a filtering algorithm based on the findings of the study.
References 1. Ajzen, I.: Nature and Operation of Attitudes. Annual Review of Psychology 52, 27–58 (2001) 2. Ajzen, I., Sexton, J.: Depth of processing, belief congruence, and attitude-behavior correspondence. In: Chaiken, S., Trope, Y. (eds.) Dual-Process Theories in Social Psychology. The Guilford Press, New York (1999) 3. Chaiken, S.: Heuristic Versus Systematic Information Processing and the Use of Source Versus Message Cues in Persuation. Journal of Personality and Social Psychology 39(5), 752–766 (1980) 4. Chaiken, S., Liberman, A., Eagly, A.H.: Heuristic and systematic processing within and beyond persuasion context. In: Uleman, J.S., Bargh, J.A. (eds.) Unintended Thought. The Guilford Press, New York (1989) 5. Chen, Y., Shang, R.-A., Kao, C.-Y.: The effects of Information Overload on consumers’ subjective state towards buying decision in the Internet shopping environment. Electronic Commerce Research and Applications 8, 48–58 (2009) 6. Chin, W.W.: The Partial Least Squares Approach to Structural Equation Modeling. In: Marcoulides, G.A. (ed.) Proceedings of Modern Methods for Business Research, pp. 295– 336. Lawrence Erlbaum Associates, Mahwah (1998) 7. Crites, S.L., Fabriger, L.R., Petty, R.E.: Measuring the Affective and Cognitive Properties of Attitudes: Conceptual and Methodological Issues. Personality and Social Psychology Bulletin 20(6), 619–634 (1994) 8. Davis, F.D.: Perceived usefulness, perceived ease of use, and user acceptance. MIS Quaterly 13, 319–322 (1989) 9. Davis, J.G., Ganeshan, S.: Aversion to Loss and Information Overload: An Experimental Investigation. In: International Conference of Information Systems Proceedings, paper 11 (2009) 10. Ellison, N.B., Steinfield, C., Lampe, C.: The benefits of Facebook “Friends”: Social Capital and College Students’ Use of Online Social Network Sites. Journal of ComputerMediated Communication 12(4), article 1 (2007) 11. Eppler, M.J., Mengis, J.: The Concept of Information Overload: A Review of Literature from Organization Science, Markeing, Accounting, MIS and related Disciplines. The Information Society: An International Journal 20(5), 1–20 (2004) 12. Facebook Query Language, http://developers.facebook.com/docs/reference/fql/ 13. Facebook Statistics, http://www.facebook.com/press/info.php?statistics/ 14. Fishbein, M., Ajzen, L.: Belief, attitude, intention and behavior: An introduction to theory and research. Addison-Wesley, Reading (1975) 15. Fornell, C.G., Larcker, D.F.: Evaluating Structural Equation Models with Unobservable Variables and Measurement Error. Journal of Marketing Research 18(3), 39–50 (1981) 16. Friedman, H.H., Amoo, T.: Rating the rating scales. Journal of Marketing Management 9(3), 114–123 (1999)
Cognition or Affect? – Exploring Information Processing on Facebook
183
17. Hansman, K.W., Ringle, C.M.: Strategische Erfolgswirkung einer Teilnahme an Unternehmensnetzwerken: eine Empirische Untersuchung. Die Unternehmung 59(3), 217– 236 (2005) 18. Henseler, J., Ringle, C., Sinkovics, R.: The Use of Partial Least Squares Modeling. Advances in International Marketing 20, 277–319 (2009) 19. Hu, P.J., Chau, P.Y.K., Sheng, O.R.L., Tam, K.Y.: Examining the Technology-Acceptance model using physical acceptance of Telemedicine Technology. Journal of Management Information Systems 16(2), 91–112 (1999) 20. Homburg, C., Baumgartner, H.: Beurteilung von Kausalmodellen – Bestands-aufnahme und Anwendungsempfehlungen. Marketing ZFP 17(3), 162–176 (1995) 21. Hulland, J.: Use of Partial Least Squares (PLS) in Strategic Management Research: A Review of Four Recent Studies. Strategic Management J. 20, 195–204 (1999) 22. Inside Facebook: Facebook’s June 2010 US Traffic by Age and Sex: Users Aged 18-44 Take a Growth Break, http://www.insidefacebook.com/2010/07/06/facebooks-june2010-us-traffic-by-age-and-sex-users-aged-18-44-take-abreak-2/ 23. Jones, Q., Ravid, G., Rafaeli, S.: Information Overload and Message Dynamics of Online Interaction Spaces: A Theoretical Model and Empirical Exploration. Information Systems Research 15(2), 194–210 (2004) 24. Jones, Q., Moldovan, M., Raban, D., Butler, B.: Empirical Evidence of Information Overload Constraining Chat Channel Community Interactions. In: Proceedings of the CSCW, San Diego, California (2008) 25. Köbler, F., Riedl, C., Vetter, C., Leimeister, J.M., Krcmar, H.: Social Connectedness on Facebook – An Explorative Study on Status Message Usage. In: Proceedings of the Sixteenth Americas Conference on Information Systems, paper 247 (2010) 26. Koroleva, K., Krasnova, H., Günther, O.: ‘STOP SPAMMING ME!’ – Exploring information overload on Facebook. In: Proceedings of the Sixteenth Americas Conference on Information Systems, paper 447 (2010) 27. Krasnova, H., Kolesnikova, E., Günther, O.: It Won’t Happen To Me!: Self-Disclosure in Online Social Networks. In: Proceedings of the 15th Americas Conference on Information Systems, San Francisco, USA, paper 343 (2009) 28. Krosnick, J.A., Boninger, D.S., Chuang, Y.C., Berent, M.K., Garnot, C.G.: Attitude Strength: one construct or many related constructs? Journal of Personality and Social Psychology 65(6), 1132–1151 (1993) 29. Moore, G.C., Benbasat, I.: Integrating Diffusion of Innovations and Theory of Reasoned Action models to predict utilization of information technology by end-users. In: Kautzand, K., Pries-Hege, G. (eds.) Diffusion and Adoption of Information Technology, pp. 132– 146. Chapman and Hall, London (1996) 30. Morris, M.R., Teevan, J., Panovich, K.: What do people ask their Social Networks and Why? A Survey Study of Status Message Q&A behaviour. In: Proceedings of CHI, Atlanta, Georgia, USA (2010) 31. Ringle, C.M., Wende, S., Will, A.: SmartPLS, Release 2.0.M3. University of Hamburg, Hamburg (2005), http://www.smartpls.de (2009) 32. Schneider, S.C.: Information Overload: causes and consequences. Human Systems Management 7, 143–153 (1987) 33. TechCrunch, http://techcrunch.com/2010/04/22/facebook-edgerank/ 34. Voss, K.E., Spangerberg, E.R., Grohmann, B.: Measuring the hedonic and utilitarian dimensions of consumer attitude. Journal of Marketing Research 40(3) (2003) 35. Yang, H., Yoo, Y.: It’s all about attitude: revisiting the technology acceptance model. Journal of Decision Support Systems 38(1) (2004)
Trend Analysis and Recommendation of Users’ Privacy Settings on Social Networking Services Toshikazu Munemasa and Mizuho Iwaihara Graduate School of Information, Production and Systems, Waseda University, Japan [email protected], [email protected]
Abstract. Social networking services (SNSs) are regarded as an indispensable social media for finding friends and interacting with them. However, their search capabilities often raise privacy concerns. Usually, an SNS provides privacy settings for each user, so that he/she can specify who can access his/her online contents. But these privacy settings often become either too simplistic or too complicated. To assist SNS users to discover their own appropriate settings, we propose a privacy-setting recommendation system, which utilizes privacy settings on public access, collected from over 66,000 real Facebook users and settings donated by participating users. We show privacy scores of the collected settings according to user categories. Our recommendation system utilizes these analysis results as well as correlations within privacy settings, and visualizes distribution of collected user’s settings. Our evaluations on test users show effectiveness of our approach.
1 Introduction Social networking services (SNS) are becoming one of the most popular services on the Internet. Facebook[6] is currently the world largest SNS having over 600 million user accounts. SNSs provide various functionalities for posting user profiles, pictures, messages, and friend links to assist users in discovering and communicating with friends. However, user profiles provided by SNS providers often include attributes that can identify persons, such as real names, mail address and phone numbers, as well as attributes that are highly private, such as personal photos, marriage status, and education. If such attributes are disclosed to the public, attackers can track users’ private matters, or the profile information and contents can be used in an unexpected way. Therefore, virtually every SNS provider allows users to define privacy settings. Profile attributes, such as age, marital status, and living address, as well as contents such as photos and posts, can be assigned an openness level. In SNS orkut, two options “Everyone” and “Only my friends” are available. Facebook provides 6 openness levels, namely “Everyone,” “Friends and Networks,” “Friends of Friends,” “Friends Only,” “Specific Friends,” and “Only Me” in 2010. Note that Facebook’s privacy specification has been occasionally changed. Since assigning such levels to over 20 attributes is not an easy task, Facebook also offers provides a default and several “Recommended” settings. However, whether such predefined settings are adequate for various types of users is questionable. For example, it is pointed out in [1] that the default settings of Facebook permit public access to the list of communities a user is participating, and most of A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 184–197, 2011. c Springer-Verlag Berlin Heidelberg 2011
Trend Analysis and Recommendation of Users’ Privacy Settings
185
users are not aware of such default open settings. Also [2] reports that users of the photo-sharing site Flickr show various different traits in their privacy settings, and to avoid tedious privacy settings users are sometimes leaning toward either all-open or all-restricted simplistic settings. There is a trade-off in the openness level of privacy settings. If a user’s settings are too restrictive, communications with his/her friends become difficult and may lose opportunities to be known to a larger scope of users. On the other hand, if the settings are too public, the risk of his/her personal information being misused or unexpectedly used is raised. For example, there are incidents such as an employee was fired because of writings in Facebook about her company, and [5] reports that a fake account having copied profiles makes easier to create new friends and obtain permission to access their contents. In this paper, we propose a new framework for assisting SNS users in finding their appropriate privacy settings. Our approach is based on understanding of privacy practice of others. We try to collect other users’ privacy settings as many as possible, and present overall tendencies of the privacy settings to the target user. We also provide recommendation of settings based on mining results on the settings and user categories. By knowing how other users are behaving in privacy settings, the target user can estimate what settings are appropriate for him/her. However, this approach has a difficulty in collecting sufficient settings, since privacy settings are usually only accessible to their owners. To overcome the above obstacle, we utilize two different types of setting collections. One is called the open privacy settings, which consist of 0-1 bit vectors representing whether the profile attribute corresponding to each bit is open to the public or not, disregarding the inner levels. We also infer basic user categories such as gender and social status from the owners’ profiles. The open privacy settings consist of only publicly available information, so that a limitless number of settings can be obtained. The evaluation of this paper is based on the settings of 66,246 Facebook users. The other type of setting collections is called shared privacy settings, which consist of privacy settings collected from users who agree to share their settings in return for privacy setting assistance. To evaluate how open or not the target user is, we utilize privacy score[10]. Privacy score is motivated by Item Response Theory[3], which is a modern test theory measuring difficulty of tests, abilities of persons answering test items. Although the original work[10] is insightful, their evaluation is done only on a few hundreds of participating users. In this paper, based on the open privacy settings, we show distributions of privacy scores of practical-scale real settings, for different user categories. We also utilize the score results for diagnosing the target user’s settings. Besides privacy score-based assistance, we also show recommendation based on co-occurrences within privacy settings and user categories. Our prototype system introduced in this paper visualizes privacy score distributions, presents diagnosis results, as well as pops up applicable recommendations. Since our system is seamlessly integrated with the original Facebook privacy setting page, users can easily grasp others’ traits and decide their own settings.
186
T. Munemasa and M. Iwaihara
In the following, we survey researches related to our work. Squicciarini et al.[15] proposed a system for developing group privacy policies on shared contents, such as group photos, using a taxation model and bidding. Their model incorporates collective tendencies of privacy settings like ours, but their target is to reach a group agreement on a shared content. FaceCloark[11] provides a Firefox browser extension for replacing sensitive values in Facebook profiles with fake values, and only admitted users are given access to a server storing true values. Their approach provides privacy protection between trusted users, but users need to figure out to whom they can disclose. In [8], an ontology-based approach for evaluating potential risks of disclosing profile attributes is discussed. Privacy Wizards[7] performs user clustering based on open profiles and friend links, to suggest privacy settings of a new user based on his/her own settings on similar users. This approach does not collect privacy settings of others, but utilizes only the settings of the target user. Therefore the approach does not work well when the target user has little knowledge on privacy settings and when starting privacy settings from scratch. A3P[16] performs data mining over collected shared privacy settings, image contents and metadata, and recommends privacy policies. However, their approach still relies on costly shared privacy settings. On the other hand, our approach overcomes the shortage of shared privacy settings by combining them with a vast collection of open privacy settings. We also provide settings recommendations as well as visualization of privacy score distributions. Regarding utilization of user categories, the survey[9] reports a negative correlation between the openness and the size of a regional community, such that users of a large community network tend to hide their profiles and friend lists, while users of a small community network tend to do the opposite. The rest of this paper is organized as follows: In Section 2, we discuss collecting open privacy settings and profiles. In Section 3, we show trend analysis results on open privacy settings. In Section 4, we describe three approaches for assisting users’ privacy settings. Section 5 describes our prototype system implementation, Section 6 shows evaluation results, and Section 7 concludes this paper.
2 Collecting Open Privacy Settings and Profiles 2.1 Collection Method and User Categorization In Facebook, if an attribute B of a user’s profile is visible from the public, we can assume that he/she sets the privacy setting of the attribute as Everyone. On the other hand, if the value for another attribute B is missing from the profile, either he/she should have left B blank or assigned an openness level lower than Everyone. In this way, we can construct his/her open profile, consisting of the attribute values visible to Everyone. Also, the open privacy setting s of the user is the bit vector such that s(B) = 1 if attribute B’s value is visible from the public and s(B) = 0 otherwise. We collected user profiles of Facebook in year 2010. Facebook has been updating its user profile format and privacy settings. After our collection, New Profile was introduced. We created a new Facebook user account, which has no linked friend, so that from this account we can only view attributes that are accessible from Everyone. Using this account, we collected about 80,000 randomly-selected users’ open profiles. We
Trend Analysis and Recommendation of Users’ Privacy Settings disclosure rate (%)
187
disclosure rate (%) 8
70
Male
60
Female 50
Male
7
Female
6 5
40
4 30
3 20
2
10
1
0 Posts by me
Family
Relationships Interested in Bio and and looking favorite for quotations
0 Website Religious Birthday and political views
Mobile phone
Other phone
Address IM screen Mail name address
Fig. 1. Disclosure rates by gender disclosure rate (%)
disclosure rate (%)
80
10
70
Student
60
Non-Student
9
Student
8 Non-Student
7
50
6 40
5
30
4
20
3 2
10
1 0 Posts by me
Family
Relationships Interested in Bio and and looking favorite for quotations
0 Website
Religious and political views
Birthday
Mobile phone
Other phone
Address
IM screen name
Mail address
Fig. 2. Disclosure rates by social status
successfully obtained 66,246 user profiles. The remaining about 14,000 profiles are either belonging to users who already closed their accounts, or their profile format was not compatible with our crawler. The incompatibility was caused by transition to New Profile, which coincided with our profile sampling. We inferred each user’s social status as follows: If there is a description in Job, we classify the user as a NonStudent. If Job has no description, but Education has a description, and from the enrollment year we infer that he/she is still in school, we classify him/her as a Student. Otherwise, his/her social status is Unknown. 2.2 Analysis of Collected Open Profiles The gender of 66,246 user profiles is classified as 33,045 males, 32,212 females, and 989 unknowns. Let us define that the disclosure rate which is the percentage of the users who disclose an attribute by choosing Everyone. Figure 1 shows the disclosure rates for each profile attribute and gender. Notice that the disclosure rates in the left graph are between 15 to 60 percent, while those in the right graph are between 0 to 5 percent. The disclosure rates of Mobile Phone, Other Phone, Address, and IM Screen Name are much lower than others, regardless of gender. These attributes have high possibilities of
188
T. Munemasa and M. Iwaihara
uniquely identifying a person in the real world, so many users are not disclosing these attributes to Everyone. Female users are less likely to disclose attributes than males, except for the attributes Birthday and Family. Female users’ lower disclosure rates suggest more cautious attitude toward public users, perhaps considering the risk of attracting digital stalkers, or they are more interested in interacting with known users rather than unknown users. The higher disclosure rate of Birthday also suggests female users’ interest on their birthday celebration. On the contrary, male users’ relatively higher disclosure rates suggest less cautiousness against digital stalkers, and more eagerness in publicizing their profiles. Figure 2 shows the disclosure rates of profile attributes based on their social status. Again the disclosure rates in the left graph are between 24 to 83 percent, while those in the right graph are between 0 to 8 percent. The disclosure rates of NonStudents are higher than students for all the profile attributes except Family. This tendency can be explained as NonStudents, mostly workers, are more active on advertising their profiles to obtain new business relationships, while students are more interested in casual relationships. We utilize these observed trends in our system for privacy-setting recommendation.
3 Trend Analysis of Open Privacy Settings In this section, we describe privacy score (or p-score for short) introduced in [10], to evaluate openness of a user’s privacy setting in relative to a category of users. Then we conduct trend analysis over our collected open privacy settings. 3.1 Computing Privacy Score by Item Response Theory Suppose that an SNS has N users, and each user j ∈ {1, . . . , N } has n attributes i ∈ {1, . . . , n}. The n×N response matrix R holds privacy settings of n attributes of N users. The privacy setting of attribute j of user i is represented as R(i, j). Each privacy setting is represented as the openness level k, which is an integer ranging from 0 to l, where larger k means wider disclosure scope. For example, Facebook has six openness levels: (0) Only Me, (1) Specific Friends, (2) Friends Only, (3) Friends of Friends, (4) Friends and Networks, (5) Everyone. If we can only detect whether a privacy setting is Everyone or lower, like the open privacy settings described in Section 2, we will use two openness levels such that R(i, j) = 1 for Everyone, and R(i, j) = 0 for Not-Everyone. Item Response Theory (IRT) is a modern test theory in psychometrics for measuring difficulty of tests, abilities (or other attributes ) of persons answering test items[3]. As opposed to classical test theories, IRT can provide theoretical justification in equating scores of different tests and like TOEFL and SAT. IRT entails three assumptions: 1. A unidimensional trait denoted by θ; 2. local independence of items; 3. the response of a person to an item can be modeled by a mathematical item response function (IRF). As the unidimensional trait, we use ability θj of person j, where θj is a real in [−∞, ∞]. We characterize a (test) item i by the discrimination parameter αi and the difficulty parameter βi . In the two parameter logistic (2PL) model, the probability Pij of a correct response to an item i by a person j is given by the following IRF:
Trend Analysis and Recommendation of Users’ Privacy Settings
Pij =
189
1 1 + e−αi (θj −βi )
The above IRF implies that as the difficulty parameter βi becomes higher, higher ability θj is necessary to keep the correct response probability Pij . Also, as the discrimination parameter αi becomes smaller, the variance of the probabilities between persons becomes smaller, so the item i becomes less effective in comparing persons’ abilities. Suppose that the response matrix R holds responses of persons to items, where R(i, j) = 1 if person j’s response to item i is correct, otherwise R(i, j) = 0. Also suppose that the discrimination and difficulty parameters αi and βi are given. Then, to estimate the ability θj of each person, the following logarithmic maximum likelihood can be used: L=
n
[R(i, j) log Pij + (1 − R(i, j)) log(1 − Pij )]
i=1
Now we consider estimating privacy scores of SNS users from their privacy settings, utilizing IRT. Following [10], we map the ability θi of person i to his/her privacy score, where the larger θi means he/she is more extrovert and disclosing his/her attributes to larger openness levels, while the smaller θi means he/she is more conservative/introverted and more likely to choose smaller openness levels. The difficulty parameter βj is mapped to sensitivity of attribute j, meaning how likely j is chosen to be disclosed. The discrimination parameter αj is mapped to the discrimination of attribute j, meaning how differently users choose their setting for j. [10] shows maximum-liklihood estimation of αj and βj from response matrices. But in this paper we also incorporate public knowledge on the risk of private information leakage in determining βj . JNSA published surveys on information security incidents[13]. In this report, the Simple-EP Diagram is compiled to calculate the value of leaked privacy data of a person in terms of (a) economical loss and (b) emotional pain. The Simple-EP Diagram shows 3-level risk values for typical private attributes such as real name, birth date, address, and phone. We reflect the risk values of the EPdiagram onto weights on SNS profile attributes. When dealing with l(> 1) openness levels, we need to decompose multi-level privacy settings into (l + 1) binary response matrices R0∗ , . . . , Rl∗ , such that if user i sets k ∗ (0 ≤ k ≤ l) as the openness level of attribute j, then Rm (i, j) = 1 for k ≤ m ≤ l and ∗ (i, j) = 0 for 0 ≤ m < k. After computing θ for the openness level k of user i, Rm ik the privacy score is computed as a weighted sum of θik . 3.2 Privacy Score Distributions Now we show privacy scores calculated on the open privacy settings collected in Section 2, and compare attitudes of different user categories. As we mentioned earlier, for the open privacy settings we can only obtain either 1: Everyone or 0: Not-Everyone. Therefore our analysis is focused on people’s attitudes toward public access, but there
190
T. Munemasa and M. Iwaihara percent 35
p-score of default setting
30
Male Female
25 20 15 10 5 0
5975 6464 6727 6905 7037 7143 7230 7305 7371 7431 7485 7536 7583 7629 7672
p-score
Fig. 3. Privacy score distribution by gender 60
%
%
60
Male-Undisclosed Male-Dating Male-Relationships Male-Networking Male-Firiendship
default setting 50
40
Female-Undisclosed Female-Dating Female-Relationships Female-Networking Female-Friendship
default setting 50
40
30
30
20
20
10
10
0
0 5975 6464 6727 6905 7037 7143 7230 7305 7371 7431 7485 7536 7583 7629 7672
5975 6464 6727 6905 7037 7143 7230 7305 7371 7431 7485 7536 7583 7629 7672
(a) p-score of male users
(b) p-score of female users
Fig. 4. Privacy score distribution by purposes 35
%
%
35
default setting
default setting
Male-Single
Female-Single
30
30
Male-In a Relationship 25
Female-In a Relationship 25
Male-Married
Female-Married
20
20
15
15
10
10
5
5
0
0 6727
6905
7037
7143
7230
7305
7371
7431
7485
7536
(a) p-score of male users
7583
7629
7672
6727
6905
7037
7143
7230
7305
7371
7431
7485
7536
7583
7629
7672
(b) p-score of female users
Fig. 5. Privacy score distribution by relationship
is no access restriction in collecting privacy settings and we can conduct a large scale analysis. Figure 3 shows distributions of privacy scores for each gender. The vertical dashed line indicates the privacy score corresponding to the default privacy setting of Facebook. As we can see from the figure, most of the users have scores lower than the default setting. Therefore the great majority of the users change the default setting into tighter openness levels for attributes that can identify individuals, such as postal address and phone, so that these attributes are not visible from Everyone. Less than one percent of the users have higher privacy scores than the default setting; these users are disclosing
Trend Analysis and Recommendation of Users’ Privacy Settings
191
their identifying attributes such as phone and postal address. Samples of these extremely high score users indicate that they are using their Facebook accounts for business purposes. The trends of male and female users are largely similar, while female users have slightly higher openness. But the highest score of the male users is 8201, surpassing the highest female score of 8054. Privacy score distributions classified by purposes are shown in Figure 4 (a) (male users) and Figure 4 (b) (female users). Here the purposes are one of the user profile attributes of Facebook, indicating the user’s primary objective of using Facebook, taking a value from Undisclosed, Dating, Relationships, Networking and Friendship. Basically male and female users show similar curves, and users not disclosing their purposes have low scores. But there is a noticeable rise between 7230 and 7305 for female users with purposes Friendship and Networking, suggesting that female users seeking such relationships have more openness than the male users of the same purposes. Privacy score distributions classified by relationship status are shown in Figure 5 (a) (male users) and Figure 5 (b) (female users). The relationship status is a user profile attribute, taking a value from Single, In a Relationship and Married. The male and female users are showing similar trends; married users have lower scores than single and in-a-relationship users, seemingly they become conservative in revealing their personal attributes to the public.
4 Assisting Privacy Settings In this section, we discuss assisting users in setting up their privacy settings, through trend analysis and recommendation based on collected privacy settings of other users. 4.1 Collected Privacy Settings We utilize the following two types of collected privacy settings: – open privacy settings: As described in Section 2, open privacy settings are those collected from publicly-accessible profiles. Due to the accessibility from the public, the multi-openness levels are reduced to two openness levels of whether or not visible to Everyone. But we can virtually collect every valid user’s open without access restriction. – shared privacy settings: Shared privacy settings are donated voluntarily to our system by participating users. In return, the participating users can receive diagnosis and recommendation from our system. The users need to agree that their privacy settings are collected and used for statistical analysis. Although we can only collect the settings of agreed participating users, full multi-level settings are obtained. The major functionalities of our privacy-setting assistance are are classified into the following three categories: 1. Settings Visualization 2. Privacy Score Diagnosis 3. Recommendation by Attribute Co-Occurrence In the subsequent sections, by User A we refer to a participating user who is using our system for configuring his privacy settings.
192
T. Munemasa and M. Iwaihara
4.2 Settings Visualization In Settings Visualization, the histogram of the shared privacy settings for each attribute is presented to User A. User A can overlook trends of other users. Furthermore, User A can specify a reference privacy score s using a sliding bar. Then the histograms of the settings having scores near to s is displayed. By this reference privacy score, User A can examine how the distribution of privacy settings changes as the openness represented by s changes. 4.3 Privacy Score Diagnosis In Privacy Score Diagnosis, the privacy settings entered by User A are sent to the server, and the privacy score of the settings is calculated. Then the score is compared with the shared and open privacy settings. From the open privacy settings collected from 66,246 Facebook users, we select users whose profiles have high similarities with User A, to explore how users similar to User A behave in profile settings. We use the cosine similarity between the vectors of the profile attributes of User A and the open settings, and select top-100 similar profiles as the neighboring open privacy settings. User A’s score s is contrasted with both neighboring open and shared privacy settings, by showing the average (μ), standard deviation (σ) and the following three-level classification: User A is classified into three levels: (1) introvert if s < μ−σ, (2) average if μ−σ ≤ s ≤ μ+σ, and (3) extravert if μ + σ < s. The same classification is carried out on both neighboring open and shared privacy settings. After receiving these feedbacks on the average, standard deviation, and three-level classification of s, User A can modify his settings, and repeat this process until he is satisfied with the result. 4.4 Recommendation by Attribute Co-occurrence Our third method for privacy-setting assistance is to utilize co-occurrence of profile attributes made public. Suppose that there exists an association rule such that if a user chooses attribute A1 as public, it is highly likely that he/she also chooses attribute A2 as public. Our system can exploit such co-occurrences for recommendation of settings: When User A chooses A1 as public, the system notifies him that A2 is also chosen as public by a high percentage of users. We can easily measure such co-occurrence probabilities between each pair of public attributes, for each user category. Notifying co-occurrences to User A should have a distinct effect, since such inter-attribute correlation is not based on privacy scores, and not supported in the other two methods. User A can determine whether he follows or discards such recommendation. In Table 1, we show pairs of attributes that have high co-occurrence probabilities in being both public in the open privacy settings, for user categories Male-NonStudent and Female-Student (not all categories are shown due to space). For Male-NonStudent, the pairs “Religious view/Birthday” and “Mobile phone/Other phone” are characteristic; both of pairs are highly likely to associate a user to a real-world person. On the hand, Female-Student has high probabilities in pairs involving Family, indicating more activeness in posting family-related topics. To estimate how the rankings of the co-occurrence probabilities are similar between different user categories, we compute the Kendall rank correlation coefficient between
Trend Analysis and Recommendation of Users’ Privacy Settings
193
Table 1. Co-occurrence probabilities of pair attributes being public Male-NonStudent Relationship/Interested in Posts/Relationships Posts/Interested in Religious view/Birthday Interested in/Bio Family/Relationships Relationships/Bio Mobile phone/Other phone Posts/Bio Family/Interested in
% Female-Student % 59.3 Posts/Relationships 57.0 57.6 Relationships/Interested in 55.7 54.2 Posts/Interested in 50.6 43.2 Family/Relationships 47.7 40.6 Interested in/Bio 41.9 38.4 Family/Interested in 41.6 36.8 Posts/Family 40.3 35.4 Relationships/Bio 34.2 32.4 Family/Bio 29.9 31.5 Posts/Bio 29.5
Table 2. Kendall rank correlation coefficient τ on co-occurrence probabilities of two attributes being public
Male-Student Female-Student Male-NonStudent
Male-Student Female-Student Male-NonStudent Female-NonStudent — 0.93 0.63 0.76 — 0.59 0.75 — 0.75
user categories C1 and C2 as follows: Let r1 (pi ) (resp. r2 (pi )) denote the rank of the co-occurrence probability of public attribute pair pi in user category C1 (resp. C2 ). Then pi and pj are concordant if both r1 (pi ) < r1 (pj ) and r2 (pi ) < r2 (pj ) or if both r1 (pi ) > r1 (pj ) and r2 (pi ) > r2 (pj ). The Kendall rank correlation coefficient 4P is given as τ = n(n−1) − 1, where n is the number of the public attribute pairs, and P is the number of the concordant pairs of public attribute pairs. The coefficient τ for each two user categories is shown in Table 2. Each coefficient is over 0.5, indicating that these rankings have a high degree of agreement, but still varying between 0.59 to 0.93. Thus considering user categories is expected to give a more adapted attribute recommendation.
5 System Implementation 5.1 System Architecture In this section, we describe our prototype privacy settings recommendation system for SNSs. The system consists of its frontend (client) and backend (server). The frontend collects user’s settings and send them to the backend. The backend, implemented using Linux OS, Apache web server, PHP and Perl languages, stores collected shared privacy settings as well as the open privacy settings to a MySQL database. The backend also computes privacy scores and carries out recommendations. The frontend visualizes privacy score distribution and notifies recommendations. The transmission of data between the frontend and backend is done in the JSON (Javascript Object Notation) Format. The frontend needs to access the user’s Facebook privacy settings to be shared. Both Facebook Graph[12] and OpenSocial[14] provide access to the user’s profile data, upon
194
T. Munemasa and M. Iwaihara
his/her concent. However, these APIs do not provide access to privacy settings of profile data. Therefore we need to device an alternative method. A simple way is to provide a web form asking the privacy settings. The user is expected to copy his settings to the form manually. However, such a task is not easy for the user, since Facebook privacy settings have more than twenty items. Furthermore, the separate web form approach is susceptible to human errors. Therefore, automatic collection of privacy settings is necessary. We realize automatic collection of settings as follows: Our system asks the user to install an extension function to his Firefox web browser. The function handles a session to login to Facebook, and obtains the HTML page of the user’s privacy settings. Then the DOM (Document Object Model) tree of the settings page is parsed, and the values of privacy settings are extracted from the tree. To smoothly integrate our recommendation functionalities into the Facebook interface, the frontend injects HTML and Javascript codes to the original page of Facebook privacy settings. When the user requests the privacy settings page, the frontend also requests to the backend and receives necessary data in JSON and HTML formats. 5.2 User Interface Figure 6 shows a screen shot of the frontend: The drop-down lists on the right are used for choosing settings. Area (1) lists the attributes, areas (2) and (3) are for Settings Visualization. The sliding bar in area (3) is used for specifying the reference privacy score. As the user shifts the sliding bar, a new reference score is sent to the backend, and privacy settings near to the reference privacy score are retrieved and sent to the front end, and then visualized as bar charts in area (2). By moving the sliding bar, the user can examine how the distribution changes for each attribute. Figure 7 shows a screen shot when the reference privacy score is high. After privacy settings are completed, the user can execute Privacy Score Diagnosis, by clicking the button “Send and Evaluate Your Settings” in area (4) of Figure 6. Then a dialog from Facebook Application requesting permission to collect the user’s profiles appears. Facebook Graph API is used for accessing the user’s profiles. If permission is granted, profile attributes Gender, Age, Current City, Hometown, and Relationship status are collected from the user’s profiles. If permission is denied, diagnosis is done without these profiles. One more dialog appears, as shown in Figure 8, which shows the collected profiles and check-boxes for specific permission to each profile attribute. The dialog further asks work status and purpose, and confirmation to send them to the backend. The profile attributes are sent only when the user explicity permits by the check boxes. After the backend receives the user profiles and privacy settings, the user’s privacy score is calculated, and Privacy Score Diagnosis and Recommendation by attribute cooccurrence described in Section 4 are carried out. Figure 9 shows an example of diagnosis results, consisting of three-level classification (extrovert, average, introvert), the user’s privacy score, average score, and standard deviation. Figure 10 shows an example of Recommendation by attribute co-occurrence, in which a recommendation message of the format “N% users disclose ITEM to the public.” is inserted as an HTML element to the privacy settings page. After receiving these feedbacks, the user can revise his/her settings and receive diagnosis again.
Trend Analysis and Recommendation of Users’ Privacy Settings
Fig. 6. Screen shot of privacy settings assistance
195
Fig. 7. Settings visualization when privacy score is high
Fig. 8. Dialog before sending profiles and privacy settings
Fig. 9. Privacy score diagnosis
Fig. 10. Recommendation by attribute cooccurrence
6 Evaluation We have conducted an evaluation on the usability of our system by questionnaire surveys over human test subjects. The test subjects consist of 15 postgraduate students of the authors’ university. Six questions and their responses are shown in Table 3. Q1 was asking whether the original Facebook privacy settings are complicated. The result was that 53 percent subjects said no. This result is rather contracting with our view that the 6-level settings on more than 20 attributes is not a simple task. But we point out that it can be indeed simple for users having a simple policy such as always choosing “Friends Only”. This question does not assume a situation where careful selection of settings for attracting more people is requested.
196
T. Munemasa and M. Iwaihara Table 3. Questionnaire results from 15 test subjects Question Q1 Do you think that Facebook privacy settings are complicated ? Q2 Do you think that privacy score and result messages are appropriate to represent openness level of your privacy settings ? Q3 Visualizing accumulated data (bar-charts) of privacy settings are helpful to set your privacy settings ? Q4 When you see suggestion messages such as ”N% users disclose ITEM to the public.” in your settings page, these suggestions are helpful to set your privacy settings ? Q5 Which one is better? Visualizing accumulated data (bar-chart), or suggestions
Yes 7 (47%)
No 8 (53%)
12 (80%)
3 (20%)
11 (73%)
4(27%)
9 (60%)
2 (13%)
bar-chart 6 (40%)
suggestions 9 (60%)
Others
4 (27%) (No suggestion message)
Q2 asked appropriateness of privacy score, and 80 percent said yes. Q3 asked whether bar-chart visualization of privacy settings of other users was helpful, and 73 percent said yes. Q4 asked whether recommendation messages appeared in the session and whether the messages were helpful. In the result, 27 percent received no recommendation, while 60 percent of those received said the recommendation was useful. Overall, the results of Q2 - Q4 show effectiveness of our approach on showing tendencies of others’ privacy settings. Q5 asked to choose a better method from the bar-chart visualization and recommendation by attribute co-occurrence, and 60 percent said the latter is better. Note that as shown in Q4, not all test subjects received recommendations. Such situations occur when no significant co-occurrence is observed in the collected settings. We surmise the reasons of the superiority of the recommendation message over the bar-chart visualization as: (1) the former can alert the user in a more targeted manner, (2) reason of recommendation is shown as the percentage of users choosing public, and (3) learning effort of users is unnecessary. On the other hand, the latter demands users to read the bar-charts and determine their appropriate settings by themselves, which require a certain training effort.
7 Conclusion In this paper, we proposed a new framework on assisting privacy settings of SNSs by visualizing tendencies of similar users’ settings and recommending based on attribute co-occurrence. Our approach on collecting open privacy settings and profiles is not relying on restricted full privacy settings, so we can mine from a vast volume of users, where users can be categorized by available user attributes. In future work, we consider improving the user interface and improving settings recommendation by reflecting friend links.
Trend Analysis and Recommendation of Users’ Privacy Settings
197
References 1. Acquisti, A., Gross, R.: Imagined communities: Awareness, information sharing, and privacy on the facebook. In: Danezis, G., Golle, P. (eds.) PET 2006. LNCS, vol. 4258, pp. 36–58. Springer, Heidelberg (2006) 2. Ahern, S., Eckles, D., Good, N., King, S., Naaman, M., Nair, R.: Over-Exposed? Privacy Patterns and Consideration in Online Mobile Photo Sharing. In: Proc. CHI (2007) 3. Baker, F.B.: The Basics of Item Response Theory, 2nd edn. Heinemann (1985) 4. BBC News, Crew sacked over Facebook posts (October 2008), http://news.bbc.co.uk/2/hi/uk_news/7703129.stm (accessed June 2010) 5. Bilge, L., Strufe, T., Balzarotti, D., Kirda, E.: All Your Contacts Are Belong to Us: Automated Identity Theft Attacks on Social Networks. In: Proc. WWW 2009, pp. 551–560 (2009) 6. Facebook web site, http://www.facebook.com 7. Fang, L., LeFevre, K.: Privacy Wizards for Social Networking Sites. In: Proc. WWWW 2010, pp. 351–360 (2010) 8. Iwaihara, M., Murakami, K., Ahn, G.-J., Yoshikawa, M.: Risk evaluation for personal identity management based on privacy attribute ontology. In: Li, Q., Spaccapietra, S., Yu, E., Oliv´e, A. (eds.) ER 2008. LNCS, vol. 5231, pp. 183–198. Springer, Heidelberg (2008) 9. Krishnamurthy, B., Wills, C.E.: Characterizing privacy in online social networks. In: Proc. Workshop on Online Social Networks (WOSP 2008), pp. 37–42 (2008) 10. Liu, K., Terzi, E.: A framework for computing the privacy score of users in online social networks. In: Proc. Int. Conf. Data Mining (2009) 11. Luo, W., Xie, Q., Hengartner, U.: FaceCloak: An Architecture for User Privacy on Social Networking Sites. In: Proc. IEEE Int. Conf. Privacy, Security, Risk and Trust (PASSAT 2009), pp. 26–33 (August 2009) 12. Graph API Reference, http://developers.facebook.com/docs/reference/api/ 13. NPO Japan Network Security Association. 2009 Survey Report of Information Security Incident (April 2010) 14. OpenSocial - It’s Open. It’s Socal. It’s up to you, http://www.opensocial.org/ 15. Squicciarini, A.C., Shehab, M., Paci, F.: Collective Privacy Management in Social Networks. In: Proc. WWW, pp. 521–530 (2009) 16. Squicciarini, A.C., Sundareswaran, S., Lin, D., Wede, J.: A3P: Adaptive Policy Prediction for Shared Images Over Popular Content Sharing Sites. In: Proc. ACM HT, pp. 261–169 (June 2011)
Semantics-Enabled Policies for Information Sharing and Protection in the Cloud Yuh-Jong Hu, Win-Nan Wu, and Jiun-Jan Yang Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University, Taipei, Taiwan [email protected],{99753505,98753036}@nccu.edu.tw http://www.cs.nccu.edu.tw/~ jong
Abstract. The cloud computing platform provides utility computing allowing people to have convenient and flexible information sharing services on the web. We investigate the inter-disciplinary area of information technology and law and use semantics-enabled policies for modeling legal regulations in the cloud. The semantics-enabled policies of information sharing and protection are represented as a combination of ontologies and rules to capture the concept of security and privacy laws. Ontologies are abstract knowledge representations of information sharing and protection which extracted manually from the data sharing and protection laws. Rules provide further enforcement power after ontologies have been constructed. The emerging challenges of legalizing semanticsenabled policies for laws in the cloud include mitigating the gap between semantics-enabled policy and laws to avoid any ambiguity in the policy representation, and resolving possible conflicts among policies when they are required to integrate the laws from multiple jurisdictions. Keywords: semantics-enabled policies, information sharing, data protection, national security, cloud computing, privacy for social network cloud.
1
Introduction
We are dealing with the problem of using information sharing services not only to enforce national security, but also to ensure information protection of personal privacy. Personal data on the social network, such as Facebook and Twitter, are usually collected, retained, processed, and retrieved across different jurisdictions in the cloud. The legal policy enforcement for cross-border information sharing and protection is much more difficult in the cloud than in a current computer environment. We show how national security and privacy laws can be modeled and enforced as semantics-enabled policies using ontologies and rules. These policies for modelling laws in reality can be mapped to the enforceable security policies in the cloud’s data centers. In our formal policy framework, the semantics-enabled policies are integrated, managed, and enforced in order to provide the cross-border information sharing and protection services. A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 198–211, 2011. c Springer-Verlag Berlin Heidelberg 2011
Semantics-Enabled Policies for Information Sharing and Protection
199
In the NIST’s definition, cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources. That can be rapidly provisioned and released with minimal management effort and service provider interaction. The cloud model is composed of five essential characteristics, e.g., on-demand self-service, ubiquitous network access, locationindependent resource pooling, rapid elasticity, and measured service. A new spectacular phenomenon of information sharing and service integration has been created on the social web 2.0 since cloud services, such as SaaS, PaaS, IaaS, were offered. Cross-border data integration in the cloud allows legal authorities to exploit the legitimate law enforcement processes to collect and use shareable personal information to counter international crimes. However, it is difficult to foresee the consequences that may arise in the enforcement of national security policies through cross-border information sharing on the social web without violating current privacy laws. Currently, the ‘model contracts’ and the ‘Safe Harbor’ program are used for cross-border information sharing for the transfer of personal data to third countries. Even though they are compliant with the EU Data Protection Directive (EC/95/46), they may not offer a workable solution to implement data sharing and protection services in the cloud [1]. 1.1
Research Issues and Contributions
Research issues. We have identified several research issues on the semanticsenabled policy enforcement of information sharing and protection in the cloud: (i) policies are represented and interpreted without causing any ambiguity while enforcing them for services, (ii) to ensure that policies that are compliant with the laws, and information sharing and protection of legal concepts can be manually extracted from the laws in each judicial domain, (iii) to deploy and enforce the policies for national security and privacy protection purposes, where abstract semantics-enabled legal-aware policies are overlaid on the current OpenTC cloud infrastructure to activate the lower level security policies1 , (iv) to integrate and unify semantics-enabled policies from multi-domains in order to have crossborder data usage services, and (v) to enforce legalized policies when semanticsenabled policies are deployed in the formal policy platform. Our contributions. Our main contributions are: (i) semantics-enabled policies are presented as a combination of ontologies and rules, and (ii) unifying privacy protection policies with national security policies in the social network cloud. For each information disclosure request, we intend to comply simultaneously with national security laws to counter-crimes and with privacy laws to protect civil rights. (iii) Automated policy integration is indicated as ontologies merging and rules integration from multi-domains. The high-level abstract semanticsenabled policies are mapped to national security and privacy protection policies. (iv) A data request for a counter-crime example is demonstrated to enforce national security. We intend to provide legal information sharing services without violating the data protection law for each jurisdiction. 1
The EU FP6 Open Trusted Computing (OpenTC) project, http://www.opentc.net/
200
Y.-J. Hu, W.-N. Wu, and J.-J. Yang
Outline. The remainder of the paper is organized as follows. In Section 2, we introduce background information and related work. The important design issues related to semantics-enabled formal policies, including policy representation, policy compliance, policy framework, and policy deployment are explained in Section 3. In Section 4, we discuss unifying policies through policy integration. We focus on an example for unifying privacy protection and national security policies and explicitly describe the policy enforcement of this example. In Section 5, we conclude this paper and point out possible future work.
2
Background and Related Work
Security and privacy are the two major challenges to deliver trusted information sharing and protection services in the cloud [2]. Given a particular cloud service infrastructure, we should allow only authorized users to use the services permitted by security and privacy laws. Once the services of information sharing cross-borders in the cloud, regulation compliance and enforcement become more difficult. This is because each domain is defined as an independent judicial area and only regulated by its own security and privacy protection laws. When information sharing crosses multi-domains, we have to integrate the laws from different jurisdictions to provide the intended services. Sometimes resolving regulation conflicts between different judicial areas is unavoidable. To counter crimes, such as fraud or terrorism, we address the issue of possible cross-border information sharing on the social network in the cloud. Although a cloud infrastructure can provide much easier services of data sharing, a cloud provider must still respect the data protection laws from each legal domain to regulate its data collection, storage, and usage. We intend to apply the semantic web’s ontology and rule technologies to represent the national security and privacy protection policies. The semantics-enabled policies are computable and machine understandable, so the legal compliant policies are enforced automatically by computer with little human intervention. An enterprise server usually declares its privacy protection policies either in legal language for privacy statements or in computer privacy language for the Privacy Preference Platform (P3P) [3]. When a privacy language, such as P3P, is used for privacy statement declaration in a server, it always takes into account the Fair Information Principles (FIPs) extracted from the international privacy protection law, such as EU Data Protection Directive (EC/95/46). On the other hand, Enterprise Policy Authorization Language (EPAL) policies can be shown as access control policy (ACP) and data handling policy (DHP) [4]. ACP states the conditions that a requester must satisfy to gain access to a resource. DHP, on the other hand, indicates how the requester’s information should be treated once it is revealed. In fact ACP and DHP are a data controller’s promised policies created from unifying of the client’s user privacy preferences and the server’s privacy declarations [5] [6]. For each data request, an ACP is used for authenticating a legalized user to access the data. Then a DHP requests other servers’ data usage, the same as an original data collector. But P3P and EPAL privacy languages lack the formal and unambiguous semantics
Semantics-Enabled Policies for Information Sharing and Protection
201
for a policy administrator to specify the privacy protection policies. The formal semantic policy resolves this problem by allowing a software agent to enforce various privacy protection policies automatically. In addition, ACP and DHP can be enhanced and used for information sharing and protection in the cloud. One of the research challenges in solving the online privacy protection problem is to develop a privacy management framework using a formal semantics language thus empowering agents to enforce privacy protection policies. Agents also can detect and avoid any policy violations from each data request. We have established a semantic privacy protection model to address this issue [7]. We also intend to enhance the OpenTC’s two-layered trusted virtual cloud infrastructure by using this semantic privacy protection model in the cloud [8]. A three-layered formal policy framework is presented to ensure the legal data sharing and to avoid the data protection laws violation in the social network cloud (see Figure 2). The purpose of information sharing is to permit legally collecting personal identification information (PII), such as email, online location, and phone numbers, in the social network cloud to counter-crimes. We conclude that the dual objectives of greater national security and greater privacy protection can be achieved through unifying national security policies and data protection policies in the cloud. This statement is similar to the viewpoints suggested in [9]. In fact, the semantic web technologies were applied to a national security protection scenario to facilitate information sharing across intelligence community boundaries [10]. Another one, the Information-Sharing Agreement (ISAs) were constructed through agreed formal defined legal rules, derived from policies, to regulate and direct the inter-agency data flow [11].
3
Semantics-Enabled Formal Policies
The well-known semantic web layered architecture2 has undergone revisions reflecting the evolution of layers and their relationship. Semantics-enabled formal policies are formulated as ontology and rule knowledge bases with ontology and rule languages in the semantic web layered architecture. Many operations can be automated, thereby reducing ad-hoc program coding to a minimum, and enabling automated documentation [12]. An ontology is a formal, explicit specification of a shared conceptualization [13]. One key aspect of managing policies is the semantic heterogeneity and conflicts among policies. Using ontology as a formal representation of a policy and a meta-policy for solving the policy semantic heterogeneity and conflict are very promising. Furthermore rules empower the policy enforcement for information sharing and protection once the policy and meta-policy have been described as ontology. 3.1
Formal Policy Representation
A formal policy is a declarative expression for a legal regulation that can be executed in a computer system without causing semantic ambiguity. A formal 2
http://www.w3.org/2007/03/layerCake.svg
202
Y.-J. Hu, W.-N. Wu, and J.-J. Yang
policy is created from a policy language, which is a combination of ontology language and rule language. Policy languages, such as Rein [14], KAoS [15], and Protune [12], have also been proposed – to allow agents to understand and enforce policies as intended by their semantics. A formal policy is composed of ontologies and rules, where ontologies are created from an ontology language and rules are created from a rule language [16]. A formal protection policy aims at representing and enforcing data protection directives and national security principles, where the structures of privacy protection directives and national security principles are modeled as ontologies and the enforcement of these formal protection policies is shown as rules. Using the policy ontology, a Request for data hasCondition, such as DataUser, Purpose, etc (see Figure 1). If multiple policies are applicable for a data request, we use hasPriority to set an execution priority. Otherwise, Isolated Policy isBelongedTo a TLD (Trusted Legal Domain). When a Request getInTo a TLD (see Section 3.3), the policies for this legal domain will be integrated. DomainPolicy is a meta-policy and it hasTLD to offer its DataPolicy. A meta-policy is a policy about policies that provides a set of rules for realizing services needed for the management of policies [17]. A meta-policy consists of a set of rules for setting up the priority between privacy protection and national security polices. Policy management services are provided in a formal policy framework in Section 3.3. They could be implemented as meta-policies in Rein [14] or as policy administration tools in KAoS [15]. In Protune, the role of meta-policies is to govern policy behavior, to reduce ad-hoc programming efforts, and to improve policy readability and maintainability.
Fig. 1. A policy ontology is used for policy and data usage descriptions of a TLD
Semantics-Enabled Policies for Information Sharing and Protection
3.2
203
Formal Policy Compliance
The cloud computing environment is an international global computer system infrastructure. Once dispersed, computer resources are installed and data is in the cloud, we face the challenges of providing legalized data sharing and protection services across jurisdictions. In the cloud, anyone can use anything from anywhere at anytime, so we must harmonize the laws that come from different jurisdictions. This also raises the regulation compliance issue where the formal policies enforced in the cloud must satisfy the data usage criteria indicated in the related laws. Obviously current data protection and national security laws are not up-todate on handling the cloud’s cross-border data sharing and protection problems. We need to address research issues, not only for a law refinement, but also for a technology re-engineering. The ultimate objective of this study is to empower the use of flexible and agile cloud resources without violating the laws. Semantics-enabled formal policies are inflexible if they are only compliant with current laws but do not comply with the new laws resulting from emerging information technologies. We propose a formal policy framework with flexible policy deployment, integration, and enforcement. In this framework, semanticsenabled data protection and national security policies are automatically unified to satisfy the purpose of national security enforcement through data sharing. However, we must also ensure that data protection laws are not violated. In this paper, a formal policy compliance of each data request is based on the data usage context of a user. It is a pre-condition in retrieving shared information that satisfies the laws. The laws that will be applied to a data request in a TLD depends on the data usage context of a data user. The legal boundary of a TLD is also based on the data usage context. 3.3
Formal Policy Framework
A trusted policy framework is essential to facilitate automatic policy integration and to meet the inter-domain’s service-access requirements in the cloud [2]. We need a framework to guarantee that formal policies are compliant with the laws. In addition, they must be properly specified, verified, and enforced for any possible data access across domains. Based on the trusted virtual domain’s (TVD’s) two-layered infrastructure [18], a semantics-enabled formal policy three-layered framework is presented (see Figure 2): 1. Cloud Machine Domain (CMD) layer A group of physical cloud computers with various virtual machines (VMs) are established within a trusted machine domain (TMD). A TMD allows a grouping of cloud computers connected by a VLAN switch to be protected as an isolated Intranet. Otherwise, a virtual privacy network (VPN) is set up to use a secure channel for TMDs and to provide secure data transmission between VMs.
204
Y.-J. Hu, W.-N. Wu, and J.-J. Yang
Fig. 2. A semantics-enabled formal policy framework with three policy domain layers: cloud machine domain (CMD), cloud virtual domain (CVD), and cloud legalized domain (CLD)
In the CMD layer, data centers are operated in the so-called physical cages model, wherein different customers’ IT infrastructure runs on distinct physical resources. A physical boundary of a TMD depends on whether the hosts belongs to the same LAN within an Intranet. In the same LAN, hosts can communicate directly using the trusted physical link without traffic encryption. 2. Cloud Virtual Domain (CVD) layer Although a group of of VMs are dispersed across multiple physical cloud computers in TMDs, these VMs are still possibly configured into a virtual zone as a Trusted Virtual Domain (TVD) belongs to a specific customer in a private cloud. A TVD consists of a set of virtual machines, network configuration, storage and policies for access control and resource consumption. Protection policies are created for uniform secure services, such as storage, networking, and TVD membership in a TVD [8].
Semantics-Enabled Policies for Information Sharing and Protection
205
The CVD layer allows resource sharing among customers in the logical cages model. This enables a more flexible and efficient management of the data center’s resources [8]. The logical boundary of a TVD is a secure logical domain, where security and storage usage policies are uniformly enforced within a TVD across its members. 3. Cloud Legalized Domain (CLD) layer Semantics-enabled policies are manually specified and are compliant with the current laws for data sharing and privacy protection in a Trusted Legal Domain (TLD). A TLD has a virtual legal boundary and use law compliant semantics-enabled policies to regulate data access. The semantics-enabled policies are translated into the network security and storage usage policies of a TVD. In the CLD layer, we use the legal cages model, compared with the logical cages model on the CVD layer, to provide uniformly legalized data sharing and protection services. A legal virtual boundary of a TLD defined for a person (or software) has limited data access rights to serve a purpose within a particular data usage context. For example, a national security law enforcer has the right to access any suspect’s Facebook IP and email addresses from the list of friends’ contacts whenever an investigation with certain evidence is allowed to do so. However, whether to grant or deny a data request permission still depends on an additional data usage context, such as where is the data requester’s location, which data center is responsible for this data, and what applicable laws are used for this request, etc. Furthermore, the semantics-enabled policies can also define a permissible data flow between any two TLDs and regulate the flow under each TLD’s law. 3.4
Formal Policy Deployment
Semantics-enabled policies are deployed in TLDs and enforced on the CLD layer in a formal policy framework. We aim to represent and enforce the high-level legal compliant semantics-enabled policies of TLDs. Thus the legal compliant policies of TLDs can be flexibly mapped into the security and privacy policies of TVDs. Consecutively, the security and privacy policies of TVDs are mapped into the security services of TMDs. The possible mappings from TLD(s) to TVD(s) are one-to-one, many-to-many that are similar to the mapping situations from TVD(s) to TVD data center(s) (or TVDc) implemented in the Xen Cloud Platform (XCP)3 [8]. The legal virtual boundary of a TLD is determined by a particular law that regulates the data disclosure range and level, where the semantics-enabled policies are compliant with the law for this TLD. An intersection area is compliant with applicable laws from multiple TLDs. When a data usage context is initiated for a data user to request information, the possible semantics-enabled policies related to the laws are executed. A data usage context includes a purpose, a data user’s role, a requester location, a data location, and action, etc (see Condition in Figure 1). 3
XCP http://www.xen.org/products/cloudxen.html
206
Y.-J. Hu, W.-N. Wu, and J.-J. Yang
Fig. 3. A layer structure from legal domain to virtual domain, where the semanticsenabled policy is enforced and managed through meta-data, including domain-policy, meta-policy, to locate the real information
In fact, this data usage context is based on the core definitions of data protection laws or national security laws. When a user submits a data request, a data usage context is created for this request with the policy enforcement ensuring that all of the information disclosure is legal under the laws. We face a law integration problem that turns into a formal policies integration problem. In [8], two types of policy govern the cloud security in their TVDs. The first, security policy, limits the flow of networks and the usage of machine storage. The second, membership policy, defines which VM is allowed to join a TVD. Security policy is used for the security enforcement of TVDs on the CVD layer but the real policy enforcement mechanisms are still executed on the CMD layer. Semanticsenabled protection policies leverage the cloud security services of security policies because the CVD layer is unaware of the legal requirements.
4
Unifying Formal Policies
When a data user asks for information, a formal policy provides the concept of laws represented for a TLD with its possible enforcement constraints through a data usage context. Whenever a data usage context is suited to a multiple TLDs’ intersection area, formal policies from these TLDs are unified to enforce data usage (see TLD d in Figure 2). In the procedure of unifying multiple formal policies, we map and merge local ontologies from policies and construct a global ontology of these unified formal policies [7]. For demonstration, two types of formal policies, privacy protection and national security, are unified to enforce a national security policy in the social network cloud (see Section 4.4). 4.1
Formal Policy Integration
People are getting aware of a more flexible and easier way to provide information sharing services in the cloud. For example, it is much easier to counter-terrorism
Semantics-Enabled Policies for Information Sharing and Protection
207
through collecting a suspect’s profile in the social network cloud. A challenge exists for how to achieve a privacy-preserving data integration and sharing services [19]. We attempt to apply the semantics-enabled formal policies integrated from various autonomous data sources in the cloud for information sharing once the laws are available. Information integration collects the data from autonomous and heterogeneous sources, and provides users with a unified view of these data through a so called global schema. The global schema, which is a reconciled view of the information, provides a single point of query services for end users. But the design of a data integration system includes several different issues, so it is very complex [20]. In this paper, we use a data integration service for information sharing in order to achieve a privacy-preserving data usage in the social network cloud. 4.2
Privacy Protection Policies
A privacy protection policy is a type of formal policy used for specifying a data usage constraint created by a data owner. After a policy is accepted, it represents a long-term promise made by an enterprise to its users. Therefore, it is undesirable to change an enterprise’s promises to customers every time an internal access control rule changes. If possible, we should enable the P3P and EPAL policies to be accountable and transparent on information processing for a data owner to revise the data usage permissions in the future [3]. A data owner’s PII is usually collected by a data controller, analyzed by a data processor, and accessed by a data user. All of these operations are protected under the privacy protection law’s umbrella in a TLD b (see Figure 4). When a data request, including collection, analysis, and use, is asked for. We first consider the data usage context of this request. This allows us to decide how many and at what level PII can be disclosed in order to comply with the privacy laws. 4.3
National Security Policies
When a national security officer intends to access a group of suspects’ PII, a data usage context is also created for this request. The data usage context of this information request includes a national security officer as a data user role, an investigation for homeland security as a purpose, the location of this data user, and the data itself. The policy ontology in Figure 1 details this concept description. Formal policies, based on the national security laws, are fetched to circumscribe the virtual boundary of a data usage in a TLD. Once the laws are revised, the data usage context will be changed and the virtual boundary of a data usage will be updated. Thus the formal policy framework in Figure 2 provides a flexible policy re-mapping while applying the new laws to redraw a TLD virtual boundary. A PII is originally protected by the data protection law in the TLD b. When a data usage context is created to enforce the national security policy, a data usage is moved and circumscribed in the TLD d, and eventually migrated into the TLD c (see Figure 4). For a PII, if it sits in the TLD b but cannot move
208
Y.-J. Hu, W.-N. Wu, and J.-J. Yang
into the TLD d or TLD c with any data usage context, this implies that this PII cannot be disclosed through the national security policy enforcement. 4.4
Unifying Privacy Protection and National Security Policies
Some believe that the objectives of greater national security and greater personal privacy can be compromised but others disagree. For example, in [9] they believe that the ultimate solution balances the national security and privacy protection lies in utilizing information technologies for counter-terrorism and to safeguard civil liberties. Pattern-based data queries face the challenge of privacy rights violation for false positives when identify the terrorist suspects. Therefore, pattern-based queries are required to issue iteratively in a privacy-sensitive manner. In this paper, the privacy violation issue can be avoided by using the right data usage context in a TLD. When we retrieve PII, the semantics-enabled polices reason. This provides additional evidence for updating the data usage context to allow enforcing national security policies iteratively; however, the information disclosure still respects the data protection policies. When a data usage context is moved into the intersection of TLDs, i.e. TLD d, it implies that the privacy protection and national security policy are unified. Then a data usage request is regulated by these two type of policies. The ontologies of these policies will be mapped and merged. Rules will be further integrated to enforce the data usage within the conjunction, TLD d, of the multiple legal domains (see Figure 4). However, when applying pattern-based data usage in the conjunction area, we still have to follow the PII anonymous disclosure principles if supporting evidence is not strong enough to allow a full information disclosure. Handling anonymous information requires multiple stages of human-driven analysis with reasoning of unified policies. Therefore, national security analysts cannot act alone on the results of such queries until a third-party legal authority has established sufficient probable cause. Data analysts would refine queries in stages, seeking to gain more confirmation while involving privacy-protection techniques in the process [9]. Eventually, the data usage context will move to the TLD c, where it is beyond the TLD b’s data protection boundary. Under that circumstance, the data usage context is only regulated and enforced by the national security laws. At this stage, data protection laws are out of context because national security officers have enough plausible evidence to prove that the suspects have committed a crime against the national security laws. Unifying privacy-protection policies with national security policies not only ensure privacy, but also encourages sharing data without fear of a privacy rights violation. Sometimes, PII are collected and stored by a social network in multiple data centers dispersed across different judicial TLDs. Each TLD is an independent legal domain and regulated by its own data protection and national security laws. Unless there is an establishment of (international) mutual agreements, a TLD’s legal regulations do not allow its PII to be shared and transported to other TLDs. So the formal policies are only enforced locally without being unified
Semantics-Enabled Policies for Information Sharing and Protection
209
Fig. 4. A data usage context serves various information disclosure for TLDs
with each other. Given this situation, the data usage and storing is restricted in a single legal domain, so the economic incentives of using a cloud’s resources are hard to obtain. 4.5
Formal Policy Enforcement
Based on the policy ontology presented in Section 3.1, we reuse the vocabularies of this ontology to describe the concepts of domain-policy and data-policy for the policy enforcement rules in the TLD d. We demonstrate how to use the information sharing and privacy protection policies to serve the purposes of enforcing national security and privacy protection for a data request in the TLD d. According to the policy ontology (see Figure 1), when a data request ?x with its data usage context ?c satisfy a DomainPolicy(?d)’s data usage context ?dc. A user is allowed to enter the TLD ?tld enforcing the investigation of national security(see rule (1)): – A partial ontology for a domain policy: hasTLD.DomainPolicy(d), hasTLD−.TLD(d). hasCondition.DomainPolicy(d), hasCondition−.Condition(d). hasPartOf.Condition(d), hasPartOf−.Purpose(investigation), hasPartOf−.DataUser(securityPersonnel), hasPartOf−.Location(TW), hasPartOf−.Evidence(things). hasPartOf−.Consent(nill). – A rule for a domain policy enforcement: Request(?x) ∧hasCondition(?x,?c) ∧ Condition(?c) ∧ hasCondition(?d, ?dc) ∧ Condition(?dc) ∧ DomainPolicy(?d) ∧ hasTLD(?d, ?tld) −→ getInTo(?x, ?tld) ← (1)
210
Y.-J. Hu, W.-N. Wu, and J.-J. Yang
An ontology and a rule for a data policy ?d in the TLD ?tld allow a request ?r using PII ?pii of the social network information ?sInfo (see rule (2)) as follows: – A partial ontology for a data policy: isBelongedTo.DataPolicy(d), isBelongedTo−.TLD(d). describes.DataPolicy(d), describes−.PII(d). hasDisclosedFor.PII(d), hasDisclosedFor−.socialNetInfo(d). socialNetInfo(d) ≡ Email(d) OnlineLocation(d) phoneNo.(d). – A rule for a data policy enforcement: Request(?r) ∧ satisfy(?r, ?x) ∧ DataPolicy(?d) ∧describes(?d, ?pii) ∧ hasDisclosedFor(?pii, ?sInfo) ∧ Evidence(things) −→ canUse(?r, ?pii) ∧ socialNetInfo(?sInfo) ← (2)
5
Conclusion and Future Work
We present a three-layered formal policy framework to demonstrate how data usage crosses multiple judicial domains in the cloud. We focus on the design and modeling of the CLD layer with numerous TLD built-ins for the formal policy framework. In this innovative cloud framework, a TLD specifies its legal virtual boundary to accept a data request. When a data user asks for information disclosure using a role with a purpose from a location. A data usage context is created to determine which TLD with its various policies is eligible to constrain data usage. A domain-policy is applied to select which data policies are applicable for a real information disclosure within a TLD. A meta-policy is used for setting up the data-policy’s priority for policy management when policy conflicts exist. Semantics-enabled policies are shown as a combination of ontologies and rules, where ontologies describe the concept of policies, including domain-policy, metapolicy and data-policy. Rules further enforce these different type of policies. The semantics-enabled policies are applied to a scenario where the national security policies for information sharing and the privacy protection policies for data usage are both satisfied. Finally, the CLD layer’s proof-of-concepts prototype, based on the OpenTC architecture, has been implemented to justify our approach. Acknowledgements. This research was partially supported by the NSC Taiwan under Grant No. NSC 99-2221-E-004-010 and NSC 100-2221-E-004-011MY2.
References 1. Bruening, J.P., Treacy, B.C.: Cloud computing: privacy, security challenges. Privacy & Security Law Report (2009) 2. Takabi, H., et al.: Security and privacy challenges in cloud computing environments. IEEE Seurity & Privacy 8, 24–31 (2010) 3. Ant´ on, I.A., et al.: A roadmap for comprehensive online for privacy policy management. Comm. of the ACM 50, 109–116 (2007)
Semantics-Enabled Policies for Information Sharing and Protection
211
4. Vimercati, S.D.C.d., et al.: Second research report on next generation policies, project deliverable D5.2.2. Technical report, PrimeLife (2010) 5. Ardagna, A.C., et al.: A privacy-aware access control system. Journal of Computer Security 16, 369–397 (2008) 6. Karjoth, G., et al.: Translating privacy practices into privacy promises - how to promise what you can keep. In: POLICY 2003. IEEE, Los Alamitos (2003) 7. Hu, Y.J., Yang, J.J.: A semantic privacy-preserving model for data sharing and integration. In: International Conference on Web Intelligence, Mining and Semantics (WIMS 2011). ACM, Norway (2011) 8. Cabuk, S., et al.: Towards automated security policy enforcement in multi-tenant virtual data centers. Journal of Computer Security 18, 89–121 (2010) 9. Popp, R., Poindexter, J.: Countering terrorism through information and privacy protection technologies. IEEE Security & Privacy 4, 24–33 (2006) 10. Kettler, B., et al.: Facilitating information sharing across intelligence community boundaries using knowledge management and semantic web technologies. In: Popp, L.R., Yen, J. (eds.) Emergent Information Technologies and Enabling Policies for Counter-Terrorism, pp. 175–195. Wiley, Chichester (2005) 11. Buchanan, W., et al.: Interagency data exchange protocols as computational data protection law. In: Legal Knowledge and Information Systems - JURIX, pp. 143– 146. IOS Press, Amsterdam (2010) 12. Bonatti, P., Olmedilla, D.: Policy language specification, enforcement, and integration. project deliverable D2, working group I2. Technical report, REWERSE (2005) 13. Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition 5 (1993) 14. Kagal, L., et al.: Using semantic web technologies for policy management on the web. In: 21st National Conference on Artificial Intelligence (AAAI). AAAI, Menlo Park (2006) 15. Tonti, G., Bradshaw, J.M., Jeffers, R., Montanari, R., Suri, N., Uszok, A.: Semantic web languages for policy representation and reasoning: A comparison of kAoS, Rei, and Ponder. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 419–437. Springer, Heidelberg (2003) 16. Hu, Y.J., Boley, H.: SemPIF: A semantic meta-policy interchange format for multiple web policies. In: 2010 IEEE/WIC/ACM Int. Conference on Web Intelligence and Intelligent Agent Technology, pp. 302–307. IEEE, Los Alamitos (2010) 17. Hosmer, H.H.: Metapolicies I. ACM SIGSAC Review 10, 18–43 (1992) 18. Berger, S., et al.: Security for the cloud infrastructure: Trusted virtual data center implementation. IBM Journal of Research and Development, 6:1–6:12 (2009) 19. Clifton, C., et al.: Privacy-preserving data integration and sharing. In: Data Mining and Knowledge Discovery, pp. 19–26. ACM, New York (2004) 20. Calvanese, D., Giacomo, G.D.: Data integration: A logic-based perspective. AI Magazine 26, 59–70 (2005)
Social Mechanism of Granting Trust Basing on Polish Wikipedia Requests for Adminship Piotr Turek1 , Justyna Spychala1 , Adam Wierzbicki1 , and Piotr Gackowski2 1
Polish-Japanese Institute of Information Technology Warsaw, Poland {piotr.turek,justynka,adamw}@pjwstk.edu.pl 2 Teleca Poland L ´ od´z, Poland [email protected]
Abstract. The purpose of this paper is the description of research about Polish Wikipedia administrators and their behavior during Request for Adminship votings. Administrator is regarded as trustworthy individual, and thus social aspects of deciding about granting and revoking administrative permissions becomes relevant for the sustained growth of Wikipedia. We have conducted two kinds of experiments: First is gathering of several statistics about current administrators and their contribution to the project. Second experiment is based on an implicit social network created from the edit history and compares contributors’ collaborative efforts with the votes actually cast during Request for Adminship procedure. Keywords: Wikipedia, Collaboration, Trust.
1
Introduction
Wikipedia is one of the most popular websites on the Internet. It is a collaborative effort to organize and present human knowledge, similarly to traditional encyclopedias. Its most distinctive feature is the factx that anyone may edit the content. Thanks to the wiki technology, anyone may become the editor. This fact causes the sustained growth of Wikipedia[8], but also possible scalability problems in the future. Nowadays in the Web 2.0 era, there are a lot of sites where user contributed content plays a major role. Many other public wiki sites may face similar problems as Wikipedia. Due to Wikipedia’s openess and lack of centralized supervision, authors need to overcome problems, that are not found in editing of traditional encyclopedias. The most notable example is vandalism, which is mostly the deliberate deletion of content or putting false or irrelevant information. The effect of vandalizing Wikipedia may have serious consequences for real people, especially when a biographical article becomes vandalized. While the global impact of this kind of damage is rather low, it is rising[7]. Even though the anti-vandalism bots created A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 212–225, 2011. c Springer-Verlag Berlin Heidelberg 2011
Trust Basing on Polish Wikipedia Requests for Adminship
213
to automatically prevent the damage do a good job, there is always the need of human reviewers. Another problem connected with lack of central supervision arisises when editors have different points of view which may result in an edit war, when two or more contributors or groups try to enforce their version of the article. This violates one of the key Wikipedia rules, which mandates the contributors to keep a neutral point of view. Vi´egas et al. [11] noted that edit wars are a threat not only for controversial articles. The mentioned problems are caused mostly by human factors and at least some of their instances cannot be resolved without another human intervention. This is the role of administrators to constantly monitor Wikipedia and make sure that rules established by the community are obeyed. 1.1
Problem Statement
With growing amount of work for administrators caused by increased popularity and amount of content in Wikipedia[3], there is a potential risk that administrators may become overwhelmed and their response time may become longer. There are also debates that Wikipedia administrators form a strong, tightly connected group of friends and it is more difficult for new editors to enter that group. To avoid the possible degradation of Wikipedia quality, especially because there may be too little administrative workforce to accomodate Wikipedia growth, we identified the need for new tools to evaluate potential new candidates for administrators, preferably from outside of the connected group. To get started we have taken a look at the current situation among administrators and performed quantitative studies on past Requests for Adminship (RfAs) (votes on new candidates for admins). After examining the current situation, we have constructed the implicit social networks from two sources: RfA votings and Wikipedia edit history, then compared, how the relationships from the edit history relate to the cast votes. In this way, we have tried to find which criteria are dominant for voters when making the decision about candidate. The rest of the paper is organized as follows: Next, we review past Wikipedia research, especially concerning adminship. In section 3, we present quantitative study on the current administrators and their RfA procedures. Section 4 focuses on the comparison of the voting-based network with the network of relationships based on edit history. Finally, we summarize the results and draw conclusions.
2
Related Work
Wikipedia has been a subject to several studies in the past few years. Most notable example of research topic is assessing content quality[7,1]. The trustworthiness of Wikipedia is one of the key concern related to it’s usefulness and generally, success.
214
P. Turek et al.
The problem of recommending and evaluating candidates for administrators hasn’t been extensively studied, but this topic is slowly growing in popularity. The most similar work that we have found is [2]. The authors present an idea of recommending and evaluating candidates for administrators based on behavioral data and comments, not the page text. They counted each candidate’s edits in various namespaces (article, article talk, Wikipedia, Wikipedia talk, wikiprojects etc.) to calculate total contribution as well as contribution diversity. They also measured user interaction, mainly activity on talk pages, but also participation on arbitration or meditation committee pages and a few others. There are also several other statistics, but the ones mentioned seemed to be the most relevant to the candidate’s success. Especially successful were candidates with strong edit diversity, mere edits in Wikipedia articles didn’t add much more chance of success. In user interactions, article talk page edits were the best predictor of success, with other authors talk page edits being rather poor. Authors also confirmed Kittur’s[4] results that the percentage of indirect work (coordination, discussion, etc.) grows over time, the share of articles in all Wikipedia edits is decreasing. The problem of evaluating voters and candidates has been also studied in the social context in [5]. The authors found out that the probability of one person’s vote to be positive is correlated with the basic relative figures such as: who – voter or candidate has more edits, who has more barnstars (awards given by other Wikipedia users), the extent of collaboration of the two, etc. Authors strongly noted that the vote value (positive or negative) is not just a function of candidate, but both voter and candidate. They also studied the relationship between past votes (which are public) and next votes given by other voters. The “response function” (function estimating vote value based on voter and previously cast votes) varied from one user to another. This suggests that each voter has a certain policy of looking or not looking at previous votes.
3
Polish Wikipedia Adminship
As Wikipedia itself defines, an Administrator (sysop) is a committed and trustworthy participant of a project, who has received additional powers by a decision of the community. These powers do not suggest editorial control over the project. Administrators also provide help in editing Wikipedia, especially to newcomers. The basic administrative permissions are as follows: – deleting pages and un-deleting them, so administrators have the access to content previously regarded as irrelevant or inappropriate for an encyclopedia, – flagging and unflagging a page as editable only by administrators (mostly not encyclopedic pages, such as the main page) or only by registered users, – blocking (and unblocking) users ability to edit pages, mostly used to disallow malicious individuals from damaging Wikipedia. Either user account or IP address (or a group of those) may be blocked.
Trust Basing on Polish Wikipedia Requests for Adminship
215
As of November 1, 2010 Polish Wikipedia had 168 administrators. Since 2005 there have been held 281 votings for Requests for Adminship (hereafter - RfA). 171 were completed with granting an administrator’s privileges, 110 were rejected the candidates, 39 were withdrawn before the end of the voting, and 34 were canceled (due to statutory requirements or no acceptance of the nomination by a candidate). Approximately 38 administrators were selected before introduction of the RfA procedure in March 2005. Data on the RfA does not add up, inter alia, for the following reasons: – “Verification” votings have been counted as ordinary (sometimes administrators want to confirm that they still have the support of the community and decide to verify their trustworthiness by standing for a re-voting). – Some of the administrators gave up their powers. This happened both at the moments they stopped editing Wikipedia, and in the situations when they decided that after a break in editing they were not going to take it up again. – some administrators resigned, and then applied for the adminship again, as has happened in the case of former administrators who returned to editing after previous conflicts within the community. – A few administrators’ permissions have been taken away by the Arbitration Committee. – The first RfA procedure was performed on 3rd March 2005. Previously, administrators were elected on a mailing list. – Some charts use only data from 86 cases, due to the lack of complete data in the logs of Wikipedia. This applies particularly to the initial contribution of administrators. In the beginnings, when Wikipedia had only several active editors, the adminship was granted solely basing on technical needs, without social issues in mind. Soon after that, the mailing list was a place, where the emerging community discussed social aspects and particularly nominated candidates for administrators. The procedure implemented on a mailing list worked on a principle, that if nobody argued, whether a certain person should get the administrative permissions, they were granted. During the 4 years (up to 2005) of nominating candidates on a mailing list, 40 persons got the permissions, while only one candide was rejected. This way of granting adminship was questionable and didn’t leave a trace in the Wikipedia itself, who and when got the permissions. At the beginning of 2005 the new voting-based procedure was introduced. It caused a lot of problems, for example because of “free riders”, who had very little and often disputable contribution to the project and applied for the position of administrator. There was also a problem on the other side – people who voted often had very little experience in editing Wikipedia. Additionally it was easy to rig the voting by using sock puppets (additional accounts owned by the same person). To remedy this situation the procedure was formalized and its final version were created almost a year later (December 2005). The current version of the procedure mandates that a person standing for the voting must have the account for at least 3 months and with at least 1,000 edits. To be able to vote, user must have the account for at least 2 weeks and 500
216
P. Turek et al.
edits in articles. The voting starts at the moment, when a candidate confirms that he or she is willing to become an administrator, as candidates may apply by themselves or be nominated by others. To get the administrative permissions, candidate must have at least 20 “for” votes and they must be at least 80% of total “for” and “against” votes. After being rejected (due to not having enough support votes or not meeting the formal requirements) or resiging, candidate may re-apply in 60 days after voting ends. Table 1. During which voting the candidate was accepted Attempt Accepted Rejected Percent successful 1st 145 83 63.60 2nd 18 21 46.15 3rd 4 10 28.57 4th 0 6 0.00 5th 0 1 0.00
3.1
Basic RfA Statistics
During which Voting a Candidate was Accepted. The first analysis we dealt with, was an attempt to determine during which voting a candidate is accepted. There is a noticeable and significant difference between the numbers of candidates who were admitted in the first and subsequent attempts. It is also evident that less than half the candidates who were rejected in the first approach is trying to get the administrator’s rights for a second time. In total, 228 candidates have applied for the adminship at least once; 83 were rejected. For a second time the adminship was requested by 39 candidates; over a half of them, 21, was rejected. 14 candidates applied for the voting for a third time and four of them were accepted. Both in the case of a fourth vote (6 candidates) and a fifth vote (one candidate), no one was accepted. Nobody applied for a sixth time. Frequency of Votings. Next, we proceeded to analyze the number of votings a year and the percentage of applications accepted yearly. On the chart with the number of votings (Fig. 1) a peak can be observed in 2006 when the figure reaches the value 95, while a year before it was 34, and a year later it decreased to 60. Apart from the period 2006–2007 the number of votings has never exceeded 38. Only in 2010 the level was lower than 34. The number of RfAs between 2006 and 2010 got lowered over three times (from 95 to 26). However, this may be due to an incomplete testing period (data for the study were collected on November 1, 2010). The percentage of accepted applications (see Fig. 2) can be divided into two periods, first, 2005–2008, when the percentage of accepted candidates ranged between 57 and 70 percent. The second period, 2009–2010, are values below 50 percent (respectively 47 and 42 percent). Between 2008 and 2010 the percentage of successful RfAs fell almost by half (from 70% to 42%).
Trust Basing on Polish Wikipedia Requests for Adminship 100 90 80 70 60 50 40 30 20 10 0 2005
2006
2007
2008
2009
2010
Fig. 1. Number of votings per year
90 80 70 60 50 40 30 20 10 0 2005
2006
2007
2008
2009
70 60 50 40 30 20 10 0 2005
2010
2011
2006
2007
2008
2009
217
2010
Fig. 2. Percent of accepted Requests for Adminship 16000 14000 12000 10000 8000 6000 4000 2000 0 2006
2007
2008
2009
2010
Fig. 3. Mean number of votes in single Fig. 4. Mean number of edits of successful RfA per year candidates 1400 1200 1000 800 600 400 200 0 2006
45 40 35 30 25 20 15 10 5 0
2007
2008
2009
2010
2005
2006
2007
2008
2009
2010
Fig. 5. Mean number of days since ac- Fig. 6. Year of registration of last 86 count registration of successful candidates administrators per year
Number of Votes in Single Voting. Another issue is the number of votes gave during a single voting. We decided to use the arithmetic mean value, not the median. The biggest difference between the arithmetic mean and the median did not exceed 4.71 of votes and that happened only when the analyzed values reached 80. As it can be seen on Fig. 3, the number of votes in the RfAs increased from a minimum of 20 votes in the first half of 2005 to a maximum of 88 votes in the second half of 2010. The chart shows two trends: one runs from the first half of 2005 to the first half of 2007, when the number of votes increased from 20 to 78; the second trend lasted from the first half of 2007 to the second half
218
P. Turek et al.
of 2010 when the number of votes remained at a similar level, ranging from 70 in the second half of 2008 and first half of 2009 to 88 in the second half of 2010. Numbers of Votes. Next, we present the statistical data on the minimum and maximum number of votes “for”, “against” and “abstain.” The lowest number of votes, when a candidate received adminship was 12/22/25 (for/against/abstain) in various polls. The highest number of votes when a candidate has not been granted the powers was 85/64/28 (for/against/abstain) in various polls. The highest number of votes was cast during a voting on the nomination for WarX – 125 votes. In total, there were 14 votings in which the number of votes exceeded 100 (in 10 cases, the candidate was accepted, in 4 rejected.) 3.2
Candidates’ Experience on Wikipedia
Another study concerning candidates’ experience prior to receiving administrator powers was conducted for the last 86 users who were elected as administrators. In the case of previously selected administrators, collecting complete data was not possible due to gaps in the logs of Polish Wikipedia. Number of Edits. One of the factors, that cause the most discussions during the votings is the number of edits made by a candidate. The RfA Rules contain a sentence that reads: Candidates for administrators [...] may be users who have at least 1,000 undeleted edits 1 . However, this value is often considered too low by the voters. On the basis of an analysis of the number of edits at the time of granting the privileges it can be observed that the minimum falls in the first half of 2006 and amounted on average to 2,037 edits. Then the values grow, achieving just over 14,000 edits in 2010. This shows that in the subsequent years the acceptance of candidates required growing experience and the difference between the level required by the Rules and the level a candidate is commonly accepted was constantly increasing. A similar phenomenon is observed on the German Wikipedia, where, according to the declaration of voters the candidates were accepted when they had over 10,000 edits in the second half of 2010. Time of Wikipedia Practice. Another factor that triggers emotions during the votings is the time of practice. It is required by the rules of voting: Candidates for administrators [...] may be users who have at least 1,000 undeleted edits, the first of which took place at least 3 months before requesting the adminship. We analyzed the time (in days) of the cadidates’ practice between the date of registration and the date of being granted adminship, which is not exactly the same value as required in the regulations. The examined time of practice in the first half of 2006 was 182 days. The values gradually grew from 511 days in the second half of 2007, 870 days in the first half of 2009, with a small decline in the second half of 2009 (682 days). In the second half of 2010, it reached the value of 1310 days, but this may be a slightly undependable result due to only 1
http://pl.wikipedia.org/wiki/Wikipedia:Przyznawanie_uprawnie%C5%84# Regulamin_przyznawania_uprawnie.C5.84
Trust Basing on Polish Wikipedia Requests for Adminship
219
two votings in this period. An overall analysis of the chart shows that in 2006, the candidates had less than one year practice, and since mid-2008 it is at least two years. The last two candidates with experience of less than one year were elected in February 2009 and November 2008. Date of Registration of Recent Successful Candidates. The final factor we analyzed was the date of registration of the last 86 administrators (see Fig. 6. The analysis found that, as of November 2011, there were no administrators who created their accounts in 2009 and 2010. The latest was Magalia’s account, created at the end of August 2008. Almost half of the administrators created their accounts in 2006 (41 of 86). The rest, according to the number of accounts, in 2005 (20), 2007 (16) and 2008 (9).
4
Comparison of Votes with Wikipedia Implicit Social Network
In our experiment, we created the networks, where nodes represent users, who took part in a voting and edges represent the relationships between them. First network represents who voted for whom and the second, who voted agaist whom. Each vote has been converted to an arc in the graph connecting the person, who cast the vote with the person, for or against whom the vote was. We will take a look at each of those networks independently. 4.1
Wikipedia Implicit Social Network
WikiTeams[9,10] is our ongoing research in effort to assess the collaboration of Polish Wikipedia contributors with emphasis on the aspect of teamwork. The research tool that we used there is the social network analysis performed on the implicit social network mined from the Wikipedia edit history. To create this dataset, we analysed the entire edit history since the inception of Polish Wikipedia in 2001. The key was to find the real authors of content, not only those who copy or move the information around and find the (implicit) relationships between authors such as trust, criticism, acquaintance and common interests. This was accomplished by using various algorithms similar to those used in plagiarism detection. The major obstacle was the amount of data present in the edit history. In case of Polish Wikipedia it is over 220GB of text. Firstly, we identified, what is required in further processing. We needed a way to concisely represent the article text with authorship information. As a basic unit of content we considered a single word. We processed each revision of a particular article in order of the changes that were made and for each word we have assigned its author. So the first revision consisted of words written by the creator of the page and subsequent ones contained the text at a particular time with their respective authors. Between each two subsequent revisions we may have four kinds of actions: adding a word, deleting a word, moving a word from one place to another and
220
P. Turek et al.
changing a word. Adding is simply putting a new word in the text (whose author is the author of the revision, where it firstly appeared). Deleting is simply removing a word from the text. Moving is removing a certain portion of text in one place and putting exactly the same sequence in the other. Changing is an operation of replacing one word by the other (including for instance spelling corrections). We needed to separate moving from deleting followed by adding to preserve authorship information. There is a threshold to avoid regarding moving single words or common phrases as moving the text written by previous author. It works by identifying how many consecutive words were moved, if it was below the threshold, then the whole operation is considered a deletion followed by addition by the new author. The replacements of single words are considered also a deletion followed by addition. 4.2
Network Structure
The Wikipedia implicit social network is a graph consisting of nodes, each representing one Wikipedia contributor and edges, each representing one kind of relationship between them. Each edge has its specific weight represented by a numeric value. We have defined four dimensions (kinds) of relationships between authors: trust, criticism (distrust), acquaintance and knowledge (interests). This network is completely implicit, it doesn’t contain any information about social relationships, it is completely based on edit history. Trust. In original WikiTeams dataset we have defined trust as amount of text written by one author and then copied somewhere else by other. We assumed that the one who copies text fragments, believes in its trustworthiness. Currently in the new dataset we calculate trust a bit differently. The main operation that influences weights of trust relationships between contributors is not copying or moving text, but adding the text in the vicinity of text written by other author. We believe that when someone edits article text he or she has read the surrounding paragraphs (reviewed them). Criticism (distrust). Criticism in the original WikiTeams was defined simply by the number of words written by one author and deleted by other. Deleting words may not always mean criticism or distrust. This measure allowed easy spotting of edit wars, where two or more authors or groups argue with each other. Now, in the new dataset criticism is measured by the number of edits made by one author and reverted by another. Acquaintance. Acquaintance is modeled by the amount of discussions between particular contributors. To calculate this measure we looked at the articles’ and users’ talk pages. The measure is proportional to the amount of text added by one author next (that is in response) to the text written by the other author. 4.3
Common Links in Implicit Social Network and RfA Networks
We have found the intersection of voting network and Wikipedia Implicit Social Network dimensions: trust, distrust and acquaintance. This way we have social
Trust Basing on Polish Wikipedia Requests for Adminship Measure votes-for votes-against Minimum 0.05 0.05 1st quartile 13.92 9.35 Median 57.26 41.08 Mean 442.24 287.71 3rd quartile 217.83 164.24 Maximum 170,972.51 54,129.07 Coverage 87.44% 79.02%
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.01
221
for against 0.1
1
10
100
1000 10000
Fig. 7. Wikipedia Implicit Social Network measures for Trust dimension (statistics and cumulative relative frequency distribution)
Measure votes-for votes-against Minimum 1.000 1.000 1st quartile 1.000 1.000 Median 1.000 1.000 Mean 2.331 4.203 3rd quartile 2.000 3.000 Maximum 165.000 165.000 Coverage 11.17% 14.06%
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
for against 1
10
100
Fig. 8. Wikipedia Implicit Social Network measures for Distrust dimension (statistics and cumulative relative frequency distribution)
network measures for each pair voter-candidate. To find out how those measures are related to the voting, we have found the values in each dimension for each pair. Next, we present the cumulative relative frequency distribution graphs of those three measures separately for votes “for” and “against.” Each graph is followed by a table with basic statistics: Minimum link strength value in given dimension (Min), first quartile of those values (1st Qu.), Median – the second quartile (Median), Average value (Mean), third quartile (3rd Qu.), maximum value (Max) and the percent of votes which actually has corresponding link in Wikipedia implicit social network (coverage). Trust. As it can be seen in Fig. 7, the votes “for” suggest strong link between voter and candidate. The Table summarizes the trust values for links between voter and candidate in votes-for and votes-against networks. High coverage means that great majority of voter-candidate pairs have corresponding links in trust network. Criticism. Fig 8 shows the relative frequency distribution of the Criticism measure. Table shows a summary of values in this measure for votes-for and votes-against. Criticism is based on reverting edits. It is clearly visible that for measure value around 20 and higher there are practically only “against” votes. Low values (around 1-3) slightly suggest that the vote will be “for”, but there is not too big difference. In this dimension there is low coverage, which means
222
P. Turek et al.
Measure votes-for votes-against Minimum 1 1 1st quartile 210 210 Median 492 405 Mean 1,035 751 3rd quartile 1,186 811 Maximum 25,093 72,362 Coverage 64.49% 53.93%
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
for against 1
10
100
1000
10000
Fig. 9. Wikipedia Implicit Social Network measures for Acquaintance dimension (statistics and cumulative relative frequency distribution)
that there are more not existing links (which may be regarded as links with zero value). More of them are in votes-for group, which suggests that not having any edit reverted supports getting a vote “for.” Acquaintance. Similar to above, fig. 9 describe results of Acquaintance analysis. This dimension, however a bit more sparse than trust (lower coverage) allows better discrimination between “for” and “against” votes for values over around 200. For lower values of acquaintance it is difficult to tell the outcome of casting a vote. 4.4
Significance of Difference between Mean Values
We have also performed the study of statistical significance of the results on the entire population of cast votes compared to the entire potential population of possible votes. One sample unit in the data sets corresponds to a pair of voter and candidate and their respective values in implicit social network. The table 2 shows the results of Welch two sample t-test for the data from votesfor and votes-against. This test is an adaptation of Student’s t-test for use with samples with possibly not equal variances and shows the statistical significance of the difference between mean values of trust, criticism and acquaintance measures. The table shows the hypotheses verified, t statistic, number of degrees of freedom (df) and the p-value which is a base to accept or reject the mentioned hypotheses. The significance level of a test is chosen to be 0.01. Table 2. The statistical significance of difference between mean values of network measures in votes-for and votes-against data sets Measure Trust
Hypothesis t df p-value Result Mean trust value is higher in 3.571 5537.2 0.00018 true votes-for than in votes-against Criticism Mean criticism value is higher in -2.524 403.8 0.00600 true votes-against than in votes-for Acquaintance Mean acquaintance value is 4.674 1674.0 0.000002 true higher in votes-for than in votesagainst
Trust Basing on Polish Wikipedia Requests for Adminship
4.5
223
Results
The Table 3 summarizes the data from previous tables and presents only the relevant values, i.e. those which differ noticeably among “for” and “against.” It is clearly seen that average trust value is almost two times higher for votes “for” even though median is not so distinctive. This is caused by some outlayers in the “for” network, much higher than the average. They certainly predict vote value as positive. Table 3. Statistics for networks created from votes Measure Votes “for” Votes “against” Trust median 57.26 41.08 Trust mean 442.24 287.71 Criticism mean 2.331 4.203 Criticism 3rd quartile 2.000 3.000 Acquaintance median 492 405 Acquaintance mean 1,035 751
Criticism is very concentrated in lower values, it has a lot of links of value one, therefore min values, 1st quartiles and medians are equal to one. Very distinctive here is mean – this clearly suggest that even from very low values (below 4) we cannot reliably predict the vote, but if the value is much higher, the vote has very high probability of being negative. The third variable – acquaintance is better shown on a graph. According to the table the difference of values in median and mean aren’t great, but the distributions presented before suggest a threshold value, when probability of positive vote is significantly higher.
5
Conclusions
This section concludes our research. What we have found in section 3 is certainly showing that it is more difficult to become a new administrator than it was before. Firstly, the average chances in RfA are falling, so it is more challenging and demanding to get the administrative privileges. On the one hand it may be regarded as the symptom of maturity of the project, but on the other hand, the cause might be in forming of a closed society, a clique of administrators and potential candidates. There is another fact supporting the hypothesis of forming of a closed society – the mean number of days since registration to receiving adminship is nearly five times larger than it was five years before. This also could be caused by maturity of Wikipedia and its rules, but the growing average needed days on Wikipedia is not supported by a growing number of contributions needed to become an administrator. It is just a matter of being in a society for particular amount of time.
224
P. Turek et al.
The implicit social network created from edit history may be used as a predictor of vote value when we have potential voter and potential candidate. By looking at those individuals (preferably those from the outside of the “core” administrators team, who have strong links in the network, we may recommend them as candidates for new administrators. 5.1
Further Research
This paper is only simple description of our approach, which opens new paths in research in the field of automatic candidates evaluation. There are many possibilities of extension of this approach. One of the examples is finding other network measures, which may be better suited to recommend administrators. Currently, we are using measures that were designed for evaluation and recommendation of teams of editors for new articles. Actually, administrators’ task is quite different than editors’ so it is desirable to assess them basing on supportive work than editing content. The other idea of further research is connected with forming of elites, closed groups of users with certain privileges. Such phenomenon works against the open nature of Wikipedia, but as we noticed, there is a strong sub-network of interconnected nodes, which in some way steers the Wikipedia. Acknowledgements. This research has been supported by the Polish Ministry of Science grant (69/N-SINGAPUR/2007/0). The authors would like to thank Maciej “Nux” Jaros for extracting data from Wikipedia database for statistics in section 3.
References 1. Adler, B.T., Chatterjee, K., de Alfaro, L., Faella, M., Pye, I., Raman, V.: Assigning Trust to Wikipedia Content. In: WikiSym 2008: International Symposium on Wikis and Open Collaboration (2008) 2. Burke, M., Kraut, R.: Taking up the mop: identifying future wikipedia administrators. In: CHI 2008: CHI 2008 Extended Abstracts on Human Factors in Computing Systems, pp. 3441–3446. ACM, New York (2008) 3. Kittur, A., Chi, E., Pendleton, B., Suh, B., Mytkowicz, T.: Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. In: Proc. CHI 2007, ACM Press, New York (2007) 4. Kittur, A., Suh, B., Pendleton, B.A., Chi, E.H.: He says, she says: conflict and coordination in Wikipedia. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2007, pp. 453–462. ACM, New York (2007) 5. Leskovec, J., Huttenlocher, D., Kleinberg, J.: Governance in Social Media: A case study of the Wikipedia promotion process. In: Proceedings of the 4th International AAAI Conference on Weblogs and Social Media, ICWSM 2010. AAAI Press, Menlo Park (2010) 6. Ortega, F.: Wikipedia: A Quantiative Analysis. Universidad Rey Juan Carlos, Madrid (2009)
Trust Basing on Polish Wikipedia Requests for Adminship
225
7. Priedhorsky, R., Chen, J., Lam, S.K., Panciera, K.A., Terveen, L.G., Riedl, J.: Creating, destroying, and restoring value in wikipedia. In: Proceedings of the 2007 International ACM Conference on Supporting Group Work, GROUP 2007, pp. 259–268. ACM, New York (2007) 8. Spinellis, D., Louridas, P.: The Collaborative Organization of Knowledge. Communications of the ACM – Designing Games with a Purpose 51(8) (August 2008) 9. Turek, P., Wierzbicki, A., Nielek, R., Hupa, A., Datta, A.: Learning About the Quality of Teamwork from Wikiteams. In: Proceedings of the 2010 IEEE Second International Conference on Social Computing, SocialCom/IEEE International Conference on Privacy, Security, Risk and Trust, PASSAT, Minneapolis, pp. 17–24 (2010) 10. Turek, P., Wierzbicki, A., Nielek, R., Hupa, A., Datta, A.: WikiTeams: How do they achieve success? IEEE Potentials 30(5) (2011) 11. Vi´egas, F., Wattenberg, M., Kushal, D.: Studying Cooperation and Conflict between Authors with History Flow Visualization. In: Proceedings of the 2004 Conference on Human Factors in Computing Systems. ACM, New York (2004)
Revealing Beliefs Influencing Trust between Members of the Czech Informatics Community Tom´aˇs Knap and Irena Ml´ ynkov´ a Department of Software Engineering, Faculty of Mathematics and Physics Charles University in Prague, Czech Republic {tomas.knap,irena.mlynkova}@mff.cuni.cz
Abstract. In the project “Social Network of the Computer Scientists in the Regions of the Czech Republic” (SoSIReCR), our aim is to build a social network of Czech informatics community, so that its members can better cooperate and exchange information. In such a social network, the aspect of trust of a member of the informatics community willing to depend on another member is of crucial importance. Unfortunately, trust – a rather complex concept – is typically comprehended as a black box and indivisible concept, leading to confusion of the social network members what trust actually is. To minimize that confusion, we choose in this paper a different approach – trust is comprehended as a set of trusting beliefs (the simpler and more intuitive concepts than trust), such as a belief that a trustee is honest or that (s)he is an expert in the given domain. To identify these beliefs we conduct a survey of the trust literature. Consequently, we select a suitable set of these beliefs relevant for the SoSIReCR project and evaluate the selection process by consulting it (mainly) with the members of the informatics community. We believe that the presented approach is a general promising way to properly define trust in social networking applications.
1
Introduction
The informatics community in the Czech Republic consists of various entities – persons (students, IT professionals, academics, employers), institutions (companies, universities), and other entities typically enabled/initiated by the institutions and formed by the persons (research groups, projects). The goal of the SoSIReCR project1 is to leverage the communication and cooperation among members of the informatics community by creating a social network with vertices representing the particular members of the community and edges representing relations between them, e.g. “a student/academic belongs to a research group”, “a student/IT professional/academic works on a project/for a company”, or “a student graduated at the given faculty”. 1
The work presented in this article has been funded in part by the Czech Science Foundation (GACR, grant number 201/09/H057) and GAUK 3110. http://www.sosirecr.cz/index_en.php
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 226–239, 2011. c Springer-Verlag Berlin Heidelberg 2011
Revealing Beliefs Influencing Trust between Members
227
Every member of the informatics community is associated with its personal and professional profile. A personal profile holds basic information (e.g. name, email, working place) relevant for the given entity together with information melted from relations with other entities (e.g. on which projects the entity participates). A professional profile of an entity holds information about to which extent the entity (a person or group) is an expert in various domains of informatics or to which extent the entity works and knows the given domain of informatics (a company, university, or project). The project uses two top-levels of the ACM Computing Classification System2 to model the domains of informatics used as axes of the professional profile. The social network in the SoSIReCR project is accessible via a Web portal, where users can manage their profiles. The social network behind the portal will be exposed according to Linked Data principles3 , which enables to (1) reuse data already available on the Web (e.g. instances of the Friend of a Friend ontology4 to obtain friends of the given person) and (2) interconnect the social network in the SoSIReCR project with other open social networks. The whole architecture of the project is illustrated in [6]. The following important scenarios S1 – S5 are addressed by the project: S1 To justify their price in the labor market properly, students/IT professionals want to compare their abilities with (i) other students or IT professionals, (ii) the typical abilities of employees working at certain positions or (iii) typical level of knowledge of other universities’ graduates. S2 Students/Academics want to know who is working on similar research topics at other universities, so that they can unify their efforts to make the research more effective and publish at more prestigious conferences. S3 Companies/universities searching students/IT professionals for their projects want to find quickly and easily suitable candidates who would like join the project and have the desired expertise. S4 Students/IT professionals want to know which companies are looking for new employees and in which domains of expertise. S5 Companies want to know the typical aggregated knowledge of students/IT professionals in various regions of the Czech Republic – this information help them when setting up new branches. Scenario S4 is sufficiently solved (at least in the Czech Republic) by various job portals5 . Nevertheless, the SoSIReCR portal will provide additional features, such as semantic-rich information or detailed job applicant’s expertise using professional profiles, which are not available at most job portals. Other scenarios are not addressed satisfactorily by any other application (see Section 6). Social networks are recognized as a valuable source of information [13]; however, they can be full of malicious entities [10]. Therefore, the aspect of trust of 2 3 4 5
http://www.acm.org/about/class/1998 http://www.w3.org/DesignIssues/LinkedData.html http://www.foaf-project.org/ Such as http://www.jobs.cz/en/, http://www.prace.cz/ (in Czech), or http://www.hledampraci.cz/ (in Czech).
228
T. Knap and I. Ml´ ynkov´ a
an entity willing to depend on another entity in the social network is of crucial importance for all the given scenarios S1 – S5 . To motivate the problems when dealing with trust, let us suppose an instance of the scenario S2 : “A young researcher (seeker, trustor ) is searching another researcher (target entity) for future academic collaboration”. From the seeker’s point of view, the crucial question is how much he can trust that the target entity is the right one for the collaboration. The seeker typically does not know the target entity, hence, (s)he cannot himself estimate trust in the target entity (there is no trust relation between him and the target entity). Since it was proofed experimentally that trust in social networks is transitive 6 , the seeker can (and actually has to) rely on another entity (a recommender ) having a trust relation to the target. There are lots of algorithms, e.g. [10,26,23,20], quantifying trust between two entities not having a trust relation between them. The problem is that these algorithms typically comprehended trust as a “black box” and indivisible concept. Since trust is so complex concept [14], semantics of such “black box” trust is ambiguous – a seeker understands the semantics of his/her trust relations in other entities, however, is rather confused regarding trust relations of others. For example, whereas one seeker can trust the target just because of reading his/her paper, another seeker can trust the same target only after personally verifying that (s)he has the desired competence and has the interest to collaborate. As a result, any algorithm quantifying black box trust between two entities not having a trust relation between them has to rely on at least two trust relations and, thus, cannot assign clear semantics to the quantified (transitive) trust [13]. To address the issues with “black box” trust, we decided for a different approach. We comprehend trust (properly defined in Section 2) as a concept formed by the set of underlying trusting beliefs [4,21]. Trust is never quantified directly – neither explicitly by the entities, nor implicitly by the portal – it is derived based on the quantifications of the beliefs forming trust. By deriving trust from its beliefs – the simpler and more intuitive concepts, the confusion of social network’s members what trust actually is is minimized. Selection of the proper set of beliefs, which (1) would be justified by the literature and (2) suitable for the scenarios S1 – S5 , is the main goal and contribution of this paper. The paper is organized as follows. Section 2 formalizes the concept of trust used in SoSIReCR. Section 3 reviews trusting beliefs in the literature and Section 4 discusses their suitability for SoSIReCR. Section 5 evaluates the reasonability of the selected beliefs. Section 6 reviews related work; the paper is rounded off with a conclusion in Section 7.
2
Concept of Trust as “Trusting Intention”
Definition 1, based on the definition proposed by McKnight and Chervany in [21], drives the comprehension of trust in the further text. We selected Definition 1 6
If an entity A trusts an entity B and the entity B trusts an entity C, then, to some extent, the entity A trusts the entity C.
Revealing Beliefs Influencing Trust between Members
229
from lots of other definitions, because it comprehends trust as the subjective opinion of an entity about another entity and it embodies five essential elements of trust: (a) potential negative consequences, (b) dependence, (c) feeling of security, (d) situation specific context, and (e) lack of reliability and control; the necessity of these elements is justified in [21]. Definition 1. Trust (trusting intention) is the extent to which one entity (trustor) is willing to depend on the other entity (trustee) in a given situation with a feeling of relative security, even though negative consequences are possible. As already introduced in Section 1, we need to distinguish two dimensions of trust in social networks [1,10] – the trustee in Definition 1 can be the target entity (e.g. the entity the trustor is considering to work with in the scenarios S2 or S3 ) or the recommender (the entity who recommends the target entity or another recommender). The extent in Definition 1 can be quantified either on a discrete [18,10] or continuous scale [19,12,26,24,10]. In general, discrete trust levels are easily seizable by humans; on the other hand, continuous trust values provide more accurate expressions of trust. Since trust is never going to be assessed manually by the entities in the SoSIReCR project, we use continuous trust values from the interval [−1, 1], ranging from absolute distrust (−1) to absolute trust (1). If the trust value is above a certain threshold κ ∈ [0, 1], the trustor is willing to depend on the trustee.
3
Review of Trusting Beliefs in Literature
In this section we survey trusting beliefs (or simply beliefs) influencing interpersonal trusting intentions, with a special focus on trust in informatics literature. We do not consider beliefs forming trust of an entity in a resource – therefore, data provenance and all data quality dimensions, such as accuracy, timeliness, or relevance are omitted [5]. Since many labels for beliefs obtained from the literature are synonyms representing the same beliefs, we clustered the obtained labels into the set of beliefs presented in Table 1; the reasons for considering two labels as two different representations of the same belief are discussed further. The first column in Table 1 contains the main labels chosen to represent the beliefs, supplemented in the brackets with other labels in the same cluster. Labels for the beliefs are provided as they appear in the literature7 . Further columns in Table 1 represent for every belief its description, and introduce references to papers containing the labels from the same cluster. If not specified otherwise, the beliefs are domain specific – i.e. they are connected with the given axis of the professional profile. The description of the beliefs follows: 7
With one exception – the label for the belief practice was originally called “experience” in [13], however, this label collides with the belief experience in Table 1.
230
T. Knap and I. Ml´ ynkov´ a Table 1. Identified trusting beliefs
Belief ’s labels Affinity (Similarity)
Description A trustee has characteristics in common with a trustor, such as shared tastes, standards, values, viewpoints, interests, or expectations Competence (Abil- A trustee has an ability to do for a trustor what ity, Capability) the trustor needs Experience (Track A trustor has an experience with a trustee; the record, History of trustor has evidence about the trustee’s previous encounters) interaction with him/her Expertise A trustee is an expert in the particular domain (Authority) Honesty A trustee is honest w.r.t. the beliefs in other, (s)he (Bias, Impartiality) tells the truth Practice A trustee has experience of solving similar problems in the given domain, but without extensive expertise Reputation Reputation of a trustee is what is generally said or believed about the trustee’s character or standing Reliability A trustee is reliable regarding the particular situ(Dependability) ation at hand Willingness (Likely A trustee will do what a trustor needs, she is moto help, Motivation) tivated to do that
Papers [13, 25, 9, 24, 20, 26] [16, 3, 4, 1, 11] [8, 13, 1, 19, 22] [13, 16, 8] [13, 8, 1, 11] [13, 2] [3] [16, 1, 4, 11, 17] [4, 16, 8]
Affinity (Similarity): Trust-based recommender systems [24,20,26] assume that trust reflects similarity between users. Papers [25,9] show a strong and significant correlation between trust and similarity; they state that “recommendations [of entities] only make sense when obtained from like-minded people exhibiting similar taste”. Paper [13] defines affinity as an extent to which a trustor has characteristics – shared tastes, standards, values, viewpoints, interests, or expectations – in common with the trustee and confirms that affinity is an important belief in subjective (taste-like) domains. Competence (Ability, Capability): In [3], trust is presented as a function of capability. Paper [1] states that “to trust an entity [...] means to believe in its capabilities”. In [4] they argue that in order to trust an agent, we need to know his/her competence – ability to do what the trustor needs. Paper [16] indirectly states that trustee’s competence influences trust in that entity. In [11], they introduce trust as a complex concept formed by many beliefs, including competence of the trusted person. Experience (Track Record, History of Encounters): The label experience is rather ambiguous, because it is used to denote an experience (practice) of a trustee in the particular domain or as an experience (track record) of a trustor with a trustee regarding the particular domain. To distinguish these two beliefs, the former one is denoted as practice and discussed later. Here, we discuss
Revealing Beliefs Influencing Trust between Members
231
experience according to the latter meaning. Papers [8,19] claim that the previous experience of a trustor with a trustee influences the amount of trustor’s trust in the trustee. Paper [13] defines track record of a recommender as an experience of the trustor with recommendations from that recommender. In [1], when deriving trust, they compare the number of positive and negative experiences of a trustor with the trustee. Finally, paper [22] defines trust as a “subjective expectation an agent has about another’s future behaviour based on the history of their encounter”; “history of their encounter” is their mutual experience. Expertise (Authority): Expertise of a trustee is an extent to which the trustee “has relevant expertise in the domain of the recommendation-seeking”, which “may be formally validated through qualifications” [13]. Paper [16] explains the importance of experts’ recommendations when seeking for trustworthy information/ recommendation. In [8] they introduce label authority with the meaning “being an expert”. Although the idea is understandable, it is rather confusing. Authority of the entity can imply and typically implies the existence of expertise of the entity in the given domain; however, it is not always true – someone (e.g. a general or officer) can be an authority just because he has power over the others. Honesty (Bias, Impartiality): Paper [1] emphasizes the role of trustee’s honesty when deciding how much to trust recommendations from that trustee. An impartial trustee is defined in [13] as someone who “does not have vested interests in a particular resolution to the scenario”; for example, a vendor of LCD monitors might be dishonest regarding the properties of his products. In [11], honesty of a trustee is emphasized as one of the beliefs forming trust. In [8] they claim that “a biased source may convey certain information that is misleading or untrue”. The label bias has an opposite meaning than honesty or impartiality; however, it still points to the same belief, only observed from the opposite side. Practice: Paper [13] specifies that the trustworthy entity needs to have “practice of solving similar scenarios in the domain”, but not necessarily with extensive expertise. Lots of papers describing algorithm for locating experts, e.g. [2], are actually locating entities with high practice, who may be experts, but it is hard to verify that. Reputation is “what is generally said or believed about a person’s or thing’s character or standing”8 . In [3], company’s trust to an employee is presented (among other beliefs) as a function of the individual’s reputation. Unfortunately, reputation is very often, e.g. in [15,22,14], comprehended as an indivisible concept, similarly to trust; whereas trust is considered as a subjective opinion of a trustor, reputation is in this case considered as a collective measure of trustworthiness [14]. We do not comprehend this kind of reputation as a belief. Reliability (Dependability): Paper [1] emphasizes the role of reliable recommendations in the process of determining trust. In [16], a trustworthy trustee is characterized as a reliable entity. Paper [4] distinguishes two meanings of trust – core trust and reliance trust; the latter one emphasizes the importance of 8
Definition taken from The Concise Oxford Dictionary.
232
T. Knap and I. Ml´ ynkov´ a
the trustee’s reliability. According to [11], trust is a composition of many different beliefs, including reliability and dependability. In [17] trust is defined as “assumed reliance on some person or a confident dependence on the character, ability, strength or truth of someone”. Apart from that, many definitions of trust include reliability of the trusting intentions, not of the trustees; e.g. paper [7] presents a definition of trust as “what an observer knows about an entity and can rely upon to a qualified extent”. Willingness (Likely to help, Motivation): Paper [4] states that willingness to do what the trustor needs is a crucial belief. In [16] they specify that trustworthy trustee is the one who is likely to help the trustor. According to [8], a trustee may be more believable, if there is a motivation for the trustee to provide accurate information/participate on the project. Many papers also argue that social proximity (a trustor and a trustee are friends, colleagues, or acquaintances) matters when seeking recommendations in social networks [10,13]. Nevertheless, the question is whether it is (1) because of higher trust of the recommenders closer to the trustor or (2) because these recommenders are more easily accessible and the trustor can better assess their suitability to give recommendations in the given situation (the trustor is more aware of what knowledge they may posses) [13]. We agree with the latter reason, therefore, we assume that knowledge, friendship or affinity are not beliefs, but relations between entities, which really help when quantifying beliefs, such as honesty, competence, or willingness; however, as long as the beliefs are quantified, these relations do not play significant role when determining trust.
4
Trusting Beliefs in SoSIReCR
Table 2 identifies for the scenarios S1 – S5 the relevant trusting beliefs Bsel ⊆ B selected from the set of beliefs B in Table 1; the abbreviation “T”, respectively “R”, in Table 2 represents that a combination of the belief and the scenario is relevant when forming trusting intentions where the trustee is a target entity, respectively a recommender. The selected beliefs Bsel do not involve reputation, which can be computed merely by looking at the quantification of the simpler beliefs – Heath [13] states that reputation is influenced by expertise, practice, honesty, and experience; Gil and Arzt [8] argue that experience with the trustee can be used to compute reputation of the trustee. The belief competence is not considered because it is comprehended as a combination of the beliefs expertise and practice. The belief reliability is not considered, because it is a composition of other simpler beliefs. Finally, we omit the belief affinity, because the influence of (character) affinity when deriving trust between entities is marginal in the objective domains (such as informatics), where the trustees’ competence is more important than their character similarities [13]. For all the scenarios S1 – S5 , honesty (truthfulness) of a trustee is selected. This is due to its importance – if a trustor does not know whether the trustee (and especially the target entity) is honest, it is hard to believe the quantification
Revealing Beliefs Influencing Trust between Members
233
Table 2. Trusting beliefs in the scenarios S1 – S5 Belief Experience
Description S1 S2 S3 S4 S5 Does the trustor have previous experience T/R T/R T/R T/R with the trustee? Expertise What is the trustee’s expertise in the relevant T/R T/R T/R T/R axes of the prof. profile? Honesty Do the profiles of the trustee correspond with T/R T/R T/R T/R T/R the reality? Is the recommender honest? Practice What is the trustee’s practice in the relevant T/R T/R T/R T/R axes of the prof. profile? Willingness Would the target entity be willing to coopT T T (to cooperate) erate with the trustor for the duration of the project/common work?
of practice, expertise or willingness9 . In SoSIReCR, honesty is considered as a domain-wide concept. Similarly, experience of a trustor with a trustee is of crucial importance and is used in the scenarios S1 – S4 ; in S5 , professional profiles of lots of trustees are collected during the aggregation of profiles; therefore, it is hardly assumable that the trustor will evaluate his/her experience with all these trustees. Willingness is selected for the scenarios S2 – S4 , where a trustor is looking for a collaboration with the target entity; in S1 , S5 , and when trusting a recommender in all the scenarios S1 – S5 , the willingness of the trustee is not necessary, the information/profiles’ details are provided automatically. Practice and expertise are important in the scenarios S2 – S5 , where a trustor needs to know the competence of a trustee in the selected axes of the professional profile. In S1 , we just compare two professional profiles, without any further quantification of the trustee’s competence.
5
Selection Process Evaluation
In this section we evaluate the trusting beliefs’ selection process by consulting it mainly with members of the informatics community. To do that, we created a questionnaire consisting of four model situations S = {S2T , S2R , S3T , S5T } successively corresponding with the scenarios: S2 , where the trustee is a target entity (hence abbreviation S2T ); S2 , where the trustee is a recommender; S3 and S5 in which the trustees are target entities. We omitted the scenario S1 , because it is rather simple, and S4 , which is an analogy of S3 , just seen from the opposite perspective. In each situation, the respondent is presented with a set of trusting beliefs Bsel introduced in Table 2 and the respondent’s goal is to mark for each such belief b ∈ Bsel one choice (C0b,s , C1b,s , C2b,s , or C3b,s ) expressing to which extent the belief b influences trust of the trustor (respondent) in the trustee in the 9
Honesty is considered as an interpersonal belief (as the other beliefs), because the content of profiles is actually what the entities say about themselves.
234
T. Knap and I. Ml´ ynkov´ a Table 3. Number of choices Cib,s selected by the respondents
Belief
S2T S2R S3T S5T #0 #1 #2 #3 #0 #1 #2 #3 #0 #1 #2 #3 #0 #1 #2 #3
Experience Expertise Honesty Practise Willingness
0 0 0 15 3
15 24 16 21 4
22 52 27 49 37
67 28 61 19 60
0 0 3 20 -
25 23 25 55 -
32 60 51 25 -
47 21 25 4 -
0 0 0 0 0
15 16 3 18 14
31 60 49 39 43
58 28 52 47 47
7 0 8 -
20 6 23 -
49 41 53 -
28 57 20 -
given situation s ∈ S. The choices (four levels of influence) are the same for all situations with the meanings: the given belief b has no influence (C0b,s ), minimal influence (C1b,s ), influence (C2b,s ), or substantial influence (C3b,s ) in the given situation s. The choice C3b,s means that if the quantification of the belief b is not satisfactory, it will penalize the considered entity heavily, possibly obstructing any potential trusting intention with that entity in the situation s. The choice C2b,s (C1b,s ) means that if the quantification of the belief b is not satisfactory, it is a major (minor) issue, which will penalize (slightly penalize) the trustee. The questionnaire was completed during April 2011 by 104 respondents (81% of men) with ages between 20 and 69. Most of the respondents were informatics (81%), the main target group of the SoSIReR portal. We dispatched the questionnaire to the region coordinators cooperating on the SoSIReCR project and managing different regions of the Czech Republic. The respondents were selected mainly by the regions’ coordinators, not by the authors themselves, based on the purposive non-probability sampling10 . The description of the situations S is in Appendix, the full questionnaire (translated to English) is available at http://www.ksi.mff.cuni.cz/~knap/files/Questionnaire.pdf. Table 3 summarizes for the belief b ∈ Bsel and the situation s ∈ S the number of choices Cib,s (abbreviated as #Cib,s or #i if the belief and the situation are obvious) selected by the respondents; i ∈ {0, 1, 2, 3}. For some combinations of the belief and the situation, the results are not defined, which corresponds with the empty spaces in Table 2. Suppose that for a belief b ∈ Bsel , a situation s ∈ S, and the numbers i, j ∈ {0, 1, 2, 3}, i = j, we have a null hypothesis H0b,s,i,j : “#Cib,s is equal to #Cjb,s ”. Then, using the binomial test, suppose that we reject the null hypothesis H0b,s,i,j with p-value pj < 0.05. If ∀k ∈ {0, 1, 2, 3}, k = i, the hypothesis H0b,s,i,k can be rejected in the way described and #Cib,s > #Ckb,s , we accept the hypothesis H b,s,i : “#Cib,s is the prevailing number of choices for the belief b in the situation s, i.e. the belief b has in the situation s the level of influence Ci ” and this result is statistically significant, with p = maxk|k=i {pk }; for p < 0.01 b,s (respectively 0.01 < = p < 0.05), the appropriate #Ci is highlighted in Table 3 with a dark grey (light grey) color. 10
We have not tracked the size of the sample, only the number of respondents who answered the questionnaire.
Revealing Beliefs Influencing Trust between Members
235
In the situation S3T , for beliefs b ∈ B = {honesty, practice, willingness}, we cannot accept the hypothesis H b,S3T ,i for any i ∈ {0, 1, 2, 3}; however, when comparing the sum #C2b,S3T + #C3b,S3T with the sum #C0b,S3T + #C1b,S3T , the first sum prevails and this result is statistically significant with p < 0.01. Therefore, we can accept the hypothesis that the beliefs B have influence or substantial influence in the situation S3T (denoted by bolded font in Table 3). Table 3 shows that experience has a substantial influence in all situations; simply, if a trustor has a positive experience with a trustee, (s)he is much more willing to depend on him/her. The belief honesty does not have a substantial influence in the situation S2R , probably because of lower influence of expertise and practice of a trustee (recommender), and, thus lower needs for honesty of the recommender in S2R . Whereas, in S2T , honesty and willingness have substantial influence, we cannot say that in S3T ; the reason for that may be that there is a lack of system trust [21] in S2T – the trustee is not bounded by any contract – thus, there is a higher need for honesty and willingness in S2T . The belief expertise does not have a substantial influence in any of the situations, which is rather surprising, especially in the situation S3T . The reason for that may be that most of the respondents comprehend the position of a mobile application programmer as rather standard position not requiring any extensive expertise above the generic programming skills. Practice is more important in S3T (hiring a programmer) than in S2T (writing a paper); this corresponds with the previous hypothesis that programming is comprehended as a rather routine job; however writing a good paper needs expertise more than practice. Finally, practice has minimal influence in S2R – a trustee with a vast expertise is more useful when searching for recommendations.
6
Related Work
McKnight and Chervany [21] conducted an extensive survey of various beliefs the trusted entities should have (the survey is based on the interdisciplinary papers published between years 1960 and 1995) and group these beliefs to four categories: (1) Benevolence: A trustee cares about the welfare of a trustor and is therefore motivated to act in the trustor’s interest. A benevolent person does not act opportunistically. (2) Competence: A trustee has the ability to do for a trustor what the trustor needs to have done. (3) Honesty: A trustee makes good faith agreements, tells the truth, and fulfils any promises made. (4) Predictability: Trustee’s actions are consistent enough that a trustor can forecast what the trustee will do in a given situation. Although focused on informatics literature, our survey has lots of similarities. Competence and honesty is comprehended similarly. The category predictability is a function of the belief experience – if a trustor knows what were the actions of a trustee in the past, (s)he can forecast the future behavior of the trustee. The category benevolence has a substantial overlap with the belief willingness and is related to our belief honesty (“A benevolent person does not act opportunistically”). Quantification and propagation of trust and distrust in social networks has been studied in lots of papers, e.g. [26,10,12]. Although we have explained in
236
T. Knap and I. Ml´ ynkov´ a
this paper, why the propagation of trust as a black box concept is complicated, we can consider the techniques proposed in these papers when quantifying and propagating trusting beliefs, e.g. honesty11 . Apart from a general purpose social networking application, such as Facebook12 , the SoSIReCR project focuses on the needs of the Czech informatics community. ResearchGate13 , Epernicus14 , and iamResearcher15 are examples of foreign projects with the similar goals as the SoSIReCR portal – to ensure information sharing and collaboration of members of the informatics community. Nevertheless, they focus merely on the academic domain.
7
Conclusions and Future Work
In this paper, we reviewed the relevant literature to identify trusting beliefs forming trusting intentions between two entities – members of the Czech informatics community – and we summarized the identified beliefs in Table 1. Consequently, we selected the relevant trusting beliefs for the scenarios S1 – S5 in the SoSIReCR project. The selection process was evaluated by 104 people – mostly members of the Czech informatics community – and the evaluation confirmed (with the exception of practice in the situation S2R ) that all the selected beliefs in Table 2 have influence or substantial influence on trust; these results are statistically significant, with the significance level α = 0.01, respectively α = 0.05. In this paper, we clarified which trusting beliefs form the trusting intentions in the SoSIReCR project. In the future, we will focus on (1) the quantification of these beliefs and (2) on the estimation of weights these beliefs have when deriving trust between two entities w.r.t. the given scenario S1 – S5 , task criticality, and domain; the evaluation in this paper depicted the first preliminary estimations for these weights (see Table 3). We are persuaded that the approach presented in this paper seems to be a general promising way to properly define trust between entities in social networking applications.
References 1. Beth, T., Borcherding, M., Klein, B.: Valuation of Trust in Open Networks. In: Gollmann, D. (ed.) ESORICS 1994. LNCS, vol. 875, pp. 3–18. Springer, Heidelberg (1994) 2. Breslin, J.G., Bojars, U., Aleman-meza, B., Boley, H., Nixon, L.J., Polleres, A., Zhdanova, A.V.: Finding Experts using Internet-based Discussions in Online Communities and Associated Social Networks. In: First International ExpertFinder Workshop (2007) 11 12 13 14 15
Quantification and propagation of beliefs is outside the scope of the paper. http://facebook.com http://www.researchgate.net http://www.epernicus.com http://www.iamresearcher.com
Revealing Beliefs Influencing Trust between Members
237
3. Essin, D.J.: Patterns of trust and policy. In: Proceedings of the 1997 Workshop on New Security Paradigms, NSPW 1997, pp. 38–47. ACM, New York (1997) 4. Falcone, R., Castelfranchi, C.: Social Trust: A Cognitive Approach. In: Trust and Deception in Virtual Societies, pp. 55–90. Kluwer Academic Publishers, Dordrecht (2001) 5. Freitas, A., Knap, T., O’Riain, S., Curry, E.: W3P: Building an OPM based provenance model for the Web. Future Generation Comp. Syst. 27(6), 766–774 (2011) 6. Galgonek, J., Knap, T., Kruliˇs, M., Neˇcask´ y, M.: SMILE – A Framework for Semantic Applications. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2010. LNCS, vol. 6428, pp. 53–54. Springer, Heidelberg (2010) 7. Gerck, E.: Toward Real-World Models of Trust: Reliance on Received Information. Basil Blackwell, Oxford (1990), http://www.safevote.com/papers/trustdef.htm 8. Gil, Y., Artz, D.: Towards Content Trust of Web Resources. Web Semant. 5(4), 227–239 (2007) 9. Golbeck, J.: Trust and nuanced profile similarity in online social networks. ACM Trans. Web 3(4), 1–33 (2009) 10. Golbeck, J.A.: Computing and Applying Trust in Web-based Social Networks. PhD thesis, College Park, MD, USA (2005) 11. Grandison, T., Sloman, M.: A survey of trust in internet applications. IEEE Communications Surveys and Tutorials 3(4) (2000) 12. Guha, R., Kumar, R., Raghavan, P., Tomkins, A.: Propagation of Trust and Distrust. In: International World Wide Web Conference (2004) 13. Heath, T.: Information-seeking on the Web with Trusted Social Networks from Theory to Systems. PhD thesis, Milton Keynes, UK (2008) 14. Josang, A., Ismail, R., Boyd, C.: A survey of trust and reputation systems for online service provision. Decision Support Systems 43(2), 618–644 (2007) 15. Kamvar, S.D., Schlosser, M.T., Garcia-Molina, H.: The Eigentrust algorithm for reputation management in P2P networks. In: Proceedings of the 12th International Conference on World Wide Web, pp. 640–651. ACM, New York (2003) 16. Kautz, H., Selman, B., Shah, M.: The Hidden Web. AI Magazine 18, 27–36 (1997) 17. Kini, A., Choobineh, J.: Trust in electronic commerce: definition and theoretical considerations, pp. 51–61 (1998) 18. Levien, R.: Attack-Resistant Trust Metrics, pp. 121–132 (2009) 19. Marsh, S.: Formalising Trust as a Computational Concept (1994) 20. Massa, P., Bhattacharjee, B.: Using Trust in Recommender Systems: An Experimental Analysis, vol. 2995 (February 2004) 21. Mcknight, D.H., Chervany, N.L.: The Meanings of Trust. Technical report, University of Minnesota, Carlson School of Management (1996) 22. Mui, L., Mohtashemi, M., Halberstadt, A.: A computational model of trust and reputation, pp. 2431–2439 (2002) 23. Richardson, M., Agrawal, R., Domingos, P.: Trust management for the semantic web. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 351–368. Springer, Heidelberg (2003) 24. Walter, F., Battiston, S., Schweitzer, F.: A model of a trust-based recommendation system on a social network. Autonomous Agents and Multi-Agent Systems 16(1), 57–74 (2008) 25. Ziegler, C.-N., Golbeck, J.: Investigating interactions of trust and interest similarity. Decision Support Systems (2007) (in press, corrected proof) 26. Ziegler, C.-N., Lausen, G.: Propagation Models for Trust and Distrust in Social Networks. Information Systems Frontiers 7(4-5), 337–358 (2005)
238
T. Knap and I. Ml´ ynkov´ a
Appendix - Description of the Situations in Evaluation Situation S2T : Imagine you are a young fellow with an interesting idea for a journal article and you plan to contact a researcher who knows the domain “searching in object databases”, whether he would help you with the preparation of the paper for a prestigious conference VLDB. You have at hand professional profiles of other researchers and their preliminary expression of interest. Which beliefs (factors) from Table 4 influence your choice of the most suitable researchers for the academic collaboration? Table 4. Description of beliefs for Situation S2T Belief Description Experience You or your colleagues at the university have good previous experience with the given researcher Expertise The researcher has already published several papers at the most prestigious conferences Honesty The researcher is telling truth, his/her professional profile corresponds with the reality (This is verified by several persons you trust) Practice The researcher has already published lots of paper at average conferences Willingness The researcher is willing to cooperate with you in the following 3 months (to cooper- (the idea is interesting for him/her, (s)he has no deadlines for other ate) projects, the priority of the cooperation with you is high)
Situation S2R : If you have not found any suitable researcher you could contact and collaborate with, which beliefs from Table 5 influence your choice of the persons (recommenders) you ask for a recommendation of the suitable researcher for a collaboration? Table 5. Description of beliefs for Situation S2R Belief Description Experience You or your colleagues at the university have good previous experience with the given recommender ((s)he has already given you/them good advices in the past) Expertise The recommender is an expert in the given domain, (s)he works in the important research center. Honesty The recommender is telling truth, his/her professional profile corresponds with the reality (This is verified by several persons you trust) Practice The recommender has practise in the given domain ((s)he worked 5 years in a company XY, however, (s)he was doing only routine tasks
Situation S3T : The European project, which continues in the next year, is looking for a programmer of mobile applications to complete the existing team of programmers. You are responsible for the selection of that programmer. Now
Revealing Beliefs Influencing Trust between Members
239
Table 6. Description of beliefs for Situation S3T Belief Description Experience You or your colleagues at the university have good previous experience with that programmer (You/They have already worked with the programmer on the project of the similar scope) Expertise The programmer has lots of certificates regarding programming of mobile applications or programming in general Honesty The programmer is telling truth, his/her professional profile corresponds with the reality (This is verified by several persons you trust) Practice The programmer worked for 5 years in the company XY developing applications for mobile devices Willingness The programmer is willing to participate on the project (the job descrip(to cooper- tion is interesting for him/her, the salary conditions are acceptable for ate) him/her)
you have at hand tens of professional profiles of the programmers. Which beliefs from Table 6 influence your choice of the three programmers – appropriate (trustworthy) candidates for the given position? Situation S5T : Imagine you are an employee of a fast growing IT company that is programming applications for mobile devices. Your task is to create a report to the Executive Director of the company, who wants to establish new branch in some region of the Czech Republic and, thus, wants to know the potential of students and recent graduates in the various regions of the Czech Republic. The SoSIReCR portal will provide you with the aggregated professional profile of trustworthy students and graduates for each region of the Czech Republic. Which beliefs from Table 7 are important for you to denote the given student or graduate as trustworthy – that is, as a person whose professional profile is included in the aggregated professional profile for the given region?
Table 7. Description of beliefs for Situation S5T Belief Expertise
Honesty Practice
Description The student or graduate has some certificates regarding programming of mobile applications or programming in general, (s)he is an expert in the given domain of applications for mobile devices The student or graduate is telling truth, his/her professional profile corresponds with the reality The student or graduate has practise in programming applications for mobile devices
High-Throughput Crowdsourcing Mechanisms for Complex Tasks Guido Sautter and Klemens Böhm KIT, Am Fasanengarten 5, 76128 Karlsruhe, Germany {guido.sautter,klemens.boehm}@kit.edu
Abstract. Crowdsourcing is popular for large-scale data processing endeavors that require human input. However, working with a large community of users raises new challenges. In particular, both possible misjudgment and dishonesty threaten the quality of the results. Common countermeasures are based on redundancy, giving way to a tradeoff between result quality and throughput. Ideally, measures should (1) maintain high throughput and (2) ensure high result quality at the same time. Existing work on crowdsourcing mostly focuses on result quality, paying little attention to throughput or even to that tradeoff. One reason is that the number of tasks (individual atomic units of work) is usually small. A further problem is that the tasks users work on are small as well. In consequence, existing result-improvement mechanisms do not scale to the number or complexity of tasks that arise, for instance, in proofreading and processing of digitized legacy literature. This paper proposes novel resultimprovement mechanisms that (1) are independent of the size and complexity of tasks and (2) allow to trade result quality for throughput to a significant extent. Both mathematical analyses and extensive simulations show the effectiveness of the proposed mechanisms. Keywords: Crowdsourcing, Data Quality, Throughput.
1 Introduction Recently, crowdsourcing has become popular for tasks that require human input to increase data quality. Crowdsourcing distributes small pieces of a large effort to many users who make small contributions, usually over the Internet. Crowdsourcing has been used successfully for many tasks, e.g., image labeling [4, 10], double-keying individual words for OCR correction [11], grading the relatedness of word pairs for ontology construction [3, 7], or word sense disambiguation [8]. Crowdsourcing poses a number of challenges. In particular, there is no guarantee that user inputs are correct. Here, an input is correct if it is identical to what respective experts would agree on [3]. There are several reasons for incorrect inputs. We distinguish: -
Users can accidentally make mistakes due to sloppiness or misjudgment, even if they contribute solely because of interest in the project, like in [2, 4]. Especially if they receive some reward for their inputs, users may cheat to reduce their effort. In particular, they may contribute arbitrary random input instead of
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 240–254, 2011. © Springer-Verlag Berlin Heidelberg 2011
High-Throughput Crowdsourcing Mechanisms for Complex Tasks
241
working thoughtfully. Especially if the reward is external, e.g., monetary, like in [3, 8, 11], gathering the reward might well be the only motivation. [3] has observed users following this strategy. To ease presentation, we introduce several notions: A Task T is the unit of work assigned to contributors. Each task consists of Decisions D1 … Dd the user is supposed to take. In [11], for instance, a task consists of two decisions, namely on the correct transcriptions of two words from images. In [3] in turn, tasks consist of 12 decisions, on the relatedness of 12 term pairs. The original state of a task is its status before any user has worked on it. Further, the final status, i.e., after a crowdsourcing system regards the task as completed, is its result. Finally, inputs are the contributions of individual users who work on a task. We formalize these notions in Section 2. In our context, it is important that incorrect inputs occurring for different reasons (see above) exhibit different properties and require different countermeasures. The crowdsourcing projects mentioned before have developed different respective strategies. One countermeasure against mistakes is redundancy, i.e., to obtain contributions of several users for each task. However, redundancy severely reduces throughput. A mechanism to discourage cheating is to probe users with tasks the system already knows the correct result for, e.g. CAPTCHAs [9]. The tasks crowdsourced in previous projects were relatively simple, e.g., doublekeying words [11], grading the relatedness of word pairs [3], or finding meaningful labels for images [10]. If tasks consist of more than one decision, like in [3, 8], the individual decisions are mutually independent and can be freely combined into tasks. Tasks in other applications are much more complex. An example is the generation of semantic markup for legacy documents. In general, the tasks consist of multiple decisions that belong together, or decisions are very complex. In Distributed Proofreaders [5] for instance, decisions are transcriptions of entire document pages, including both the word level and the structure of the page. Tasks of similar complexity arise in the Madagascar Project [6]. Because of the complex decisions and the high level of redundancy, throughput is low in Distributed Proofreaders, around 18,000 documents in 10 years. A more promising approach would be to use a mechanism like reCAPTCHA [11] for the word level transcription and to proofread the page structure separately (we argue). Even if structuring a page can be broken into several decisions, these decisions still form a unit that a user should work on as a whole. This calls for crowdsourcing mechanisms that (1) effectively counter errors and thus enforce data quality, (2) yield a high throughput, and (3) work with large tasks. Previous crowdsourcing projects have mostly addressed (1), but not in combination with (2) or (3). In particular, they have not addressed the tradeoff between data throughput and result quality. In this paper, we therefore study generic qualityenforcement techniques that are independent of the nature of the tasks and counter both mistakes and cheating: -
-
v-Voting counters mistakes. For each task, it obtains inputs from several users and aggregates them to the overall result. Unlike static redundancy, it uses a voting mechanism (controlled by parameter ‘v’), which reduces the number of inputs required. Vote Boosting builds upon v-Voting, to further increase throughput. It increases the weight of inputs from users who are known from prior observations to make
242
G. Sautter and K. Böhm
few mistakes, thus reducing the number of inputs required. If a reward system is in place, the reward can be specified to increase with the weight of the vote. We expect this to foster high-quality inputs. These mechanisms assume that most users contribute useful inputs, an assumption common to crowdsourcing projects. If most inputs were arbitrary, there would be no chance of obtaining any meaningful data at all. Note that the mechanisms ensure data quality in the presence of cheating, but do not prevent or discourage dishonest user behavior in itself. This would require some sort of user probing mechanism, e.g. one akin to CAPTCHA. To assess the effectiveness of our mechanisms, we have conducted a thorough evaluation, considering both mistakes and cheating. It comprises theoretical analyses of the expected throughput and result quality, as well as simulations. The results are that v-Voting and Vote Boosting serve their respective purpose well; in particular, they yield the same result quality as static redundancy with fewer inputs. This paper is part of a larger effort that will also cover user experiments. Since such experiments are expensive even when covering only few points in the parameter space, it is mandatory to study the alternatives with other methods beforehand. This paper reports on the respective results. Paper Outline. Section 2 introduces formal notions required for our analysis. Section 3 reviews related work. Section 4 provides an in-depth explanation and mathematical assessment of the data-quality-enforcement mechanisms. Section 5 features simulation results, Section 6 concludes.
2 Formal Notions To facilitate formal analysis of crowdsourcing, this section formalizes some notions. 2.1 Decisions, Tasks and Functions Definition: A Decision D is an atomic parameter set by a user.
□
For instance, a decision is to classify a named entity, or to specify if a given paragraph belongs to a document’s main text or is a page header or a caption. Notation: Opts(D) := {O1, …, Oo} denotes the options available for D. N ∉ □ Opts(D) denotes the null option, which models the case that D is undecided. For instance, the options for a decision can be the classes available for named entities or the paragraphs types. Note that Opts(D) can be large. In particular, this is the case when users have to type words into a text field, like in [10, 11]. At every point of its time of residence in the crowdsourcing system, a decision D has an option S(D) ∈ Opts(D) ∪ {N} assigned to it. We refer to S(D) as the state of D. There are several dedicated states to be distinguished: Notation: SO(D) ∈ Opts(D) ∪ {N} is the original state of D, i.e., the state assigned to D when it enters the crowdsourcing system. SI,U(D) ∈ Opts(D) denotes the state a user U has assigned to D in his input, i.e., the option this user has selected. An input
High-Throughput Crowdsourcing Mechanisms for Complex Tasks
243
state cannot be N. SR(D) ∈ Opts(D) ∪ {N} is the result of D, i.e., the state of D when leaving the system. A null result, i.e., SR(D) = N, indicates that the system could not determine a meaningful result for D. SC(D) ∈ Opts(D) is the correct state of D, i.e., the outcome respective experts would agree on. Input(D) = (SI,U1(D), …, SI,Uu(D)) is the input list of D, containingthe inputs that users U1, …, Uu have contributed to D. □ For instance, the original state can be the class an NLP tool has assigned to a named entity. Definition: A Task T = (D1, …, Dd) is the unit of work assigned to users, consisting of one or more decisions D1, …, Dd. □ The individual decisions that make up a task can be connected or independent. In the first case, a crowdsourcing system cannot modify tasks by adding or removing decisions. In the latter case, the system can freely put together decisions to tasks. At any point of its time of residence in the crowdsourcing system, a task T has a state S(T). The state of a task is the composition of the states of the individual decisions it consists of, namely S(T) = (S(D1), …, S(Dd)). Analogously to individual decisions, we make the following distinctions: Notation: SO(T) = (SO(D1), …, SO(Dd)) is the original state of T. SI,U(T) = (SI,U(D1), …, SI,U(Dd)) is the input of U to T. SR(T) = (SR(D1), …, SR(Dd)) is the result of T, i.e., its state after all user interactions. SC(T) = (SC(D1), …, SC(Dd)) is the correct state of T. Input(T) = (SI,U1(T), …, SI,Uu(T)) is the input list of T, comprising the inputs that users U1, …, Uu have contributed to T. □ Definition: An abstract input-aggregation function Result(Input(T)) is a function □ of type Input(T) → {∅, SR(T)} that computes the result of T from Input(T). A crowdsourcing system successively obtains inputs from users and adds them to Input(T). It evaluates Result(Input(T)) after the addition of each input; once Result(Input(T)) does not return ∅, T is complete, and no further input is required. Notation: Work(T) denotes the expected value of |Input(T)| at the moment the input□ aggregation function returns a non-empty result. In other words, Work(T) is the expected number of inputs to collect. 2.2 Types of Errors This section investigates which errors can occur in the inputs that users contribute to crowdsourced tasks. Note that it is not our goal to enable crowdsourcing systems to distinguish between these errors. In general, this is not possible. This is because an error typically does not reveal the motivation of the user who incurred it. However, errors occurring for different reasons differ in their statistical nature, i.e., follow different patterns of occurrence. They thus require specific countermeasures. In general, there is an error in a decision D if S(D) ≠ SC(D). We are interested in the prevention of errors in the result of D, namely that SR(D) ≠ SC(D). Orthogonal to the distinction discussed below, there are two types of errors: (1) Miss Errors are errors that remain undetected; formally, a miss error exists if SO(D) ≠ SC(D), and
244
G. Sautter and K. Böhm
SR(D) ≠ SC(D). (2) Added Errors are errors introduced by users; i.e., SO(D) = SC(D), and SR(D) ≠ SC(D). Accidental Errors are errors in the inputs of benevolent users incurred by mistake, be it out of sloppiness, lack of focus, or erroneous judgment. We assume that accidental errors occur randomly. Further, errors resulting from sloppiness tend to be miss errors, while the ones resulting from misjudgments can be of both types. Notation: P(’accidental miss’) is the average probability across all users that some user accidentally misses an error in a decision D of a task T. P(’accidental add’) is the average probability that some user accidentally adds an error in a decision D. □ Cheating Errors occur because users do not bother to contribute thoughtful input. If the original state of a task SO(T) is a valid input, we assume that cheating users simply submit SO(T) as their input because this is the least effort possible. If the original state of a task consists of null values, like the initially empty text fields in [10, 11], we assume cheating users to randomly select an option from Opts(D) as their input. In the former case, adding an error requires making a change to the original state of a task. So submitting the original state as an input without changing anything cannot add any error. Thus, cheating errors generally are miss errors in this case. Notation: P(’cheat’) is the average probability that some user cheats on a task T and thereby contributes an input with miss errors for all errors in SO(T). □ Combined Error Probability. To simplify subsequent computations, we aggregate the individual error probabilities. Notation/Observation: P(’miss’) is the average probability of a miss error in a single input. This happens if a user cheats on T, or if he does not cheat and misses the error in some decision D∈ T by mistake, namely: P(’miss’) = P(’cheat’) + (1-P(’cheat’)) · P(’accidental miss’)
P(’add’) is the average probability of an add error in a single input. This happens if a user does not cheat and adds an error in some decision D by mistake, namely: P(’add’) = (1-P(’cheat’)) · P(’accidental add’)
□
2.3 Parameters and Figures This section lists the exogenous and endogenous parameters of crowdsourcing systems and describes the optimization goals. The exogenous parameters are: (1) The nature of the tasks, i.e., the number of decisions they consist of, the number of options in the decisions, and whether the decisions are connected or not. (2) The accuracy of the initial states of the tasks, or, in other words, the number of errors to correct in each task. (3) The probabilities of users to make accidental errors and to cheat on tasks. The sole endogenous parameter is the input-aggregation function in use and its parameterization. The numbers to optimize are: (1) the expected accuracy of task results, namely P(‘SR(T) = SC(T)’), and (2) the expected number of inputs required to achieve this accuracy, i.e., the expected value of |Input(T)|. The latter is particularly important
High-Throughput Crowdsourcing Mechanisms for Complex Tasks
245
when using third-party crowdsourcing platforms that require a fixed monetary reward per input, like the Amazon Mechanical Turk [1].
3 Related Work This section discusses recent crowdsourcing projects, the mechanisms used to enforce data quality, and some experiences. 3.1 r-Redundancy Many projects [3, 4, 8] use a simple input-aggregation function, namely r-Redundancy, where r is the parameter specifying the number of inputs required. r-Redundancy means that, once r inputs are given for a task T, the most frequently given input in Input(D) becomes the result of D, for each Decision D in T. r usually is an odd number. r-Redundancy is suboptimal with regard to throughput. This is because a task always takes r inputs to complete, even if the first (r+1)/2 inputs agree completely. Eckert et al. [3] use a 5-redundant approach to arrange terms into a concept hierarchy. Each task consists of 12 independent decisions. Each decision was to compare a pair of terms with regard to relatedness and relative generality. To detect inputs of low quality, each task included two very easy decisions P and Q. If users got them wrong, this served as an indicator for them not paying attention. With this mechanism, [3] achieved a degree of data quality comparable to that of a concept hierarchy constructed from the same terms by domain experts. However, embedding decisions with known results like P and Q in every task only works with independent decisions that a crowdsourcing system can freely bundle into tasks. It is impossible to use with tasks that consist of connected decisions. Snow [8] successfully used 10-Redundancy based crowdsourcing for detail level NLP tasks like word sense disambiguation, achieving a result quality similar to [3]. All tasks consist of 30 independent decisions bundled randomly. The system did not include any mechanisms to detect or filter inputs of low quality. 3.2 Agreement Games Agreement Games synchronously obtain inputs from two random users, referred to as U and V. Each task T usually consists of a single decision D, and usually SO(D) = N. If the two inputs agree, they count as correct, and both users get a reward. Von Ahn has successfully used this approach for image labeling [10]. OntoGame [7] has shown that it also works well for ontology construction and alignment, and for named entity disambiguation. However, the agreement approach is unlikely to work well for tasks with multiple decisions. This is because such tasks make it much harder for users to make inputs that agree in all decisions – a single mistake in one input renders both inputs useless. 3.3 Other Approaches ReCAPTCHA [11] is a crowdsourcing project that double-keys images of document pages in a word-by-word fashion. The CAPTCHAs users have to solve consist of two
246
G. Sautter and K. Böhm
random word images. One of them is the crowdsourcing task T, a single decision D on the correct transcription of the given word image. The other one is the actual CAPTCHA, referred to as C in the following, a word image whose correct transcription SC(C) is already known to the system. The presence of the CAPTCHA C that is indistinguishable from the actual task T (= {D}) counters cheating well. ReCAPTCHA considers an input for D only if the CAPTCHA is solved, i.e., SI(C) = SC(C). A task is complete as soon as there are 3 agreeing inputs. However, reCAPTCHA tasks are tiny. Tasks that take more time are impractical as CAPTCHAs. Furthermore, insisting on agreeing inputs is impractical with regard to throughput if tasks consist of multiple decisions, as we will show. Another crowdsourcing project related to the digitization of legacy documents is Distributed Proofreaders [5]. Its purpose is to correct OCR errors by means of redundancy. Tasks consist of one very large decision, namely the transcript of an entire page. Data throughput has been low so far, around 18,000 works in roughly eight years. A more sophisticated process separating the pages into smaller chunks might be more promising, e.g., using reCAPTCHA on the word level. The GalaxyZoo [4] project had over a million galaxy images classified into six basic categories by over 10.000 volunteers in less than 200 days. Their system presented each user randomly selected images. However, this approach requires the whole set of tasks to be available from the start, which is not a given in digitization efforts. In addition, GalaxyZoo computed results only in the very end, using a centrality measure to weight the inputs of individual users.
4 High-Throughput Crowdsourcing To facilitate crowdsourcing of large numbers of complex tasks like proofreading digitized documents, this section now introduces respective data-quality-enforcement mechanisms. To ease presentation, we first investigate a base case that assumes a single input to complete a task. We then present our mechanisms and evaluate them. We use the following running example: Think of a task T = {D1, D2, D3, D4}. Di is determining the type of the i-th paragraph in a page. Further suppose that Opts(Di) = {‘page header’, ‘main text’, ‘caption’, ‘footnote’}, SO(T) = (‘main text’, ‘main text’, ‘caption’, ‘footnote’), and SC(T) = (‘page header’, ‘main text’, ‘main text’, ‘main text’).
This corresponds to only 25% accuracy in automated classification, a very low value. We chose this below-standard value for presentation purposes. For our analysis, we use conservative, yet realistic figures. Namely, we assume that on average, for an individual decision D in a generic task T P(‘SO(D) = SC(D)’) = 80%, P(’miss’) = 10%, and P(’add’) = 5%.
4.1 Base Case As the baseline for assessing the effectiveness of individual countermeasures, we first formalize the base case (‘BC’), i.e., that exactly one user contributes to each task.
High-Throughput Crowdsourcing Mechanisms for Complex Tasks
247
Then, the probabilities PBC(’miss’) of a miss error and PBC(’add’) of an add error occurring in a decision D are PBC(’miss’) = P(’miss’), PBC(’add’) = P(’add’)
This results in the following probability of a correct result: PBC(’SR(D)=SC(D)’) = 1 - P(‘SO(D)=SC(D)’) · PBC(’add’) - P(‘SO(D)≠SC(D)’) · PBC(’miss’)
Note that always WorkBC(T) = 1, representing the optimal throughput. With the values from the running example, we obtain PBC(‘SR(D)=SC(D)’) = 0.94 and PBC(‘SR(T)=SC(T)’) ≈ 0.7807.
4.2 v-Voting v-Voting (‘V’) is a means to counter accidental errors. As r-Redundancy, it does so by obtaining and aggregating several inputs for each task. As opposed to r-Redundancy, it uses an agreement-based input-aggregation function. That is, there is a fixed level of agreement to reach, but no fixed number of inputs to obtain. [11] uses this technique for individual words, with a fixed v = 3. We generalize it here to a parametric level of agreement, referred to as v, and for any multi-decision task. Notation: ResultV(Input(T)) is the input-aggregation function for v-Voting. RV(Input(D)) is an auxiliary function that computes if there is an agreed-upon result for a decision D. Formally, this is: {O } R V (Input(D)) := 0 N
if ∃ O0 ∈ Οpts(D) such that {SI (D) ∈ Input(D) | SI (D) = O0 } ≥ v otherwise
if ∃ D ∈ Τ : R V (Input(D)) = N ∅ Result V (Input(T)) := (R V (Input(D)) for all D ∈ T) otherwise
□
Note that ResultV(Input(T)) avoids the ambiguous cases that can occur with rRedundancy. Another advantage of ResultV(Input(T)) is that it requires fewer inputs than r-Redundancy for the same expected result quality. Further note that ResultV(Input(T)) computes the result decision-wise and does not require whole inputs to agree, in contrast to [11]. Example 1. Suppose that v = 2, that three users U1, U2, and U3 contribute inputs to the task T from the running example, and that the inputs are as follows: SI,U1(T) = (‘page header’, ‘main text’, ‘main text’, ‘footnote’) SI,U2(T) = (‘main text’, ‘main text’, ‘main text’, ‘main text’) SI,U3(T) = (‘page header’, ‘main text’, ‘caption’, ‘main text’) Even though no two inputs are equal, and all deviate from SC(T) in one decision, at least two inputs agree for each decision. Namely, the agreed-upon overall result SR(T) is (‘page header’, ‘main text’, ‘main text’, ‘main text’), which is equal to SC(T), even though none of the users actually provided this input. Had users U1 and U2 given the same overall input, U3 would not have been asked to contribute an input to T at all. ■
248
G. Sautter and K. Böhm
Decision-wise voting can considerably decrease the number of inputs required for an agreed-upon result, as illustrated in the example. The larger the number of decisions a given task comprises, the higher the advantage. Formal Analysis. What is the overall probability of a correct result for a task T, i.e., PV(‘SR(T) = SC(T)’)? We compute this in the following. For ease of presentation, we assume v = 2. To keep the computation simple, we further assume the worst case that, if several inputs contain add errors on a decision D of a task T, these errors are identical and become part of the result of T. This actually holds only for binary decisions, i.e., |Opts(D)| = 2. In non-binary decisions like the task from the running example, the assumption heavily increases the probability of an error. It helps us because it restricts |Input(T)| and thus reduces the number of cases to consider. Our simulations will show that |Input(T)| barely increases for |Opts(D)| > 2, in the range of a few percent, over a wide range of values for the other exogenous parameters. Notation: PV(’miss’) and PV(’add’) denote the probabilities of a miss error and an □ add error occurring in the result of a decision D ∈ T, respectively. Informally, an error in the result of a decision D occurs if the first two inputs are erroneous, and if one of the two first and the third input are erroneous. Formally, PV(’miss’) and PV(’add’) are as follows: 2
3
PV(’miss’) = 3·P(’miss’) - 2·P(’miss’) 2 3 PV(’add’) = 3·P(’add’) - 2·P(’add’)
The overall probability for a decision D ∈ T to be correct in the result then is: PV(‘SR(D) = SC(D)’)= 1 - P(‘SO(D)=SC(D)’) · PV(’add’) - P(‘SO(D)≠SC(D)’) · PV(’miss’)
The overall probability to obtain a correct result for a task T consisting of d decisions D1…Dd, is: PV(‘SR(T) = SC(T)’) = PV(‘SR(D) = SC(D)’)
d
Example 2. To illustrate the above, we compute the probability of a correct result for the task from the running example, with the exogenous parameters given there: PV(‘SR(D) = SC(D)’) = 0.9886 and PV(‘SR(T) = SC(T)’) ≈ 0.9552
In the base case, the respective values are 0.94 and 0.7807. With no user input at all, the probability of a correct result would be, just for comparison: 0.84 = 0.4096 ■ Discussion. In Example 2, 2-Voting increases the probability of a correct result for the example task T to about 96% from about 78% in the base case. This corresponds to a reduction of error by a factor of about 6, for the at most threefold effort. Note that accuracy, for instance that of classifiers, is usually measured for individual objects. This corresponds to the individual decisions of a task. In this example, 2-Voting increases the probability of a correct final result for a decision D of a task T from 94% to about 99%. This corresponds to a reduction of error by a factor of almost 6 compared to the base case, again, for at most three times the effort. The following notions are auxiliary; we use them to formally derive our main results, the computation of the expected throughput WorkV2(T).
High-Throughput Crowdsourcing Mechanisms for Complex Tasks
249
Notation: P(‘SI,U1(D)=SI,U2(D)’) is the probability that the first two inputs SI,U1(D) and SI,U2(D) agree for a decision D. □ Informally, this is the probability that either none or both SI,U2(D) and SI,U2(D) are erroneous in some way. Formally, it is as follows: 2
2
P(‘SI,U1(D)=SI,U2(D)’) = P(‘SO(D)=SC(D)’) · (P(‘add’) + (1-P(‘add’)) ) 2 2 + P(‘SO(D)≠SC(D)’) · (P(‘miss’) + (1-P(‘miss’)) )
Notation: P(‘SI,U1(T) = SI,U2(T)’) denotes the probability that the first two inputs SI,U1(T) and SI,U2(T) agree for an entire task T. □ Formally, this is: P(‘SI,U1(T) = SI,U2(T)’) = P(‘SI,U1(D) = SI,U2(D)’)|T| Throughput. The actual increase in effort in comparison to the base case depends on the probability P(‘SI,U1(T)=SI,U2(T)’) of the first two inputs to agree on all decisions in T, namely: WorkV2(T) = 2 · P(‘SI,U1(T) = SI,U2(T)’) + 3 · P(‘SI,U1(T) ≠ SI,U2(T)’)
Further, WorkV2(T) / WorkBC(T) is the overhead 2-Voting incurs in comparison to the base case. Likewise, 1 - WorkV2(T) / WorkR(T) is the reduction in effort 2-Voting yields in comparison to 3-Redundancy. Example 3. With the values from the running example, the probability of the first two inputs to agree and the expected number of inputs required are: P(‘SI,U1(T) = SI,U2(T)’) = 0.6162 and thus WorkV2(T) = 2.3838
Compared to the reduction in error, the overhead over the base case is relatively low. The reduction in effort, as compared to 3-Redundancy, is 21%, corresponding to a 26% increase in throughput, at no increase of the probability of errors at all. ■ Discussion. Note that in reality both P(’miss’) and P(’add’) will be far lower than the pessimistic values from our example computations. Further, the probability of a correct original state P(‘SO(D)=SC(D)’) is often higher, resulting in a higher probability of the first two inputs to agree P(‘SI,U1(D)=SI,U2(D)’). On the other hand, tasks can comprise far more decisions, so the exponent in the computation of P(‘SI,U1(T) = SI,U2(T)’) increases, resulting in lower values. Depending on the actual numbers, the effect can go either way: Example 4. If P(‘SI,U1(D)=SI,U2(D)’) is 99% in a task with 20 decisions, P(‘SI,U1(T)=SI,U2(T)’) is 82%; in a task with 50 decisions in turn, it drops to 61%. ■ 4.3 Vote Boosting Vote Boosting (‘VB’) increases the weight of inputs of users who make few mistakes. It exploits that presumably not all users make mistakes with the same probability, and that v-Voting allows to observe the frequency of mistakes for each user U. If U has made few mistakes recently, Vote Boosting gives higher weight to an input from U in the aggregation function. Thus, it reduces the number of inputs required to compute a result. Definition: CoinFlip(c) is a random function that returns 1 with a probability of c and 0 with a probability of (1-c). □
250
G. Sautter and K. Böhm
Notation: BoostProb(U,T) is the function that computes the probability that the input SI,U(T) of a user U for a task T receives a vote boost (referred to as the boost probability in the following). □ We derive a formula for this probability below. Definition: ResultVB(Input(T)) is the input-aggregation function for Vote Boosting, as follows: if | Input(T) | = 1 and CoinFlip(BoostProb(U, T)) = 1 S (T) Result VB (Input(T)) := I,U Result V (Input(T)) οtherwise
□
With this definition of ResultVB(Input(T)), with a probability of BoostProb(U,T), SI,U(T) immediately becomes the result of T, bypassing the v-Voting mechanism. This reduces WorkVB(T) to 1, the baseline level, completely eliminating the overhead. However, it also abandons the error-prevention functionality of v-Voting. Thus, BoostProb(U,T) may return a boost probability considerably above 0 only for users who are very unlikely to make mistakes. We formalize BoostProb(U,T) as follows: Notation: C denotes the minimum probability required for the result of a task T to be correct, i.e., the required result quality. P(‘SI,U(D)=SC(D)’) is the probability that a user U contributes a correct input to a decision D, the respective probability for a task T is P(‘SI,U(T)=SC(T)’) = P(‘SI,U(D)=SC(D)’)|T|. Further, Correct(U) denotes the observed number of correct inputs from user U since his last erroneous input. m denotes the maximum probability the system accepts for user U to receive a vote boost for a task T even though actually P(‘SI,U(T)=SC(T)’) < C. □ The actual value of P(‘SI,U(T)=SC(T)’) is unknown, but we can estimate it with high certainty from Correct(U). In particular, we can compute BoostProb(U,T) by means of a significance test for accepting the hypothesis “P(‘SI,U(T)=SC(T)’) ≥ C” based on Correct(U) correct observed inputs. This hypothesis states that user U has a sufficiently high probability of providing correct input for T to be eligible for a vote boost. We derive an upper bound b for BoostProb(U,T) from a significance test, namely the highest value b for which the hypothesis is true at a significance level of m / b: P( " P(' SI,U (T) = SC (T)' < C" ) ≤ m / b ⇔ C
Correct(U) |T |
m
≤ m/b ⇔ C
Correct( U) |T |
≥b
Note that b increases exponentially with Correct(U) / |T|. To prevent voting to be completely deactivated for any user (i.e., his boost probability rises to 1), we use (1 m) as an additional upper bound. Further, we want BoostProb(U,T) to be 0 for Correct(U) = 0 and therefore subtract m. Definition: -Correct(U ) − 1) □ BoostProb( U, T) := min (1 − m ), m ⋅ ( C |T|
Example 5. This example illustrates how the boost probability increases as a user gives more and more correct inputs: Suppose that a given task T consists of 3 decisions. Further, suppose user U has contributed a correct input to the previous
High-Throughput Crowdsourcing Mechanisms for Complex Tasks
251
Correct(U) = 100 decisions. Finally, let m = 1%, and C = 99%. Then the probability of boosting the vote of U is: BoostProb(U, T) = 0.01 ⋅ (0.99
−100 3
− 1) = 0.4%
For the boost probability to exceed 50% for the given T, m, and C, Correct(U) has to exceed 1173. This means that U has to contribute inputs to 391 tasks without any mistake for this to happen. After increasing Correct(U) to 1374, i.e., after 458 tasks the size of T, the boost probability finally reaches its upper limit of (1 - m) = 99%. ■
5 Evaluation To evaluate our mechanisms, we have run extensive simulations. We have tested many variations of input-aggregation functions. 5.1 Experimental Setup The sets of tasks used here have two parameters: the number of options per decision, and the accuracy of the initial states. We generated 9 sets of 1,000,000 tasks, with 2, 3, or 4 options per decision and 80%, 90%, and 95% as the accuracy for the original states. Each task consists of 5 to 10 decisions, normally distributed over that interval. The user populations tested have two parameters: their mean probabilities of cheating and of mistaking. We used values of 1%, 4%, and 15% for both, generating populations of 1000 users for each of the resulting 9 combinations. For the individual users, the probabilities of cheating and of making errors by mistake were exponentially distributed over [0,1] around the respective mean values. We have implemented users as follows: In case of an add error on a decision with more than two options, a user selects one of the erroneous options at random. Users take a fixed time t per decision when contributing thoughtfully. Changing the state of a decision increases this time to 2·t. Cheating decreases it to t/2. At runtime, each user is a separate thread, so users are independent of each other and work concurrently. In all, we ran simulations for 46 input-aggregation functions: One is the base case, i.e., each task receives one input. The other 45 are as follows: r-Redundancy with r = 3,5,7, v-Voting with v = 2,3,4, combined with 14 different parameter combinations for Vote Boosting, one being to deactivate it. 5.2 Results From a total of 14,661 simulated scenarios, we report only on the four analyses we deem the most interesting, to save space. v-Voting vs. r-Redundancy. Table 1 shows the average result quality and the average number of inputs per task for v-Voting and r-Redundancy. For fairness, the numbers for v-Voting exclusively come from input-aggregation functions that do not use Vote Boosting. All numbers are aggregated over all user populations and task sets. Clearly, vVoting yields better throughput than r-Redundancy. This substantiates the results of the analysis. Interestingly, result quality also improves slightly with 2-Voting and 3-Voting in comparison to 3-Redundancy and 5-Redundancy, respectively. We figure that this is because v-Voting avoids ambiguous decisions.
252
G. Sautter and K. Böhm Table 1. Inputs per task and remaining error Base Case
3-Red.
Remaining Error (in %)
4,25
Inputs per Task
1
2-Voting
5-Red.
3-Voting
7-Red.
1,11
1,01
3
2,36
4-Voting
0,48
0,46
0,27
0,27
5
3,57
7
4,75
Vote Boosting. Figure 1 visualizes the impact of Vote Boosting, namely the increase in throughput and in errors. The effect of changes to C and m is similar for all three values of v we have tested: The more liberal the parameter settings, the higher the increase in throughput, but also the number of errors. The dependency seems almost linear for both. For a given result quality required, this predictable behavior allows for tuning to achieve the highest throughput possible. Cost of High-Quality Results. Table 2 shows the average number of inputs required for each task to achieve at least 99.5% accuracy in the result, broken up across the 9 different user populations. The accuracy actually achieved is given in brackets, with the parameters of the input-aggregation function listed beneath. The input-aggregation function always uses v-Voting (parameter v), mostly with Vote Boosting (parameters m and C). A value of 0 for m indicates that Vote Boosting has not been used. These results point out the correlation between the capability and honesty of contributing users and crowdsourcing throughput; the latter translates directly into the per-task cost in scenarios with a per-input payoff, e.g., the Amazon Mechanical Turk. With low probabilities for both mistakes and dishonesty, 1.14 inputs per task are sufficient to achieve the desired accuracy. This number increases sharply if either of the two probabilities increases. With pessimistic values for both, even 5.38 inputs per task are not enough to reach the goal. This highlights the importance both of fostering highquality inputs and of deterring users from cheating.
Fig. 1. Effects of Vote Boosting
High-Throughput Crowdsourcing Mechanisms for Complex Tasks
253
Crowdsourcing Strategy. As our simulations have shown, the best-suited strategy to achieve a desired result quality at a high throughput depends on the exogenous parameters. These parameters are hardly predictable at the start of a crowdsourcing project. Thus, we recommend starting out on pessimistic assumptions, i.e., favoring result quality over throughput. Then, experts can assess the quality achieved (e.g., from a sample of task results) and deduce values of the exogenous parameters. Afterwards, the endogenous parameters can be adjusted to optimize throughput. Table 2. Inputs required to achieve 99.5% result accuracy Mean Prob. of Cheating 1% 4% 15%
1% 1.14 (99.51%) v=2 m=8% C=92% 1.42 (99.57%) v=2 m=4% C=96% 3.94 (99.65%) v=4 m=2% C=98%
Mistakes 4% 1.78 (99.63%) v=2 m=4% C=96% 1.93 (99.51%) v=2 m=4% C=96% 4.6 (99.61%) v=4 m=2% C=98%
15% 3.78 (99.55%) v=3 m= 2% C=98% 4.48 (99.51%) v=4 m=4% C=96% not achieved 5.38 (98.62%) v=4 m=0
6 Conclusions Crowdsourcing is popular for large-scale data processing endeavors that require human input. However, both potential inability and dishonesty of users threaten the quality of the results. This causes a tradeoff between data throughput and result quality. In this paper, we have studied mechanisms that enforce data quality with an impact on throughput as small as possible, independent of the actual tasks. In particular, vVoting increases throughput over static redundancy based approaches. Vote Boosting further increases throughput by capitalizing on especially capable users. Extensive simulations over a wide range of exogenous parameters have confirmed the suitability of the mechanisms, substantiating our findings from theoretical analyses. In particular, simulation results show (1) that v-Voting yields higher result quality than r-Redundancy with fewer inputs per task, and (2) that Vote Boosting allows trading off result quality in favor of throughput in a predictable fashion.
References 1. The Amazon Mechanical Turk, http://www.mturk.com 2. Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popovic, Z.: Predicting protein structures with a multiplayer online game. Nature 466 (2010) 3. Eckert, K., Niepert, M., Niemann, C., Buckner, C., Allen, C., Stuckenschmidt, H.: Crowdsourcing the assembly of concept hierarchies. In: Proceedings of JCDL 2010, Brisbane, Australia (2010) 4. Lintott, C.J., Schawinski, K., Slosar, A., Land, K., Bamford, S., Thomas, D., Raddick, M.J., Nichol, R.C., Szalay, A., Andreescu, D., Murray, P., Vandenberg, J.: Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 389 (2008), doi:10.1111/j.1365-2966.2008.13689.x
254
G. Sautter and K. Böhm
5. Newby, G.B., Franks, C.: Distributed proofreading. In: Proceedings of JCDL 2003, Houston, TX (2003), doi:10.1109/JCDL.2003.1204888 6. Sautter, G., Böhm, K., Agosti, D., Klingenberg, C.: Digital Resources from Legacy Documents - an Experience Report from the Biosystematics Domain. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 738–752. Springer, Heidelberg (2009) 7. Siorpaes, K., Hepp, M.: OntoGame: Towards overcoming the incentive bottleneck in ontology building. In: Chung, S., Herrero, P. (eds.) OTM-WS 2007, Part II. LNCS, vol. 4806, pp. 1222–1232. Springer, Heidelberg (2007) 8. Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks. In: EMNLP 2008, Morristown, NJ, USA (2008) 9. Von Ahn, L., Blum, M., Hopper, N., Langford, J.: CAPTCHA: Using Hard AI Problems for Security. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 294–311. Springer, Heidelberg (2003), doi:10.1007/3-540-39200-9_18 10. Von Ahn, L.: Games with a Purpose. IEEE Computer 29(6), 92–94 (2006) 11. Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: reCAPTCHA: HumanBased Character Recognition via Web Security Measures. Science 321(5895) (2008), doi:10.1126/science.1160379
Designing for Motivation: Focusing on Motivational Values in Two Case Studies Fahri Yetim, Torben Wiedenhoefer, and Markus Rohde Department of Information Systems and New Media, University of Siegen, Hoelderlinstr. 3, 57076 Siegen, Germany {fahri.yetim,torben.wiedenhoefer,markus.rohde}@uni-siegen.de
Abstract. This paper presents our investigations in how value sensitive design of interactive systems could motivate people to contribute to semantic web applications. In two case studies we adopted the Value Sensitive Design (VSD) framework (Friedman et al., 2006), relying on three levels of investigation. Conceptual investigation focused on the literature analysis and identified a set of motivational values. Empirical investigation involved understanding the motivations of users within two cases. Finally, technical investigation was conducted to determine design features which may support and facilitate these values. This study illustrates the use of the VSD framework for investigating motivational values and provides a review of design features to support end users’ motivation to contribute to public goods. Keywords: Motivation, value sensitive design, user participation, annotation, case studies.
1 Introduction The triumph of social web applications (like Facebook, Twitter, Flickr, asf.) pioneered the increasing importance of user contribution in general. Concepts such as “Crowdsourcing” (Howe 2008) or “Swarm Intelligence” (Hinchey et al. 2007) mark milestones on the way to the assumed mass production of public goods. But what does in fact motivate users of web applications to become “prosumers” (Tapscott and Williams 2008) and to contribute their knowledge (personal property) to the (public) swarm? Recent approaches to semantic web system designs rely on the contributions of end users. Enabling users to annotate different media resources and to provide a piece of additional information is regarded as crucial for valuable content enrichment. Therefore, there has been a growing interest in designing systems that motivate people to participate and to annotate in order to enrich web content (e.g., Zhang, 2008, Preece & Shneiderman, 2009; Cuel et al., 211). The current literature investigates the effects of different motivation strategies (such as intrinsic and extrinsic motivations) on user participation. These studies show that human values and social needs such as the sense of belonging to a community, altruism, fun etc. play a significant motivational role and thus should be considered when designing systems that motivate user contribution (Kuznetsov, 2006; Farzan et al, 2008). A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 255–268, 2011. © Springer-Verlag Berlin Heidelberg 2011
256
F. Yetim, T. Wiedenhoefer, and M. Rohde
The purpose of this paper is to report on the results of our investigations which were aimed at understanding the motivational values in two cases and which were to suggest design features for supporting them. This research was conducted within the context of the EU-funded research project INSEMTIVES, whose goal is to bridge the gap between human and computational intelligence in the current semantic content authoring research. It investigates incentive mechanisms to motivate users to contribute semantic annotations, in order to increase the amount and quality of annotations for a broad range of different resource types. The two cases considered in this paper are two different companies, one a telecommunication provider and the other a web service portal provider. This paper addresses the following research questions: 1.
What motivates users in different use cases? Are there significant commonalities/differences between the cases?
2.
What mechanism/design features may support motivational values and thus motivate people to annotate in these use cases?
To deal with these issues we took a Value Sensitive Design (VSD) perspective and employed the VSD methodology developed by Friedman et al. (2006). The VSD methodology has been applied in several contexts (Yetim, 2011a, 2011b), including the analysis of Wikipedia for understanding the motivational values of Wikipedians (Kuznetsoy, 2006). Our study is design-oriented, searching for design features to support motivational values in the use context. The VSD methodology is used to frame our investigations for both the understanding of and designing for promoting motivational values. The paper contributes to the current literature by demonstrating how the VSD methodology can be employed for investigating motivational values in the context of a design project and by providing a review of design features to support end user motivation to contribute to public goods. In the following we will first provide a brief introduction into the VSD methodology and clarify how it has been employed in this study. Then we will describe different types of investigations that we have conducted and finally we will present our findings.
2 Value Sensitive Design as a Methodological Framework The VSD approach (Friedman et al., 2006) is viewed as a viable principled way to systematically considering human values throughout the design and deployment of information technologies (Le Dantec et al., 2009; Yetim, 2011a, 2011b). Methodologically, at the core of the VSD lies an iterative process that integrates conceptual, empirical, and technical investigations. Conceptual investigations involve philosophically informed analyses of the central value concepts and issues under investigation. Empirical investigations may focus on the analysis of the social context (e.g. to understand how individuals apprehend specific values) as well as on the evaluation of a particular design. Technical investigations involve proactive design to support values identified during the conceptual investigations and analyzing current technical mechanisms with respect to how they hinder or support human values.
Designing for Motivation: Focusing on Motivational Values in Two Case Studies
257
The VSD framework has also been criticized with respect to the order of its basic activities. For example, Le Dantec et al. (2009) argued that the VSD methodology starts with a set of human values as part of the conceptual investigations and then analyzes the chosen value concepts. Instead, they suggest beginning with empirical investigations in the context to identify contextual values to be considered in design. Instead of a fixed classification of values, they suggest the classification of values that were divined from the empirical work. On the other hand, some researchers have argued that the VSD framework does not prescribe the sequence of investigations and is open in this respect (Yetim, 2011a). This means that it is in accordance with the VSD methodology if predefined abstract value categories, identified within conceptual investigations, are combined with local values, identified within the empirical investigations. The value categories can be used as an analytic tool with respect to the locally expressed values, as Le Dantec at al. (2009) state: “With an empirical investigation shaping the understanding of values, the conceptual investigation becomes a tool through which the designer can reflectively evaluate the values presented through the empirical investigation and those that may be expressed through a more generally defined classification.” (p. 1148). This is the way how we employ the empirical and conceptual investigations of the VSD methodology in our research.
3 Investigating Motivations from a VSD Perspective For investigating the motivational aspects in two case studies we started with empirical investigations to understand the motivational values of the users, i.e., what is it that motivates them to annotate. Simultaneously, we conducted conceptual investigations to identify motivational value categories in the literature. The results of the conceptual investigations, i.e. the value categories, did not influence the empirical investigations. Instead, they were used for the analysis of the data collected within the empirical investigations. Yet, for presentation purposes, we will first introduce the categories of motivational values that were derived from the literature analysis as part of our conceptual investigations, and then present the results of our empirical investigations in two case studies, in which these categories were used to analyze the data. Finally, we will present the results of our technical investigations that were conducted to determine design features that may support or facilitate the values. In other words, starting from what we know about what motivates people individually and collectively in our design contexts, we analyzed the state of the art knowledge to identify and suggest relevant design features or mechanisms (cf. Cuel et al., 2011). 3.1 Conceptual Investigations Several studies indicate that motivational values play a significant role in motivating people to participate. For example, most people value their own welfare and are motivated to increase it when opportunities to do so arise. Goals that are valued induce motivation directed towards achieving these goals (Batson et al, 2002). The current literature offers a set of motivational values. For example, Batson et al. (2002) differentiate four types of motivation for community involvement: egoism, altruism,
258
F. Yetim, T. Wiedenhoefer, and M. Rohde
collectivism, and principlism. The differentiation is based on the ultimate goal for each motive. For egoism, the ultimate goal is to increase one’s own welfare; for altruism, it is to increase the welfare of another individual or other individuals; for collectivism, the goal is to increase the welfare of a group; and for principlism it is to uphold one or more moral principles. Kuznetsov (2006) states that the motivations of Wikipedians to contribute are grounded on values such as reputation, community, reciprocity, altruism and autonomy (see also Wagner and Prasarnphanich 2007). Hars and Qu (2002) show that altruism and identification with a community as internal motivation factors played an important role for participation in open source projects. Also Wasko and Faraj (2005) found that increasing professional reputations motivate people to contribute their knowledge. Oreg and Nov (2008) considered the categories reputation-building, self development, and altruism in the context of open source projects and showed that software contributors placed a greater emphasis on reputation-gaining and self-development, whereas content contributors placed a greater emphasis on altruistic motives. Fang and Neufeld (2009) provide other categories of motives, including software use value, status and recognition, learning, personal enjoyment, reciprocity, getting paid, sense of ownership and control, career advancement, free software ideology, and social identity. All of these values are potentially relevant and therefore may be used as ex post categories to analyze and interpret the results of the empirical investigations in order to understand which values dominate in the fields studied, as presented next. 3.2 Empirical Investigations The empirical investigations consist of two case studies conducted with a telecommunication provider and a web service portal provider. For reasons of confidentiality, we refer to them as Telco Corp case and adfind case. Telco Corp. Case Purpose and Method The first case study was conducted with a telecommunication provider located in Spain. The overall purpose of this investigation was to understand how semantic annotation tools can be helpful for users to organize and locate content within their work context and what motivates them to actively participate in semantic content creation. For this purpose, we organized an interview session and a focus group. We first conducted the interview session with 11 representative employees (Head of departments, project managers, developers, etc.). Each semi-structured interview took about 1 hour and was divided into four major issues, dealing with the understanding of (a) existing communication processes and – entities, (b) existing usage practices to retrieve information/knowledge, (c) benefits and problems with respect to semantic annotation mechanisms that can support information retrieval, and finally (d) factors that motivate or would motivate people to participate actively in semantic content creation. After the interview session, we arranged a three hours focus group workshop with five participants. The goal of this workshop was to find out how appropriate semantic
Designing for Motivation: Focusing on Motivational Values in Two Case Studies
259
tools would look like and what would motivate users to use them. In this workshop, participants first conducted a scenario-oriented walkthrough and also completed work related tasks by using the existing content and knowledge management systems of the company. We used the thinking aloud and constructive interaction methodology and the constructive and critical comments of the users were recorded. After the scenariooriented walkthrough, the participants discussed what tools or techniques can help in order to find information more effectively and efficiently. They also discussed the problem of participation, i.e. what can motive employees to actively contribute in semantic content creation. The results recorded through interviews and workshops were transcribed and analyzed. This paper focuses only on the results related to motivational aspects. Two of the authors interpreted the related data separately and assigned them to the motivational value categories mentioned before. They also discussed their different interpretations and came to the following findings. Results Community support was mentioned by some interviewed participants as one of the main reasons for why they would contribute. Two of the interviewed participants emphasized the relevance of community for them. One of the participants stated that the internal network is very strong in their group and that people would tend to help each other as much as they can. One interviewee emphasized the usefulness of expert allocation support to locate contact persons to specific knowledge, documents or technologies more easily and pointed out that he would annotate or add additional information to make this work. Reputation was also a relevant motivational value for some users as they expressed their desire to gain recognition from others in their company. One of the participants indicated that a strong incentive would allow users to build a reputation in a certain area, for example, when switching projects (which happens often for Telco Corp. workers) they could be channeled towards activities they actually like. Self development through learning from the annotations of others was highly valued by several participants. In particular, they emphasized the importance of the usefulness of annotations. For example, one participant stated that the annotations would allow him to find information from other projects more easily in order to enhance the collective knowledge. Another participant would be motivated to use annotation tools if such tools allowed him to track the expertise of people in an area based on the content they generate. Personal enjoyment (or having fun) during annotating was also desired by one participant in our interview. He stated that the “entertaining” side of the tools is very important to him in order to be motivated to use the annotation tools. Finally, the expectation of self benefit can also be regarded as a reason for using the annotation tools. We could infer this indirectly from the statements of some participants. For example, one participant stated that he would not use the annotation tools as he could not see any benefits in annotating and regards using them as a “waste of time”. Another participant, who could also not see any value in using annotation tools, stated that he would only use the tools if annotating was their main job.
260
F. Yetim, T. Wiedenhoefer, and M. Rohde
Adfind Case Purpose and Method The second case study was conducted with a web service portal provider located in Austria. The main purpose of this study was to find out how semantic annotation mechanisms can improve web service search engines and what can motivate users to actively participate in semantic content creation. Like the study before, we organized an interview section and a focus group. We first conducted the interview session with 8 participants. Half of them were staff members of the portal provider and the others were developers, who frequently use web service search engines. Each semistructured interview took about 1 hour and was divided into four major issues, dealing with the understanding of (a) the general work tasks (if staff member) and the reason for using web service search engines (if developer), (b) the existing usage practices of web service retrieval processes, (c) semantic annotation mechanisms to improve retrieval processes, (d) motivation factors for participation in semantic content creation. After the interview session, a two hours focus group workshop was conducted with 14 participants, including staff members from the portal provider and external developers, who are experienced web service users. We presented the existing web service search engine to all participants and discussed shortcomings of the existing search engine, as well as ways for overcoming them, or how semantic annotation tools support web service retrieval processes, and finally, what motivates users to actively contribute to semantic content creation. The results recorded through interviews and workshops were transcribed and analyzed. Two of the researchers came up with separate interpretations of the comments related to motivational aspects and assigned them to the motivational value categories. They discussed their different interpretations and came to the following findings. Results Self-benefit was of particular importance for motivating participants in this case. The data showed that all participants would annotate web services to reach a personal goal, even though their goals differed. In a commercial context service providers/developers usually have the goal of selling as much web services as possible (in the case of open source, they may also aim to make web services popular – which is related to the reputation value, as discussed below). One participant, for example, annotates his own web services by providing a detailed description and specific tags in order to make them more visible and searchable for web service consumers. Web service consumers use annotation tools mainly to enhance their personal web service retrieval processes. Another participant mentioned that he would not mind if others saw his tags, but he uses tags for making bookmarks to find visited web services more easily and faster. Another interviewee pointed out that there must be an extra value for consumers in order to enrich web service descriptions. To get extra access to withhold information or functionalities would be a valuable mechanism. For example, one annotation would enable users to search for additional three times or for a specific amount of annotations the user will see related web services, from which the consumer might benefit.
Designing for Motivation: Focusing on Motivational Values in Two Case Studies
261
Community has also appeared to be relevant. In fact, there does not exist any visible community around the web service search engine, yet, the interviews as well as the data from the focus group session indicate that a community would foster the motivation to actively contribute to semantic content creation. Two participants, noncommercial web service developers, stated that it is important for them to give something back to a community, especially, if they have already profited from the work or knowledge of that community (e.g., web service recommendation). Another participant also pointed out that he would be more willing to spend time on annotating web services, if others also contributed. Reputation has played a significant role especially for those people who have been part of a community. For web service developers of a company, it is important to gain a good reputation. One participant stated that he would value the possibility to make his own web services more visible within the company by using annotation tools. Another interviewee pointed out that allowing one to change the status from newcomer to experienced developer or user can motivate them to participate. Self development was desired and articulated either directly by emphasizing the need for personal development or indirectly by emphasizing the need for appropriate instruments that may be helpful for personal development. As already mentioned in the description of self-benefit, one interviewee explained the benefit of getting extra information or functionality, after making a specific amount of annotations. During the focus group several participants emphasized the value of retrieving expertise from other developers through annotations and all participants agreed to have benefited from the annotation immediately. Finally, personal enjoyment (or having fun) during the annotation process was also emphasized by some participants. Summary The two case studies show some similarities with respect to their results, i.e. with respect to motivational values. In both cases we found indications for five values: reputation, self-development, self-benefit, community, and personal enjoyment. Yet, we should mention that each of the values cannot be strictly separated and that participants can value several things simultaneously. For example, as one participant state, he would annotate first of all for himself but would not mind if others saw the tags and liked it if they were useful to them. This expresses both self and community benefits. Despite the similarities between the results of these two cases, we do not claim that there would not be any differences if we had involved more users than the limited number of users we actually interviewed. Moreover, as user communities can change, there may also be other values which we were not able to identify in our groups. Nevertheless, what we can claim based on our findings is that the motivational values identified in our cases are good starting points to investigate in how they can be supported by means of technology, as presented in the following.
262
F. Yetim, T. Wiedenhoefer, and M. Rohde
3.3 Technical Investigations As part of the VSD methodology, technical investigations involve activities in which designers bring to bear state-of-the art knowledge on design specifications that might be used to realize given values within the context of a design project. Accordingly, we will consider the state-of-the-art knowledge on technical mechanisms or design features and suggest some ways for supporting and facilitating motivational values in our contexts. The suggested features may have already been realized and tested in some existing prototypes or may be untested ideas and thus be of hypothetical nature. Whether their implementations support the motivational values remains an issue of evaluation after the implementation, an issue which has not been addressed in this paper. Table 1 summarizes the motivational values identified in both cases as well as the design features identified in our literature analysis. Some of the features may be casespecific, whereas others can be applied in both cases. Yet, due to the limitation of space, we will not discuss their appropriateness for each case independently and in detail. We should also note that some features mentioned can simultaneously support multiple motivational values such as self-benefit and community benefits. Facilitating Reputation Building: A number of mechanisms have been suggested that can motivate people based on the value of reputation. One design feature that promotes contributions is visibility to the community. The contributor can be identified by a login name. This visibility offers contributors recognition that adds to their social presence online (Preece & Shneiderman, 2009). This has been observed to motivate tagging on Flickr (Ames & Naaman, 2007; Nov et al., 2008) and to increase editing contributions on Wikipedia (Nov, 2007), in turn, creating a growing reputation (Farzan et al., 2008). A way to increase credibility was observed in the community of Wikipedia (Forte et al., 2008). That members invest more of themselves in the community can be seen through their presence on multiple discussion channels, such as discussion pages, meta-pages or mailing lists. Registered contributors to Wikipedia develop online identities in order to be respected, trusted, and appreciated by their peers (Kuznetsov, 2006). A reputable identity is rewarding as it signifies success and accomplishment. Registered users can develop elaborate online profiles in their User Pages. Many Wikipedians include links to articles they have previously worked on, which allows other users to learn quickly about each other’s interests and level of expertise. Furthermore, users can nominate and award contributors with distinguished work. Outstanding articles can be listed as “Featured Articles” on the front page of Wikipedia, similarly, useful portals can be listed as “Featured Portals”. Users who contributed to Featured Articles and Featured Portals acquired a respectable reputation as their work was rewarded by the community. The evaluation of the quantity and quality of the contribution of others as well as celebrating status seems to be important. It has been argued that relative rankings of contributions strongly motivate contributions to information repositories (Cheshire and Antin, 2008). Some systems provide a way for people to recognize and evaluate another’s contribution. For example, eBay’s rating system allows purchasers to rate vendors according to the condition of the goods purchased, the timeliness of delivery, the quality of the purchase, and so on(Cheng and Vassileva, 2006). Another example
Designing for Motivation: Focusing on Motivational Values in Two Case Studies
263
was introduced by Farzan et al. (2008), a point-based rewarding-system within a social-network system. While the designed mechanism motivated participants to contribute more to a social network site, they pointed out that it is important to analyze for each individual social network, what action (e.g. posting a photo, making comment, etc.) leads to which amount of points. As their study showed, decay functionality, as well as the opportunity to adjust the point system to the user’s behavior of factors on the site, are crucial functionalities. Variations on this theme involve rating people’s ratings, awarding points or rewarding contributions with money (Kollock, 1999; Hars and Qu, 2002; Hummel et al., 2005; Cheng & Vassileva, 2006; Farzan et al., 2006). Flickr (http://www.flickr.com) addresses user reputation by highlighting specific content, e.g. “the most interesting photos.” (Wasko et al, 2005). Reality shows; talent competitions; YouTube, blog; and Flickr posts of pictures are all manifestations of the need to be noticed. Thus, recognizing and rewarding contributions and, in so doing, enabling the contributors to stand out are techniques used by researchers and designers to encourage online contributions (Preece & Shneiderman, 2009). Table 1. Design features for supporting motivational values
Motivational Values Reputation(-building)
Self-benefit Self-development Community
Personal Enjoyment
Supportive Design Features/Mechanisms Visibility to the community Multiple channels Building reputable online identities Point and status reward systems Feedback through rating of actions/choices Explaining self-benefits Rewarding through access to extra information Incenting by tagging awareness Promoting reciprocity Explaining community benefits Informing about the beneficiaries of contributions Incenting by goal-setting Rewarding cooperative behavior Social comparison through visualization of contributions Integrating fun features Packaging the task as a game
Facilitating Self-benefits: There are also mechanisms that can support the users’ need for self-benefit. Hars and Qu (2002) showed that the personal need for a software solution is a key factor. Some approaches motivate users to participate by turning their feedback into an activity that is important and meaningful to them. For example, Farzan and Brusilovsky (2006) utilized student ratings of courses in a course recommendation system to show their progress towards their career goals. This approach assumed that the main goal of students is to take courses that help them to find an interesting career in the future. The rating of the relevance of courses enables students to observe their progress towards each of their career goals (Farzan et al., 2008).
264
F. Yetim, T. Wiedenhoefer, and M. Rohde
There are also studies that argue that explaining self-benefits of an activity can motivate users to do the activity. Beenen, et al. (2004) applied this principle to the MovieLens system to address the problem of under-contribution. They studied the effect of revealing to the user the uniqueness and benefit of their contribution. Their result showed that users were more likely to participate when they were reminded about their benefit and the benefit of others (Farzan et al., 2008). Also using the MovieLens system, Rashid et al. (2006) studied the effect of identifying the beneficiary of a user’s contribution. Their results suggested that how much the individual identifies with and likes the group correlates with the user contribution level to the community. Thus, we conclude that the system should provide immediate feedback to the users with respect to the benefits or other positive effects of their contributions for their selfinterests. Facilitating Self-development: Reward mechanisms have been built in several systems to support the users’ desire to receive something for their contribution, which promotes their self-development (Farzan et al., 2008). For example, the Comtella system (Bretzke & Vassileva, 2003) rewards more cooperative users with incentives such as a greater bandwidth for downloading or a higher visibility in the community. Hummel et al. (2005) showed that announcing extra personal access to specific information as a reward for participating actively triggered increases during their experimentation. Participants continued to contribute after the reward was withdrawn. In addition, Thom-Santelli et al. (2008) suggested a mechanism for higher-quality tag recommendation in relation to the users’ role. They pointed out that users need to be able to see tags they have used within systems and across systems, and that the current visualizations of one’s tags and the body of tags within a system (e.g. tag clouds) are not appropriate in meeting this need. Facilitating Community Building: Wasko and Faraj (2005) argued that giving something in return to the community for its help was by far the most cited reason for why people participated. Reciprocity facilitates the creation of a community. For example, Kuznetsov (2006) showed that Wikipedia creates a community of contributors, which is subdivided into smaller spheres that unite people by area of interest, background, age, political opinion, etc. The community fosters a motivation to contribute by sharing information and thus helping the collective to which one belongs. The Wiki technology entails many tools such as “Community Portals”, “Collaborations”, and Discussion Pages, which encourage Wikipedians to work together, thereby causing them to meet other members with similar interests. Through this cooperation, Wikipedians develop a connection to other contributors and begin to feel needed by the Wikipedia community. Several studies emphasize the significance of explaining community benefit for the motivation (Rashid et al, 2006; Farzan et al., 2008). According to the “collective effort” model people are more likely to work hard if they feel their contribution is important or identifiable to the group (Ling et al. 2005). Ling et al. showed that email messages explaining the value of a contribution caused members to contribute less as compared to those whose messages did not mention value at all. In the context of the MoVieLens system, Breenen et al. (2004) showed that users are more likely to participate when they are reminded about their benefit and the benefit of others. Thus, a system should provide information about effects or benefits of a contribution for the community, while annotating documents or information.
Designing for Motivation: Focusing on Motivational Values in Two Case Studies
265
Rashid et al. (2006) found that how much the individual identifies with and likes the group correlates with the user contribution level to the community. They suggest that designers can use information about the beneficiaries of contributions to create subtle and integrated messages to increase motivation. Some studies emphasized the relevance of goal-setting for motivating contribution in online communities. For example, Beenen et al. (2004) conducted an experiment in MoveLens, and their results showed that specifically challenging goals resulted in a higher number of ratings and that group goals stimulated a higher contribution than did individual goals. Beenen et al. (2004) concluded that designers should be more specific about assigning goals or providing opportunities for individuals to declare contribution goals for themselves. The application of the goal setting theory can be observed in socialnetworking sites such as LinkedIn (http://www.linkedin.com) which provided information about how complete a user’s profile is (see Farzan et al., 2008). Another approach to address the users’ desire to share information with the community has been realized in the context of Comtella. The system allows the members of an online community to share web resources amongst each other and rewards more cooperative users (Bretzke and Vassileva, 2003; Cheng and Vassileva, 2006). Motivating social comparison in the quality of the contributions, comments, and ratings is an interesting approach. Vassileva and Sun (2007) showed that the visualization effectively encouraged social comparison and competition, which resulted in an increased participation. This means that designers can encourage user participation in desired activities by showing a representation of the contributions of the community members along these activities. Finally, there are many other factors that may have a motivating effect: For example, a welcoming environment, safety, support for newcomers, and contacts to ask questions (Preece & Shneiderman 2009) as well as telling participants that their contributions are valued because of their expertise (Ling et al., 2005). Facilitating Personal Enjoyment: Personal enjoyment or fun has been viewed as a separate design space. According to Shneiderman (2004), designers must address three important goals that contribute to fun-in-doing: (1) provide the right functions so that users can accomplish their goals, (2) offer usability plus reliability to prevent frustration from undermining the fun, and (3) engage users with fun-features. Fun features such as alluring metaphors, compelling content, attractive graphics, appealing animations, and satisfying sounds can promote user engagement. Game approaches are the most popular ones to motivate people based on fun and intellectual challenge as the predominant user experience. Games package a task as a game to use the computational power of humans (von Ahn, 2006). They are significant features of a system that motivates people. Players have the incentive of playing a game that is competitive and entertaining and at the same time produces useful semantic annotations. The task of annotation is thereby well hidden behind a motivating concept and an extremely simple interface. There are different options for designing incentives-driven game tools. Example games are described in (Siorpaes & Hepp, 2008), including games for the construction of ontologies and the semantic annotation of data (OntoTube and OntoBay). For example, OntoTube is a two player quiz game for annotating videos. Both players have to answer questions about the video, and the answers are used for data generation.
266
F. Yetim, T. Wiedenhoefer, and M. Rohde
4 Conclusion In this paper, we have presented the results of our research aiming at the value sensitive design of semantic web applications in order to motivate people to contribute to semantic web contents. By adopting the VSD framework which distinguishes three levels of investigation, our approach started with empirical investigations to understand what motivates people in two application contexts. Then we suggested some design features that would support those values and thus motivate users. Our study contributed to the current research and practice by illustrating how the VSD can be employed for investigating motivational values and by providing a review of design features to support end the users’ motivation to contribute to public goods. Researchers may use the VSD methodology in the way advocated here to conduct further research. Practitioners may implement design features suggested here and evaluate whether they support motivational values as claimed in this paper. Yet, the research has also some limitations due to the limited number of participants involved in both empirical studies. So, we cannot claim with certainty whether values other than those identified here may or may not play a role in these contexts. In addition, as user communities can change, there may also be emerging values. Finally, our claims that the suggested design features may support the motivational values are of hypothetical nature. The features need to be implemented and tested in both application contexts. This will be one of our future research efforts. All in all, we conclude that incentives enable and motivate a particular course of action. To understand the motivational values present in the application contexts is a good starting point to designing systems that motivate. Acknowledgment. This work has been supported by the EU-funded project INSEMTIVES - Incentives for Semantics (www.insemtives.eu, FP7-ICT-2007-3, Contract Number 231181).
References 1. Ames, M., Naaman, M.: Why we tag: motivations for annotation in mobile and online media. In: Proceedings of 25th Annual ACM Conference on Human Factors in Computing Systems (2007) 2. Batson, C.D., Ahmad, N., Tsang, J.: Four motives for community involvement. Journal of Social Issues 58(3), 429–445 (2002) 3. Beenen, G., Ling, K., Wang, X., Chang, K., Frankowski, D., Resnick, P., Kraut, R.E.: Using Social Psychology to Motivate Contributions to Online Communities. In: Proceedings of the CSCW 2004, pp. 212–221. ACM Press, New York (2004) 4. Bretzke, H., Vassileva, J.: Motivating Cooperation in Peer to Peer Networks. In: Brusilovsky, P., Corbett, A.T., de Rosis, F. (eds.) UM 2003. LNCS, vol. 2702, pp. 218– 227. Springer, Heidelberg (2003) 5. Cheng, R., Vassileva, J.: Design and evaluation of an adaptive incentive mechanism for sustained educational online communities. User Modeling and User-Adapted Interaction 16(3-4), 321–348 (2006)
Designing for Motivation: Focusing on Motivational Values in Two Case Studies
267
6. Cheshire, C., Antin, J.: The social psychological effects of feedback on the production of Internet information pools. Journal of Computer Mediated Communication (13), 705–725 (2008) 7. Cuel, R., Morozova, O., Rohde, M., Simperl, E., Siorpaes, K., Tokarchuk, O., Wiedenhoefer, T., Yetim, F., Zamarian, M.: Motivation Mechanisms for Participation in Human-driven Semantic Content Creation. International Journal of Knowledge Engineering and Data Mining 1(4), 331–349 (2011) 8. Friedman, B., Kahn, P., Borning, A.: Value Sensitive Design and Information Systems. In: Zhang, P., Galletta, D. (eds.) Human-Computer Interaction and Management Information Systems: Foundations, pp. 348–372. M.E. Sharpe, New York (2006) 9. Fang, Y., Neufeld, D.: Understanding Sustained Participation in Open Source Software Projects. Journal of Management Information Systems 25(4), 9–50 (2009) 10. Farzan, R., Brusilovsky, P.: Social navigation support in a course recommendation system. In: Wade, V., Ashman, H., Smyth, B. (eds.) AH 2006. LNCS, vol. 4018, pp. 91–100. Springer, Heidelberg (2006) 11. Farzan, R., DiMicco, J., Brownholtz, B., Dugan, C., Geyer, W., Millen, D.R.: Results from deploying a participation incentive mechanism within the enterprise. In: Proceedings of CHI 2008 Conference on Human Factors in Computing Systems (2008) 12. Forte, A., Bruckman, A.: Why do people write for wikipedia? Incentives to contribute to open-content publishing. In: Proceedings of 41st Annual Hawaii International Conference on System Sciences (HICSS), Citeseer (2008) 13. Hars, A., Qu, S.: Working for free - Motivations for participating in open-source projects. International Journal of Electronic Commerce (6), 25–39 (2002) 14. Hinchey, M.G., Sterritt, R., Rouff, C.: Swarms and Swarm Intelligence. IEEE Computer 40(4), 111–113 (2007) 15. Howe, J.: Crowdsourcing. Why the Power of the Crowd is Driving the Future of Business. Crown Business Publishing, New York (2008) 16. Hummel, H.G.K., Burgos, D., Tattersall, C., Brouns, F., Kurvers, H., Koper, R.: Encouraging contributions in learning networks using incentive mechanisms. Journal of Computer Assisted Learning (21), 355–365 (2005) 17. Kollock, P.: The economies of online cooperation: gifts and public goods in cyberspace. In: Smith, M.A., Kollock, P. (eds.) Communities in Cyberspace. Routledge, London (1999) 18. Kuznetsov, S.: Motivations of contributors to Wikipedia. ACM SIGCAS Computers and Society 36(2), Article 1 (2006) 19. Le Dantec, C.A., Poole, E.S., Wyche, S.P.: Values as Lived Experience: Evolving Value Sensitive Design in Support of Value Discovery. In: Proceedings of the CHI, Boston, MA, USA, pp. 1141–1150 (April 7, 2009) 20. Ling, K., Beenen, G., Ludford, P., Wang, X., Chang, K., Li, X., Cosley, D., Frakowski, D., Terveen, L., Rashid, A.M., Resnick, P., Kraut, R.: Using social psychology to motivate contributions to online communities. JCMC 10(4) (2005) 21. Nov, O.: What motivates Wikipedians? Communications of the ACM 50(11), 60–64 (2007) 22. Nov, O., Naaman, M., Ye, C.: What drives content tagging: The case of photos on Flickr. In: Proceedings of 26th Annual ACM Conference on Human Factors in Computing Systems, pp. 1097–1100 (2008) 23. Oreg, S., Nov, O.: Exploring motivations for contributing to open source initiatives: The roles of contribution context and personal values. Computers in Human Behavior 24(5), 2055–2073 (2008)
268
F. Yetim, T. Wiedenhoefer, and M. Rohde
24. Preece, J., Shneiderman, B.: The Reader-to-Leader Framework: Motivating TechnologyMediated Social Participation. AIS Transactions on Human-Computer Interaction 1(1), 13–32 (2009) 25. Rashid, A.M., Ling, K., Tassone, R.D., Resnick, P., Kraut, R., Reidl, J.: Motivating participation by displaying the value of contribution. In: Proceedings of CHI 2006 Conference on Human Factors in Computing Systems, pp. 955–958 (2006) 26. Shneiderman, B.: Designing for Fun: How can we design user interfaces to be more fun? Interactions 11(5), 48–50 (2004) 27. Siorpaes, K., Hepp, M.: Games with a Purpose for the Semantic Web. IEEE Intelligent Systems 23(3), 50–60 (2008) 28. Thom-Santelli, J., Muller, M.J., Millen, D.R.: Social tagging roles: publishers, evangelists, leaders. In: Proceeding of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in Computing Systems, pp. 1041–1044. ACM, New York (2008) 29. von Ahn, L.: Games With A Purpose. IEEE Computer 39(6), 96–98 (2006) 30. Vassileva, J., Sun, L.: Using Community Visualization to Stimulate Participation in Online Communities. e-Service Journal 6(1), 3–39 (2007) 31. Wagner, C., Prasarnphanich, P.: Innovating collaborative content creation: The role of altruism and wiki technology. In: Proceedings of the 40th Annual Hawaii International Conference on System Sciences, pp. 18–27 (2007) 32. Wasko, M., Faraj, S.: Why Should I Share? Examining Social Capital and Knowledge Contribution in Electronic Networks of Practice. MIS Quarterly 29(1), 35–57 (2005) 33. Yetim, F.: Bringing Discourse Ethics to Value Sensitive Design: Pathways to Toward a Deliberative Future. AIS Transactions on Human-Computer Interaction 3(2), 133–155 (2011a) 34. Yetim, F.: Focusing on Values in Information Systems Development: A Critical Review of three Methodological Frameworks. In: Proceedings of the International Conference “Wirtschaftsinformatik”, Zurich, Switzerland, February 16-18, pp. 1197–1204 (2011b) 35. Zhang, P.: Motivational affordances: Reasons for ICT Design and USE. Communications of the ACM 61(11), 145–147 (2008)
A Bounded Confidence Approach to Understanding User Participation in Peer Production Systems Giovanni Luca Ciampaglia [email protected] http://www.inf.usi.ch/phd/ciampaglia
Abstract. Commons-based peer production does seem to rest upon a paradox. Although users produce all contents, at the same time participation is commonly on a voluntary basis, and largely incentivized by achievement of project’s goals. This means that users have to coordinate their actions and goals, in order to keep themselves from leaving. While this situation is easily explainable for small groups of highly committed, like-minded individuals, little is known about large-scale, heterogeneous projects, such as Wikipedia. In this contribution we present a model of peer production in a large online community. The model features a dynamic population of bounded confidence users, and an endogenous process of user departure. Using global sensitivity analysis, we identify the most important parameters affecting the lifespan of user participation. We find that the model presents two distinct regimes, and that the shift between them is governed by the bounded confidence parameter. For low values of this parameter, users depart almost immediately. For high values, however, the model produces a bimodal distribution of user lifespan. These results suggest that user participation to online communities could be explained in terms of group consensus, and provide a novel connection between models of opinion dynamics and commons-based peer production.
1
Introduction
In the past decade mass collaboration platforms have become common in several production contexts. The term commons-based peer production has been coined to refer to a broad range of collaborative systems, such as those used for producing software, sharing digital content, and organizing large knowledge repositories, however, seem to be based upon a paradox. In wikis, there is a link between quality and cooperation [30], but, at the same time, contribution is voluntary, based on non-monetary incentives [23,26]. For small teams, this might not be a problem. In large scale wikis, where low access barriers are necessary to attract vast masses of contributors [8], and where expert users play a crucial role in maintenance and governance [2], user retention becomes instead crucial [10]. A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 269–282, 2011. Springer-Verlag Berlin Heidelberg 2011
270
G.L. Ciampaglia
An established fact about participation to online groups is the preferential behavior of users, that is, a newcomer’s long-term participation can be predicted by the outcome of his or her early interactions [1,21]. This could be explained in terms of Socialization theory [6], as users assess the willingness of the community to accept them and vice versa. It is also true, however, that quality assessment of the produced contents, and in particular comparison of the objectives of an individual with those of the community, is important in determining user participation [17]. This could be explained as a form of day-to-day coordination or group consensus taking place among editors [15]. In this paper we study user participation as a collective social phenomenon [4]. Other models of peer-production have been proposed already, for example for social information filtering platforms [13]. Here, we draw specifically from the modeling work on models of social influence under bounded confidence [9,12]. Let us consider a community of users engaged in editing a collection of pages, e.g. Wikipedia. Pages are denoted by a certain number of features upon which users can find themselves in agreement or not. For example, let us consider the writing style of pages. Users try to modify pages according to their objectives, i.e. using their own style. At the same time, by interacting with contents, users can be also influenced by the style of other users. This reciprocal influence, however, happens only to a certain extent, that is, only when user and page (that is, their styles) are similar enough. Vandals, to illustrate with the same example, might not be interested in learning the encyclopedic writing style. In the context of social psychology this phenomenon is known as bounded confidence, and is regarded as a general feature of human communication within groups that try to reach consensus [12]. It can be also seen as a form of herding in that people are influenced by the social context they are in [22]. The population of users in our model is dynamic, with user departure determined endogenously by the social influence process. Although others have already studied Deffuant’s model to a dynamic population [3], here we explicitly link the process of social influence to user participation. We implemented these ideas in an agent-based model of a peer production system. In this model, several factors affect the behavior of agents, such as user activity, content popularity, and community growth. To understand what factors are truly important for the resulting dynamics of user participation, we performed a factor screening using global sensitivity analysis. 1.1
Related Work
The subject of user participation in mass collaboration systems has been already touched by several authors, for example on social networking sites [16], and knowledge sharing platforms [31]. A “momentum” law has been proposed for the distribution of user life edits of inactive users [29]. The distribution of user account lifespans has been shown to decay with a heavy tail, and a powerlaw model has been proposed after this observation [11]. Empirical data from Wikipedia, however, seem to support a super-position of different regimes [7]; a feature of the model we present here is indeed a bimodal distribution of user
A Bounded Confidence Approach to Understanding User Participation
271
lifespans. In the context of wikis and other free open source initiatives some authors have used survival analysis to outline the diffences between different communities, [20] but this modeling technique is not suited to understand the connection between social influence, group coordination, and user retention. We advocate the need to explicitly model such processes explicitly. The paper is organized as follows: in Sec. 2 we introduce our model of peer production; in Sec. 3 we briefly describe global sensitivity analysis and Gaussian Processes, the two statistical techniques we used for the factor screening study; in Sec. 4 we present our main results and we discuss them in Sec. 5.
2
An Agent-Based Model of Commons-Based Peer Production
In this section we introduce our model of peer production. While we make explicit use of the terminology of wiki platforms (e.g. “users” who “edit pages”) we stress that ours is a general model of consensus building in a dynamic bipartite population, and not merely a description of a wiki platform. We also stress that in our model the state of agents may not necessarily represent an opinion in the classic sense of other studies of opinion dynamics, i.e. extremes of the spectrum do not necessarily denote – say – political extremism, nor we speak of “moderates” to identify the center of the opinion space. To keep things simple, we consider only the unidimensional case, i.e. the state of an agent is a scalar number in the interval [0, 1]. We denote with x (t) the state of a generic user at time t and with y (t) the state of a generic page. The interaction rule between a user and a page captures the dynamics of social influence. Let us imagine that at time t a user edits a page. Let μ ∈ [0, 1/2] be the speed (or uncertainty) parameter and ε ∈ [0, 1] the confidence [18]. If |x (t) − y (t) | < ε then: x (t) ← x (t) + μ (y (t) − x (t))
(1)
y (t) ← y (t) + μ (x (t) − y (t))
(2)
else, if |x (t) − y (t) | ≥ ε, we allow only Eq. (2) to take place with probability r. This addition to the bounded confidence averaging rule reflects the fact that, in peer production systems, users often deal with content they do not agree with without being influence by it, as when a vandalized page is reverted to a previous, non-vandalized revision (also known as rollback). Different pages can reflect different topics and hence receive attention from users based on their popularity. We employ a simple reinforcement mechanism to model this. Let cp ≥ 0 be a constant. If mt is the number of edits a page has received up to time t, then the probability of it being selected at that time will proportional to mt + cp . When cp → ∞, pages will be chosen for editing with uniform distribution, regardless of the number of edits they have received. Hence, we can study the impact of of content popularity in user participation
272
G.L. Ciampaglia
by setting cp to a small or large value. Of course users do not always choose to edit an existing page. Sometime, a user can decide to create a new page. We model this by considering a rate of new page creations ρp . Whenever a new page is created, its state y is equal to the state x of creator. Creators are chosen at random among existing users. In order to model user participation, the population of users is dynamic. First, we consider an input rate of new users ρu , whose state is chosen at random within the interval [0, 1]. Second, we consider a inhomogeneous departure rate that depends on the experience of users. Let us consider a generic user at time t and let us denote with nt the number of edits he (or she) did up to t, and with st the number of these edits that resulted in the application of Eq. (1). Let cs ≥ 0 be a constant and r(t) be the ratio r (t) =
st + cs n t + cs
(3)
The rate of departure λd (t) is then defined as: λd (t) =
r (t) 1 − r (t) + τ0 τ1
(4)
with τ0 τ1 time scale parameters. Depending on the value of r (t), the expected lifetime τ will interpolate between two values: τ = τ0 (long lifetime) for r (t) = 1, τ = τ1 if r (t) = 0 (short lifetime). If cs → ∞, we recover a homogeneous process with rate τ0−1 , so we can set cs to control how sensitive the departure rate is to unsuccessful interactions.
3 3.1
Evaluation Methods Computer Code Emulation via Gaussian Processes
Although we can perform the statistical evaluation of our peer production model using directly the computer simulator, this approach is not desirable, as evaluation of the computer code can be quite time consuming. We rely instead on emulation of the computer code output. We use a Gaussian Process (GP) as a surrogate model of the average lifetime τ of users in our peer production system. Gaussian processes (or Gaussian Random Functions, GRF) are a supervised learning technique used for functional approximation of smooth surfaces and for prediction purposes: see [25] for the application of GP to computer code evaluation. Given input sites Θ obs = (θ1 , θ2 , . . . , θN ) we can evaluate our model as specified above, and obtain observations of the average user lifetime Tobs = (τ1 , τ2 , . . . , τN ). Based on these observations, we wish to predict the value of τ at an untested input site θ, i.e. τ (θ). With a GP, this value is τˆ (θ) = E [τ (θ) | Θobs ]; the uncertainty in the prediction, that is, Var [ˆ τ (θ)], is equal to Var [τ (θ) | Θobs ]. With it we can compute a confidence interval that characterizes the uncertainty of the prediction of τ based on training data (Θobs , Tobs ).
A Bounded Confidence Approach to Understanding User Participation
273
There are several strategies for selecting the input sites Θobs at which we will run our computer simulator. Here we choose to employ a uniform, space-filling design generated via Latin Hypercube Sampling (LHS) because it yields better error bounds than those produced with uniform random sampling [19]. The space-filling requirement is attained using a maximin design. A maximin design is any collection of points Θ that maximizes the minimum distance between points: maxΘ mini
Global Sensitivity Analysis
A computational or mathematical model is comprised usually of a number of parameters, or factors, which are meant to affect in some way its output, or response. Hence, in general, a model can be thought as a mapping between factors (input) and responses (output). One might be interested in the problem of quantifying how much output “variability” in this mapping can be apportioned to each of the inputs. Global Sensitivity Analysis (GSA) is a set of statistical techniques used to get an answer to this problem. See [24] for a primer on GSA. One application of GSA is factor screening. The ranking of parameters is usually done by computing the sensitivity indices of each input parameter (factor). There are various techniques for computing the sensitivity indices, each with its own properties and assumptions. In this study we computed sensitivity indices by decomposing the output variance of our surrogate model. We used other techniques as well, namely partial correlation coefficients and standardized regression coefficients, and they gave concordant results. We choose to report here only the results of the decomposition of variance because it applies more naturally to non-linear models like ours. The method we use was proposed by Sobol and is based on the analysis of variance (anova) [28]. The idea is to decompose the variance of the output in several components that are attributable to independent factors, in our case the parameters of the model. d Let us assume that the space of parameters is 0, 1 , where d is the number of parameters. Sobol proposes to write the output Y as: d Yi θi + Y θ1 , . . . , θd = Y0 + i=1
1≤i<j≤d
Yi,j θi , θj + + · · · + Y1,2,...,d θ1 , θ2 , . . . , θd (5)
and shows that this decomposition is unique under the assumption that components are orthogonal and have zero mean. In Eq. (5), Y0 = E [Y ], Yi θi is the main effect of parameter θi , Yi,j (θi , θj ) is the 2-way interaction effect between the i-th and j-th parameters (i = j), and so on. Each summand is computable from suitable integrals. For example the main effect of Yi is: 1 1 ··· Y (θ1 , . . . , θd )dθ ¬i − Y0 (6) Yi (θi ) = 0
0
274
G.L. Ciampaglia
where with θ¬i when mean the reduced parameter vector obtained by considering all parameters except θi . Similar formulas can be obtained for higher order effects. Let us now consider the variances of the summands of Eq. (5). We can decompose σ 2 , the total variance of Y , as: σ2 =
d
σi2 +
i=1
2 2 σi,j + · · · + σ1,2,...,d
(7)
1≤i<j≤d
The sensitivity indices proposed by Sobol are obtained by standardizing all summands of Eq. (7), obtaining: 1=
d
Mi +
i=1
Ci,j + · · · + C1,2,...d
(8)
1≤i<j≤d
Mi is the main sensitivity index of parameter θi , Ci,j is the two-way interaction index between θi and θj , etc. Two quantities are of interest for assessing the importance of a parameter: the already cited main sensitivity index Mi ; and the total interaction index Ti , which is defined as the sum of all terms that involve parameter θi : Ci,j + Ci,j,k + · · · + C1,2,...d (9) Ti = j=i
4
1≤j
Results
4.1
Simulation Scenario
Table 1 lists all parameters of the model, together with simulation settings. Two quantities have been held fixed: simulation time, and transient time. Two other parameters, the initial number of users Nu and the initial number of pages Np , are determined after a transient, see Subsec. 4.2 below. All remaining parameters, instead, were assigned a range of values. To sum up, we had an input space of 10 independent dimensions. We chose the long (τ0 ) and short (τ1 ) user lifetimes to range in non-overlapping intervals corresponding to different time scales, consistently with empirical observations of user participation from Wikipedia [7]. The value of the simulation time T was chosen so that a simulation would comprise more than one generation of long-term users. Intervals for event rates such as the daily rate of edits (λe ), of new user arrivals (ρu ), and of new page creations (ρp ), were chosen looking at plausible values from the public statistics on the Wikipedia project.1 . These parameters have a strong influence on simulation time, therefore ranges for them were set trying to strike a balance between exhaustiveness of the sensitivity analysis and simulation wall clock time. 1
These statistics are freely available on http://stats.wikimedia.org
A Bounded Confidence Approach to Understanding User Participation
275
Table 1. Parameters settings for global sensitivity analysis Parameter
Variable name
Const. popularity Const. successes Confidence Daily edit rate Daily rate of pages Daily rate of users Initial no. of pages Initial no. of users Long lifetime Rollback probability Short lifetime Simulation time Speed Transient time
const pop const succ confidence daily users daily pages daily edits
long life rollback prob short life speed
Symbol
Value(s)
cp cs ε λe ρp ρu Np Nu τ0 r τ1 T μ T0
(0, 100) (0, 100) (0, 1/2) (1, 20) (1, 20) (1, 20)
(10, 100) (0, 1) (1/24, 1) 1 (0, 1/2) 2
Unit
day 1/day 1/day
day day year
Distribution uniform uniform uniform uniform uniform uniform see Subsec. 4.2 see Subsec. 4.2 uniform uniform uniform uniform
year
The choice of ranges for the constant popularity term (cp ) and for the constant successes term (cs ) was a bit more problematic. To our knowledge, none of them has ever been studied before in the context of peer production communities. We settled for ranges we deemed would be large enough for our purposes. Finally, the opinion dynamics parameters. It is clear that μ < 1/2. Regarding the confidence ε, the literature on bounded confidence models in one dimension suggests that for ε > 1/2 the dynamics of consensus does not change noticeably. This should apply also to the dynamics of user participation in our model. We ran some simulations of the average lifetime, and found confirmation to this intuition. We thus restricted ε to the interval (0, 1/2). 4.2
Transient
Transient duration T0 was determined empirically: we plotted the daily number of users Nu (d; θ), d = 1, 2, . . . , for various values of the parameters θ and chose T0 as the time after which all curves look stationary. Figure 1a reports the results of this exercise. In the figure, the shaded region corresponds to the transient interval (0, T0 ). The value of T0 is 730 days. The values of θ were taken from a maximin LHD with 50 points. Each curve is scaled by its average value Nu (θ) computed over the interval d ∈ [731, 1095]. The yellow solid line is a B-spline fit of 50 evenly spaced observations of the expected scaled number of users Nu/N u , and serves as a guide for the eye. During the transient phase we did not record any data, so that the estimation of τ , on which the sensitivity our analysis is based, did not reflect the dynamics of opinion formation during the transient.
276
G.L. Ciampaglia
transient
2.5 2.0 1.5 1.0
80
const succ const pop rollback prob short life long life
60 40 20
0.5 0.0
daily edits daily users daily pages confidence speed
100
main effect
no. of users Nu /Nu
3.0
0
200
400
600
800
1000
0 0.0
time (days)
(a)
0.2
0.4
0.6
0.8
1.0
parameter scaled value
(b)
Fig. 1. (a): transient time determination. (b): main effects plot.
4.3
Factor Screening via Global Sensitivity Analysis
We sampled a maximin Latin Hypercube Design (LHD) with 50 points using the intervals listed in Tab. 1. To sample a decent maximin design, we generated 104 hypercubes at random and selected the one that maximized Eq. (3.1). We computed the average user lifetime τ (θ) by running 10 replications for any θ and averaging the values obtained. We first plotted the values of the response variable τ versus each input parameter to check visually for any linear trend. Scatter plots are shown in Fig. 2. A multiple linear regression gave a coefficient of determination R2 = 0.83. However, no clear trend emerges from the plots for all parameters except for the confidence ε and the long lifetime τ0 . For the latter, something similar to a linear trend can be seen, whereas for the other the relationship looks more of sigmoidal type. We tried fitting a sigmoid function to τ as a function of ε. The result of a K-S test (p-value < 3.5 × 10−4 ) rejected the normality of the residuals, and therefore led us to exclude a sigmoid model as a possible functional form of τ (ε). Next, we fitted a GP emulator to the average user lifetime data, using the open source machine learning toolkit from the SciKits collection2 We then discarded the simulator and used τˆ (θ) in lieu of it. To compute the sensitivity indices we used the Winding Stairs (WS) method, a resampling technique proposed in [14]. We computed main (Mi ) and total interaction (Ti ) effect indices for each parameter (i = 1 . . . 10) using a WS matrix with 104 rows. The results are shown in Tab. 2. The total variance σ ˆ 2 was also computed from W (each column of a WS matrix is an independent sample). The WS method yields better estimates of the total interaction effects than other methods [5], so we impute the presence of some slightly negative values of Mi to the uncertainty in the estimation of the total output variance σ 2 and to the presence of factors with almost null total effect. 2
Home page: http://scikit-learn.sourceforge.net/
< τ > (days)
A Bounded Confidence Approach to Understanding User Participation
140 105 70 35 0
daily edits
0
0.00
0.0
10
speed
daily users
20
0
0.25
0.50 0
0.5
1.0 10
short life
daily pages
10
20
50
100 0
55
100
const succ
long life
0
confidence
10
20 0.00
0.25
0.50
50
100 0.0
0.5
1.0
const pop
277
rollback prob
Fig. 2. Scatter plots of τ versus θ = (λe , ρu , ρp , ε, μ, cs , cp , r, τ0 , τ1 ). Error bars (standard error of the mean lifetime computed over 10 realization) are all smaller than the data points. Table 2. Variance decomposition. Winding Stairs sample size 104 rows, total variance 635.365 days2 . Parameter
Mi
Ti
λe ρu ρp ε μ cs cp r τ1 τ0
-0.002 -0.003 0.003 0.65 -0.004 0.004 -0.005 -0.005 0.002 0.18
0.014 0.02 0.027 0.73 0.03 0.03 0.016 0.026 0.03 0.23
Only two factors have a Ti > 3%. These are the confidence ε, and the long term lifetime τ0 . We explored further the individual contribution of each parameter in the output variance by looking at the main effect plots. These are plots of Y (θi ) as a function of θi , and can be obtained evaluating Eq. (6) using Monte Carlo averaging and the GP emulator. To facilitate comparison of the different parameter ranges, in Fig. 1b we plotted the main effect as a function of the scaled parameter value. Figure 1b shows that ρp and τ1 have a slight effect on user lifetime too, the first negative and the second positive.
278
G.L. Ciampaglia
70
80
60
lifetime
lifetime
70 60
50
50
40
40
30
30 20
20
10
10
0.2
0.2
sh0.4 ort 0.6 lif
0.1
e
0.8
0.1
70 60
lifetime
70 60
lifetime
lo40n50 g 60 lif70 e 80 90
co nfi
30
0.3
0.2
co nfi
0.3
de nc e
0.4
de nc e
0.4
20
50
50
40
40
30
30
20
20
10
10
e
nc
de
ro0.2 llba0.4 0.6 ck p r
0.2 0.8
ob
nfi
0.1
0.3 0.1
co
nc
de
0.2 0.4
nfi
s0.2 pee0.3 d
co
0.1
0.4
e
0.4 0.3
Fig. 3. Two-way interaction plots
The difference between Ti and Mi is the fraction of variance that is only due to interactions between θi and any other parameter or groups of parameters. For ε this difference is 0.08 and for τ0 it is 0.05. Summed up together, this residual interaction effect amounts to almost three quarters (77%) of the total interaction effects from all remaining parameters. Thus we expect ε and τ0 to have some interesting interactions with other parameters. We explored two-way interactions systematically using two-way interaction plots, which are the 3D counterparts of the curves of Fig. 1b. Given two parameters θi and θj , with i = j, we computed Yi,j θi , θj ): we evaluated Eq. (6) in a similar way, this time holding fixed the values of two parameters instead of one. Here we report the results on the interaction between ε and other parameters, included τ0 . The plots are shown in Fig. 3 and 4.
A Bounded Confidence Approach to Understanding User Participation
279
lifetime
70 60
lifetime
70 60 50
50
40
40
30
30
20
20
10
10
de nc e
0.4
de nc e
0.4
0.3
0.2
y 10pa
0.1
15
ges
co nfi
d5ail
0.2
co nfi
5
dail 10 y us 15 ers
0.3
0.1
70
lifetime
60 50 40 30 20 10
nc
de
y10ed
0.2 15
its
nfi
dail
0.1
co
5
e
0.4 0.3
Fig. 4. Two-way interaction plots (cont’d)
Almost all parameters show just a weak interaction with ε, which occurs at low (ε < 0.1) and high (ε > 0.4) values of it. Only the pair {ε, τ0 } shows a significant degree of interaction. 4.4
User Lifetime Distribution
Previous studies on continuous opinion dynamics under bounded confidence show that, as ε grows, the population of agents undergoes a gradual change from a regime with no consensus, to a regime of total consensus with a single cluster [9,12]. In our model this shift must reflect somehow in the average user lifetime, but what shape the user lifetime distribution takes during it? The findings from the previous section let us restrict the field of study to just two parameters of the original ten, namely ε and τ0 . In this section we focus only on them, and try to understand what is the actual distribution of user lifetimes, by simulating from our model.
ε=0
1.2 1.0 0.8 0.6 0.4 0.2 0.0 −5 −4 −3 −2 −1
0
u = log(τ ) (days)
1
2
Prob. Density p(x)
G.L. Ciampaglia
Prob. Density p(x)
280
ε = 0.3
0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 −8 −6 −4 −2
0
2
4
6
u = log(τ ) (days)
Fig. 5. GMM fit of log-lifetime of user accounts in two different runs of the model. For ε > εc a bi-modal pattern is a clear feature of user participation.
We performed simulations holding fixed the user lifetime parameters (τ0 = 100 days and τ1 = 1 hour), while changing the value of ε. The values of all other parameters were fixed to the midpoints of the respective ranges listed in Tab. 1. We computed the log-lifetime u = log (τ ) and fitted a 2-components Gaussian Mixture Model (GMM) to u. Figure 5 reports the result of the fitting, showing the densities of the individual components using stacked area plots. We report here only two values of ε, ε = 0 and ε = 0.3, which is a value greater than the threshold for consensus in Deffuant’s model, to show the difference between the two regimes.
5
Discussion
In this section we discuss the main findings of the present study. We presented an agent-based model of user participation in a peer-production community. We model participation as a bounded confidence consensus process, where users modify content according to their objectives and skills (represented by a continuous state x), and are in turn indirectly influenced by the rest of the community. We use global sensitivity analysis to study the importance of the model’s parameters in explaining the average user lifetime. The first interesting – and rather surprising – finding is that, as shown in Tab. 2, of the overall ten parameters of the model, only two affect the average user lifetime in a considerable way. This is interesting because it suggests that several other factors like content popularity, user community growth, and user activity rate, are not as important as the general level of “tolerance” of the community (given by the confidence ε) in affecting the process of group consensus. Moreover, interaction plots show that relevant interactions occur between ε and τ0 : this confirms the intuition that the role of τ0 is to set the support of the distribution of τ , and that ε acts as a switch, controlling the transition from a regime where only short-term forms of participation are possible, due to the low rate of successful user-page interactions, to a consensus regime where a cluster of long-term users is able to emerge. Of course, the results from the factor screening should be also viewed in light of our simulation setup. We decided to focus on a stable community, where
A Bounded Confidence Approach to Understanding User Participation
281
the number of users Nu is stationary, and not on the initial phase of community formation. Plausibly, during this transient phase other parameters, such as the speed μ, and the rollback probability r, might have more importance in determining the span of user participation. The second interesting finding is about the actual distribution of user participation, which is markedly bimodal. From Fig. 5 it is possible to appreciate, for ε = 0.3, a clear subdivision in two groups of users based on their participation span. We can see also a subdivision for ε = 0, which is probably related to the fact that cs = 50 in that setup. Although we did not perform a proper model calibration, this finding is encouraging, as previous studies on the distribution of user accounts lifetime in Wikipedia have shown a similar bimodal pattern [7]. In general, both findings show that agent-based model can be studied through the systematic use of simulations and computer code emulation, and provide a novel connection between model of opinion dynamics, whose study has been so far notoriously lacking on the empirical side [4,27], and peer production. Acknowledgments. Alberto Vancheri and Paolo Giordano for the insightful discussions; the anonymous reviewers, for the suggestions on how to improve the manuscript; the conference organization, for their generous financial support.
References 1. Backstrom, L., Kumar, R., Marlow, C., Novak, J., Tomkins, A.: Preferential behavior in online groups. In: WSDM 2008, pp. 1–11 (December 2007) 2. Beschastnikh, I., Kriplean, T., McDonald, D.W.: Wikipedian self-governance in action: Motivating the policy lens. In: Proc. of ICWSM 2008 (2008) 3. Carletti, T., Fanelli, D., Guarino, A., Bagnoli, F., Guazzini, A.: Birth and death in a continuous opinion dynamics model. Eur. Phys. J. B 64(2), 285–292 (2008) 4. Castellano, C., Fortunato, S., Loreto, V.: Statistical physics of social dynamics. Rev. Mod. Phys. 81(2), 591–646 (2009) 5. Chan, K., Saltelli, A., Tarantola, S.: Winding stairs: A sampling tool to compute sensitivity indices. Stat. and Comp. 10, 187–196 (2000) 6. Choi, B., Alexander, K., Kraut, R.E., Levine, J.M.: Socialization tactics in wikipedia and their effects. In: Proc. of CSCW 2010, New York, NY, USA, pp. 107–116 (2010) 7. Ciampaglia, G.L., Vancheri, A.: Empirical analysis of user participation in online communities: the case of wikipedia. In: Proc. of ICWSM 2010 (2010) 8. Ciffolilli, A.: Phantom authority, self-selective recruitment and retention of members in virtual communities: The case of wikipedia. First Monday 8(12) (December 2008) 9. Deffuant, G., Neau, D., Amblard, F., Weisbuch, G.: Mixing beliefs among interacting agents. Adv. Comp. Sys. 3, 87–98 (2001) 10. Goldman, E.: Wikipedia’s labor squeeze and its consequences. Telecomm. and High Tech. Law 8, 157–184 (2009) 11. Grabowski, A., Kosi´ nski, R.A.: Life span in online communities. Phys. Rev. E 82(6), 066108 (2010) 12. Hegselmann, R., Krause, U.: Opinion dynamics and bounded confidence–models, analysis, and simulation. J. Art. Soc. Soc. Sim. 5(3), paper 2 (2002)
282
G.L. Ciampaglia
13. Hogg, T., Lerman, K.: Stochastic models of user-contributory web sites, pp. 50–57 (2009) 14. Jansen, M., Rossing, W., Daamen, R.: Monte-Carlo Estimation Of Uncertainty Contributions From Several Independent Multivariate Sources. In: Grasman, J., van Straten, G. (eds.) Predictability And Nonlinear Modelling In Natural Sciences And Economics, pp. 334–343 (1994) 15. Kittur, A., Kraut, R.E.: Beyond wikipedia: Coordination and conflict in online production groups, pp. 215–224 (2010) 16. Leskovec, J., Backstrom, L., Kumar, R., Tomkins, A.: Microscopic evolution of social networks. In: Proc. of KDD 2008, New York, NY, USA, pp. 462–470 (2008) 17. Lin, H.F., Lee, G.G.: Determinants of success for online communities: an empirical study. Behav. & Inf. Tech. 25(6), 479–488 (2006) 18. Lorenz, J.: Continuous opinion dynamics under bounded confidence: A survey. Intl J. Mod. Phys. C 18, 1819–1838 (2007) 19. McKay, M.D.: Latin hypercube sampling as a tool in uncertainty analysis of computer models. In: Proc. of WSC 1992, New York, NY, USA (1992) 20. Ortega, F., Izquierdo-Cortazar, D.: Survival analysis in open development projects. In: Proc. of ICSE 2009, Washington, DC, USA, pp. 7–12 (2009) 21. Panciera, K., Halfaker, A., Terveen, L.: Wikipedians are born, not made. In: Proc. of GROUP 2009 (2009) 22. Raafat, R.M., Chater, N., Frith, C.: Herding in humans. Trends in Cog. Sci. 13(10), 420–428 (2009) 23. Rafaeli, S., Ariel, Y.: Online Motivational Factors: Incentives for Participation and Contribution in Wikipedia. In: Psychological Aspects of Cyberspace: Theory, Research, Applications, pp. 243–267. CUP (2008) 24. Saltelli, A., Tarantola, S., Campolongo, F., Ratto, M.: Sensitivity Analysis in Practice–A guide to Assessing Scientific Models. John Wiley & Sons, Ltd., Chichester (2004) 25. Santner, T., Williams, B., Notz, W.: The Design and Analysis of Computer Experiments. Springer, Heidelberg (2003) 26. Schroer, J., Hertel, G.: Voluntary engagement in an open web-based encyclopedia: Wikipedians and why they do it. Media Psych. 12(1), 96–120 (2009) 27. Sobkowicz, P.: Modelling opinion formation with physics tools: Call for closer link with reality. J. Art. Soc. and Soc. Sim. 12(1), 11 (2009) 28. Sobol’, I.M.: Global sensitivity indices for nonlinear mathematical models and their monte carlo estimates. Math. and Comp. in Sim. 55(1-3), 271–280 (2001) 29. Wilkinson, D.M.: Strong regularities in online peer production. In: Proc. of EC 2008, Chicago, Illinois, USA (2008) 30. Wilkinson, D.M., Huberman, B.A.: Cooperation and quality in wikipedia. In: Proc. of WikiSym 2007, Montr´eal, Qu´ebec, Canada, October 21-23 (2007) 31. Yang, J., Wei, X., Ackerman, M., Adamic, L.: Activity lifespan: An analysis of user survival patterns in online knowledge sharing communities. In: Proc. of ICWSM 2010 (2010)
Modelling Social Network Evolution Radosław Michalski, Sebastian Palus, Piotr Bródka, Przemysław Kazienko, and Krzysztof Juszczyszyn Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland {radoslaw.michalski,sebastian.palus,piotr.brodka, kazienko,krzysztof}@pwr.wroc.pl
Abstract. Most of the real social networks extracted from various data sources evolve and change their profile over time. For that reason, there is a great need to model evolution of networks in order to enable complex analyses of theirs dynamics. The model presented in the paper focuses on definition of differences between following network snapshots by means of Graph Differential Tuple. Keywords: social network evolution, graph distance measures.
1 Introduction Real social network extracted from data about IT user activities [3] have the dynamic nature. They evolve over time, and for that reason, there is a great need to model, analyse and measure the social network evolution. In this work, a new method of modelling the dynamic evolutionary patterns of the network by measuring similarity between graphs is proposed. The new approach which differ from other ones [1], [2], [5] may be applied to any dynamic network but it is also dedicated to the analysis of multilayered networks [3], [4], for which we can compare either the following network snapshots within a given layer or we can analyse differences between various layers within the same time window. For that purpose, first, we need to define a difference between two graphs and next, to measure this difference using dedicated measures. The overall question is: how to measure the change, i.e. how to extract simple values from the object called by authors ‘Graph Differential Tuple’? As a result, a variety of measures are proposed in the paper to extract simple, meaningful, numerical values describing changes, i.e. characterizing Graph Differential Tuple.
2 The Difference of Graphs – Graph Differential Tuple 2.1 General Concept Two graphs can differ in many ways. There can be different vertices, different edges and – for edges between the same vertices – different weights. In this concept we want to define the difference of graphs – basic operations needed to be done in order to transform one graph into the other one. A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 283–286, 2011. © Springer-Verlag Berlin Heidelberg 2011
284
R. Michalski et al.
Two graphs can be defined as: and , where: , , V1 – set of vertices in graph G1, V2 – set of vertices in graph G2, , : , – edges in G1, , : , – edges in G2, , 0,1 – weight of the edge between x and y. Using the above definition, the difference between graphs can be introduced as a set of different vertices, different edges and different weights. 2.2 Graph Differential Tuple In order to present graph difference in a consistent, detailed shape, the Graph Differential Tuple can be defined as follows: ∆
, where – set of added vertices – set of removed vertices , : , , – set of added edges , : , , – set of removed edges ∆ – the set of modified weight tuples , , , ,
: :
,
,
,
2.3 Case Study Using Graph Differential Tuple, evolution of a social network can be presented. Having e.g. two time windows of the same social network, this approach would reveal differences and show how is this social network changing in time. Let G1 and G2 be defined as follows: ,
, , , ,
, ,
,
, ,
, , , , , , ,
,
, , , ,
, , , , , , , ,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Table 1. Weight matrix for the graphs G1 and G2 ¢A,B² ¢B,A² ¢B,C² ¢C,D² ¢D,C² ¢D,E² ¢E,D² ¢C,E² ¢F,B² ¢F,E² 0.3 0.5 0.8 1.0 0.9 0.2 0.1 0.7 0.6 0.4 0.3 0.5 0.8 0.3 0.9 -
ͳ
ʹ
¢E,C² 0.1
¢G,A² 0.4
Then the Graph Differential Tuple will look as follows: , , , ∆ , where: , , , , , , , , , 0.4 , , , 0.3 ,
∆
,
,
,
,
,
,
,
,
,
,
,
¢A,G² 0.3
Modelling Social Network Evolution
Fig. 1. Visualization of G1
285
Fig. 2. Visualization of G2
3 The Measures of Distance between Graphs Based on the distance vector G1G2 = V + ,V − , E + , E − , E Δ
described in previous
section, a few distance measures, which presents the distance between two groups in numbers can be introduced. 3.1 The Sum The sum measure is the simplest measure possible. It is represented by weighted sum of all sets from G1G2 vector, i.e.: +
+
−
d (G1 , G2 ) = α V + α V s
−
+
+
−
−
+ β E + β E +γ E
Δ
(1)
where: α + , α − , β + , β − , γ ∈ [0;1] are the coefficients which reflect the importance of each vector element. 3.2 The Normalised Sum
The second measure is based on the first measure but it is normalised by the number of the nodes and edges from both graphs. It returns value from range [0;1], where 0 means that two graphs are identical, and 1 that graphs are completely different. It is defined as follows: d n (G1 , G2 ) =
α + V + + α − V − + β + E+ + β − E− + γ EΔ N1 + N 2 + E1 + E2
(2)
286
R. Michalski et al.
3.3 The Relative Sum
The relative sum informs how the graphs do differ, but relatively to the first graph:
d (G1, G2 ) = fn
α + V + + α − V − + β + E+ + β − E− + γ EΔ N1 + E1
(3)
3.4 Based on Edge Modification
The last measure is built on EΔ , i.e., edges modifications and computed as follows:
d
mw
a, b∈N ∩ N mw(a, b) 1 2 when E1 ∩ E2 ≠ φ (G1, G2 ) = E1 ∩ E2 in the other case 0
(4)
where: mw(a, b) = w2 (a, b) − w1 (a, b)
4 Conclusions and Future Work In the following paper there were proposed: a concept of Graph Differential Tuple and measures based on that tuple. Those measures are about to be used as a comparison tool for social network graphs. The next step is to compare those measures against themselves and other typical measures for graph comparison in terms of information we may acquire about social network evolution direction, speed and anomalies, and the computational complexity as well. Acknowledgements. The work was supported by fellowship co-financed by the European Union within the European Social Fund, The Polish Ministry of Science and Higher Education, the research project 2010-13 and the training within the "Green Transfer" project co-financed by the European Union from the European Social Fund.
References 1. Bunke, H.: On a relation between graph edit distance and maximum common subgraph. Pattern Recognition Letters 18(8), 689–694 (1997) 2. Eroh, L., Schultz, M.: Matching graphs. Journal of Graph Theory 29(2), 73–86 (1998) 3. Kazienko, P., Bródka, P., Musiał, K., Gaworecki, J.: Multi-layered Social Network Creation Based on Bibliographic Data. In: SocialCom 2010, pp. 407–412. IEEE Computer Society Press, Los Alamitos (2010) 4. Kazienko, P., Bródka, P., Musiał, K.: Individual Neighbourhood Exploration in Complex Multi-layered Social Network. In: WI-IAT 2010, pp. 5–8. IEEE Computer Society Press, Los Alamitos (2010) 5. Zager, L.: Graph Similarity and Matching. MIT Thesis, USA (2005)
Towards High-Quality Semantic Entity Detection over Online Forums Juan Du, Weiming Zhang, Peng Cai, Linling Ma, Weining Qian, and Aoying Zhou Institute of Massive Computing, Software Engineering Institute, East China Normal University
Abstract. User-generated content (UGC) implies user-behaviors. Mining on such data helps understanding the relationship between social media and the real world. Howevr, UGC is usually of low quality, which results in the diÆculty of semantic entity extraction. In this paper, we propose a method towards high-quality semantic entity refinement on forums by employing external resources. Experiments on real-life Chinese online forums show the eectiveness of our method.
1 Introduction User-generated content (UGC) has been spreading widely. It has become a key kind of resource for information retrieval because of its huge volume and rich information. Unfortunately, the low-quality of UGC makes the information extraction rather diÆcult. To utilize UGC, we aim at detecting semantic entities, i.e. entities with annotated terms. With semantic entities, UGC can be well-organized and more understandable, so that further analysis can be more eective. The high-quality semantic entities can therefore contribute to further analysis such as personalized recommendation, meme detection and collective-behavior analysis. UGC are usually of low quality. The pieces of information are often short, while the expression is informal. Abbreviation, alias and symbol are often used. Existing semantic entity detection methods, such as those based on graphical models, highly depend on context of documents. Thus, they can not be applied on low quality UGC directly. In this paper, we take another approach that utilizes external resources to refine the annotations of semantic entities. We build a model to choose related resources. We also deeply analyze the contents and the structure of an external resource, by proposing six heuristic rules for extraction.
2 Approach with External Resources A semantic entity is a list of terms describing a specific entity. The entity can be an event or a topic. Existing methods can extract naming entities that use only one term to describe the entity. However, a single term is usually insuÆcient. We takes an approach with four steps. The first step generates a list of candidate terms Lw based on UGC. Then, the second one generates queries to external resources based on Lw . After that, in the third step, the external resource are queried and results are retrieved. Finally, in the last step, the term list is refined. A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 287–291, 2011. c Springer-Verlag Berlin Heidelberg 2011
288
J. Du et al.
The first step is preprocessing. Lw , as the input, can be generated by clustering. A forum post is presented as a document vector, in which each dimension is a term. A single pass clustering algorithm is adopted to cluster the terms. Terms with high weights are used as description of the cluster, i.e. the Lw . The second step is query generation. We generate the query through dierent combinations of terms in Lw . To minimize the times of searching, we apply Apriori Principle [1] into query generation. According to this, if we just go through the first page of the search pages, it crawls [0,2Nl -1] times, where Nl is the number of terms used for query generation. Actually, on web search engines, when searching with more than three dierent terms, the probability of finding results is low. The third step is querying external resources. There are two problems here. The first one is the crawling mode. And the other one is result ranking. Crawling mode. Comparing with using archives, we prefer to crawl the pages on external resources manually to get a better refined result according to the latest update. In our case, interested topics are stimulated by time and therefore earlier information is of less importance. Therefore, we use online crawling. Despite of timeconsumption for crawling, the number of the search pages N sp (SEi ) for a semantic entity is under control, that is N sp (SEi ) [0,2Nl -1]. Result ranking. The returned results are ranked based on their importance and relationships to the entity. Three features are used for ranking the results: the weight of the search terms Æw according to the ordered list Lw , the occurrence percentage Æo of the returned result in the candidate pool, and the position of the appearance of a term in the result Æm (the more important, e.g. in the title, the higher). All Æw , Æo , and Æm are normalized to [0 1]. And the final score of a result is defined as score Æw Æo Æm . Then, the results are ranked in descending order of the score. Table 1. Structure of a page in Wikipedia Field Page title (Pt ) First paragraph (Pa ) Infobox (Pi ) The left (Pl )
Description The name for the semantic entity that the page introduces A brief introduction to the semantic entity, like an abstract A small table to illustrate the related information of the semantic entity Full description of the semantic entity
The last step is the refinement phase. In this part, information on the related page is extracted to annotate the semantic entity. We consider words with a link on the page as critical terms. Furthermore, the contents in the item page from Wikipedia are wellorganized. The composition of a page is shown in Table 1. Essentially, the ranking for the relevance follows Rule 1. Pt Pa S Pi S Pl S Pa U Pi U Pl U, where ”S” implies the word in the post while ”U” stands for the opposite. According to the characteristics of the page organization, we provide six heuristic rules for picking important words. The first three ones are: Lw 1 Pt Pa S , Lw 2 Pt Pa S Pi S , and Lw 3 Pt Pa S Pi S Pl S . Continuing adding one part according to Rule 1, we could form the left three heuristic rules, i.e. Lw 4, Lw 5, and Lw 6. For dierent external resources, the best heuristic rule to be applied varies.
Ë
Ë
¼
Ë
Ë
Ë
¼
Ë
¼
¼
¼
¼
Towards High-Quality Semantic Entity Detection over Online Forums
289
3 Experiment The proposed method is evaluated on a data set that containing posts of a Chinese online forum (Liba Forum: ) between Oct. 2009 and Oct. 2010. After preprocessing, 224 dierent semantic entities are extracted, implying 224 topics that people are mainly talking about. The ground truth for the related external links and the semantic annotation has already been labeled. We tune the parameters Æw , Æo and Æm to get a high accuracy with Æw 0.9, Æo 0.1 and Æm 0.3. In addition, we give the precision@N measurements with dierent number of search words, where N refers to the top N results in the candidate page pool. The result of precision@N is shown in Fig. 1. The highest precision@N rate is 85% with three search words in the top five of the candidate pages. Table 2 represents the number of pages to crawl during search. It indicates that pages to crawl do not increase linearly with more search words. We set 1.43 (seconds) to denote the time for crawling a page. This value is estimated by the arithmetic mean value of ten random search words. The total time consumption, T total , includes searching, retrieving and analyzing. To process 224 clusters, the time for searching and retrieving with three words is about 30 minutes (8 seconds per cluster). We use the top three words from Lw to generate queries and get five candidate pages. Lw is from the most related page concerning the co-occurrence words of user-created data and the page. According to the six heuristic rules, we get seven kinds of lists including the original one. Lw stands for the original list, Lw 1, Lw 2,..., Lw 6 correspond to the six heuristic rules respectively. We provide the F-score weight with dierent ¼
¼
¼
¼
Table 2. Time consumption for retrieving external resources on-line #(search word) 1 2 3 5
#(page) 219 650 1409 4561
#(cluster) 224 224 224 224 ¼
¼
#(pagecluster ) maximal#(pagecluster) Ttotal (m) 0.977 1 5.3 2.901 2 15.61 6.290 7 33.7 20.361 31 108.82 ¼
in Fig. 2-4. In average, Lw 4, Lw 5, Lw 6 achieve higher F-score than the original list Lw , indicating that our refinement with external resources works. Heuristic rules 4-6 extract words not in the forum, conversely implying that only mining the target data is not suitable for the low-quality user-created data. The best one Lw 4 is about twice higher than the original one in F-score, and thus our method greatly improves the labeling. The scores for Lw 4, Lw 5 and Lw 6 are close. It implies that the most related information in Wikipedia is the abstract. We can focus more on the title and abstract than others to improve the eÆciency. By emphasizing on the precision, the heuristic rule 3 gains about the same F-score score with that of the original one. That means external resources cover the most important words in UGC to present a semantic entity.
¼
¼
¼
¼
J. Du et al.
80% 60%
F-score
precision
70% 50% 40% 30%
precision@ 1 precision@ 2
20% 10% 0%
1
2
3
5
number of search words
Fig. 1. Precision@N
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
F-score α=0.3
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
F-score α=0.5
F-Score
precision@N 90%
F-score
290
Lw Lw'1 Lw'2 Lw'3 Lw'4 Lw'5 Lw'6
Lw Lw'1 Lw'2 Lw'3 Lw'4 Lw'5 Lw'6
seven kinds of lists
seven kinds of lists
Fig. 2. « 0.3
Fig. 3. « 0.5
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
F-score α=0.7
Lw Lw'1 Lw'2 Lw'3 Lw'4 Lw'5 Lw'6
seven kinds of lists
Fig. 4. « 0.7
4 Related Work Well-extracted semantic entities are the basis of many advanced analytical applications [2,3]. There are two general approaches for extraction. One is based on sequence annotation techniques, and the other is to regard extraction as a classification problem. In addition, the framework integrating above-mentioned both outperforms traditional algorithms [4]. However, these methods are all based on high-quality data sets. There are many external resources being widely used in information retrieval [5,6]. Essentially, the external resource is well-organized. For example, Wikipedia can be used as a thesaurus in [7] to build a concept-based vector representation of documents. Retrieving related information exactly from external resources is essential during the refinement. Manos et al.[8] have linked online news with social media, finding implicit relations between social media and online news.
5 Conclusion In this paper, we introduce an approach to refine the annotation to a semantic entity from UGC by employing external resources. During the refinement, we propose six heuristic rules to extract the information from external resources in dierent levels. Experiments show that our approach performs better in mining on user-created data in terms of eectiveness and eÆciency. Acknowledgement. This work is partially supported by National Science Foundation of China under grant numbers 60833003 and 61070051, National Basic Research (973 program) under grant number 2010CB731402, and National Major Projects on Science and Technology under grant number 2010ZX01042-002-001-01.
References 1. Toivonen, H.: Apriori algorithm. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 39–40. Springer, US (2010) 2. Qian, W., Chen, F., Du, J., Zhang, W., Zhang, C., Ma, H., Cai, P., Zhou, M., Zhou, A.: UCW: A prototype for analyzing user-created web data. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011, Part II. LNCS, vol. 6588, pp. 442–445. Springer, Heidelberg (2011)
Towards High-Quality Semantic Entity Detection over Online Forums
291
3. Tanev, H., Piskorski, J., Atkinson, M.: Real-time news event extraction for global crisis monitoring. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds.) NLDB 2008. LNCS, vol. 5039, pp. 207–218. Springer, Heidelberg (2008) 4. Cai, P., Luo, H., Zhou, A.: Semantic entity detection by integrating CRF and SVM. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds.) WAIM 2010. LNCS, vol. 6184, pp. 483–494. Springer, Heidelberg (2010) 5. Liu, M., Li, W., Wu, M., Hu, H.: Event-based extractive summarization using event semantic relevance from external linguistic resource. In: ALPIT, pp. 117–122 (2007) 6. Hersh, W., Bhupatiraju, R., Price, S.: Phrases, boosting, and query expansion using external knowledge resources for genomic information retrieval. In: TREC, pp. 503–509 (2003) 7. Wang, P., Domeniconi, C.: Building semantic kernels for text classification using wikipedia. In: SIGKDD, pp. 713–721. ACM, New York (2008) 8. Tsagkias, M., de Rijke, M., Weerkamp, W.: Linking online news and social media. In: WSDM, pp. 565–574. ACM, New York (2011)
“I’m Not an Alcoholic, I’m Australian”: An Exploration of Alcohol Discourse in Facebook Groups Sarah Posner and Dennis Wollersheim La Trobe University, Melbourne, Australia [email protected], [email protected]
Abstract. This paper discusses alcohol discourse characteristics in alcohol related Facebook groups, through a discourse analysis of their wall posts and discussion sections. We created an analytical framework to analyse the content. Our findings on alcohol culture and binge drinking were similar to that stated in existing literature. This study raises important questions about how online discussions on alcohol are creating unhealthy online drinking communities, and how this impacts actual drinking patterns.
1 Introduction This exploratory study aims to understand alcohol discourse on Facebook. With the rise in popularity of social networking sites (SNS) as a means to communicate, it is important to understand how health issues such as unhealthy drinking cultures are being discussed in these forums. Worldwide, alcohol is one of the leading causes of mortality and morbidity, due to its causal relationship with over 60 types of diseases and injuries (WHO, 2009). In Australia, the impact of excessive alcohol consumption is extensive; “each year approximately 3,100 people die as a result of excessive alcohol consumption and around 72,000 people are hospitalised,” (Department of Health and Aging, 2009). SNS can influence alcohol consumption through direct (advertising) or direct (fan pages and gift applications) promotion. We discuss how SNS influences alcohol consumption culture, through an analysis of alcohol communication on SNS.
2 Literature Review There are many benefits of using SNS, especially for young people. SNS can foster learning, development of critical thinking skills, provide psychosocial skills by facilitating identity development, enhance cognitive skills by perspective talking, and provide social support (Mitchell & Ybarra, 2009). SNS provide users with new ways of communicating online, and gives users the ability to create and display images about themselves, and share information with online friends. A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 292–295, 2011. © Springer-Verlag Berlin Heidelberg 2011
“I’m Not an Alcoholic, I’m Australian”: An Exploration of Alcohol Discourse
293
One study looked at the way this communication was being conducted on SNS. To determine the prevalence of displaying ‘risk behaviours’, 500 publicly available, self-reported MySpace profiles of 18-year olds was analysed. They conclude that the most frequent ‘risk behaviours’ references were about alcohol use (41%) (Moreno et. al, 2009a). This study raises significant questions about the influences of peers online, especially in relation to communicating about alcohol. Facebook research has focused on privacy, disclosure and its functionality. Kushin & Kitchener (2009) used computer-mediated discourse analysis to examine political issue discussions in Facebook groups; they found that 73% of wall posts supported the political position of the groups. The study concluded that Facebook’s multilayered user functions such as news feeds on the user’s home page may influence a user’s decision to join or interact in a particular group. The study also discussed how interaction among Facebook group members could be an important peer communication tool. Kolek & Saunders (2008) studied online disclosure, examining displays of risk behaviours such as underage drinking in student Facebook profiles. They did a content analysis of the “interests” section, the “about me” section and the “groups” section of an individual’s profile page. This study discusses how displays of alcohol, through photographs and text in these three sections on an individual’s homepages, can have serious implications on future opportunities. The study found that over 1/3 of the profiles contained positive references to drinking or alcohol and over 1/2 of the profiles contained at least one photo with someone consuming an alcoholic drink. A final study by Tyma (2008) is based on the Virginia Tech University shootings in 2007. The aim of this study was to understand how people used Facebook as a way to communicate about the shootings. This study used critical discourse analysis as a method, sampling publicly accessible Facebook groups. The study analysed the text, including the title, location, discussion, boards and wall posts of the groups. The study found that there were multiple ways in which this incident was being discussed; such as expressions of support for those who experienced loss or anger about the incident. All these studies attempt to understand the different ways users can communicate on Facebook.
3 Methodology and Results We used discourse analysis of alcohol related statements from a set of Facebook group pages. We did a motivated single sample from all the publicly accessible groups that were returned from a query on the term alcohol, restricted to Australian networks. The data was downloaded on 31/7/09. The Facebook search function returned 167 groups, which we separated into into six categories. The data was read through twice, and the groups categorised. Each category was analysed by dividing the text into subject positions, objects,and institutions. Results can be seen in Table 1.
294
S. Posner and D. Wollersheim Table 1. Alcohol related Facebook group category definitions
R *URXS &DWHJRU\ 3ROLF\
3URGXFW
6RFLDO DFWLYLW\
6RFLDO DVVRFLDWLR Q
6RFLDO 3ODFHV 2WKHU
'HILQLWLRQ 'LVFXVVLRQV UHODWHGWR JRYHUQPHQW SROLFLHVVXFKDV DOFRSRSWD[HV 'LVFXVVLRQV UHODWHGWR VSHFLILF DOFRKROLF EHYHUDJHV
*URXSV SURPRWLQJD SDUWLFXODUVRFLDO EHKDYLRXUHJ VPXJJOLQJ DOFRKROLQWR FOXEV 6RFLDOJURXSV ZLWKDOFRKRODV DYDOXHGSRLQW RIUHIHUHQFH
6XEMHFW 3RVLWLRQV 7KHEOXGJHU /D\H[SHUW 7KHDFWLYLVW
2EMHFWV
7D[ 3XEOLF2SLQLRQ $OFRKROSULFHV 9LROHQFH (YLGHQFH $OFRKROSURGXFW +HDOWK H[SHUW &KHDSDOFRKRO 7KHORZHUVWDWXV $OFRKROQDPHV GULQNHU 7KHXQGHUDJH GULQNHU 6H[XDOLVHG GULQNHU 7KHGULQNLQJ $OFRKROWRROV H[SHUW $OWHUQDWLYH 7KHUHEHOV QDPHVIRU 7KHKHUR GUXQNHQQHVV 7KHPDFKRPDQ 3DUWLHV )ULHQGV 3DWULRWLFGULQNHU 7UDVKEDJV /RRVH\*RRVH\ 6H[XDOLVHG GULQNHU 7KHSURXG GHYLDQW
,QVWLWXWLRQV *RYHUQPHQW 3ROLFH
RI JUSV
$OFRKRO FRPSDQLHV 0HGLD
6HFXULW\ (GXFDWLRQ
DOFRKROSURGXFWV 3ROLFH HJ%XQG\ *RYHUQPHQW VRQJVVLQJLQJ 0HGLD *DPHV )XQ )RUPHGDURXQG 3XEV $OFRKRO DSDUWLFXODU 7UDQVSRUW VXSSOLHUV DOFRKROYHQXH (QWU\ &RQWHQWUHWULHYHGE\WKHTXHU\³DOFRKRO´EXWRQO\SHULSKHUDOO\UHODWHG
4 Conclusion This research has contributed to an understanding of alcohol discussions in Facebook. Alcohol groups in the Australian Facebook network present unique insight into the nature of discourse in areas such as binge drinking, alcohol culture and the reinforcement of that culture. This study contributes to the literature on SNS and communication, addressing a gap in the literature, by using a method that can be developed further in a larger study. We used discourse analysis to examine statements about alcohol and identify the social, cultural and historical context in which they are located. From this analysis, a picture of alcohol culture on Facebook groups emerged, emphasising binge drinking and describing some beliefs and values that reinforce this behaviour. In discussions, binge drinking was socially accepted within the groups, and even presented in a positive light. This can be seen especially with groups such as Trashbags and Loosey Goosey’s, where they encourage drinking at risky to high risk levels.
“I’m Not an Alcoholic, I’m Australian”: An Exploration of Alcohol Discourse
295
In the groups, discussions reinforcing this culture were found to stem from peer influences. Group members were exposed to statements related to alcohol culture, posted on the Facebook group’s wall posts or discussion sections. Although this was a small exploratory study, the consistency and strengths of the themes were striking. Although it is not possible to collect reliable demographic information about participants in Facebook discourse, the popularity of Facebook among young demographic groups, combined with the strength of the binge drinking theme, suggests that SNS need to be studied both for its potential for harm, and as a tool in the framing of interventions to address binge drinking in the younger population groups. The ways participants ‘spoke’ about alcohol could be useful for drafting social marketing messages used in interventions.
References Kolek, E., Saunders, D.: Online Disclosures: an empirical examination of undergraduate Facebook profiles. NASPA Journal 45(1), 1–25 (2008) Kushin, M., Kitchener, K.: Getting Political on Social Networking Sites: Exploring online political discourse on Facebook. Paper Presented at 2009 Annual Convention of the Western States Communication Association, Phoenix, AZ (2009) Mitchell, K., Ybarra, M.: Social Networking Sites: Finding a balance between their risks and benefits [Editorial]. Archpediatrics 163(1), 87–89 (2009) Moreno, M., Parks, M., Zimmerman, F., Brito, T., Christakis, D.: Display of health risk behaviours on MySpace by adolescents. Archpediatrics 163(1), 27–33 (2009a); Department of Health and Aging (2009) Australia: The Healthiest Country by 2020-National Preventative Health Strategy- the roadmap for action: Alcohol–reshaping the drinking culture in Australia, ch. 4 (2009) The World Health Organisation Web Site. Alcohol: Facts and figures (2009), http://www.who.int/substance_abuse/facts/alcohol/en/index.html (retrieved April 1, 2009) Tyma, A.: Expressions of Tragedy – Expression of Hope: Facebook, Discourse, and the Virginia Tech Incident. Unpublished doctoral thesis. North Dakota State University, Fargo, North Dakota (2008)
Impact of Expertise, Social Cohesiveness and Team Repetition for Academic Team Recommendation Anthony Ventresque, Jackson Tan Teck Yong, and Anwitaman Datta School of Computer Engineering, Nanyang Technological University, Singapore {aventresque,jacktty,anwitaman}@ntu.edu.sg
Abstract. Forming multidisciplinary teams is a key to carry out complex tasks, which is increasingly the case higher up in the knowledge value chain. In this paper, we study academic teams, by proposing a representation of the information available from various data sources, through (i) competence, (ii) social and (iii) team networks. Each of these projections of the interactions between individuals and concepts have specific characteristics. We then empirically evaluate the impact of these notions on team formation process. The objective is to guide team recommendation systems design. Keywords: Team Recommendation, Expertise, Cohesiveness, Team Repetition.
1 Introduction Collaboration has become vital for successfully carrying out creative work, which is crucial in knowledge based economies. In academic/research world, this is manifested, for instance, by high and increasing average numbers of authors per paper and collaborators per author [2,5]. However finding the right collaborators is not an easy problem: usually people tend to work with the same set of personal acquaintances and miss new colleagues (and as a result, opportunities) as people are generally barely aware of experts and promising newcomers of the various topics involved to carry out a complex multidisciplinary work. One can thus benefit from tools for (multidisciplinary) expert search and team recommendation. Some existing solutions focus on specific sub-problems, like expert finding (SmallBlue [4]), implicit relation identification (WikiNetViz [3]), etc. T-RecS (Team Recommendation System) [1] is at the moment a single comprehensive framework for team recommendation.1 The social and semantic relations and dynamics among experts and their areas of expertise is complex. Various perspectives of the same may be obtained - depending on the kind of available information, as well as the manner of representing the same. The most ubiquitous representations include multidimensional social network of users and bi-partite semantic graph of users and their competence. It has been argued [6] that more complex structures going beyond such egocentric representations, are needed to keep track of the presence and evolution of the set of users and the set of concepts. 1
This work was supported in part by A*Star SERC grant 072 134 0055. A web demo instantiation of the framework with the NTU researchers network can be found at http://sands.sce.ntu.edu.sg/T-RecS
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 296–299, 2011. c Springer-Verlag Berlin Heidelberg 2011
Impact of Expertise, Social Cohesiveness and Team Repetition
297
In this paper, we study three networks, representing socio-semantic interactions in academic teams and we demonstrate their importance for the purpose of team recommendation. In particular, we show that each representation captures crucial and somewhat complimentary characterizations for team prediction.
2 Representation of Socio-semantic Interactions We derive several representations of the academic social network, the network itself having been derived from multiple data sources, such as databases of bibliographic records (e.g. DBLP, Medline), websites of research projects and academics, etc. The competence network is defined as a weighted bipartite graph of users and concepts linking the former to the latter. Given a set of users U = {ui , . . . , un } and a set of concepts C = {cj , . . . , cm }, C = (U, C, Ec , Wc ) is a competence network where Ec is the set of links between users and competencies, and Wc is the weight of these connections. Given ui ∈ U , and cj , ck ∈ C, (ui , cj ), (ui , ck ) ∈ Ec denotes that ui has the competencies cj and ck . Likewise, Wc (ui , cj ) > Wc (ui , ck ) means that ui has a better expertise in cj than in ck . Given also ul ∈ U , and (ul , cj ) ∈ Ec , Wc (ui , cj ) > Wc (ul , cj ) indicates that ui is a better expert than ul regarding topic cj . Another important relation between the entities is determined by their social connections. Here we assume various social overlays, representing the many different explicit (e.g. Facebook, LinkedIn, Twitter, MySpace), implicit (e.g. Last.FM, Amazon) or hybrid social networks (emails, blogs, etc.). A social network is a graph Sm = (U, Em ) where Em is the set of links between users. Given ui , uj ∈ U , (ui , uj ) ∈ Em denotes that uj is a neighbour of ui in Sm . Relation Em can be directed or undirected, depending on whether ∀ui , uj , (ui , uj ) ∈ Em ⇐⇒ (uj , ui ) ∈ Em is true or not. S = (U, E) is a multidimensional social network, i.e. the projection of various social networks {S1 , . . . , Sn }, Si being a directed or undirected graph. Given ui , uj ∈ U , (ui , uj ) ∈ E denotes that uj is a neighbour of ui in any social network Si ∈ {S1 , . . . , Sn }. In most models, only the above two dyadic representations are considered; however these egocentric representations put emphasis only on the relation between egos and alters but not on the team dynamics itself. Many properties are very difficult, or even impossible to model with the competence and social network only, for instance, the cooccurrence of academics and concepts: e.g. that ui and uj worked always in teams about concepts ck and cl but not on cx , etc. That is why we introduce the idea of team network, i.e. an hypergraph T = (U, C, Et ), with Et ⊆ P(U ∪ C), i.e. the set of hyperlinks describing the joint appearance of users and concepts. Given {u1 , . . . ui } ∈ U , and {c1 , . . . , cj } ∈ C, t = (u1 , . . . , ui , c1 , . . . , cj ) ∈ Et denotes that {u1 , . . . ui } are in the same team (e.g. published together), which topics of interest are concepts {c1 , . . . , cj }.
3 Experiments We have now three projections of the several interactions involving academics, concepts and their dynamics. Each of these data structures is well designed to capture some specific characteristic of the teams. Given any team (concepts and academics), it is very easy to compute an expertise value of the team within the competence network, as it contains every individual expertise value. Likewise, social network is very efficient to
298
A. Ventresque, J.T.Y. Tan, and A. Datta
find out the social cohesiveness of a team, e.g. how socially close the members are (see for instance [1]). Team repetition, i.e. the co-occurrence of individuals and concepts together in teams over time, is more difficult to capture using the previous two representations that focus on individuals but not on the team evolution itself. Team network bridges the gap. The objective of the following experiments is to show that these three characteristics are all key elements for team quality assessment and hence for team prediction and recommendation systems. We conduct an empirical analysis on a corpus collected by [6]2 . It consists of papers extracted from Medline web database about a specific topic (zebrafish, a laboratory fish) over a 20 years period (1985-2004). The 13,084 authors involved in this dataset published 6,145 papers (=teams) in this period. An expert of the field selected terms from each paper’s abstract among 70 specific keywords. The problem of links weighting in the various networks extracted from the dataset is out of the scope of this paper. We define now three characteristics of teams. Expertise value of a team is the percentage of its members that have already published about one or several of its concepts. Cohesiveness value is like a global clustering coefficient: any academic’s local value is high when she is socially close to the other team members and if these neighbors are in turn close to each other; we then average this local values (see [1] for more details). Team repetition can be seen from the perspective of academics (the same set of individuals appear altogether in previous teams) or concepts (ibid for set of concepts). Figure 1(a) shows the ratio of teams with a particular percentage of expertise, cohesiveness, academic or conceptual repetition. We see for instance (top right point of conceptual repetition) that 45% of teams have 100% of their concepts appearing in previous teams: i.e. half of new papers have similar concepts as previous papers. Essentially, the figure shows that teams are more likely composed of experts or novice (high and low expertise values – extreme values – are more represented), with a lower spike when team expertise is balanced (around 50%); the same applies for cohesiveness (high and low values are more probable) and academic repetition in teams or new combinations are more frequent. On the contrary, conceptual repetition distribution shows that most teams work on concepts that already appear together in previous papers, and not on new conceptual combinations. However, we are not (only) interested in the actual teams composition, but also in the key elements of team dynamics. To capture this, we consider a null-model, i.e. a model that generates new teams in a fully random way. The model keeps track of the individuals, concepts and their numbers in the actual teams, and creates for each year new teams by shuffling the academics (resp. concepts). We then compare for each year these generated teams with the actual ones and compute a ratio for each characteristic between observed and generated teams. Figure 1(b) shows that high expertise values in teams, as well as high cohesiveness and repetition are much more represented than by chance: the value should be one if teams in real life were formed at random, but we can see that for high team experience, cohesiveness or repetition the ratio becomes orders of magnitude higher. In conclusion, we observed in this work that the three characteristics defined in this paper are all important in actual teams and crucial in their evolution. 2
http://camille.roth.free.fr/software.php
Impact of Expertise, Social Cohesiveness and Team Repetition
Expertise Social Cohesiveness Academic Repetition Conceptual Repetition
0.4 Ratio of Teams
0.35 0.3 0.25 0.2 0.15 0.1
10000 1000 Ratio/Null Model
0.45
299
Expertise Social Cohesiveness Academic Repetition Conceptual Repetition
100 10 1
0.05 0
0.1 0 20 40 60 80 100 Percentage Expertise/Cohesiveness/Repetition
(a)
0 20 40 60 80 100 Percentage Expertise/Cohesiveness/Repetition
(b)
Fig. 1. (a) Distributions of expertise, cohesiveness, academic and conceptual repetitions in observed teams; (b) ratio of previous characteristics in observed/generated teams
4 Conclusion Hypergraphic models for teams first appeared in the domain of sociology and has been used only very recently in computer science to model socio-semantic interactions [6]. We argue in favor of this projection to complement the classical competence and social networks that have been used many times in other works. However, we argue that none of the representation is adequate in itself, and all of them are useful and capture some specific perspective of a complex reality. Particularly, we demonstrate in this paper that individuals expertise (which is mainly captured by competence network), social cohesiveness (from social network) and team repetition (team network) are all very important criteria for team quality assessment. Any team recommendation system, which of course aims to simulate team formation, should thus consider all these properties and implement modules for information extraction and manipulation accordingly.
References 1. Datta, A., Tan Teck Yong, J., Ventresque, A.: T-RecS: Team recommendation system through expertise and cohesiveness. In: WWW, pp. 201–204 (2011) 2. Grossman, J.W.: The evolution of the mathematical research collaboration graph. In: Congressus Numerantium, pp. 201–212 (2002) 3. Le, M.T., Dang, H.V., Lim, E.P., Datta, A.: Wikinetviz: Visualizing friends and adversaries in implicit social networks. In: International Conference on Intelligence and Security Informatics (2008) 4. Lin, C.Y., Cao, N., Liu, S.X., Papadimitriou, S., Sun, J., Yan, X.: Smallblue: Social network analysis for expertise search and collective intelligence. In: ICDE (2009) 5. Newman, M.: Who Is the Best Connected Scientist? A Study of Scientific Coauthorship Networks. Complex Networks, 337–370 (2004) 6. Taramasco, C., Cointet, J.P., Roth, C.: Academic team formation as evolving hypergraphs. Scientometrics 14 (2010)
CEO’s Apology in Twitter: A Case Study of the Fake Beef Labeling Incident by E-Mart Jaram Park, Hoh Kim, Meeyoung Cha, and Jaeseung Jeong Graduate School of Culture Technology, KAIST {jaram.park,hoh.kim,meeyoungcha,jsjeong}@kaist.ac.kr Abstract. We present a preliminary study on how followers and non-followers of a popular CEO respond differently to a public apology by the CEO in Twitter. Sentiment analysis tool was used to measure the effect of the apology. We find that CEO’s apology had clear benefits in this case. As expected, it was more effective to followers than non-followers. However, followers showed a higher degree of change in both positive and negative sentiments. We also find that negative sentiments have stronger dynamics than positive sentiments, in terms of the degree of change. We provide insights on the potential for efficient crisis communication in online social media and we discuss future research agenda. Keywords: Twitter, Apology, Corporate mistakes, Sentiment analysis.
1 Introduction Social media platforms like Twitter have changed the corporate communication dynamics in several major ways. First, while only elite journalists could produce news on corporate mistakes or wrongdoings in the past, now any Internet user can publicly discuss his negative experience about a company. Second, social media enable direct interactions between individual customers and high profile corporate figures such as CEOs. Customers, who could access CEOs only through TV or magazine interviews in the past, now have a direct conversation channel with CEOs through platforms like Twitter. Third, social media expanded the scope of corporate communication in general. Corporate communication used to exist only between the public relations team and journalists, but now it exists virtually between any corporate personals and customers. Given this paradigm shift, many CEOs worldwide are actively utilizing social media to reach out to their customers [1]. Such change has made corporate communication departments nervous, because what CEOs posts on social media can no longer be carefully and selectively drafted by the public relations team. Their posts are personal and ad-hoc. Nonetheless, social media posts by CEOs and tweets carry weight. Both consumers and journalists pay great attention to what CEOs say on a realtime basis. CEOs tweets are quoted in the mainstream media, sometimes circulating much wider than the official announcements made through the corporate communication channels. In the era of social media, a number of serious challenges arise, especially upon a corporate mistake. Should a CEO apologize for the mistake in social media? When a CEO apologizes, to what extent does it help or hurt the corporate’s reputation? Is there a right timing and tone of voice of an effective apology? A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 300–303, 2011. c Springer-Verlag Berlin Heidelberg 2011
CEO’s Apology in Twitter
301
Public apologies by leaders have become important. As corporate mistakes are disclosed and discussed openly via social media these days, there are more pressure from citizens on leaders to apologize. However, to the best of our knowledge, the effect of corporate apologies in social media has not been studied. In this paper, we conduct a preliminary analysis on corporate apologies and try to answer the following two questions. First, how do followers and non-followers of a CEO respond to the CEO’s apology in social media? What benefits does a CEO’s apology bring in terms of the reputation of the corporation and crisis management? In order to answer these questions, we gathered and analyzed the responses of Twitter users on a famous Korean CEO’s apology incident that happened in 2010 summer.
2 Data Methodology On July 27th, 2010, the government-run veterinary service in Korea announced that, after a round of investigation on large retailers, an E-Mart branch was caught selling imported beef as domestic beef [3]. In Korea, people prefer domestic beef, which is far more expensive than imported beef. This prompted angry reaction among many customers and Twitter users requested that owner and vice chairman of Shinsegae Group (the mother company of E-Mart), Yongjin Chung, directly take care of the issue.1 The issue was soon circulated widely within social media. The next day, E-Mart representative director Byungryul Choi apologized on Twitter posting “I express my sincere regret for selling falsely-labeled Korean beef, but I have to make clear to you that this was not intentional. The negligence of employees led to mislabeling and we won’t make this mistake again.” Chung immediately re-tweeted Choi’s tweet and added “I sincerely apologize for all the concerns on the beef scandal.” We gathered all tweets that mentioned the word “E-Mart” for a two month period in July and August in 2010. Figure 1 shows the number of tweets before the corporate mistake (July 1–26), on the day of the mistake (July 27th), on the day of the CEO’s apologies (July 28th), and on the days afterwards (July 29th–August 30th). The number of tweets suddenly increase on July 27th, indicating the dispute and wide sharing of sentiments on the fake beef labeling incident in Twitter. For detailed analysis, we focus on a nine-day period from July 24th to August 1st. Table 1 shows the number of tweets, mentions (including replies to other tweets), and retweets (RTs) for this period. A total of 4,500 users, including 1,177 followers and 3,323 non-followers of Chung, engaged in the event. We also show the number of mentions directed to Chung, which account for 18% of the tweets for followers and fewer than 1% for non-followers. Interestingly, most tweets are either retweets or mentions, and only a small fraction of tweets (25%) are of fresh content. Table 1. Summary of data set Followers of the CEO Non-followers Total 1
# users 1,177 3,323 4,500
# tweets 1,491 3,889 5,380
# mentions 1,213 2,819 4,032
# RTs 625 1,596 2,221
# mentions to Chung 275 35 310
Chung is a grandson of Samsung’s founder and CEO of Shinsegae Group. Unlike the other hermit type of Korean conglomerates owners, Chung has been exceptionally known for being active in Twitter. As of May 2011, he had more than 110,000 followers on Twitter.
302
J. Park et al.
Fig. 1. The number of tweets per day containing the word “E-Mart”
3 Sentiment Analysis In order to quantify the positive and negative moods embedded in tweet, we used a Korean version of the LIWC (Linguistic Inquiry and Word Count) sentiment tool, KLIWC, which has been widely used by Korean psychology researchers [2,4]. LIWC is a transparent text analysis program that counts words in psychologically meaningful categories (e.g., happy, angry). Empirical results demonstrate that LIWC can detect meanings in a wide variety of experimental settings, including attention focus, emotionality, social relationships, thinking styles, and individual differences [5]. Throughout the nine-day period, not all tweets were classified as having negative sentiments. Some tweets were unrelated to the beef incident and had positive sentiments, while some tweets were explicit positive feedback on Chung’s apology. Figure 2 shows how the positive sentiment (left figure) and negative sentiment (right) evolve for the followers and non-followers of Chung. Overall, Twitter users exhibited positive sentiment towards E-Mart before the fake beef labeling incident and negative sentiment after the incident. We make two observations. First, followers’ reactions towards negative corporate behaviors (e.g., mistake, apology) were more intense than those of non-followers. The positive mood of followers sharply increased on the day of the CEO’s apology, while their negative mood sharply decreased the day after the apology (July 29th). This
Fig. 2. Temporal evolution of the positive and negative sentiment scores
CEO’s Apology in Twitter
303
observation indicates that the apology was more effective to Chung’s followers than non-followers. Although at a lesser extent, negative sentiments of non-followers showed signs of abating after the apology. However, positive sentiments of non-followers did not increase like those of followers. Second, compared to the variations in positive sentiment, the degree of change in negative sentiment is larger for both non-followers and followers. This observation indicates that negative sentiments have stronger dynamics than positive sentiments. The continued negative sentiments of non-followers after the apology are due to two-fold causes. Some users not knowing about the apology continued to request for an apology, while others criticized that the apology was not serious.
4 Conclusion and Future Work Based on the analysis, we conclude the following. First, CEO’s apology shows clear benefits in this case: (i) negative sentiments of both non-followers and followers decreased; (ii) positive sentiments after the apology, however, increased only among followers. Second, in terms of the degree of sentiment change, the apology has more influence on decreasing negative sentiment rather than increasing positive sentiment. Considering that a major goal of apologizing is reducing anger and negative sentiment, this result is rational. Third, followers engaged more actively with the CEO, directly replying to Chung, than non-followers. Chung’s Twitter apology was perhaps more effective than other CEOs’ apologies could have been, since he has been very active on Twitter for a long time before the corporate mistake. In social media, crisis management starts from building relationships in normal times. Starting to tweet in times of crises to manage negative sentiments in Twitter is not recommended as relationships cannot be built suddenly . It is a common sense to “make friends before you need them.” There are several exciting future directions. First, having confirmed that Twitter followers are friendly to a CEO even under a corporate crisis situation, we would like to further study the network effect of followers. In particular, do followers influence their own followers who are not directly connected to the CEO? Second, we want to analyze the different ways that bad news spread among followers and non-followers. Third, we are interested in comparing the effect of apology in offline and online settings. Which CEO apology is more effective, the one in social media or the one in offline announcements? Finally, we would like to analyze the customer sentiments by the hour rather than by day, since Twitter is realtime media.
References 1. BusinessWeek, CEOs Who Use Twitter (2009), http://tinyurl.com/ole6wv 2. Korean-Linguistic Inquiry and Word Count, http://k-liwc.ajou.ac.kr 3. The Korea Herald, Shinsegaes Chung Apologizes (2010), http://tinyurl.com/3s4ue5q 4. Lee, C.H., Sim, J.-M., Yoon, A.: The Review about the Development of Korean Linguistic Inquiry and Word Count. The Korean Journal of Cognitive Science 16(2), 32–121 (2005) 5. Tausczik, Y.R., Pennebaker, J.W.: The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. Journal of Language and Social Psychology 29(1), 24–54 (2010)
GViewer: GPU-Accelerated Graph Visualization and Mining Jianlong Zhong and Bingsheng He Nanyang Technological University
1 Introduction Visualization is an effective way of identifying the patterns of interests (such as communities) in graphs including social networks and Web [8,6]. There have been a number of tools developed for graph visualizations, e.g., Tulip, Gephi and GMine [8]. All of these tools use the CPU as the main power to calculate the graph layouts for visualization, such as force-directed layout [2]. However, the layout calculation is usually computation intensive, for example, the force-directed layout has the complexity of O(N 3 ), where N is the number of vertexes in the graph. In our experiments, the CPU-based solution takes more than half one hours on the CPU to layout a graph with 14.5 thousand vertexes. Instead of laying out the entire graph, existing tools usually address this performance issue with an off-line multi-scale approach, where the entire graph is partitioned with the multi-level partitioning algorithm. The graph layout is limited to the graph data at the lowest level, and each partition consists of dozens of vertexes. While the multi-level approach improves the response time, the static graph partitioning has limited the flow and the scope of graph exploration. Users can only follow the pre-computed multi-level graph layout to explore the graph. Additionally, there is little information visualized for boundary vertexes at each graph partition. The limited flexibility hurts the effectiveness of visualization on graph mining. With the limitations of existing graph visualization tools in mind, we propose to accelerate the graph layout calculation with graphics processors (GPUs), and further to support interactive graph visualization and mining. The adoption of GPU is motivated by the recent success of GPGPU (General Purpose computation on GPUs), where GPUs have become many-core processors for various database tasks [4,5]. As a start, we develop a graph layout library on the GPU. The library includes multiple commonly used graph layouts [8], such as force-directed layout [2], spectral layout [1] and tree layout [3]. The inherent data parallelism of calculating the graph layouts facilitates implementing the algorithm on the GPU. Moreover, we utilize the GPU hardware features to reduce the memory latency. As a result, the GPU-based graph layout calculation on a NVIDIA Quadro 5000 GPU is over 8.5 times faster than its CPU-based counterpart on the Intel quad-core W3565 CPU. As a side product, calculating the graph layout on the GPU eliminates the overhead of data transfer between the main memory and the GPU memory. Note, existing approaches need to transfer the graph layout data from the main memory to the GPU for rendering. With the accelerated layout calculation as a building block, we develop user interactions for graph visualization and mining. Currently, user interactions include the simple A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 304–307, 2011. c Springer-Verlag Berlin Heidelberg 2011
GViewer: GPU-Accelerated Graph Visualization and Mining
305
graph operations, i.e., filtering, vertex selections, zooming in/out, and drilling in/out. Thanks to the GPU acceleration, these user interactions offer good interactive user experiences. We have implemented these techniques into a system named GViewer. We will demonstrate the following two key aspects of GViewer: (1) Efficient graph layout calculation. GViewer performs the graph layout calculation at runtime for the subgraph specified in the user interaction. Additionally, we also perform a side-by-side comparison between the GPU-based algorithm and its CPU-based counterpart. (2) User interactions in GViewer to support graph visualization and mining.
2 System Implementation We implement GViewer with a recent GPU programming framework named CUDA. Currently, GViewer supports the commonly used graph layouts [8], such as forcedirected layout [2], spectral layout [1] and tree layout [3]. We use OpenGL for graphics rendering and the CUDA-OpenGL inter-operability support for collaboration between computation and visualization. GPU-Accelerated Graph Layout. The force-directed layout has good quality layout result, strong theoretical foundations, simplicity and interactivity [2]. The basic idea of the force-directed layout is physical simulation, where vertexes are modeled as objects with mechanical springs and electrical repulsion among them. The edges tend to have uniform length because of the mechanical spring force, and vertexes that are not connected tend to be drawn further apart due to the electrical repulsion. The force-directed layout calculation is an iterative process. In each iteration, the algorithm calculates the new position for each vertex based on its current position, the total spring force from its neighbor vertexes and the total electrical repulsion from its unconnected vertexes. That is, for each iteration, we need to calculate the force (either spring force or repulsion) between any two vertexes. A basic implementation is that each GPU thread calculates the force for a vertex, through scanning the vertex list and calculating the force during the scan. While the basic implementation takes advantage of the thread parallelism of the GPU, it incurs excessive memory accesses. We improve the memory performance of the basic implementation with two hardware features of GPU, i.e., coalesced accesses and shared memory. In CUDA, T GPU threads are grouped into a warp (T = 32 in current CUDA). If the threads in a warp access consecutive memory addresses, these accesses are coalesced into a single request such that the bandwidth utilization is improved. The shared memory is a piece of fast on-chip memory for storing the frequently accessed data. Combining these two features, a warp first reads T vertexes into the shared memory, and then each thread in the warp calculates the partial forces on the T vertexes. Next, this calculation repeats until the vertex list is exhausted. With the coalesced access and the shared memory, the number of memory requests is significantly reduced. The spectral layout [1] is based on the calculation of the eigenvector of the adjacency matrix of the graph. We implement the Lanczos algorithm for the eigenvector calculation [7] with CUDA BLAS library.
306
J. Zhong and B. He
The tree layout is to show a rooted tree-like formation for a graph. It is suitable for a tree-like graph. We use breadth first traversal (BFS) to generate the tree layout. The GPU-based BFS is performed in k iterations. Initially, the input set includes s only, where s is a root vertex defined by the user. In each iteration, we span one hop from the input set of vertexes in order to get all the neighbor vertexes within one hop. We use an array flag to indicate whether a vertex is firstly accessed in the kth iteration. Initially, only the flag for s is set to be zero, and other flags are -1. At the ith iteration, we get the neighbor list of the vertex whose flag equals to (i − 1). This is implemented using a map primitive [4]. A map is similar to a database scan, with a CUDA feature coalesced access memory optimizations for bandwidth utilization. Next, we set the flag for each vertex in the neighbor list: if the flag is -1, it is set to be i; otherwise, the flag does not change. The iteration ends when no flag is set within the iteration. Given the BFS result, we can calculate the position of each vertex in the display region, by considering the tree height and the fanout [3].
3 Case Studies We evaluate GViewer on a commodity machine with 2GB RAM, one NVDIA Quadro 5000 GPU and one Intel quad-core W3565 CPU. The operating system is Windows 7. We extract an undirected graph from DBLP (http://dblp.uni-trier.de/xml/) for demonstration: each author as a vertex, and two connected vertexes meaning co-authorship between the two corresponding authors. The co-authorship represents the relationship between any two authors of the same paper. The extracted graph consists of 820 thousand vertices and 5.7 million edges. We present the major result, including the comparison between the CPU- and the GPU-based implementation, and community discovery. Timeline for CPU (sec) 0
171
341
512
682
853
1024
1194
0
20
40
60
80
100
120
140
Timeline for GPU (sec)
Fig. 1. Side-by-side comparison between the GPU- and the CPU-based force-directed layout
GPU vs. CPU-based layouts. We conduct a side-by-side comparison between the CPU and the GPU. Figure 1 shows the screen shots during the process of the CPU- and the GPU-based visualization on the graph with D = 96 and C = 8 in the force-directed layout. Along the time line, we can imagine the difference on the user experience between the CPU- and the GPU-based visualizations. For example, in order to see the fourth screen shot, the user needs to wait for 512 seconds on the CPU-based visualization, and only needs to wait for 60 seconds on the GPU-based visualization. Note, the FD-layout algorithm takes around one thousand iterations before the layout becomes stable.
GViewer: GPU-Accelerated Graph Visualization and Mining
307
Community Discovery. We demonstrate the flow of exploring the graph with a specific author in order to find his/her co-authorship community. We use Jiawei Han as an example of community discovery: (1) As the first step, we select “Jiawei Han” and highlight its neighbors with two hops. The result is omitted here. (2) We drill down from Jiawei with two hops. GViewer visualizes the subgraph with the force directed layout (The figure is omitted due to space constraints). We observed that Jiawei has a large two-hop co-author community. (3) If we set the number of hops to be one, we can easily find that Jiawei’s most important coauthors (Figure 2).
Fig. 2. One hop from “Jiawei Han”
Acknowledgement. This work is supported by an NVIDIA Academic Partnership (2010-2011) and an AcRF Tier-1 grant in Singapore.
References 1. Beckma, B.: Theory of Spectral Graph Layout. Technical report, MSR-TR-94-04 (1994) 2. Fruchterman, T.M.J., Reingold, E.M.: Graph drawing by force-directed placement. Softw. Pract. Exper. 21(11) (1991) 3. Grivet, S., Auber, D., Domenger, J.-P., Melancon, G.: Bubble tree drawing algorithm. In: International Conference on Computer Vision and Graphics (2004) 4. He, B., Yang, K., Fang, R., Lu, M., Govindaraju, N., Luo, Q., Sander, P.: Relational joins on graphics processors. In: SIGMOD (2008) 5. He, B., Yu, J.X.: High-throughput transaction executions on graphics processors. In: Proc. VLDB Endow., vol. 4, pp. 314–325 (February 2011) 6. Koenig, P.-Y., Zaidi, F., Archambault, D.: Interactive searching and visualization of patterns in attributed graphs. In: Proceedings of Graphics Interface (2010) 7. Parlett, B.N.: The symmetric eigenvalue problem. Prentice-Hall, Inc., Upper Saddle River (1998) 8. Rodrigues Jr., J.F., Tong, H., Traina, A.J.M., Faloutsos, C., Leskovec, J.: Gmine: a system for scalable, interactive graph visualization and mining. In: VLDB (2006)
Sharing Scientific Knowledge with Knowledge Spaces Marcos Baez, Fabio Casati, and Maurizio Marchese Dipartimento di Ingegneria e Scienza dell’Informazione, University of Trento, Italy {baez,casati,marchese}@disi.unitn.it
Abstract. This paper presents a set of models and an extensible social web platform (namely, Knowledge Spaces) that supports novel and agile social scientific dissemination processes. Knowledge Spaces is based on a model for structured, evolving, and multi-facet scientific resources that allows the representation of structured, evolving, and multi-facet scientific knowledge and meta-knowledge, of effective “viral” algorithms for helping scientists find the knowledge they need, and of interaction metaphors that facilitate its usage. Keywords: knowledge dissemination, social web, scientific publications.
1 Introduction Knowledge Spaces (kspaces for short) are a metaphor, a set of models and processes, and a social web platform that help you capture, share and find scientific knowledge, in all of its forms. The principle behind kspaces is to allow knowledge dissemination in the scientific community to occur in a way similar to the way we share knowledge with our colleagues in informal settings. The rationale behind this is that when we interact informally with a small team of colleagues dissemination is very effective. We are free to choose the best format for communicating our thoughts and results, we share both established results as well as latest ideas, we interact and carry on a conversation (synchronously or via email), we comment on other people's contributions and papers and observe relations among various contributions. Even when we remain in the domain of papers, we often find that we come to know interesting papers not by doing a web search or scan the proceedings, but because we "stumble upon" them, that is, we have colleagues pointing them to us via email or mentioning them in a conversation (along with their comments). In other words knowledge spreads virally. Kspaces aim at providing a set of models, processes, metrics and tools to support this informal, viral and social way of disseminating knowledge among the scientific community at large and via the Web, complementing the well-established method of papers published in conferences and journals after peer review. The goal is to use a web-based system to enable the capturing of these evolutionary bits of knowledge and data, however they may be expressed, as well as the capturing of ideas and opinions about knowledge, and leverage this information and meta-information to spread knowledge socially. Capturing opinions on knowledge is particularly important. The fact for example that somebody (and especially somebody we “trust”) shares a paper tells us a lot on the value of this paper, much more than a citation can do. As readers, A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 308–311, 2011. © Springer-Verlag Berlin Heidelberg 2011
Sharing Scientific Knowledge with Knowledge Spaces
309
we relate specific papers, in our mind, with prior knowledge. When listening to a talk we think that other work is relevant to the one being presented and often we jot it down in our own personal notes. In a world where information comes out from the web like from a hose, this knowledge about knowledge becomes essential to dissemination. Tagging, annotating and connecting the dots (linking resources in a way much more useful to science than citations) become almost as important as the dots themselves. Kspaces support this not only by using web technologies as the basis for its implementation but by using web 1.0 and 2.0 concepts in the way scientific resources and their relationships are modeled and in the way knowledge sharing is supported. In essence, kspaces is characterized by a conceptual model and a repository for scientific resources (or for pointers to them if stored elsewhere). Resources are linked in arbitrary ways and relationships are typed and can be annotated. This is analogous to the Web, although it is oriented to linking scientific resources and to supporting (and then leveraging) relationship types and annotations. Indeed building this evolving web of annotated resources and leveraging it to find knowledge is a key goal of kspaces. The intuition is that having such web of connected knowledge can be as instrumental or even more instrumental (because it contains more metadata) to finding knowledge than the Web is to finding web pages. Today this web of resources is simply not there and this is part of what makes finding interesting and relevant scientific knowledge hard. On top of this space of resources, kspaces define specific processes, permissions, and interaction modes people use to share knowledge. Kspaces manifest themselves in various forms, called designs, tailored at capturing different forms of scientific knowledge shared in different ways, from maintaining a library of related work, talks, datasets, etc, in an area – including our own, evolving work - to forming knowledge communities, writing and publishing (liquid) books, supporting the collection of the knowledge that emerges in the brain of attendees during a talk, and many others. It is through spaces with specific design that knowledge and meta-knowledge is collected and disseminated. The dissemination and search of knowledge over kspaces is then based on the “social interest”, on the goals of a search (e.g., related work vs introductory material), and on the meta-knowledge (e.g., tags and annotations). Kspaces, although being richer and more flexible than many existing systems, is not the first and only platform that exploits some form of social meta-knowledge to support search. Mandeley, CiteUlike, and Connotea, just to name a few, all have some elements of this. We believe that the key to a successful platform here lies in how such meta-knowledge can be collected and how it is used, and here lies a key contribution of kspaces.
2 Knowledge Spaces We see scientific contributions as a structured, evolving, and multi-facet objects. Specifically, we see the Scientific Resource Space (SRS) we want to collect, organize, share, evaluate, and search as consisting of scientific resources, organized as set of nodes in a graph, that can be connected and annotated by authors or readers. We do not discuss or formalize here the SRS model further as it was discussed in our earlier work [1] to which we refer the reader for details.
310
M. Baez, F. Casati, and M. Marchese
A Knowledge Space is defined as KS={R, Q, M, Tr, C, S}, i.e., a collection of SRS content, with the following characteristics: •
• •
•
•
The content is defined intensionally (in terms of the properties the content should have) or extensionally (content is explicitly added). A space can be only intensional, only extensional, or a mix. In case the content is defined intentionally, KS defines in essence a query over the SRS, denoted as Q, while R denotes resources explicitly added. A KS has members M={O, E, V} that can be owners O, editors E, and viewers V. Viewers can only access the resources. Editors can add or remove content. Owners are editors and can add new viewers or editors or owners. Tr={transparent | opaque} denotes the transparency flag. An opaque space is a space where the comments, tags, annotations on resources, and the existence of the space itself are only visible to the members of the space. In a transparent space, comments, tags, and the posted resources “percolate” down to the resource space. Non-members cannot see what’s in the space, but can see the tags and comments on the resources. C={RST, RLT, ENT} denotes the configuration of the space, i.e. a container for a specific KS application. Because containers are used for a purpose, they typically include specific types of resources and relationships that acquire a particular meaning, and require a specific UI representation. Spaces can also follow a lifecycle defined by a particular design: for instance in a implementation of a KS modeling panel discussions the space will go through the phases involving - at least - the prior, during and post panel discussions. At each stage S in the lifecycle, the permissions and the way the UI renders the content may differ.
A KS is itself a resource, and as such KS can be included in other KSs, it can be annotated and linked as resources do.
3 Knowledge Spaces Applications Kspaces essentially are a general-purpose repository and a related API collection that can be used to develop applications for specific purposes around the area of collecting, linking, sharing, and finding scientific knowledge. For example, a particular case of kspaces “Instant Communities” can reuse the kspace infrastructure and related API as foundations. Specifically, “Instant Communities” kspace provides an IT infrastructure that helps create a “community of interest” in realtime during the panel or session. Initially, material is created and posted before the panel, by the panelists. This is an immediate body of knowledge that can be shared among panelists and participants. Then, during the panel, attendees, while listening, if they have a tablet or laptop avail, they can add papers, comments, questions, slides, links, interesting datasets, and whatever they feel useful. After the panel the goal of Instant Communities is to facilitate collection and sharing of material, to keep the attendees in touch, and extend the community with other people interested. People can also create their own “view” on this body of knowledge, with a few clicks and drag and drop. One can do so by explicit selection or by filtering by poster, topic, and the like.
Sharing Scientific Knowledge with Knowledge Spaces
311
They can then share this view, or the entire space, with their team at home, with colleagues, with the entire instant community, etc. Incidentally, all this adding, selecting, and sharing knowledge provides an implicit way to connect people, connect knowledge, and identify interesting knowledge (by looking at what people share). It is a way therefore to provide information that can be used for facilitating search and for assigning reputation to scientific resources. The detailed list of features, user stories, screenshots and implementation details of instant communities are available at http://open.instantcommunities.net. The application has been used in various conferences and seminar series and will be deployed this fall for intra-company usage. It is one of the way in which kspaces tackle the challenges of bootstrapping and of usage: by providing knowledge capturing and sharing applications for specific purposes and communities.
4 Findings, Status and Next Steps Kspaces is the result of several attempts and failures at arriving at a model for capturing knowledge, which we initially tackled by trying to impose a specific knowledge collection mechanism (that is, in our terminology, a single, specific kspaces application). The finding during the years of work on this tool is that, besides a proper conceptual model, we need very domain-specific and targeted applications if we want to lower the barriers to knowledge sharing based on the principles described in the introduction. The concept and a preliminary implementation of kspaces, in their various forms and designs, are being exploited in several different pilots in cooperation with the EU Commission (who used it at their flagship event for future and emerging technologies, fet11.eu), IEEE, Springer, the archeology museum in Cambridge and major international conferences to support the collection and sharing of knowledge in conferences, in technical communities, among scholars visiting museums, and in the generation of teaching material among groups of lectures. Acknowledgements. We acknowledge the great contributions to the ideas and the code from all the LiquidPub project team, with particular thanks to Alex Birukou and to all our kspaces developers including Delsi Ayala, Simone Dalcastagne, Nicola Dorigatti, Lyubov Kolosovska, Muhammad Imran, Michele Lunelli, Aliaksey Minyukovich, Daniil Mirilenko, Alejandro Mussi, Cristhian Parra, Laura Pop. This work has been supported by the LiquidPub project (FP7 FET-Open 213360.).
Reference 1. Baez, M., Birukou, A., Casati, F., Marchese, M.: Addressing Information Overload in the Scientific Community. IEEE Internet Computing 14(6), 31–38 (2010), doi:10.1109/MIC.2010.107
Analysis of Multiplayer Platform Users Activity Based on the Virtual and Real Time Dimension Jaroslaw Jankowski Faculty of Computer Science and Information Technology West Pomeranian University of Technology ul. Zolnierska 49, 71-410 Szczecin, Poland [email protected]
Abstract. The paper proposes an approach to modelling the behaviour and segmentation of online multiplayer systems’ users, based on frequency and patterns of visits. The results presented are based on the analysis of time series both in real and virtual time, with the objective to quantitatively capture the characteristics of usage of online multiplayer platforms. Keywords: multiplayer platforms, time series analysis, web users’ behavior.
1 Introduction Together with the development of web systems the need to conduct analysis focused on the studying of users’ behaviour increases. Research in the field of online communities, virtual worlds and massively multiplayer platforms and games, among other, relates to users engagement [8] and social dynamics [2]. A big volume of the collected data requires the usage of data mining methods, which focus in the area of personalisation [4], semantic processing [3] and users’ segmentation [7]. The changeability and time evolution of the users behaviour due to developing new technologies, as well as a variety of other factors, is emphasized in the works of V. Venkatesh and M.G. Morris [11]. In this paper we focus on the visiting frequency and patterns of system usage. Earlier research in this field is based, among others, on World of Warcraft usage patterns. A study presented by P.Y. Tarng et al [9] focused on predicting eminent discontinuation of service usage by analyzing online and offline periods. In their research, R. Thawonmas et al [10] focused on the revisitation patterns within online games environments and on the identification of main groups of users. Here, we propose an alternative approach based on quantitative measurements of platform usage characteristics and user segmentation both in real and virtual time.
2 Motivation and Conceptual Framework The existing studies have focused on various aspects of data analysis, however they offer limited support for the specifics of virtual reality, parallel to real world. For mapping and parameters determination purposes, the two-dimensional (bi-temporal) model of time representation was assumed in this paper, where occurrences are A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 312–315, 2011. © Springer-Verlag Berlin Heidelberg 2011
Analysis of Multiplayer Platform Users Activity
313
registered both in real and virtual time. The real time vector can be represented linearly as a reference point for time series of events registered in virtual environments. A different real time interpretation can be used as well, as for example on the approach presented by S. Radas and S.M. Shugan, who proposed the change of speed at which time flows in selected periods [6]. In our analysis the time spent online, and time spent offline without contacts with multiplayer environments were treated as separate dimensions. The relation between virtual and real time is defined according to the formula:
VTF =
v*n d
(1)
where VTF is the virtual time factor, d is the number of real time days, v is the number of virtual days assigned to single real day and n is the number of time intervals taken into consideration for monitoring the user’s activity. To evaluate similarity of time series in real and virtual time, the time warping distance method was applied. For series X(x1,x2,…,xn) and Y(y1,y2,…,ym), with lengths n and m respectively, the M matrix is defined as an interpoint relation between series X and Y, where the element Mij indicates the distance d(xi,yj) between xi and yj. The warping path with the lowest cost between two series can be determined according to the formula [5]:
D( X , Y ) = min{d k , w = w1 , w2 ,..., wk W
}
(2)
where dk=d(xi,yj) indicates distance represented by wk=(i,j) on path w. A series with a higher level of similarity can be better compared because of alignment and dependencies resulting from dynamic time distance. In the proposed procedure, the users’ sequences in the virtual time dimension can be compared to the pattern of ideal time lapse with maximal possible system usage.
3 Empirical Data Analysis Based on Real and Virtual Time In the next step real and virtual time was identified for a typical dataset acquired from a multiplayer platform with users’ visits over the time. We focused on the number of visits and usage patterns, however other measures of user activity could be added as an additional dimension. Fig. 1 shows the relation between real time Rd and virtual time Vd for three different users. For example, user u1 shows a consistently high frequency of system usage, with real and virtual time being equal at all times, indicating a continuous system use. For user u6 the distance between real and virtual time initially increased (for example at point A Rd=20 and Vd =12). In the following days the distance is slightly changed up until virtual day Vd=19, after which there is a stability up to the point with Vd=41.
314
J. Jankowski
Fig. 1. Relative behaviour characteristics for users u1, u5 and u6
Fig. 2. Distance from ideal vector calculated with dynamic time wrapping
After that the distance rapidly increases and eventually at point B Vd=49 and Rd=100. Similar activity can be observed for user u5. Fig. 2 presents the changeable dependency for a set of 4410 users which showed a change of sequence length from the virtual dimension. The D axis shows the distance to the ideal vector calculated using the DTW method. Users (U axis) are ordered by sequence length, and together with its change the minimum and maximum distance to the ideal vector increases. For example, point A represents user u92, with a distance to ideal vector equal to 68 and sequence length 74. Point B represents user u183 with distance 574 and sequence length 64. In the next step users where divided into four classes, with users of low risk of terminating service usage in the first class, up to high risk users in the fourth class. The assumed number of classes was based on generalised results from R. Thawonmas et al [10]. Patterns of behaviour of users with activity at least during fifty days where analysed in relation to the ideal vector in clusters with 68, 71, 115 and 144 users respectively. The forecasts of behaviours in further periods based on classification and regression trees were conducted [1]. Table 1 presents the classification matrix based on regression trees for the acquired results. Table 1. Classification matrix for 10 and 20 days sequences Period 10 days
Class 1 2 3 4
Forecast class 1 4.41% 0.00% 0.00% 0.69% 4 1.00%
Forecast class 2 13.24% 25.00% 8.62% 2.78% 41 10.25%
Forecast class 3 11.76% 26.39% 33.62% 14.58% 87 21.75%
Forecast class 4 70.59% 48.61% 57.76% 81.94% 266 67.00%
1 2 3 4
64.71% 29.17% 27.59% 15.97% 120 30.00%
5.88% 26.39% 5.17% 7.64% 40 10.00%
23.53% 23.61% 57.76% 25.69% 137 34.25%
5.88% 20.83% 9.48% 50.69% 101 25.75%
Elements Total % 20 days Elements Total %
For every user ui, there were distance parameters determined, dependent on real time Rd, virtual time Vd and the distance D from the ideal vector. When the analysed period was extended from ten to twenty days, the accuracy of prediction increased for all classes apart for the second class, where results were burdened with a bigger deviation. The
Analysis of Multiplayer Platform Users Activity
315
combination of dynamic time warping and classification methods in the virtual and real time dimensions enables one to use the quantitative measures of similarity in relation to the ideal vector, with relatively low computation costs. Comparing our achieved results to the solutions presented by Y. Tarng et al [9] and R. Thawonmas et al [10] we used an approach where it is possible to compute a single quantitative measurement of the user’s activity, instead of having to deal with offline and online time directly. Measurements for players’ similarity makes it possible to identify usage patterns based on the distance function from the ideal vector.
4 Summary and Future Work The proper recognition of the needs and behavioural tendencies of web users provides the basis for making rational decisions and better adaptation of online services to users’ needs. The solutions presented so far in the literature did not include twodimensional approach towards data character and time series. The approach presented in this paper is an attempt to quantitatively estimate the users’ behavior with the use of reference sequences. The proposed analysis makes it possible to use measurements of users’ similarity with bi-temporal representation in several dimensions of system usage. One of the areas open for future research is the identification of the accuracy of sequence similarity calculations, as well as the possibility of including measurements of the users’ social activity (and not only visits) in our analysis.
References 1. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Chapman and Hall, New York (1984) 2. Ducheneaut, N., Yee, N., Nickel, E., Moore, R.: Alone Together - Exploring the Social Dynamics of Massively Multiplayer Online Games. In: Proceedings of ACM CHI 2006 Conference on Human Factors, Quebec, pp. 407–416 (2006) 3. Eirinaki, M., Lampos, H., Vazirgiannis, M., Varlamis, I.: Sewep: Using Site Semantics and a taxonomy to enhance the Web personalization process. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York (2003) 4. Mobasher, B., Cooley, R., Srivastava, J.: Automatic Personalization Based on Web Usage Mining. Communications of the ACM 43(8), 142–151 (2000) 5. Rabiner, L.R., Juang, B.: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs (1993) 6. Radas, S., Shugan, S.M.: Seasonal Marketing and Timing Introductions. Journal of Marketing Research 35(3), 296–315 (1998) 7. Rho, J., Moon, B., Kim, Y., Yang, D.: Internet Customer Segmentation Using Web Log Data. Journal of Business & Economics Research 2(11), 234–249 (2004) 8. Sweetster, P., Wyeth, P.: GameFlow: A Model for Evaluating Player Enjoyment in Games. ACM Computer and Entertainment 3(3), 1–23 (2005) 9. Tarng, P.Y., et al.: An analysis of WoW players’ game hours. NetGames, Worcester (2008) 10. Thawonmas, R., Yoshida, K., Lou, J.-K., Chen, K.-T.: Analysis of Area Revisitation Patterns in World of Warcarft. In: Natkin, S., Dupire, J. (eds.) ICEC 2009. LNCS, vol. 5709, pp. 13–23. Springer, Heidelberg (2009) 11. Venkatesh, V., Morris, M.G., Davis, G.B., Davis, F.D.: User acceptance of information technology: Toward a unified view. MIS Quarterly 27(3), 425–478 (2003)
Tracking Group Evolution in Social Networks Piotr Bródka1,2, Stanisław Saganowski1, and Przemysław Kazienko1,2 1 2
Wrocław University of Technology, Wyb.Wyspiańskiego 27, 50-370 Wrocław, Poland Research Engineering Center Sp. z o.o., ul. Strzegomska 46B, 53-611 Wrocław, Poland [email protected], [email protected], [email protected]
Abstract. Easy access and vast amount of data, especially from long period of time, allows to divide social network into timeframes and create temporal social network. Such network enables to analyse its dynamics. One aspect of the dynamics is analysis of social communities evolution, i.e., how particular group changes over time. To do so, the complete group evolution history is needed. That is why in this paper the new method for group evolution extraction called GED is presented. Keywords: social network, community evolution, GED.
1 Introduction One of the areas of science which in recent years is rapidly growing is social network analysis. One of the main reasons for this is growing number of different social networking systems and growth of the Internet together with simple and continuous way to obtain data from which we can extract those social networks. Group extraction and their evolution are among the topics which arouse the greatest interest in the domain of social network analysis. However, while the grouping methods in social networks are developed very dynamically, the methods of group evolution discovery and analysis are still ‘uncharted territory’ on the social network analysis map. In recent years only few methods for tracking changes of social groups have been proposed: [2], [3]. [5], [6]. Therefore in this paper the new method for the group evolution discovery called GED is proposed, analysed and compared with two methods by Asur and by Palla. It should also be mentioned that this article is an extension and continuation of research presented in [1].
2 Group Evolution Before the method can be presented, it is necessary to describe a few concepts related to social networks: Temporal social network TSN a list of succeeding timeframes (time windows) T. Each timeframe is in fact one social network SN(V,E) where V – is a set of vertices and E is a set of directed edges <x,y>:x,y∈V,x≠y A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 316–319, 2011. © Springer-Verlag Berlin Heidelberg 2011
Tracking Group Evolution in Social Networks
317
TSN =< T1 , T2 ,...., Tm >, m ∈ N Ti = SN i (Vi , Ei ), i = 1,2,..., m
(1)
Ei =< x, y >: x, y ∈ Vi , x ≠ y i = 1,2,..., m Evolution of particular social community can be represented as a sequence of events (changes) following each other in the successive time windows (timeframes) within the temporal social network. Possible events in social group evolution are: 1. Continuing (stagnation) – the group continue its existence when two groups in the consecutive time windows are identical or when two groups differ only by few nodes but their size remains the same. 2. Shrinking – the group shrinks when some nodes has left the group, making its size smaller than in the previous time window. Group can shrink slightly i.e. by a few nodes or greatly losing most of its members. 3. Growing (opposite to shrinking) – the group grows when some new nodes have joined the group, making its size bigger than in the previous time window. A group can grow slightly as well as significantly, doubling or even tripling its size. 4. Splitting – the group splits into two or more groups in the next time window when few groups from timeframe Ti+1 consist of members of one group from timeframe Ti. We can distinguish two types of splitting: (1) equal, which means the contribution of the groups in split group is almost the same and (2) unequal when one of the groups has much greater contribution in the split group, which for this one group the event might be similar to shrinking. 5. Merging, (reverse to splitting) – the group has been created by merging several other groups when one group from timeframe Ti+1 consist of two or more groups from the previous timeframe Ti. Merge, just like the split, might be (1) equal, which means the contribution of the groups in merged group is almost the same, or (2) unequal, when one of the groups has much greater contribution into the merged group. In second case for the biggest group the merging might be similar to growing. 6. Dissolving happens when a group ends its life and does not occur in the next time window, i.e., its members have vanished or stop communicating with each other and scattered among the rest of the groups. 7. Forming (opposed to dissolving) of new group occurs when group which has not existed in the previous time window Ti appears in next time window Ti+1. In some cases, a group can be inactive over several timeframes, such case is treated as dissolving of the first group and forming again of the, second, new one.
3 Tracking Group Evolution in Social Networks The GED method, to match two groups from consecutive timeframes takes into consideration both, the quantity and quality of the group members. To express group members quality one of the centrality measures may be used. In this article authors have decided to utilize social position (SP) measure [4] to reflect the quality of group members.
318
P. Bródka, S. Saganowski, and P. Kazienko
To track social community evolution in social network the new method called GED (Group Evolution Discovery) was developed. Key element of this method is a new measure called inclusion. This measure allows to evaluate the inclusion of one group in another. The inclusion of group G1 in group G2 is calculated as follows: group quantity
| G ∩ G2 | I (G1, G2 ) = 1 ⋅ | G1 |
SP ( x) SP ( x)
G1
x∈(G1 ∩G2 ) x∈(G1 )
(2)
G1
group quality
Naturally, instead of social position (SP) any other measure which indicates user importance can be used e.g. centrality degree, betweenness degree, page rank etc. But it is important that this measure is calculated for the group and not for social network in order to reflect node position in group and not in the whole social network. As mentioned earlier the GED method, used to track group evolution, takes into account both the quantity and quality of the group members. The quantity is reflected by the first part of the inclusion measure, i.e. what portion of G1 members is shared by both groups G1 and G2, whereas the quality is expressed by the second part of the inclusion measure, namely what contribution of important members of G1 is shared by both groups G1 and G2. It provides a balance between the groups, which contain many of the less important members and groups with only few but key members. It is assumed that only one event may occur between two groups (G1, G2) in the consecutive timeframes, however one group in timeframe Ti may have several events with different groups in Ti+1. GED – Group Evolution Discovery Method Input: TSN in which at each timeframe Ti groups are extracted by any community detection algorithm. Calculated any user importance measure.
1.
For each pair of groups in consecutive timeframes Ti and Ti+1 inclusion of G1 in G2 and G2 in G1 is counted according to equations (3).
2.
Based on inclusion and size of two groups one type of event may be assigned: a.
Continuing: I(G1,G2) ≥ α and I(G2,G1) ≥ β and |G1| = |G2|
b.
Shrinking: I(G1,G2) ≥ α and I(G2,G1) ≥ β and |G1| > |G2| OR I(G1,G2) < α and I(G2,G1) ≥ β and |G1| ≥ |G2| and there is only one match (matching event) between G2 and all groups in the previous time window Ti
c.
Growing: I(G1,G2) ≥ α and I(G2,G1) ≥ β and |G1|<|G2| OR I(G1,G2) ≥ α and I(G2,G1) < β and |G1| ≤ |G2| and there is only one match (matching event) between G1 and all groups in the next time window Ti+1
d.
Splitting: I(G1,G2) < α and I(G2,G1) ≥ β and |G1| ≥ |G2| and there is more than one match (matching events) between G2 and all groups in the previous time window Ti
Tracking Group Evolution in Social Networks
319
e.
Merging: I(G1,G2) ≥ α and I(G2,G1) < β and |G1| ≤ |G2| and there is more than one match (matching events) between G1 and all groups in the next time window Ti+1
f.
Dissolving: for G1 in Ti and each group G2 in Ti+1 I(G1,G2) < 10% and I(G2,G1) < 10%
g.
Forming: for G2 in Ti+1 and each group G1 in Ti I(G1,G2) < 10% and I(G2,G1) < 10%
The scheme which facilitate understanding of the event selection for the pair of groups in the method is presented in Figure 1.
Fig. 1. The decision tree for assigning the event type to the group
The indicators α and β are the GED method parameters which can be used to adjust the method to particular social network and community detection method. After the experiments analysis authors suggest that the values of α and β should be from range [50%;100%]. Acknowledgments. The work was supported by: Fellowship co-Financed by the European Union within the European Social Fund, The Polish Ministry of Science and Higher Education, the research project, 2010-13, The training in "Green Transfer" co-financed by the EU from the European Social Fund.
References 1. Bródka, P., Saganowski, S., Kazienko, P.: Group Evolution Discovery in Social Networks. In: ASONAM 2011, Taiwan, July 25-27. IEEE Computer Society, Los Alamitos (2011) 2. Chakrabarti, D., Kumar, R., Tomkins, A.: Evolutionary Clustering. In: KDD 2006, Philadelphia, Pennsylvania, USA, August 20-23 (2006) 3. Kim, M.-S., Han, J.: A Particle and Density Based Evolutionary Clustering Method for Dynamic Networks. In: Proceedings of 2009 Int. Conf. on Very Large Data Bases (2009) 4. Musial, K., Kazienko, K., Bródka, P.: User position measures in social networks. In: SNAKDD 2009, Article 6, 9 pages. ACM, New York (2009) 5. Palla, G., Barabási, A.L., Vicsek, T.: Quantifying social group evolution. Nature 446, 664– 667 (2007) 6. Sun, J., Papadimitriou, S., Yu, P., Faloutsos, C.: GraphScope: Parameter-free Mining of Large Time-evolving Graphs
Gathering in Digital Spaces: Exploring Topical Communities on Twitter Cate Huston1 and Michael Weiss2 1 Google Canada [email protected] 2 Carleton University [email protected]
Abstract. On Twitter, hashtags allow users to gather around a topic in a digital space, something that has been common since early IRC and internet chat rooms. However there are three important differences when gathering on Twitter: persistence, invitation, and device independence. In this paper, we search for patterns in these digital spaces through the use of visualization to explore the temporal rhythms that emerge. Keywords: social networking, communities, visualization, twitter.
1 Introduction and Related Work On Twitter, hashtags allow users to gather around a topic. Sometimes, these are one-off and transient, (“#lessambitiousmovies”) but some are reoccurring (“workwednesday”). Event hashtags allow users to gather in a digital space as well as a physical one. Gathering in digital spaces has been common since early internet chat rooms. However, there are three important differences when gathering on Twitter. First, persistence, as users who did not participate can find the content afterwards. Second, invitation is not necessary, users who do not participate, but follow users that do, may see the topic in their stream and join in. Third, device independence, users do not need to be at a computer, there is good user experience on all smart phones. Many conferences now specify a hashtag in order to aid discoverability of other participants. The use of micro-blogging for documenting conferences was explored in [1]. However, use of un-moderated user-generated tweets at conferences can pose new issues to speakers and organizers; a live Twitter stream at Web 2.0 Expo proved to be an unpleasant experience for one speaker [2, 3]. In this paper, we use visualization to explore temporal rhythms. The rest of this paper is organized as follows. The next section describes our research method and data and we present the results of two case studies (future of the news, ESE conference). Finally, we discuss our results and identify future work.
2 Research Method and Results We collected data about two digital spaces. The first dataset comprises two months of tweets from 20 users deemed influential on the future of the news, (via screen A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 320–323, 2011. © Springer-Verlag Berlin Heidelberg 2011
Gathering in Digital Spaces: Exploring Topical Communities on Twitter
321
scraping). The second dataset was collected using the Twitter search feature (http://search.twitter.com) to identify tweets containing the hashtag for the Eclipse Summit Europe 2010 conference (#ese), (using the API, http://dev.twitter.com). The second dataset has fewer tweets, but more users. It is richer in other attributes such as client usage and join date. We also collected overall statistics for each user. Specifically, the total number of tweets, the number of tweets containing @mentions, the number of @replies, and the number of distinct users mentioned. Future of the News An overview of the Future of the News sample is in Figure 1. We notice some outliers in terms of volume: particularly engaged users (using @replys to interact with large numbers of people). We created graphs for each user to show weekly rhythms and relative prevalence of each kind of tweet (Figure 2) – directed (starts with @); undirected, but containing an @; containing a link, but not an @; and “none of the above”. We created graphs of each type for all users, identifying users who have different patterns of use. Figure 3 shows a breakdown of directed tweets, Figure 4 shows tweets containing neither mentions nor links. These graphs aid identification of different types of behavior. For users who are highly interactive, we expect to see spikes in directed @replies. Users who credit others will spike in @mentions. Some users will be a source of information; for them we expect to see spikes in the links graphs. Users tweeting thoughts without credit or the support of a longer article (no link), or users who “me-form” [4] will spike in containing neither mentions nor links. The day breakdowns for individual users (Figure 2) are insufficiently fine-grained to see the rhythms of the user’s day. Thus, we created a custom visualization using Processing. The color scheme is as follows: pink is directed, purple contains a mention, orange contain a link, and grey is “none of the above”. We can see that Dave Winer’s stream (contain mostly URLs without mentions) barely slows at night (Figure 5), we also see bursty replies (Figure 6) suggesting that the user batches replies to messages directed at them, and series of tweets containing neither mentions nor links (Figure 7), suggesting that the user might be tweeting a “stream of consciousness” or constructing an argument in a series of tweets (rather than a blog post). Exploring a Conference Hashtag Because this dataset was collected via the Twitter API, we have more information about client usage, location, etc. It contains more users (181) and 640 tweets (only one couldn’t be retrieved). The distribution amongst users was extremely skewed. The vast majority of users tweeted the hashtag only once (Figure 8), a few users, the subset “live-tweeting”, tweeted between 14 and 26 times. The temporal rhythms that emerged (Figure 9) did not yield information as to which sessions were particularly popular (“tweetable”), but rhythms are fairly consistent, quickly tapering off post-conference. Color scheme is as before.
322
C. Huston and M. Weiss
Fig. 1. User Stats for FOTN
Fig. 2. Dave Winer Day Breakdown
Fig. 3. FOTN Directed Day Breakdown
Fig. 4. FOTN Mentions, Day Breakdown
Fig. 5. FOTN Temporal Rhythms - Dave Winer
Fig. 6. FOTN Temporal Rhythms - Dr. Mark Drapeau
Fig. 7. FOTN Temporal Rhythms - Kirk LaPointe
Fig. 8. Exploring a Conference Hashtag, Tweet Count Frequency
Gathering in Digital Spaces: Exploring Topical Communities on Twitter
323
Fig. 9. Temporal Rhythms for #ESE (with and without directed)
3 Conclusion and Future Work In this paper, we presented our visualizations of the temporal rhythms of two topical communities on Twitter. These allow us to identify behavioral patterns at the macro-level (engagement) and micro-level (bursting, stream of consciousness). Visualizing the conference stream did not seem to expose any interesting patterns, perhaps combining the visualization with textual analysis (keywords, sentiment) would yield more information. We plan to apply these techniques to more datasets to compare and contrast patterns. For example, how are the communities that surround more transient hashtags (those that appear and disappear in a short amount of time, such as #lessambitiousmovies) different than the communities we explored here? What do the temporal rhythms of conference tweeting tell use about the conference itself?
References 1. Saunders, N., Beltrao, P., Jensen, L., Jurczak, D., Krause, R., Kuhn, M., Wu, S.: Microblogging the ISMB: A New Approach to Conference Reporting. PLoS Computational Biology 5 (2009) 2. Boyd, D.: Spectacle at Web2.0 Expo. from my perspective, http://www.zephoria.org/thoughts/archives/2009/11/24/ spectacle_at_we.html 3. Gumption: The Dark Side of Digital Backchannels in Shared Physical Spaces, http://gumption.typepad.com/blog/2009/12/the-dark-side-ofdigital-backchannels-in-shared-physical-spaces.html 4. Naaman, M., Boase, J., Lai, C.H.: Is it really about me?: message content in social awareness streams. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 189–192 (2010)
“Eco-MAME”: Ecology Activity Promotion System Based on Human Psychological Characteristics Rie Tanaka, Shinichi Doi, Taku Konishi, Naoki Yoshinaga, Satoko Itaya, and Keiji Yamada C&C Innovation Research Laboratories, NEC Corporation, 8916-47 Takayama-cho, Ikoma-Shi, Nara, Japan [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Abstract. This study addresses constructing an activity promotion system called “Eco-MAME: Ecological platform for Motivating Activities with Mutual Effect” to let people start to carry out or continue environmentally conscious activities in local communities. We use a major psychological model for explaining the factors of intention for ecological behaviors and focus on people who already have intentions for goals but don’t have intentions for execution, namely, people who cannot carry out activities even though they understand their importance. The Eco-MAME system visualizes one’s own and others’ activities to activate factors of intention. We emphasize showing activities of others to activate factors related to social norms or responsibility. We implemented the proposed system as a web site to conduct a local experiment, and the result showed that the more the user viewed the page the more he/she reduced the usage of the energy, namely, carry out more activities. Keywords: Activity Promotion, Visualization, Personalization.
1
Introduction
When people carry out environmentally conscious activities or polling activities in elections, which are seen as important activities these days, each activity has a small effect and a set of activities has larger effects on the whole environment or society. However, some people don’t try to carry out such activities because they cannot achieve the whole effect or they cannot see similar activities of surrounding people. Our purpose is to construct a framework to promote or change such kinds of activities. We take environmentally conscious activities as one example and design a system called “Eco-MAME: Ecological platform for Motivating Activities with Mutual Effect”, which promotes ecological activities in local communities by visualizing them. Some people make no attempt to put them into action even if they know the activities are important and we focus on such people in this study. We adopt psychological knowledge in the Eco-MAME system, and Eco-MAME promotes peoples’ activities by activating activity intentions by integrating and visualizing information related to their activities A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 324–327, 2011. © Springer-Verlag Berlin Heidelberg 2011
“Eco-MAME”: Ecology Activity Promotion System
325
while considering individual traits. In this paper, we introduce psychological factors relevant to ecological activities and the whole system in Section 2. Then, we explain the actual system, implementing a part of the proposed architecture and the result of the social experiment.
Fig. 1. System architecture of “Eco-MAME”
2
Psychological Factors Related to Environmentally Conscious Behaviors and System Architecture
As one of the major psychological models for anticipating and explaining the factors of intention for ecological behaviors, a model regarding environmentally conscious behaviors and their determinants was proposed by Hirose [1]. This model defines two kinds of intention: (A) intention for a goal and (B) intention for execution. The following perceptions affect the intention for a goal: (A1) Perception of risk: “Is our earth in danger?” (A2) Perception of responsibility: “Do we have a responsibility to save the earth?” (A3) Perception of control: “Can we control the environment?” The following perceptions affect the intention for execution: (B1) Perception of execution possibility: “How easy is the behavior?” (B2) Perception of cost (benefit): “How great is the cost of the behavior?” (B3) Perception of social norm: “Is the behavior socially acceptable?” If one or more determinants of A1, A2 and A3 are activated, the person has an intention for a goal, and if one or more determinants of B1, B2 and B3 are activated, he/she has an intention for execution and carries out some actions. We propose to stimulate such determinants by showing one’s own and others’ activities and environmental information. To activate the determinant A2 or B3, it is required to show behaviors of other people in the same community in addition to one’s own activity. Since it is important that many people carry out ecological activities together to achieve a whole goal, we emphasize showing activities of others to activate A2 or B3. Figure 1 shows the system architecture. The system receives activity data of users from the input interface or sensors and visualizes them (1), 2), 3) and 4) in Fig. 1) to users. What kind of information in Fig. 1 is important differs from one person to another, namely, different people value different determinants from A1
326
R. Tanaka et al.
to B3 (reported in [2], [3]); and the system changes the method of visualization according to such individual traits. For instance, the system gives this advice to people who care about others in the community or value togetherness, namely, who value the perception of social norm (B3): “People in this town set their air conditioners to 29 degrees. We have reduced CO2 emissions by 100 tons this year.”
Fig. 2. Implemented Web Site
3
Implemented System and Local Experiment
This study focuses on people who have intentions for goals but don’t have intentions for execution. We implemented information 1), 2) and 3) in Fig. 1 as a web page and conducted an experiment with local residents. In the experiment, users input energy consumption like power or gas usage manually, and the web page shows the following information as realization of determinants of B1, B2 and B3. Visualization 1: history of actions of a user – 1) in Fig. 1 This can be an indication of the next action and effective for activation of B1. The web page shows the energy use as a historical graph. Visualization 2: other users in community – 2) in Fig. 1 This shows how others in community act and is efficient for activating B3. The social norm means roughly: “I should do this action because everyone else does.” The web shows the average of all participants as an overview with the historical graph (visualization 1). It also shows the ranking and icons of each user in the local map referring the degree of energy saving as detailed information. Visualization 3: Indication or effect of action – 1) and 3) in Fig. 1 This shows what kind of result or benefit we can get by carrying out the activity, and is efficient for activating B1 or B2. The web page shows the amount of CO2 emission and expense accompanied by energy usage. Figure 2 shows important parts of the web page: the graph including visualization 1 and an overview in visualization 2, ranking and icons including details in visualization 2. We conducted the experiment for 5 months with 13 families and got valid power usage data from 8 families. At first, we checked the change of usage, and it came out that the tendency of change was almost the same in 8 families, 3 families could reduce usage compared to that in the same month of the previous year, and 5 families had increased usage in contrast.
“Eco-MAME”: Ecology Activity Promotion System
327
Second, we analyzed the correlation between visualization and the reduction of usage compared to the previous year by focusing on the relationship between the number of accesses of each page and the rate of reduction in each month. As a result, for the page showing the graph, there is a correlation, which means that the more the user viewed the page, the more he/she could reduce usage as shown in Fig. 3. For the page showing ranking and user icons, access frequency was half of the graph page, and there was little positive correlation for the ranking page and negative correlation for the user icon page for whole period of the experiment. However, in terms of the month when people viewed the page frequently, they reduced the usage. Therefore we can say that visualization 1 and 2 had an effect in the period when users visited the web page frequently. This means it is necessary to add some framework or functions to continue to attract users’ attention to maintain promotion of the activities.
Number of accesses
30
Nov. 2009 Dec. 2009 Jan. 2010 Feb. 2010 Mar. 2010
20 10 0 -100
-80
-60
-40 -20 Rate of reduction (%)
0
20
40
gradient: 0.0528
Fig. 3. Traffic of the page showing the graph and rate of reduction
4
Conclusion
This study addressed the construction of a mechanism to promote people’s social behaviors relevant to other people in the community, and proposed an activity promotion system called “Eco-MAME” for the ecological activities. We defined the system with psychological knowledge and visualization methods of people’s behaviors focusing on people who can’t carry out activities even though they understand their importance. We conducted a local experiment with the implemented web site and results showed that the visualization had an effect on the promotion of ecological activities when participants visited the web page frequently. In future work, we will improve our web site to continue to attract users’ attention and address other visualization methods to refer to individual traits that we have not implemented yet.
References 1. Hirose, Y.: Determinants of environment-conscious behavior. Japanese Journal of Social Psychology 10(1), 44–55 (1994) 2. M1-F1 Research Institute: Analysis report Vol.11. Environmental Awareness of Young People: 7 types of classification related to real intention toward ecology (2009), http://m1f1.jp/m1f1/files/report_090909.pdf 3. Mitsubishi Research Institute and ASATSU-DK: ECODAS 2009 (2009), http://www.mri.co.jp/NEWS/press/2009/__icsFiles/afieldfile/20 09/11/26/pr090603_mcu00.pdf
SPLASH: Blending Gaming and Content Sharing in a Location-Based Mobile Application Dion Hoe-Lian Goh, Chei Sian Lee, Alton Y.K. Chua, Khasfariyati Razikin, and Keng-Tiong Tan Wee Kim Wee School of Communication and Information, Nanyang Technological University, Singapore {ashlgoh,leecs,altonchua,khasfariyati,kttan}@ntu.edu.sg
Abstract. In this demonstration, we introduce SPLASH (Seek, PLAy, SHare), a mobile application which blends gaming with content sharing and socializing activities. SPLASH is a human computation game that generates location-based content as a byproduct of gameplay. The entertainment derived from gameplay is harnessed to motivate users to contribute content. A detailed description of the features in SPLASH and its distinctive characteristics will also be presented. Keywords: Mobile application, content sharing, human computation game.
1 Introduction Fueled by technological advancements in mobile devices, social computing applications have empowered users to create, share and seek media rich locationbased content. These applications are fast becoming popular in part due to people’s increasing reliance on mobile phones and their myriad uses beyond voice calling [5]. However, it remains a challenge for users to be motivated to contribute useful information in the long run. This is because the repetitive actions required for performing these tasks will only dull enthusiasm over time [2], and contributors share content on the basis of their goodwill and other intrinsic motivations [3]. One promising approach to promote content sharing is by incorporating games into such activities. These applications, termed Human Computation Games (HCGs) [1, 4], exploit the element of fun as motivation to harness human intelligence. Computations or tasks (e.g. sharing content) are executed by players as they are deriving entertainment from gameplay. Mobile applications embodying these characteristics are becoming increasingly popular recently and is unsurprising, given the tremendous growth in the gaming industry. In this demonstration, we introduce SPLASH (Seek, PLAy, SHare), a mobile application which blends content sharing with gaming activities. It is a mobile HCG that is designed to promote the seeking and sharing of content. That is, while users are entertained through playing the game, they are generating location-based content as a byproduct [3]. Such content can be accessed for other users’ benefit. The different features that support content sharing and gameplay will be demonstrated. Various usage scenarios of SPLASH will also be highlighted. A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 328–331, 2011. © Springer-Verlag Berlin Heidelberg 2011
SPLASH: Blending Gaming and Content Sharing
329
2 Introducing SPLASH At its core, SPLASH allows users to contribute and access location-based content. Layered upon this service, are gaming features that gives users the opportunity to concurrently engage content through play. The application was developed for the Android mobile platform. Content in SPLASH is in the form media-rich location-based information known as “comments”. Each comment comprises a title, tags, description, one or more media elements (e.g. photos) and ratings, which are an indicator for content quality (Figure 1). Other information such as author, date and location are also implicitly captured by the system at creation time. In SPLASH’s content model, a real-world location is organized into two conceptual levels. “Places” represent an arbitrary geographic area that holds comments, and examples include buildings, parks, points of interest, and so on. Places may also be further divided into “units”, with each unit containing its own set of comments. For example, a mall in the real-world could be represented by a place in SPLASH. As the mall has multiple stores, each store is considered a SPLASH unit and contains comments related to it. Note however that units are optional, and a place need not be subdivided if there is no necessity for doing so.
Markers
Fig. 1. A user contributed location-based comment
Fig. 2. Markers on the map indicate availability of virtual rooms
In SPLASH, content sharing features are entwined with gaming features through virtual rooms where users interact, share content and play games. These rooms are designed to establish a sense of community among users, which has been demonstrated to foster content sharing [3]. Put differently, each place or unit is represented by a virtual room which in turn, provides a platform for accessing SPLASH’s content sharing and gaming features. SPLASH offers map interface to access virtual rooms (Figure 2). Each marker on the map represents a real-world location (place or unit) where virtual rooms are available. Users navigate the map by panning and zooming. Accessing a virtual room
330
D. Goh et al.
is accomplished by selecting a marker of interest or by keyword search. In densely populated areas with many virtual rooms, users are first presented with a list of available rooms for selection. Figure 3 shows an example of a virtual room. Each virtual room contains a comment board to access content associated with that location. Further, the room serves as a community-owned space which users may decorate with items purchased from a virtual store. The latter includes furniture, decorative items, musical instruments, games and other objects. By contributing these objects, users are accorded with recognition (explained subsequently), which motivates sustained usage. Virtual rooms also offer entertainment via three types of mini-games. Information mini-games utilize nearby content to help users learn about a particular location. An example is a game that randomly selects a captured image around a user’s current location to create a jigsaw puzzle. By solving the puzzle, users are able to see images shared by others. Mini-HCGs elicit information from users about the current location for the purposes of sharing. One example is a game about pets that reside in the virtual rooms. Pets need to be fed with information in order to thrive. Such information is then made available to users. Finally, casual mini-games offer pure entertainment, such as a shooting game. Mini-games are represented by an arcade machine in the virtual room (see Figure 3) and are played by selecting it. Virtual rooms are also designed to promote socializing. Each user is represented as an avatar which can be customized with items purchased with gold (explained next). These avatars are displayed on users’ individual profile pages (Figure 4) and in virtual rooms that users visit. A “friend” function allows users to add other users as friends. This allows them to quickly view updates of comments posted by their friends, and to also send private messages to them.
User’s avatar
View comments
Purchase items
In-game currency accumulated
User contributed item
Fig. 3. A virtual room with user contributed item
Fig. 4. A user profile with avatar and currency earned
SPLASH: Blending Gaming and Content Sharing
331
To further promote usage of the application, SPLASH offers a number of reward systems. First, users earn in-game currency called gold (Figure 4) when they contribute comments, rate comments or perform well at mini-games. Gold can be used to buy items or customize one’s avatar. Second, the application awards badges for various milestones achieved. These include contributing targeted numbers of comments or ratings, contributing items in virtual rooms, and meeting mini-game objectives. These awards are displayed in a user’s profile page. Third, public scoreboards rank users based on different accomplishments. These include rankings by amount of gold amassed, number of comments contributed, and number of comments rated.
3 Conclusion Several features in SPLASH differentiate itself from other mobile applications for sharing content. First, the application blends gaming with content sharing to encourage users to contribute location-based information. The sense of fun that is derived from games in SPLASH provides an additional benefit to users compared to other mobile content sharing applications where content contributions rely solely on goodwill and other intrinsic motivations. Next, the concept of virtual rooms as an extension to a physical location represents a pioneering approach for mobile content sharing applications. Within the virtual rooms, users are able to play games and socialize, facilitating the contribution of content. Also, the different genres of minigames available provide diversity and challenge for users, and encourage repeated use. Finally, SPLASH supports an API to allow developers to contribute new games, thus promoting diversity within the gaming environment. Further, these APIs also allow developers to extract and synthesize data found in SPLASH to create content mashups that can reside within SPLASH or as separate applications. Acknowledgments. This work was supported by the Singapore National Research Foundation Interactive Digital Media R&D Program, under research Grant NRF NRF2008IDM-IDM004-012.
References 1. Goh, D.H., Ang, R.P., Lee, C.S., Chua, A.Y.K.: Fight or unite: Investigating game genres for image tagging. JASIST 62(7), 1311–1324 (2011) 2. Goh, T.T., Liew, C.L.: SMS-based library catalogue system: A preliminary investigation of user acceptance. The Electronic Library 27, 394–408 (2009) 3. Lee, C.S., Goh, D.H., Chua, A.Y.K., Ang, R.P.: Indagator: Investigating perceived gratifications of an application that blends mobile content sharing with gameplay. JASIST 61(6), 1244–1257 (2010) 4. von Ahn, L., Dabbish, L.: Designing games with a purpose. CACM 51(8), 58–67 (2008) 5. Generations and their gadgets, http://www.pewinternet.org/Reports/2011/ Generations-and-gadgets.aspx
An Interactive Social Boarding System Using Home Infotainment Platform Sounak Dey and Avik Ghose Tata Consultancy Limited, Kolkata, India {Sounak.d,avik.ghose}@tcs.com
Abstract. The authors propose a customer interactive and connected boarder experience for the hospitality industry using the in-room TV set and a Home Infotainment Platform (HIP) connected to the same. This aims at building an interactive and social platform for interacting with the Hotel services and facilities. It also facilitates the replacement of the EPABX in the room with more interactive means for providing services including but not restricted to restaurant table booking, conference room booking, wake-up call service and external communications. This will lead to a more cost-effective deployment of services with a richer feature-set. Keywords: Home Infotainment Platform, VOIP, Social Network, TV, Hospitality.
1 Introduction Interactivity with service provider is highly required for enhanced user experience; in particular visual interaction is preferred over telephonic. Hospitality services are no exception to this trend [7]. Currently in-room services are dependent on voice communication and the only feedback customer gets is from the attending person on the other end. Thus the user experience is entirely dependent on the attendee and is non-visual, non-feedback centric and fails to leverage of vast world of Internet and social networks. Considering the example of the current system for ordering foods and booking tables in restaurants from hotel room is entirely non-visual and nonfeedback centric. Other than very basic details like vegetarian or non-vegetarian categorization and price, customers rarely get to know the primary details of a food item. However, the ingredients and the recipe of an item can be easily captured in a short streaming video [2], which could include recommendations from other customers and comments from chef. Similarly, while booking table at a restaurant, customers are not able to see the existing seating arrangement or which tables are booked or reserved already, which can easily be provided using a networked display. The proposed system is an internet based interactive in-room hospitality service model which uses the home infotainment platform (HIP) [1] using the in-room television (TV) as a display, as well as it can render to smart devices of the boarder. In proposed system, the boarder views the service interface on his in-room TV. The HIP interacts with internet and hotel server to fetch multimedia and text data and renders them in a widget based form on the TV screen and interacts with an Internet based server or the hotel server. The HIP also fetches collective data from social network sites (from user’s social A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, pp. 332–337, 2011. © Springer-Verlag Berlin Heidelberg 2011
An Interactive Social Boarding System Using Home Infotainment Platform
333
profile) and relates them with hospitality services depending on the context. For example, if user logs in, he can view which of his buddies are staying in the same or other hotel of same chain. He can even video conference with his friends using TV and a microphone. Further, personalized data like flight status of the boarder can be shown on demand on TV by matching the boarder profile with airline website and its regular RSS feed [4]. The system also proposes to replace EPABX for the hotel with a LAN service. This will enable boarders to make telephone calls using VOIP technology (using SIP or H.323 based implementations) [5], [6]. This will reduce the maintenance cost and effort of both LAN and EPABX and implementation wise easier. The whole system is a low cost deployment because it re-uses in-room TV as display. The platform is itself a very low cost device which has been implemented using a Texas Instruments DaVinci Processor based STB using Linux 2.6.10 Kernel. It supports all kind of internet connectivity (e.g. wireless, wired, proxy etc.).
2 System Architecture The system consists of a hotel in-room television which is connected to an Over the Top (OTT) box. The OTT box takes A/V input from STB and blends internet based interactive hospitality applications with TV video content. Boarders may use existing TV remote or a keyboard/mouse or a special remote with joystick and keyboard for using and interacting with these applications. The system is flexible enough to accommodate full screen applications keeping TV content off or semi-screen applications keeping TV content re-sized to fit the rest of the screen. Some parts of these applications, especially personal information sharing portions can be rendered in the second screen like smart phones, tablets carried by boarders. Connectivity wise, this system connects to both internet and hotel application server. All the connectivity is implemented via a LAN which accommodates a security firewall to restrict open internet being a concern for local hotel server. The LAN again caters as a VOIP based telephone service thus reducing cost of maintaining a separate EPABX based telephone system. The overall architecture is depicted below [Fig. 1].
Internet
HIP OTT Hotel Server (PMS) TV Screen
Tablets, Smart Phones
Hotel Room Fig. 1. Overall Architecture of the proposed system
334
S. Dey and A. Ghose
The internal architecture of the HIP platform is depicted below [Fig. 2]. As the figure depicts the system has four basic layers namely Application layer, Middle ware, Driver layer, Hardware layer. Of these, Application layer consists of one Graphic TV user Interface module which interacts with a lower layer consist of a connectivity manager, an application controller and a specific hospitality application manager. We discuss this application layer in a brief. Application layer Graphical TV Friendly User Interface Connectivity Manager
Controller
Hospitality Application Manager
Middleware Driver OS / Kernel Hardware
Fig. 2. Internal Architecture of HIP Platform
The Graphical TV User Interface Module is the GUI module which acts as a HMI for the system. The Controller Module handles the user request through GUI and sends them to respective module for handling. Hospitality Application Manager acts as a coordinator between existing hospital Property Management System (PMS) and the OTT. It handles web services for interacting with internet or hotel PMS. Parsing of data (from internet or PMS) is done here and is sent to GUI in a presentable way. Connectivity Manager handles external connectivity, be it wired or wireless.
3 Proposed Methodology To illustrate the methodology of how this proposed system works in real world, we will consider two use cases for in-room hotel services. We will describe each of these cases one by one and with example screen shots. 3.1 Use Case 1: Ordering Food and Booking Table in a Restaurant In the first case, a boarder wants to order some food from his hotel room using this application in TV. A step wise description and analysis of the application is as follows: a) First, boarder has to open the “Order Food” application for this purpose by browsing through a menu in TV GUI. b) At second step, the system will find the restaurants available in and around the hotel property. Restaurants inside the property will be fetched from hotel’s own PMS system and those outside the property will be fetched from geo-location data available
An Interactive Social Boarding System Using Home Infotainment Platform
335
from internet. This data request is based on REST protocol [3] and the reply will be in the standard XML or JSON format. Since this requires data from both PMS and open world of internet, thus a secure firewall based connectivity is required which will smoothly cater both without any conflict or security issue. All the restaurants found will be listed in the screen with a brief description for each or video about the property. Each restaurant will be presented with a link to go into details for ordering food or booking table. c) Now, boarder has to click on the link for one of the restaurant of his choice and he will be presented with the audio visual social food menu of the restaurant along with the table booking status [Fig. 3]. This food menu, along with the price and name of the item, will allow boarder to watch any multimedia content about the item (like how it is made, or if it has any history to know about). Also every item may have feedback comments and ratings from other customers so that it becomes easier for this customer to decide on which food he will order. d) The customer will then order the food and/or book the table [Fig. 3] through this menu GUI. Consequently, PMS system of the respective restaurant will get updated.
Fig. 3. Screenshots of use case 1 (Multimedia based food menu and interactive table booking
3.2 Use Case 2: Connecting Social Network Friends The next use case application is to find the social network (like facebook or twitter) friends of the boarder. A step wise description and analysis of the application is as follows: a) As above, boarder has to open the “Find your friends” application by browsing through the menu in TV GUI [Fig. 3]. On opening the application, user will be prompted for his respective social network account log in. b) Boarder has to put his user id and password in the TV GUI [Fig. 4]. Now, the application will fetch the friend list from his social network profile (i.e. open internet) and will find/match whom of those friends are currently staying at this hotel or other hotel of the same chain (i.e. using PMS system). This fetch is dependent on available APIs of corresponding social network sites. c) A comprehensive list will be presented to boarder with option to have video conference with them using the same in-room TV [Fig. 4]. This video conference will be possible by the proposed VOIP based room to room calling service.
336
S. Dey and A. Ghose
Fig. 4. Screenshots of use case 2 (user log in and his friend list with video conf link)
3.3 Other possible Use Cases Though we have discussed only two applications for demonstrative example, there might be many such in-room service applications (like spa booking, sightseeing, showing alerts for flights, conference room booking etc) which can be presented on in-room TV screen. Again, all these application may mash up data from open internet and hotel PMS at the same time using a secure firewall based LAN which can again be used for VOIP based tele-calling services (like housekeeping, car rental etc.). Also, application scope and rendering can be extended from TV screen to second screen like smart phones, smart tablets carried by boarder.
4 Conclusion In this paper, we have discussed what are the present day demand and trend of the hospitality services and how these services can be improved using social network concepts and by mashing up data from open internet using a plug in hardware device based on Home infotainment platform to in-room TV set and by replacing existing EPABX based telephone service of a hotel with a VOIP (using SIP or H.323 based implementations) based calling service using the same LAN network used for providing Internet services. The advantages of the proposed system are that its Low cost, Easy to deploy and Infrastructure Friendly. Acknowledgments. We are thankful to our colleagues Ranjan Dasgupta, Somnath Ghosh Dastidar, Sanjay Debnath, Chandramohan Km, Smita Rao, Ravi Chand of Tata Consultancy Services Limited.
References 1. Pal, A., Prashant, M., Ghose, A., Bhaumik, C.: Home infotainment platform–A ubiquitous access device for masses. In: Tomar, G.S., Grosky, W.I., Kim, T.-h., Mohammed, S., Saha, S.K. (eds.) UCMA 2010. Communications in Computer and Information Science, vol. 75, pp. 11–19. Springer, Heidelberg (2010) 2. HTTP Live Streaming, http://tools.ietf.org/html/draft-pantos-http-live-streaming-01
An Interactive Social Boarding System Using Home Infotainment Platform
337
3. Representational State Transfer (REST), http://www.ics.uci.edu/~fielding/pubs/dissertation/ rest_arch_style.htm 4. RSS specifications, http://www.rssboard.org/rss-specification 5. SIP specifications, http://www.ietf.org/rfc/rfc3261.txt 6. H. 323, http://www.itu.int/rec/T-REC-H.323-200912-I/en 7. Bjorkqvist, A.: Haaga-Helia University of Applied Sciences: Concepting the hotel for tomorrow, http://www.tekes.fi/fi/gateway/ PTARGS_0_201_403_994_2095_43/http%3B/tekes-ali1%3B7087/ publishedcontent/publish/programmes/vapari/documents/ projektien_tuloksia/concepting_hotel.pdf
From Computational to Human Trust: Problems, Methods and Applications of Trust Management Adam Wierzbicki Polish-Japanese Institute of Information Technology (PJIIT) Ul. Koszykowa 86, 02-008 Warsaw, Poland [email protected]
Abstract. The tutorial will be devoted to trust management mechanisms and their practical applications, such as reputation systems for online auctions and recommendation systems.
1 Introduction The tutorial will be devoted to trust management mechanisms and their practical applications, such as reputation systems for online auctions (like eBay) and recommendation systems (like epinions). The first aim of the tutorial will be to familiarize participants with trust management methods and to give a background for researchers interested in this subject. The tutorial will also demonstrate various research methods used in the area, from the analysis of large datasets to surveys and experimental games. The tutorial will include a specially designed experimental game that will allow participants to get a feeling for the more abstract concepts used in trust management, as well as to understand the basic issues of trust management system design. The second goal of the tutorial is to thoroughly cover two areas of trust management: reputation systems for online auctions and trust propagation algorithms for recommendation systems. These areas will be discussed in detail, covering relevant research results and the author’s own work. The tutorial will conclude by a discussion of possible research directions in the area of trust management.
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, p. 338, 2011. © Springer-Verlag Berlin Heidelberg 2011
Text Analytics for Social Research Stuart W. Shulman University of Massachusetts Amherst 200 Hicks Way Amherst, MA 01003 [email protected]
Abstract. This tutorial provides software training in “DiscoverText,” which is text analytic software developed by Professor Shulman. His work advances text mining and natural language processing research. The training links these worlds via straightforward and easy to understand explanations of software features that can be tailored for all experience levels and industries. Keywords: software, text analytics, archiving, classification, metadata.
1 Introduction DiscoverText, a new Web-based software application launched by Texifter, LLC, allows users to make sense of email archives, social media content, and other electronic document collections. Utilizing the “Graph API” feature of Facebook, or the public API for Twitter, users of DiscoverText can login using their credentials and begin archiving thousands of posts and comments on selected pages. In addition to analyzing text from social media and a variety of other sources, the software is designed to improve standard research, government and business processes. Users can securely upload emails or project documents and quickly redact sensitive, confidential, classified, or potentially offensive information before circulating or posting it, thereby saving companies hours of monotonous work. The platform also include de-duplication and near-duplicate clustering features that improve the ability of rule writing agencies to sort through tens or hundreds of thousands of electronic public comments. With DiscoverText it is also possible to crowd source data analysis in novel ways, leveraging peer relationships and Web-verifiable credentials. Ingesting hundreds of thousands of items from social media, email and electronic document repositories is easier than ever. Advanced social search leveraging metadata, networks, credentials and filters will change the way users interact with text data over time. This innovative platform brings topic modeling, sentiment detection, and other information retrieval and natural language technologies into an active learning loop where user-created choices customize and improve our text processing algorithms.
A. Datta et al. (Eds.): SocInfo 2011, LNCS 6984, p. 339, 2011. © Springer-Verlag Berlin Heidelberg 2011
Author Index
Ahmad, Muhammad Aurangzeb Baez, Marcos 308 B¨ ohm, Klemens 240 Boldi, Paolo 8 Borbora, Zoheb 145 Br´ odka, Piotr 283, 316
269
296
Gackowski, Piotr 212 Gassler, Wolfgang 113 Ghose, Avik 332 Goh, Dion Hoe-Lian 328 Gong, Xiajing 127 G¨ unther, Oliver 171 Han, Lixin 84 Hayes, Conor 153 He, Bingsheng 304 Hu, Yuh-Jong 198 Huston, Cate 320
Ma, Linling 287 Macy, Michael W. 1 Mandyam, Sridhar 22 Marchese, Maurizio 75, 308 Michahelles, Florian 161 Michalski, Radoslaw 283 Mirylenka, Daniil 75 Ml´ ynkov´ a, Irena 226 Moon, Sue 7 Motoda, Hiroshi 6 Munemasa, Toshikazu 184 Nam, Taewoo 51, 67 Nasirifard, Peyman 153 Palus, Sebastian 283 Park, Jaram 300 Pfaltz, John L. 36 Pletikosa Cvijikj, Irena Posner, Sarah 292 Qian, Weining
Jankowski, Jaroslaw 312 Jeong, Jaeseung 300 Juszczyszyn, Krzysztof 283 283, 316
161
287
Rao, Pallavi 59 Razikin, Khasfariyati Rohde, Markus 255 Rosa, Marco 8
Itaya, Satoko 324 Iwaihara, Mizuho 184
Kazienko, Przemyslaw Kim, Hoh 300 Knap, Tom´ aˇs 226
Konishi, Taku 324 Koroleva, Ksenia 171 Krasnova, Hanna 171 Lee, Chei Sian 328 Liu, Yuan 135 L¨ u, Jian 84
Cai, Peng 287 Casati, Fabio 75, 308 Cha, Meeyoung 300 Chen, Hsinchun 5 Chua, Alton Y.K. 328 Ciampaglia, Giovanni Luca Contractor, Noshir 3 Datta, Anwitaman Dey, Sounak 332 Doi, Shinichi 324 Du, Juan 287
145
328
Saganowski, Stanislaw 316 Sautter, Guido 240 Sayogo, Djoko Sigit 51, 67 Shen, Cuihua 145 Shulman, Stuart W. 339 Skoric, Marko M. 59 Specht, G¨ unther 113 Spychala, Justyna 212
342
Author Index
Sridhar, Usha 22 Srivastava, Jaideep
Xu, Feng
84
4, 145
Tan, Jackson Teck Yong Tan, Keng-Tiong 328 Tanaka, Rie 324 Tang, Xuning 127 Turek, Piotr 212
296
Ventresque, Anthony 296 Vigna, Sebastiano 8 Wang, Yazhe 98 Weiss, Michael 320 Wiedenhoefer, Torben 255 Wierzbicki, Adam 212, 338 Williams, Dmitri 145 Wollersheim, Dennis 292 Wu, Win-Nan 198
Yamada, Keiji 324 Yang, Christopher C. 127 Yang, Jiun-Jan 198 Yao, Yuan 84 Yetim, Fahri 255 Yoshinaga, Naoki 324 Zangerle, Eva 113 Zhang, Jie 135 Zhang, Jing 67 Zhang, Weiming 287 Zheng, Baihua 98 Zhong, Jianlong 304 Zhou, Aoying 287 Zhou, Jiufeng 84 Zhu, Quanyan 135