Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
6777
Vincent G. Duffy (Ed.)
Digital Human Modeling Third International Conference, ICDHM 2011 Held as Part of HCI International 2011 Orlando, FL, USA, July 2011 Proceedings
13
Volume Editor Vincent G. Duffy Purdue University School of Industrial Engineering and Department of Agricultural and Biological Engineering 315 N. Grant Street, West Lafayette, Indiana 47907, USA E-mail:
[email protected]
ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-21798-2 e-ISBN 978-3-642-21799-9 DOI 10.1007/978-3-642-21799-9 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011929223 CR Subject Classification (1998): H.5, H.1, H.3, H.4.2, I.2-6, J.3 LNCS Sublibrary: SL 3 – Information Systems and Application, incl. Internet/Web and HCI
© Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Foreword
The 14th International Conference on Human–Computer Interaction, HCI International 2011, was held in Orlando, Florida, USA, July 9–14, 2011, jointly with the Symposium on Human Interface (Japan) 2011, the 9th International Conference on Engineering Psychology and Cognitive Ergonomics, the 6th International Conference on Universal Access in Human–Computer Interaction, the 4th International Conference on Virtual and Mixed Reality, the 4th International Conference on Internationalization, Design and Global Development, the 4th International Conference on Online Communities and Social Computing, the 6th International Conference on Augmented Cognition, the Third International Conference on Digital Human Modeling, the Second International Conference on Human-Centered Design, and the First International Conference on Design, User Experience, and Usability. A total of 4,039 individuals from academia, research institutes, industry and governmental agencies from 67 countries submitted contributions, and 1,318 papers that were judged to be of high scientific quality were included in the program. These papers address the latest research and development efforts and highlight the human aspects of design and use of computing systems. The papers accepted for presentation thoroughly cover the entire field of human–computer interaction, addressing major advances in knowledge and effective use of computers in a variety of application areas. This volume, edited by Vincent G. Duffy, contains papers in the thematic area of digital human modeling (DHM), addressing the following major topics: • • • • •
Anthropometry applications Posture and motion modeling Digital human modeling and design Cognitive modeling Driver modeling
The remaining volumes of the HCI International 2011 Proceedings are: • Volume 1, LNCS 6761, Human–Computer Interaction—Design and Development Approaches (Part I), edited by Julie A. Jacko • Volume 2, LNCS 6762, Human–Computer Interaction—Interaction Techniques and Environments (Part II), edited by Julie A. Jacko • Volume 3, LNCS 6763, Human–Computer Interaction—Towards Mobile and Intelligent Interaction Environments (Part III), edited by Julie A. Jacko • Volume 4, LNCS 6764, Human–Computer Interaction—Users and Applications (Part IV), edited by Julie A. Jacko • Volume 5, LNCS 6765, Universal Access in Human–Computer Interaction— Design for All and eInclusion (Part I), edited by Constantine Stephanidis • Volume 6, LNCS 6766, Universal Access in Human–Computer Interaction— Users Diversity (Part II), edited by Constantine Stephanidis
VI
Foreword
• Volume 7, LNCS 6767, Universal Access in Human–Computer Interaction— Context Diversity (Part III), edited by Constantine Stephanidis • Volume 8, LNCS 6768, Universal Access in Human–Computer Interaction— Applications and Services (Part IV), edited by Constantine Stephanidis • Volume 9, LNCS 6769, Design, User Experience, and Usability—Theory, Methods, Tools and Practice (Part I), edited by Aaron Marcus • Volume 10, LNCS 6770, Design, User Experience, and Usability— Understanding the User Experience (Part II), edited by Aaron Marcus • Volume 11, LNCS 6771, Human Interface and the Management of Information—Design and Interaction (Part I), edited by Michael J. Smith and Gavriel Salvendy • Volume 12, LNCS 6772, Human Interface and the Management of Information—Interacting with Information (Part II), edited by Gavriel Salvendy and Michael J. Smith • Volume 13, LNCS 6773, Virtual and Mixed Reality—New Trends (Part I), edited by Randall Shumaker • Volume 14, LNCS 6774, Virtual and Mixed Reality—Systems and Applications (Part II), edited by Randall Shumaker • Volume 15, LNCS 6775, Internationalization, Design and Global Development, edited by P.L. Patrick Rau • Volume 16, LNCS 6776, Human-Centered Design, edited by Masaaki Kurosu • Volume 18, LNCS 6778, Online Communities and Social Computing, edited by A. Ant Ozok and Panayiotis Zaphiris • Volume 19, LNCS 6779, Ergonomics and Health Aspects of Work with Computers, edited by Michelle M. Robertson • Volume 20, LNAI 6780, Foundations of Augmented Cognition: Directing the Future of Adaptive Systems, edited by Dylan D. Schmorrow and Cali M. Fidopiastis • Volume 21, LNAI 6781, Engineering Psychology and Cognitive Ergonomics, edited by Don Harris • Volume 22, CCIS 173, HCI International 2011 Posters Proceedings (Part I), edited by Constantine Stephanidis • Volume 23, CCIS 174, HCI International 2011 Posters Proceedings (Part II), edited by Constantine Stephanidis I would like to thank the Program Chairs and the members of the Program Boards of all Thematic Areas, listed herein, for their contribution to the highest scientific quality and the overall success of the HCI International 2011 Conference. In addition to the members of the Program Boards, I also wish to thank the following volunteer external reviewers: Roman Vilimek from Germany, Ramalingam Ponnusamy from India, Si Jung “Jun” Kim from the USA, and Ilia Adami, Iosif Klironomos, Vassilis Kouroumalis, George Margetis, and Stavroula Ntoa from Greece.
Foreword
VII
This conference would not have been possible without the continuous support and advice of the Conference Scientific Advisor, Gavriel Salvendy, as well as the dedicated work and outstanding efforts of the Communications and Exhibition Chair and Editor of HCI International News, Abbas Moallem. I would also like to thank for their contribution toward the organization of the HCI International 2011 Conference the members of the Human–Computer Interaction Laboratory of ICS-FORTH, and in particular Margherita Antona, George Paparoulis, Maria Pitsoulaki, Stavroula Ntoa, Maria Bouhli and George Kapnas. July 2011
Constantine Stephanidis
Organization
Ergonomics and Health Aspects of Work with Computers Program Chair: Michelle M. Robertson Arne Aar˚ as, Norway Pascale Carayon, USA Jason Devereux, UK Wolfgang Friesdorf, Germany Martin Helander, Singapore Ed Israelski, USA Ben-Tzion Karsh, USA Waldemar Karwowski, USA Peter Kern, Germany Danuta Koradecka, Poland Nancy Larson, USA Kari Lindstr¨om, Finland
Brenda Lobb, New Zealand Holger Luczak, Germany William S. Marras, USA Aura C. Matias, Philippines Matthias R¨ otting, Germany Michelle L. Rogers, USA Dominique L. Scapin, France Lawrence M. Schleifer, USA Michael J. Smith, USA Naomi Swanson, USA Peter Vink, The Netherlands John Wilson, UK
Human Interface and the Management of Information Program Chair: Michael J. Smith Hans-J¨ org Bullinger, Germany Alan Chan, Hong Kong Shin’ichi Fukuzumi, Japan Jon R. Gunderson, USA Michitaka Hirose, Japan Jhilmil Jain, USA Yasufumi Kume, Japan Mark Lehto, USA Hirohiko Mori, Japan Fiona Fui-Hoon Nah, USA Shogo Nishida, Japan Robert Proctor, USA
Youngho Rhee, Korea Anxo Cereijo Roib´ as, UK Katsunori Shimohara, Japan Dieter Spath, Germany Tsutomu Tabe, Japan Alvaro D. Taveira, USA Kim-Phuong L. Vu, USA Tomio Watanabe, Japan Sakae Yamamoto, Japan Hidekazu Yoshikawa, Japan Li Zheng, P. R. China
X
Organization
Human–Computer Interaction Program Chair: Julie A. Jacko Sebastiano Bagnara, Italy Sherry Y. Chen, UK Marvin J. Dainoff, USA Jianming Dong, USA John Eklund, Australia Xiaowen Fang, USA Ayse Gurses, USA Vicki L. Hanson, UK Sheue-Ling Hwang, Taiwan Wonil Hwang, Korea Yong Gu Ji, Korea Steven A. Landry, USA
Gitte Lindgaard, Canada Chen Ling, USA Yan Liu, USA Chang S. Nam, USA Celestine A. Ntuen, USA Philippe Palanque, France P.L. Patrick Rau, P.R. China Ling Rothrock, USA Guangfeng Song, USA Steffen Staab, Germany Wan Chul Yoon, Korea Wenli Zhu, P.R. China
Engineering Psychology and Cognitive Ergonomics Program Chair: Don Harris Guy A. Boy, USA Pietro Carlo Cacciabue, Italy John Huddlestone, UK Kenji Itoh, Japan Hung-Sying Jing, Taiwan Wen-Chin Li, Taiwan James T. Luxhøj, USA Nicolas Marmaras, Greece Sundaram Narayanan, USA Mark A. Neerincx, The Netherlands
Jan M. Noyes, UK Kjell Ohlsson, Sweden Axel Schulte, Germany Sarah C. Sharples, UK Neville A. Stanton, UK Xianghong Sun, P.R. China Andrew Thatcher, South Africa Matthew J.W. Thomas, Australia Mark Young, UK Rolf Zon, The Netherlands
Universal Access in Human–Computer Interaction Program Chair: Constantine Stephanidis Julio Abascal, Spain Ray Adams, UK Elisabeth Andr´e, Germany Margherita Antona, Greece Chieko Asakawa, Japan Christian B¨ uhler, Germany Jerzy Charytonowicz, Poland Pier Luigi Emiliani, Italy
Michael Fairhurst, UK Dimitris Grammenos, Greece Andreas Holzinger, Austria Simeon Keates, Denmark Georgios Kouroupetroglou, Greece Sri Kurniawan, USA Patrick M. Langdon, UK Seongil Lee, Korea
Organization
Zhengjie Liu, P.R. China Klaus Miesenberger, Austria Helen Petrie, UK Michael Pieper, Germany Anthony Savidis, Greece Andrew Sears, USA Christian Stary, Austria
Hirotada Ueda, Japan Jean Vanderdonckt, Belgium Gregg C. Vanderheiden, USA Gerhard Weber, Germany Harald Weber, Germany Panayiotis Zaphiris, Cyprus
Virtual and Mixed Reality Program Chair: Randall Shumaker Pat Banerjee, USA Mark Billinghurst, New Zealand Charles E. Hughes, USA Simon Julier, UK David Kaber, USA Hirokazu Kato, Japan Robert S. Kennedy, USA Young J. Kim, Korea Ben Lawson, USA Gordon McK Mair, UK
David Pratt, UK Albert “Skip” Rizzo, USA Lawrence Rosenblum, USA Jose San Martin, Spain Dieter Schmalstieg, Austria Dylan Schmorrow, USA Kay Stanney, USA Janet Weisenford, USA Mark Wiederhold, USA
Internationalization, Design and Global Development Program Chair: P.L. Patrick Rau Michael L. Best, USA Alan Chan, Hong Kong Lin-Lin Chen, Taiwan Andy M. Dearden, UK Susan M. Dray, USA Henry Been-Lirn Duh, Singapore Vanessa Evers, The Netherlands Paul Fu, USA Emilie Gould, USA Sung H. Han, Korea Veikko Ikonen, Finland Toshikazu Kato, Japan Esin Kiris, USA Apala Lahiri Chavan, India
James R. Lewis, USA James J.W. Lin, USA Rungtai Lin, Taiwan Zhengjie Liu, P.R. China Aaron Marcus, USA Allen E. Milewski, USA Katsuhiko Ogawa, Japan Oguzhan Ozcan, Turkey Girish Prabhu, India Kerstin R¨ ose, Germany Supriya Singh, Australia Alvin W. Yeo, Malaysia Hsiu-Ping Yueh, Taiwan
XI
XII
Organization
Online Communities and Social Computing Program Chairs: A. Ant Ozok, Panayiotis Zaphiris Chadia N. Abras, USA Chee Siang Ang, UK Peter Day, UK Fiorella De Cindio, Italy Heidi Feng, USA Anita Komlodi, USA Piet A.M. Kommers, The Netherlands Andrew Laghos, Cyprus Stefanie Lindstaedt, Austria Gabriele Meiselwitz, USA Hideyuki Nakanishi, Japan
Anthony F. Norcio, USA Ulrike Pfeil, UK Elaine M. Raybourn, USA Douglas Schuler, USA Gilson Schwartz, Brazil Laura Slaughter, Norway Sergei Stafeev, Russia Asimina Vasalou, UK June Wei, USA Haibin Zhu, Canada
Augmented Cognition Program Chairs: Dylan D. Schmorrow, Cali M. Fidopiastis Monique Beaudoin, USA Chris Berka, USA Joseph Cohn, USA Martha E. Crosby, USA Julie Drexler, USA Ivy Estabrooke, USA Chris Forsythe, USA Wai Tat Fu, USA Marc Grootjen, The Netherlands Jefferson Grubb, USA Santosh Mathan, USA
Rob Matthews, Australia Dennis McBride, USA Eric Muth, USA Mark A. Neerincx, The Netherlands Denise Nicholson, USA Banu Onaral, USA Kay Stanney, USA Roy Stripling, USA Rob Taylor, UK Karl van Orden, USA
Digital Human Modeling Program Chair: Vincent G. Duffy Karim Abdel-Malek, USA Giuseppe Andreoni, Italy Thomas J. Armstrong, USA Norman I. Badler, USA Fethi Calisir, Turkey Daniel Carruth, USA Keith Case, UK Julie Charland, Canada
Yaobin Chen, USA Kathryn Cormican, Ireland Daniel A. DeLaurentis, USA Yingzi Du, USA Okan Ersoy, USA Enda Fallon, Ireland Yan Fu, P.R. China Afzal Godil, USA
Organization
Ravindra Goonetilleke, Hong Kong Anand Gramopadhye, USA Lars Hanson, Sweden Pheng Ann Heng, Hong Kong Bo Hoege, Germany Hongwei Hsiao, USA Tianzi Jiang, P.R. China Nan Kong, USA Steven A. Landry, USA Kang Li, USA Zhizhong Li, P.R. China Tim Marler, USA
XIII
Ahmet F. Ozok, Turkey Srinivas Peeta, USA Sudhakar Rajulu, USA Matthias R¨ otting, Germany Matthew Reed, USA Johan Stahre, Sweden Mao-Jiun Wang, Taiwan Xuguang Wang, France Jingzhou (James) Yang, USA Gulcin Yucel, Turkey Tingshao Zhu, P.R. China
Human-Centered Design Program Chair: Masaaki Kurosu Julio Abascal, Spain Simone Barbosa, Brazil Tomas Berns, Sweden Nigel Bevan, UK Torkil Clemmensen, Denmark Susan M. Dray, USA Vanessa Evers, The Netherlands Xiaolan Fu, P.R. China Yasuhiro Horibe, Japan Jason Huang, P.R. China Minna Isomursu, Finland Timo Jokela, Finland Mitsuhiko Karashima, Japan Tadashi Kobayashi, Japan Seongil Lee, Korea Kee Yong Lim, Singapore
Zhengjie Liu, P.R. China Lo¨ıc Mart´ınez-Normand, Spain Monique Noirhomme-Fraiture, Belgium Philippe Palanque, France Annelise Mark Pejtersen, Denmark Kerstin R¨ ose, Germany Dominique L. Scapin, France Haruhiko Urokohara, Japan Gerrit C. van der Veer, The Netherlands Janet Wesson, South Africa Toshiki Yamaoka, Japan Kazuhiko Yamazaki, Japan Silvia Zimmermann, Switzerland
Design, User Experience, and Usability Program Chair: Aaron Marcus Ronald Baecker, Canada Barbara Ballard, USA Konrad Baumann, Austria Arne Berger, Germany Randolph Bias, USA Jamie Blustein, Canada
Ana Boa-Ventura, USA Lorenzo Cantoni, Switzerland Sameer Chavan, Korea Wei Ding, USA Maximilian Eibl, Germany Zelda Harrison, USA
XIV
Organization
R¨ udiger Heimg¨artner, Germany Brigitte Herrmann, Germany Sabine Kabel-Eckes, USA Kaleem Khan, Canada Jonathan Kies, USA Jon Kolko, USA Helga Letowt-Vorbek, South Africa James Lin, USA Frazer McKimm, Ireland Michael Renner, Switzerland
Christine Ronnewinkel, Germany Elizabeth Rosenzweig, USA Paul Sherman, USA Ben Shneiderman, USA Christian Sturm, Germany Brian Sullivan, USA Jaakko Villa, Finland Michele Visciola, Italy Susan Weinschenk, USA
HCI International 2013
The 15th International Conference on Human–Computer Interaction, HCI International 2013, will be held jointly with the affiliated conferences in the summer of 2013. It will cover a broad spectrum of themes related to human–computer interaction (HCI), including theoretical issues, methods, tools, processes and case studies in HCI design, as well as novel interaction techniques, interfaces and applications. The proceedings will be published by Springer. More information about the topics, as well as the venue and dates of the conference, will be announced through the HCI International Conference series website: http://www.hci-international.org/ General Chair Professor Constantine Stephanidis University of Crete and ICS-FORTH Heraklion, Crete, Greece Email:
[email protected]
Table of Contents
Part I: Anthropometry Applications The Effects of Landmarks and Training on 3D Surface Anthropometric Reliability and Hip Joint Center Prediction . . . . . . . . . . . . . . . . . . . . . . . . . Wen-Ko Chiou, Bi-Hui Chen, and Wei-Ying Chou
3
An Automatic Method for Computerized Head and Facial Anthropometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jing-Jing Fang and Sheng-Yi Fang
12
3D Parametric Body Model Based on Chinese Female Anhtropometric Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peng Sixiang, Chan Chee-kooi, W.H. Ip, and Ameersing Luximon
22
Anthropometric Measurement of the Feet of Chinese Children . . . . . . . . . Linghua Ran, Xin Zhang, Chuzhi Chao, and Taijie Liu
30
Human Dimensions of Chinese Minors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xin Zhang, Yanyu Wang, Linghua Ran, Ailan Feng, Ketai He, Taijie Liu, and Jianwei Niu
37
Development of Sizing Systems for Chinese Minors . . . . . . . . . . . . . . . . . . . Xin Zhang, Yanyu Wang, Linghua Ran, Ailan Feng, Ketai He, Taijie Liu, and Jianwei Niu
46
Part II: Posture and Motion Modeling Motion Capture Experiments for Validating Optimization-Based Human Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aimee Cloutier, Robyn Boothby, and Jingzhou (James) Yang
59
Posture Reconstruction Method for Mapping Joint Angles of Motion Capture Experiments to Simulation Models . . . . . . . . . . . . . . . . . . . . . . . . . Jared Gragg, Jingzhou (James) Yang, and Robyn Boothby
69
Joint Torque Modeling of Knee Extension and Flexion . . . . . . . . . . . . . . . . Fabian Guenzkofer, Florian Engstler, Heiner Bubb, and Klaus Bengler Predicting Support Reaction Forces for Standing and Seated Tasks with Given Postures-A Preliminary Study . . . . . . . . . . . . . . . . . . . . . . . . . . . Brad Howard and Jingzhou (James) Yang
79
89
XVIII
Table of Contents
Schema for Motion Capture Data Management . . . . . . . . . . . . . . . . . . . . . . Ali Keyvani, Henrik Johansson, Mikael Ericsson, Dan L¨ amkull, and ¨ Roland Ortengren
99
Simulating Ingress Motion for Heavy Earthmoving Equipment . . . . . . . . . HyunJung Kwon, Mahdiar Hariri, Rajan Bhatt, Jasbir Arora, and Karim Abdel-Malek
109
Contact Area Determination between a N95 Filtering Facepiece Respirator and a Headform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhipeng Lei and Jingzhou (James) Yang
119
Ergonomics Evaluation of Three Operation Postures for Astronauts . . . . Dongxu Li and Yan Zhao
129
In Silicon Study of 3D Elbow Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . Kang Li and Virak Tan
139
Implicit Human-Computer Interaction by Posture Recognition . . . . . . . . . Enrico Maier
143
Optimization-Based Posture Prediction for Analysis of Box Lifting Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tim Marler, Lindsey Knake, and Ross Johnson
151
Planar Vertical Jumping Simulation-A Pilot Study . . . . . . . . . . . . . . . . . . . Burak Ozsoy and Jingzhou (James) Yang
161
StabilitySole: Embedded Sensor Insole for Balance and Gait Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peyton Paulick, Hamid Djalilian, and Mark Bachman
171
The Upper Extremity Loading during Typing Using One, Two and Three Fingers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jin Qin, Matthieu Trudeau, and Jack T. Dennerlein
178
Automatic Face Feature Points Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . Dominik Rupprecht, Sebastian Hesse, and Rainer Blum
186
3D Human Motion Capturing Based only on Acceleration and Angular Rate Measurement for Low Extremities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christoph Schiefer, Thomas Kraus, Elke Ochsmann, Ingo Hermanns, and Rolf Ellegast Application of Human Modeling in Multi-crew Cockpit Design . . . . . . . . . Xiaohui Sun, Feng Gao, Xiugan Yuan, and Jingquan Zhao A Biomechanical Approach for Evaluating Motion Related Discomfort: by an Application to Pedal Clutching Movement . . . . . . . . . . . . . . . . . . . . . Xuguang Wang, Romain Pannetier, Nagananda Krishna Burra, and Julien Numa
195
204
210
Table of Contents
Footbed Influences on Posture and Perceived Feel . . . . . . . . . . . . . . . . . . . . Thilina W. Weerasinghe and Ravindra S. Goonetilleke Postural Observation of Shoulder Flexion during Asymmetric Lifting Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xu Xu, Chien-Chi Chang, Gert S. Faber, Idsart Kingma, and Jack T. Dennerlein An Alternative Formulation for Determining Weights of Joint Displacement Objective Function in Seated Posture Prediction . . . . . . . . . Qiuling Zou, Qinghong Zhang, Jingzhou (James) Yang, Robyn Boothby, Jared Gragg, and Aimee Cloutier
XIX
220
228
231
Part III: Digital Human Modeling and Design Videogames and Elders: A New Path in LCT? . . . . . . . . . . . . . . . . . . . . . . . Nicola D’Aquaro, Dario Maggiorini, Giacomo Mancuso, and Laura A. Ripamonti
245
Research on Digital Human Model Used in Human Factor Simulation and Evaluation of Load Carriage Equipment . . . . . . . . . . . . . . . . . . . . . . . . . Dayong Dong, Lijing Wang, Xiugan Yuan, and Shan Fu
255
Multimodal, Touchless Interaction in Spatial Augmented Reality Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monika Elepfandt and Marcelina S¨ underhauf
263
Introducing ema (Editor for Manual Work Activities) – A New Tool for Enhancing Accuracy and Efficiency of Human Simulations in Digital Production Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lars Fritzsche, Ricardo Jendrusch, Wolfgang Leidholdt, Sebastian Bauer, Thomas J¨ ackel, and Attila Pirger
272
Accelerated Real-Time Reconstruction of 3D Deformable Objects from Multi-view Video Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Holger Graf, Leon Hazke, Svenja Kahn, and Cornelius Malerczyk
282
Second Life as a Platform for Creating Intelligent Virtual Agents . . . . . . Larry F. Hodges, Amy Ulinski, Toni Bloodworth, Austen Hayes, John Mark Smotherman, and Brandon Kerr A Framework for Automatic Simulated Accessibility Assessment in Virtual Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nikolaos Kaklanis, Panagiotis Moschonas, Konstantinos Moustakas, and Dimitrios Tzovaras Cloth Modeling and Simulation: A Literature Survey . . . . . . . . . . . . . . . . . James Long, Katherine Burns, and Jingzhou (James) Yang
292
302
312
XX
Table of Contents
Preliminary Study on Dynamic Foot Model . . . . . . . . . . . . . . . . . . . . . . . . . Ameersing Luximon and Yan Luximon
321
Three-Dimensional Grading of Virtual Garment with Design Signature Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roger Ng
328
A Model of Shortcut Usage in Multimodal Human-Computer Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stefan Schaffer, Robert Schleicher, and Sebastian M¨ oller
337
Multimodal User Interfaces in IPS2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ulrike Schmuntzsch and Matthias R¨ otting
347
The Application of the Human Model in the Thermal Comfort Assessment of Fighter Plane’s Cockpit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Haifeng Shen and Xiugan Yuan
357
Mass Customization Methodology for Footwear Design . . . . . . . . . . . . . . . Yifan Zhang, Ameersing Luximon, Xiao Ma, Xiaoling Guo, and Ming Zhang
367
Part IV: Cognitive Modeling Incorporating Motion Data and Cognitive Models in IPS2 . . . . . . . . . . . . . Michael Beckmann and Jeronimo Dzaack Study on Synthetic Evaluation of Human Performance in Manually Controlled Spacecraft Rendezvous and Docking Tasks . . . . . . . . . . . . . . . . Ting Jiang, Chunhui Wang, Zhiqiang Tian, Yongzhong Xu, and Zheng Wang Dynamic Power Tool Operation Model: Experienced Users vs. Novice Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jia-Hua Lin, Raymond W. McGorry, and Chien-Chi Chang An Empirical Study of Disassembling Using an Augmented Vision System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Barbara Odenthal, Marcel Ph. Mayer, Wolfgang Kabuß, Bernhard Kausch, and Christopher M. Schlick
379
387
394
399
Polymorphic Cumulative Learning in Integrated Cognitive Architectures for Analysis of Pilot-Aircraft Dynamic Environment . . . . . . . . . . . . . . . . . . Yin Tangwen and Shan Fu
409
A Context-Aware Adaptation System for Spatial Augmented Reality . . . Anne Wegerich and Matthias R¨ otting
417
Table of Contents
XXI
Using Physiological Parameters to Evaluate Operator’s Workload in Manual Controlled Rendezvous and Docking (RVD) . . . . . . . . . . . . . . . . . . Bin Wu, Fang Hou, Zhi Yao, Jianwei Niu, and Weifen Huang
426
Task Complexity Related Training Effects on Operation Error of Spaceflight Emergency Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yijing Zhang, Bin Wu, Xiang Zhang, Wang Quanpeng, and Min Liu
436
The Research of Crew Workload Evaluation Based on Digital Human Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yiyuan Zheng and Shan Fu
446
Part V: Driver Modeling A Simulation Environment for Analysis and Optimization of Driver Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ola Benderius, Gustav Markkula, Krister Wolff, and Mattias Wahde
453
Learning the Relevant Percepts of Modular Hierarchical Bayesian Driver Models Using a Bayesian Information Criterion . . . . . . . . . . . . . . . . Mark Eilers and Claus M¨ obus
463
Impact and Modeling of Driver Behavior Due to Cooperative Assistance Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Florian Laquai, Markus Duschl, and Gerhard Rigoll
473
Predicting the Focus of Attention and Deficits in Situation Awareness with a Modular Hierarchical Bayesian Driver Model . . . . . . . . . . . . . . . . . . Claus M¨ obus, Mark Eilers, and Hilke Garbe
483
The Two-Point Visual Control Model of Steering - New Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hendrik Neumann and Barbara Deml
493
Automation Effects on Driver’s Behaviour When Integrating a PADAS and a Distraction Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fabio Tango, Luca Minin, Raghav Aras, and Olivier Pietquin
503
What is Human? How the Analysis of Brain Dynamics Can Help to Improve and Validate Driver Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sebastian Welke, Janna Protzak, Matthias R¨ otting, and Thomas J¨ urgensohn
513
Less Driving While Driving? An Approach for the Estimation of Effects of Future Vehicle Automation Systems on Driver Behavior . . . . . . . . . . . . Bertram Wortelen and Andreas L¨ udtke
523
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
533
The Effects of Landmarks and Training on 3D Surface Anthropometric Reliability and Hip Joint Center Prediction Wen-Ko Chiou1, Bi-Hui Chen2, and Wei-Ying Chou1 1 2
Chang Gung University, Wen-Hwa 1st Road, Kwei-Shan Tao-Yuan, Taiwan, 333, R.O.C. Chihlee Institute of Technology, Sec. 1, Wunhua Rd., Banciao District, New Taipei City, Taiwan, 313, R.O.C.
[email protected],
[email protected],
[email protected]
Abstract. Deforming 3D scanned data is an important and necessary procedure for the development of dynamic three-dimensional (3D) scanned anthropometry. The inaccuracies in joint center will cause error in deformation. Bell et al. developed the equations to predict hip joint center (HJC) based on anthropometric measurement of inter-anterior superior iliac spine distance (IAD). However, no previous study has reported on the reliability of IAD measurements in 3D scanned data, and therefore the effect on HJC estimates needs to be determined. Four measurers (2 trained/ 2 untrained) were recruited into this study to collect measurements of IAD in 3D scanned data under two situations (with/ without landmarks). The intra-class correlation (ICC) and technical error of measurement (TEM) were used to assess the reliability of the measurements. Results showed the untrained group had the lowest reliability and validity of IAD measurement in the without landmarks situation, and the error of HJC prediction in this situation was significantly higher than in the other situations (p 0.001). Both of training and use of landmarks improved the validity of measurement and HJC prediction; compared with training alone, attaching landmarks can significantly improve the reliability of measurement.
<
Keywords: Reliability; Hip joint center; Three-dimensional body scanner; Trained; Landmarks.
1 Introduction With the general application requirements of the dynamic three-dimensional (3D) scanning data, an accurate and reliable method to deform this data needs to be defined. The accurate identification of joint centers is of great importance for 3D scanning data deformation. Studies have shown that inaccuracies in identifying joint center will cause joint translations, and have a considerable influence on body kinematics and kinetics [1-2]. The hip, knee, and ankle joint are used to define the anatomical frame of the lower extremities [3]. While the joint centers of the knee and V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 3–11, 2011. © Springer-Verlag Berlin Heidelberg 2011
4
W.-K. Chiou, B.-H. Chen, and W.-Y. Chou
ankle are easier to locate, the deeply located hip joint is not easily identified. Predictive methods are commonly used to estimate the hip joint center (HJC) clinically. Bell, Pedersen, & Brand [4] described a method to estimate the location of the HJC in all three planes (x, y, z) using a fixed percentage of the inter-anterior superior iliac spine distance (IAD). However, the reliability of these measurements in 3D scanning data and the effect of IAD measurement results on HJC estimates have yet to be determined. ISO 20685:2010 addresses protocols for the use of 3D body scan systems in the acquisition of body shape data and measurements that can be extracted from 3D scanning data. For reducing error in 3D scanning, this international standard proposes attaching reflective landmarks to the skin over anatomical landmarks prior to scanning. Landmarks are tools which help in characterizing both the size and the shape of skeletal structures in human populations. Previous studies indicated that most of the anatomical landmarks are difficult to detect without palpating the body and placing a reflective landmark on the site prior to scanning [5]. This procedure can improve the reliability and accuracy of the anthropometric results. However, this method is requiring trained staff to attach the landmarks to the subjects, costing huge of time and money. Moreover, the effect of the landmarks on the 3D surface anthropometric reliability of IAD has yet to be determined. This paper will investigate the effect of landmarks on the measurement reliability of IAD and HJC prediction. Previous studies have described the effects of anthropometric training on measured reliability. Almost all of previous studies of anthropometric measurements used trained staff to do measurements [6-9]. Sebo, Beer-Borst, Haller, & Bovier [10] also demonstrated that the reliability of measurements improved after a one-hour training in anthropometric measurements. However, most staff engaged in 3D scan image deformation has little opportunity to receive training in human anatomy. This may result in some issues regarding reliability and accuracy of the 3D surface anthropometrics. This paper separates the measurers into two groups, trained group and untrained group, to investigate the effects of landmark on the measurement reliability of IAD and HJC prediction in the two groups. The purposes of this study are to quantify the 3D surface anthropometric reliability of IAD measurement, and to report on the effect of IAD measurement differences on HJC prediction, which were calculated from the equations developed by Bell et al. [4]. We will compare the results in the four groups: no landmarks / no training (NLNT); no landmarks / trained (NLT); landmarks / no training (LNT); landmarks / trained (LT). The effect on the measurement reliability of IAD and the prediction error of HJC is discussed
2 Methods 2.1 Data Collection of 3D Whole Body Scanned Models In this study, the Chang Gung Whole-Body Scanner (CGWBS) was utilized to capture the intricacies of the human body [11]. The CGWBS scans a maximum cylindrical volume of 190 cm in height and 100 cm in diameter, which accommodates most human subjects. The CGWBS system is designed to withstand shipping and repeated use without the need for alignment or adjustment.
The Effects of Landmarks and Training on 3D Surface Anthropometric Reliability
5
Twenty subjects (9 male, 11 female), with a mean (SD) age of 21.4 (0.7) years, weight of 56.8 (9.37) kg, and height of 166.1 (1.7) cm were recruited into this study. The subjects had no history of lower limb problems, and gave informed consent to participate in this study, which was approved by the Chang Gung Memorial Hospital Institutional Review Board (no: 97-2538B). The subjects, wearing a standard garment and cap, stood erect with their shoulder held in 20° to 30° of abduction and feet shoulder-width apart. During scanning, the subjects had to hold their breath for about 10 s. All procedures and parameters were in accordance with the previous study [11]. Each subject completed two trials of scanning. The relevant anatomical landmarks of the right and left antero-superior iliac spine (ASIS) were marked in the first trial by a trained researcher. The line distance of IAD in eight of subjects was measured by the trained researchers (used Martin-type anthropometer), after scanning. 2.2 Data Collection from Scanned Data Four measurers were recruited for this study. Two were familiar with human anatomy (trained measurers); the other two had not received any trained in human anatomy (untrained measurers). Following a simple introduction to the human anatomy and the location of the ASISs through three pictures from anatomy textbooks [12], the measurers were asked to locate the coordinates of the right and left ASISs (x, y, z) on the scan images in two trials. Software will calculate the line distance of IAD form the right and left ASISs coordinates and the test-retest reliability of IAD measurement will be calculated. Anthro3D software (Logistic Technology, Taiwan) was used to perform these location tasks on 3D scanning data (Fig. 1) and to calculate the line distance of IAD. The x axis was defined as the anterior-posterior direction, the y axis was defined as the medial-lateral direction, and the z axis was defined as the superiorinferior direction. The coordinates of HJC (x, y, z) were calculated using the equations presented by Bell et al. [4]. When the coordinate system origin was located halfway between the right and left ASISs, the equation estimated HJC at 22% IAD posterior, 30% IAD inferior, and 36% IAD lateral to the point of origin.
(a)
(b)
Fig. 1. (a) After the coordinates of the right ASIS (point X) and left ASIS (point X’) are located on the scan images without landmarks, the software will calculate the IAD (line XX’). (b) After the coordinates of the right ASIS (point Y) and left ASIS (point Y’) are located on the scan images with landmarks, the software will calculate the IAD (line YY’).
6
W.-K. Chiou, B.-H. Chen, and W.-Y. Chou
2.3 Statistical Analyses This study calculated the IAD from the coordinates of right and left ASISs to assess the intra-measurer and inter-measurer reliability of the four situations (NLNT, NLT, LNT, and LT). Intraclass correlation (ICC) was used for this purpose [13-15]. Good to excellent reliability was accepted at an ICC of 0.75, as in previous research [16]. For the inter-measurer reliability, the data gather from measurer’s first trial measurements were compared with the second trial measurements. For the intrameasurer reliability, the data gather form trained measurer 1 were compared with trained measurer 2, and the data gather from untrained measurer 1 were compared with untrained measurer 2, using the first trial measurements data. The technical error of measurement (TEM) was also used to verify the degree of imprecision when performing and repeating anthropometrical measurements (withinmeasurer) and comparing them with measurements from other measurers (betweenmeasurer). It is the most common way to express the error margin in anthropometry. The values of TEM in this study were calculated with the commonly used formula. The lower the TEM obtained, the better was the precision of the examiner in performing the measurement. ISO 20685:2010 describes the methodology to survey the validation of the measurement extracted from 3D scanning image. The standard for accuracy is the results of the corresponding traditional measurement, when measured by a skilled anthropometrist. The difference between an extracted measurement and the corresponding traditional measurement on actual subjects should be calculated to show the accuracy of this extracted measurement. In this study, eight of subjects the IAD were measured by the trained researchers using a Martin-type anthropometer. The differences between the results of traditional measurement and the results of measurement extracted from 3D scanning image have been calculated to show the accuracy of the IAD measurements. The one-way ANOVA (SPSS 16.0 for windows, 2008) was performed to detect the groups (NLNT, NLT, LNT, and LT) effect on IAD measurement, signed / absolute errors of HJC prediction. The coordinates of HJC (x, y, z) were calculated using the equations presented by Bell et al. [4]. Duncan's post hoc test was used to detect which group was significant different with the others. The significance level was set at 0.001. The IAD results derived from the traditional measurement were used to calculate the HJC coordinates of the eight subjects (as a gold standard), by the Bell’s equations [4]. The difference between the results gathered from the four groups and the traditional measurement were calculated to represent the HJC prediction errors. The differences of absolute distance along each axis were compared to determine the HJC prediction errors of the four groups in the three anatomical planes (x/ y/ z). The differences of signed distance were also compared to determine the general direction of the difference.
3 Results 3.1 Reliability of IAD Measurement Table 1 summarizes the TEM results of IAD measurement. Generally, the intrameasurer TEMs were greater than the inter-measurer TEMs. NLNT has the highest
The Effects of Landmarks and Training on 3D Surface Anthropometric Reliability
7
intra-measurer and inter-measurer TEMs (16.8 mm and 13.2 mm), and NLT has the second highest (11.8 mm and 7.1 mm). LT shows the least intra-measurer and intermeasurer TEMs (2.3 mm and 2.1 mm); it is interesting to note the Duncan's post hoc test showed no significant difference between LNT and LT. Table 1. TEM results of IAD measurement (n=20) NLNT (1) NLT (2)
LNT (3)
LT (4)
Duncan's post hoc test
Inter-measurers TEM (mm)
16.8
11.8
5.2
2.3
1>2>3>4
Relative TEM (%)
7.0
5.5
2.2
1.2
1>2>3=4
Intra-measurers TEM (mm)
13.2
7.1
2.7
2.1
1>2>3=4
Relative TEM (%)
4.9
3.0
1.0
0.9
1>2>3=4
Fig. 2. The intra-measurer and inter-measurer ICCs of IAD measurement (n=20)
The inter-measurer ICCs were higher than the intra-measurer ICCs (Fig. 2). All groups have good reliability (ICC>0.75), except the intra-measurer ICC of the NLNT. The NLNT showed the lowest inter-measurer and intra-measurer ICCs in the four groups. The groups that measured with landmarks (LNT and LT) showed excellent reliability (ICC>0.9). 3.2 Validity of IAD Measurement The mean value obtained from NLNT showed the most discrepancy with the traditional measurement (25.2±15.1 mm). Duncan's post hoc test showed the result of NLNT was significantly greater than the other groups (NLT: 10.2±7.2 mm; LNT: 7.9±4.5 mm; LT: 7.7±3.8 mm).
8
W.-K. Chiou, B.-H. Chen, and W.-Y. Chou
3.3 Validity of HJC Prediction The coordinates of HJC (x, y, z) were calculated using the equations presented by Bell et al. [4]. Fig. 3 showed the general direction of the HJC prediction errors. The NLNT trended to predict the HJC locations as more posterior, lateral, and inferior. Fig. 5 showed the HJC prediction error of the four groups (NLNT/NLT/LNT/LT). Overall, the maximum errors were found in the y axis. The maximum prediction errors of x, y, and z axes were found in the NLNT group at 4.8 mm, 9.1 mm and 7.6 mm, respectively. The result of ANOVA showed there were significant different between the four groups (NLNT/ NLT/ LNT/ LT) in x, y and z axis. Duncan's post hoc test showed NLNT was significantly different with the other groups (NLT/ LNT/ LT). There were no significant differences between NLT, LNT and LT.
Fig. 3. Comparison of the absolute HJC prediction error (SD) of the four groups (n=8) *Significantly different (p 0.001).
4 Discussion 4.1 The Effect of Training and Landmark Placement on the Reliability of IAD Measurement The NLNT showed the lowest IAD measurement reliability among the four groups (NLNT/ NLT/ LNT/ LT). Even through NLT showed better reliability than the NLNT, the results of relative TEM showed the imprecision of NLT is 5.5% (intra-measurer) and 3.0% (inter-measurer); these values are not considered to be acceptable (relative TEM < 2%) [17]. These result showed whatever the measurer is trained or not, the results had lower reliability when landmarks were not used when measuring IAD on the 3D scanning data. The results of LNT and LT showed excellent reliability, with
The Effects of Landmarks and Training on 3D Surface Anthropometric Reliability
9
ICCs > 0.9. Compared with training, attaching landmarks significantly increased the reliability of measurement. These results demonstrated that landmarks can play a more important role than the training in improving reliability. 4.2 The Effect of Trained and Landmark on the Validity of IAD Measurement ISO 20685:2010 describes the 3D scanning methodology used to establish internationally compatible anthropometric databases. In this methodology, the measurers should be skill anthropometrists, and landmarks are tools for improving the accuracy of measurement. Our study compared the four groups (NLNT/ NLT/ LNT/ LT) and found except for NLNT, the results of NLT/LNT/ LT are similar. The results showed the landmarks were the only assistive tools to improve the measurement accuracy for the trained measurers; however, when the measurers have not received training in anatomy, the landmarks were important to the correctness of the measurement. Location of the ASIS by the trained measurers with landmarks is considered the best methods to improve the reliability and the validity of the measurement (ISO 20685:2010). However, our study still found about 7 mm mean measurement difference in the LT group. The errors can be explained by an additional source of error related to the placement of the landmarks during the identification of ASIS locations. Small variances in landmarks placement can therefore affect measurement results significantly [16]. Pervious study also reported a maximum measurement difference of 30 mm when waist circumference was measured at four different, yet closely located points [18]. 4.3 The Effect of Trained and Landmark on the Validity of HJC Prediction Our study compared four groups (NLNT/ NLT/ LNT/ LT) and found the results of three groups (NLT/LNT/ LT) are similar. The results of reliability and validity of IAD measurement are also similar, showing the predicted HJC coordinates were affected by the measurement differences of IAD. The NLNT shows a large difference compared with the other groups, with a maximum error in the y axis of 20.9 mm. These results show that measurement obtained by untrained measurers have low validity in HJC prediction when no landmarks are used. Since most of the staff engaged in 3D scanning image deformation has little chance to receive anatomical training, there is a need to solve the reliability and validity problem in IAD measurement on 3D scanning data. Moreover, our results showed that trained measurers also needed landmarks to improve measurement reliability. This indicates that to improve the correctness of IAD measurement and decrease the HJC prediction errors, attaching landmarks on ASISs is a better solution than giving training.
5 Conclusion This study successfully quantified the magnitude and reliability of intra- and intermeasurer anthropometric measurements, and showed the effect that measurement differences have on predicted HJC locations. Results showed the predicted HJC
10
W.-K. Chiou, B.-H. Chen, and W.-Y. Chou
locations were affected by IAD measurement differences. Landmarks can help to decrease IAD measurement differences as well as the prediction errors of HJC, especially in measurements obtained by untrained measurers.
References 1. Piazza, S.J., Okita, N., Cavanagh, P.R.: Accuracy of the functional method of hip joint center location: Effects of limited motion and varied implementation. Journal of Biomechanics 34, 967–973 (2001) 2. Stagni, R., Leardini, A., Cappozzo, A., Grazia Benedetti, M., Cappello, A.: Effects of hip joint centre mislocation on gait analysis results. Journal of Biomechanics 33, 1479–1487 (2000) 3. Cappozzo, A., Catani, F., Della Croce, U., Leardini, A.: Position and orientation in space of bones during movement: Anatomical frame definition and determination. Clinical Biomechanics 10, 171–178 (1995) 4. Bell, A.L., Pedersen, D.R., Brand, R.A.: A comparison of the accuracy of several hip center location prediction methods. Journal of Biomechanics 23, 617–621 (1990) 5. Nurre, J.H.: Locating landmarks on human body scan data. In: Proceedings of the International Conference on Recent Advances in 3-D Digital Imaging and Modeling, pp. 289–295 (1997) 6. Moreno, L.A., Joyanes, M., Mesana, M.I., González-Gross, M., Gil, C.M., Sarría, A., Gutierrez, A., Garaulet, M., Perez-Prieto, R., Bueno, M., Marcos, A.: Harmonization of anthropometric measurements for a multicenter nutrition survey in Spanish adolescents. Nutrition 19, 481–486 (2003) 7. Nagy, E., Vicente-Rodriguez, G., Manios, Y., Beghin, L., Iliescu, C., Censi, L., Dietrich, S., Ortega, F.B., De Vriendt, T., Plada, M., Moreno, L.A., Molnar, D.: Harmonization process and reliability assessment of anthropometric measurements in a multicenter study in adolescents. International Journal of Obesity 32, 58–65 (2008) 8. Johnson, W., Cameron, N., Dickson, P., Emsley, S., Raynor, P., Seymour, C., Wright, J.: The reliability of routine anthropometric data collected by health workers: A crosssectional study. International Journal of Nursing Studies 46, 310–316 (2009) 9. Weiss, E.T., Barzilai, O., Brightman, L., Chapas, A., Hale, E., Karen, J., Bernstein, L., Geronemus, R.G.: Three-dimensional surface imaging for clinical trials: Improved precision and reproducibility in circumference measurements of thighs and abdomens. Lasers in Surgery and Medicine 41, 767–773 (2009) 10. Sebo, P., Beer-Borst, S., Haller, D.M., Bovier, P.A.: Reliability of doctors’ anthropometric measurements to detect obesity. Preventive Medicine 47, 389–393 (2008) 11. Lin, J.D., Chiou, W.K., Weng, H.F., Fang, J.T., Liu, T.H.: Application of threedimensional body scanner: observation of prevalence of metabolic syndrome. Clinical Nutrition 23, 1313–1323 (2004) 12. Moore, K.L., Dalley, A.F.: Clinically Oriented Anatomy. Lippincott Williams & Wilkins, Philadelphia (1999) 13. McGraw, K.O., Wong, S.P.: Forming Inferences about Some Intraclass Correlation Coefficients. Psychological Methods 1, 30–46 (1996) 14. Shrout, P.E., Fleiss, J.L.: Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin 86, 420–428 (1979) 15. Weir, J.P.: Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. Journal of Strength and Conditioning Research 19, 231–240 (2005)
The Effects of Landmarks and Training on 3D Surface Anthropometric Reliability
11
16. Burkhart, T.A., Arthurs, K.L., Andrews, D.M.: Reliability of upper and lower extremity anthropometric measurements and the effect on tissue mass predictions. Journal of Biomechanics 41, 1604–1610 (2008) 17. Perini, T.A., de Oliveira, G.L., dos Santos Ornellas, J., Palha de Oliveira, F.: Technical error of measurement in anthropometry. Revista Brasileira de Medicina do Esporte 11, 81– 90 (2005) 18. Wang, J., Thornton, J.C., Bari, S., Williamson, B., Gallagher, D., Heymsfield, S.B., Horlick, M., Kotler, D., Laferrère, B., Mayer, L., Xavier Pi-Sunyer, F., Pierson Jr, R.N.: Comparisons of waist circumferences measured at 4 sites. American Journal of Clinical Nutrition 77, 379–384 (2003)
An Automatic Method for Computerized Head and Facial Anthropometry Jing-Jing Fang and Sheng-Yi Fang Department of Mechanical Engineering, National Cheng-Kung University, Tainan, Taiwan {fjj,n18981220}@mail.ncku.edu.tw
Abstract. Facial anthropometry plays an important role in ergonomic applications. Most ergonomically-designed products depend on stable and accurate human body measurement data. Head and facial anthropometric dimensions provide detailed information on head and facial surfaces to develop well-fitting, comfortable and functionally-effective facial masks, helmets or customized products. Accurate head and facial anthropometry also allows orthognathic surgeons and orthodontists to plan optimal treatments for patients. Our research uses an automatic, geometry-based facial feature extraction method to identify head and facial features, which can be used to develop a highly-accurate feature-based head model. In total, we have automatically located 17 digital length measurements and 5 digital tape measurements on the head and face. Compared to manual lengthmeasurement, the average error, maximum error and standard deviations are 1.70mm, 5.63mm and 1.47mm, respectively, for intra-measurement, and 2.07mm, 5.63mm and 1.44mm, respectively, for inter-measurement. Compared to manual tape-measurement, the average maximum error and standard deviations are 1.52mm, 3.00mm and 0.96mm, respectively, for intrameasurement, and 2.74mm, 5.30mm and 1.79mm, respectively, for intermeasurement. Nearly all of length measurement data and tape measurement data meet the 5mm measuring error standard. Keywords: anthropometry, head and face, feature-based.
1 Introduction Facial anthropometry is very important for the design and manufacture of products which rely on access to a database of accurate head size measurements to ensure comfort and utility, such as helmets, masks, eyeglasses and respirators [1, 2]. Traditionally, anthropometric measurements were taken subjectively by an experienced technician using an anthropometer, sliding calipers, spreading calipers, and a measuring tape. This traditional measurement method is very time-consuming and highly dependent on the measurer’s skill, and thus cannot provide objective, accurate and reproducible measurement data. An automated measurement system could therefore be very useful to ergonomic designers. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 12–21, 2011. © Springer-Verlag Berlin Heidelberg 2011
An Automatic Method for Computerized Head and Facial Anthropometry
13
Three-dimensional body scanning technology has matured considerably in recent years with the development of scanning platforms such as AnthroScan [3], Cyberware [4], Vitronic [5], AC2 [6], and others. The scanner establishes a three-dimensional data point cloud, which can then be used to produce a mesh model. However, the raw data from the scanner and the resulting point cloud lack detail and related information, thus limiting its utility in follow-up applications. The post-processing software is mostly designed for general purposes and tends to use general triangulation methods, such as Delauney triangulation, leaving it unable to generate finer meshes or retain important geometric data. Fang’s [7] work used image processing methods to identify 67 more facial feature points and 24 more feature lines than those identified in MPEG-4 definitions [8] and is much suitable for anthropometric applications. A number of large three-dimensional anthropometric surveys have been conducted around the world in past few years, such as the CAESAR (Civilian American and European Surface Anthropometry Resource) project [9] and the Taiwan Human Body Bank [10]. These anthropometric data can be used in ergonomic design to create more comfortable earphones, eyeglasses, respirators, masks, helmets, and other products to enhance the safety of workers in hazardous environments [11-20]. However, manual data measurement presents some problems. The face presents a smaller surface area relative to the rest of the body, and even slightly misplaced markers on the face can result in disproportionately serious errors. Geometric variations on the face are also more complex than those found on the rest of the body, thus increasing the scope for error. This research continues a robust line of research in marker-less facial feature identification. We establish a highly-accurate optimized mesh model from original point cloud scanned by a non-contact infrared body scanner according to these facial features. The B-spline curve is used to simulate tape measurement to automatically measure the feature line length. Fig. 1 shows the industrial applications of our research. First, the results can be used to design more effective and comfortable ergonomic products. Second, highly-accurate head and facial measurements can lead to better clinical diagnosis and assessment, be used to better plan surgical symmetry, predict post-surgical results, and be used for comparing pre-surgical goals with postsurgical results.
Fig. 1. Industrial applications of our research
14
J.-J. Fang and S.-Y. Fang
The paper is organized as follows: Section 2 introduces the anthropometry landmarks on the head and face. We divide the head model into longitudinal-and-latitudinal meshes, allowing us to use automatic methods to make head and facial anthropometry measurements. Section 3 presents all 17 length measurements and 5 tape measurements, and compares them with manual measurements. Finally, we discuss the advantages and disadvantages of this study, and suggest possible future work.
2 Methods 2.1 Anthropometry Landmarks We used a computed tomography scanner (Biograph; Siemens AG, Berlin, Germany) to obtain high density scan data from a plastic mannequin, identifying facial features using an automatic features identification method [21]. This automatic marker-less method detects head and facial features according to geometric variations. In total, the method can detect 67 feature points and 24 feature lines on the head, and is an objective identification method in that it avoids the subjective feature identification results inherent in manual measurements by differently-skilled technicians. In our previous work, the primary landmarks were located on the head scan data. Most anthropometry landmarks can be detected by geometry-based feature identification methods, but some are defined on bone surface and do not have geometric characteristics on soft tissue. For instance, to solve the unclear geometric characteristic of soft tissue, the gonion feature on the chin line is defined according to the golden ratio between the chin feature and the ear position. All the identified features used for coming applications are listed as follows: • Centerline: Op-Opthrocranion, V-Vertex, G-Glabella, S-Sellion, Prn-Pronasale, Sn-Subnasale, Sto-Stomion, Chi-Chin • Eyes: P-Pupilla • Nose: Al-Alare • Ears: Pa-Preaurale • Mouth: Ch-Chelion • Bone: Zyf-Zygofrontale, Ft-Frontotemporale, Zy-Zygomatic, Go-Gonion, Me-Menton 2.2 Model Construction We used the method described in [7] to develop multi-resolution head models suited to different applications. For medical applications, we can construct the head model using high-resolution with highly-accurate meshes, while other applications require only lower-resolution meshes with small file sizes. The method locates longitude and latitude lines based on the location of facial feature points, and adjusts the grid position so that feature points in the grid can be maintained without distortion after reconstruction. We also used a genetic algorithm to optimize the number of meshes
An Automatic Method for Computerized Head and Facial Anthropometry
15
and the errors between the original point cloud and the constructed mesh model by following equation, min f = W1 2 x1 ( x2 − 1) 2 + W2 e + W3 p 2
s.t. g1 = e − 2 ≤ 0
(1)
g 2 = p − 0.05 ≤ 0
where x1 and x2 are the number of longitudinal and latitude lines, respectively, e is the error of the constructed model, p is the ratio of long-narrow triangular meshes to the total number of triangular meshes, and W1, W2, W3 are the weighting number ( 0.7, 0.2, and 0.1, respectively). The errors can be divided into three categories: the distance between the original scan point and the mesh face, the distance between the original scan point and the mesh edges, and the distance between the original scan point and the mesh vertices. For these three error types, we take the minimum error one as the error for the specific scan point. 2.3 Head and Facial Anthropometry Head and facial anthropometry can be classified into two main categories: straight line distances and surface distances. Straight line distances are the projection distance between two feature points on the specific plane, sagittal plane, coronal plane, or transversal plane. For example, the nose protrusion is the projection distance on the sagittal plane between the Pronasale and Subnasale. In our research, the straight line distance is called the length measurement which can be easily determined by Euclidean distance.
Length = d ( q, p ) =
n
∑(q − p ) i =1
i
2
i
(2)
Surface distance is a length of a feature curve which passes through three specific feature points along the head surface. For example, the Bitragion-Subnasale curve passes through the left and right tragion and subnasale, and whose arc length is the total length of this curve. In the garment industry, the arc length is also called the tape measurement. In practice, we define a plane passing through FA, FB, and FC as a plane E: ax+by+cz+d=0. Then, we can determine the intersection point set { Fi } of plane E and the head model. The intersection point set was presented as a polyline, Γ. However, this polyline may be a concave polygon, and differs from reality. In fact, when taking tape-measurements, the tape only contacts the convex part of the body to form a convex hull around the body. The convex hull of the polygon Γ is obtained by k ⎧ k ⎫ H convex ( Γ ) = ⎨∑ α i pi pi ∈ Γ, α i ∈ R , α i ≥ 0, ∑ α k = 1, k = 1, 2,K⎬ i =1 ⎩ i =1 ⎭
(3)
16
J.-J. Fang and S.-Y. Fang
A rough estimation of tape measurements can be given by summing the intervertex distance of H convex ( Γ ) , but precision depends on the resolution of the mesh surface. To obtain a well-estimated measurement, Leong employed a B-spline approximation technique to regenerate the surface curve [22]. In our research, we adopt this method and use an order k=4 B-spline curve to fit the polyline n
C ( u ) = ∑ Ni , k ( u ) Pi , u ∈ [ tk −1 , tn +1 )
(4)
t =1
where Pi is the control point, and Ni,k is the kth-order B-spline basis function. After constructing the corresponding B-spline curve of the specific polyline, the tape measurement can be determined by numerical integration, Length = ∫
u =1
u =0
C ( u ) du
(5)
In Fig.3, the black points Fi, Fi+1, and Fi+2 are the intersection points of the plane E and the triangular meshes ∆j and ∆j+1. The red line is the original polyline connecting the intersection points, and the blue curve (the dashed line) is the corresponding fitting B-spline curve.
Fig. 2. Plastic mannequin
Fig. 3. Convex-hull B-spline curve approximation
3 Results Our research uses an automatic feature extraction method to identify head and facial features from a scanned head model. We propose a multi-resolution mesh construction method to reconstruct the head mesh model, and apply a genetic optimization method to obtain the optimized mesh numbers with good accuracy. The optimum mesh number is 120×70, the resulting error is about 0.069mm, and the longnarrow triangular mesh percentage is about 2.66%. The head mesh model is shown as Fig. 4 where the black lines are the mesh edge, green points are the feature points, and red lines are the feature lines.
An Automatic Method for Computerized Head and Facial Anthropometry
17
Fig. 4. Optimized head mesh model
We also present a digitized head and facial anthropometry method providing a total of 17 length measurements (shown in Fig. 5) and 5 tape measurements (shown in Fig. 6). Length measurements are listed in Table 1. The Proj. plane column in Table 1 lists the projection planes used to determine the corresponding distance between the two features listed in the anthropometry landmarks column. Tape measurements are listed in Table 2, where most of feature curves are open curves from the start point, passing through mid-point to the end point. Only one feature curve, the head circumference, is a closed loop curve, and this is determined by measuring the circumference of the head surface above the eyebrow ridges.
2
16
1
4
3 6
5 8
7
Fig. 5. The length measurements
Fig. 6. The tape measurements
18
J.-J. Fang and S.-Y. Fang Table 1. Definitions of length measurement anthropometry
1 2 3
Feature name Maximum frontal breadth Minimum frontal breadth Interpupillary breadth
Proj. plane coronal coronal coronal
4
Nasal root breadth
coronal
5
Nose breadth
coronal
6
Bizygomatic breadth
coronal
7 8 9 10 11 12 13 14 15
Bigonial breadth Lip length Menton-sellion length Subnasale-sellion length Nose protrusion Facial projection Chin projection Sellion Z Right tragion X
coronal coronal sagittal sagittal sagittal sagittal sagittal sagittal sagittal
16 Head breadth
coronal
17 Head length
sagittal
Anthropometry landmarks Left and right zygofrontale Left and right frontotemporale left and right pupilla The root of the left and right bridge of the nose Left and right alare Maximum horizontal breadth between left and right zygomatic arches Left and right gonion Left and right chelion Menton and sellion Subnasale and sellion Pronasale and subnasale Sellion and tragion Menton and tragion Vertex and sellion Sellion and opisthrocranion The maximum horizontal breadth of the head above the ears Glabella and opisthrocranion
Table 2. Definitions of tape measurement anthropometry
1 2 3 4
Feature name Bitragion coronal arc Bitragion frontal arc Bitragion subnasale arc Bitragion chin arc
5 Head circumference
Start point Mid-point left tragion Vertex left tragion Forehead left tragion Subnasale left tragion Chin Maximum circumference of the eyebrow ridges
End point right tragion right tragion right tragion right tragion head above the
Five volunteers used traditional measuring tools, including measuring tape (accuracy: 0.1mm) and calipers (accuracy: 0.01mm) to obtain these head and facial anthropometry data points for the plastic mannequin, which will be used for comparison with our research. The volunteers’ measurement error results are listed in Table 3. From these experiments we can see that the maximum error often occurs on the Maximum and Minimum Frontal Breadth features (i.e., the distance between the zygofrontale or frontotemporale) in the length measurement group, and the bitragion chin arc (i.e., the arc length between the left and right tragion passing through the chin point) in the tape measurement group.
An Automatic Method for Computerized Head and Facial Anthropometry
19
Volunteer No.1 repeated the measuring process 5 times for intra-volunteer comparison, with results listed in Table 4. The maximum error often occurred in the feature lines related to the chin point, e.g., the Menton-Sellion length or the Bitragion chin arc, because the zygofrontale and frontotemporale are features on the orbital bone and the chin point is also a feature point defined by the mandible, so it is difficult to distinguish these features from nearby soft tissue. Volunteers may measure this feature line or feature arc from different positions, resulting in more significant errors. We can also find that more than 88% (in most cases, it is nearly 100%) of the length-measurement data and more than 60% (in most cases, it is nearly 100%) of the tape measurement data show errors of under 5mm, which is the standard of anthropometry. Table 3. Comparison results of inter-volunteer measurement data (unit: mm)
Length measurement
Tape measurement
Mean SD Max. <5mm Mean SD Max. <5mm
1 1.69 1.26 4.24 100% 1.80 0.85 3.00 100%
2 1.86 1.34 4.15 100% 3.80 1.74 5.00 80%
3 2.50 1.53 4.97 100% 1.56 2.01 5.00 80%
4 2.51 1.59 5.63 94.12% 3.04 1.81 4.80 100%
5 1.80 1.42 4.70 100% 3.50 1.75 5.30 60%
Table 4. Comparison results of intra-volunteers measurement data (unit: mm)
Length measurement
Tape measurement
Mean SD Max. <5mm Mean SD Max. <5mm
1 1.69 1.26 4.24 100% 1.80 0.85 3.00 100%
2 1.81 1.52 4.92 100% 1.44 0.91 2.40 100%
3 1.53 1.36 4.60 100% 1.56 1.24 3.00 100%
4 1.82 1.76 5.63 88.24% 1.36 0.71 2.00 100%
5 1.68 1.57 5.02 94.12% 1.76 1.16 3.00 100%
4 Conclusions Our research presents an optimized method for multi-resolution mesh generation according to the head and facial features of a scanned head. This method can construct a mesh with accuracy levels suitable for different applications. We also applied a genetic algorithm to determine the optimized mesh resolution to satisfy the minimum mesh number and minimum error with good geometric configuration (i.e., fewer long-narrow triangular meshes).
20
J.-J. Fang and S.-Y. Fang
Anthropometry traditionally relies on the measurer’s experience and skill in manually identifying and measuring features. Our research demonstrates an automatic digital head and facial anthropometry method for measuring 17 length measurements and 5 tape measurements defined in the NIOSH anthropometry survey for respirator design, and which are used as a common industrial standard. Our method provides more objective and highly reproducible measuring data than do traditional measurement methods. Compared to manual length-measurement, the average error, maximum error and standard deviation are 1.70mm, 5.63mm and 1.47mm, respectively, for intrameasurement and 2.07mm, 5.63mm and 1.44mm, respectively, for inter-measurement. Compared to manual tape-measurement, the average error, maximum error and standard deviation are 1.52mm, 3.00mm and 0.96mm, respectively, for intra-measurement and 2.74mm, 5.30mm and 1.79mm, respectively, for inter-measurement.
5 Discussion Our method can obtain highly-accurate and highly-reproducible anthropometry data on well-defined features. Some anthropometry data have large errors (more than 5mm) and they often occur on ambiguous feature points or on feature point that are not defined on soft tissue. Resolving this problem requires the establishment of more accurate and suitable feature definitions to replace the current ambiguous definitions. The results can be used in the design of products like helmets, face shields and respirators, and are also helpful in the analysis of human facial surfaces for clinical diagnosis and assessment.
Acknowledgments This study was granted by National Science Council NSC 98-2221-E-006-244.
References 1. Li, J.C., Lin, Y.H., Yang, Y.X., Chen, C.Y.: A Study of Safety Helmet Size in Relation to Size Classification in Taiwan’s Worker Craniofacial Database. Journal of Occupational Safety and Health 14, 48–57 (2006) 2. Liu, H., Li, Z., Zheng, L.: Rapid Preliminary Helmet Shell Design Based on ThreeDimensional Anthropometric Head Data. Journal of Engineering Design 19, 45–54 (2008) 3. Human Solutions of North America Inc., http://www.human-solutions.com/ 4. Cyberware Inc., http://www.cyberware.com/ 5. Vitronic Machine Vision Ltd., http://www.vitronic.de/ 6. Textile/Clothing Technology Corp., http://www.tc2.com/ 7. Fang, S.-Y., Fang, J.-J.: A Geometric Features Preserving Method for Facial Reconstruction. In: Conference A Geometric Features Preserving Method for Facial Reconstruction 8. Koenen, R.: Overview of the MPEG-4 Standard. ISO/IEC JTC1/SC29/WG11 N4030 (2001) 9. Robinette, K.M., Daanen, H., Paquet, E.: The CAESAR Project: a 3-D Surface Anthropometry Survey. In: Proceeding of IEEE 2nd International Conference on 3-D Digital Imaging and Modeling, Ottawa, Ont., Canada, pp. 380–386 (1999)
An Automatic Method for Computerized Head and Facial Anthropometry
21
10. Taiwan Human Body Bank, http://3d.cgu.edu.tw/ 11. Enciso, R., Alexandroni, E.S., Benyamein, K., Keim, R.A., Mah, J.: Precision, Repeatability and Validation of Indirect 3D Anthropometric Measurements with Lightbased Imaging Techniques. In: IEEE Internation Symposium on Biomedical Imaging: Nano to Macro, vol. 2, pp. 1119–1122 (2004) 12. Bailar, J.C., Meyer, E.A., Pool, R.: Assessment of the NIOSH Head-and-Face Anthropometric Survey of U.S. Respirator Users. National Academies Press, Washington (2007) 13. Shiang, T.Y.: A Statistical Approach to Data Analysis and 3-D Geometric Description of Human Head and Face. Proc. Natl. Sci. Counc. Rep. China Pt. B Life Sci. 23, 19–26 (1999) 14. Yang, L., Shen, H.: A Pilot Study on Facial Anthropometric Dimensions of the Chinese Population for Half-Mask Respirator Design and Sizing. International Journal of Industrial Ergonomics 38, 921–926 (2008) 15. Yokota, M.: Head and Facial Anthropometry of Mixed-race US Army Male Soldiers for Military Design and Sizing: A Pilot Study. Applied Ergonomics 36, 379–383 (2005) 16. Kim, H., Han, D.H., Roh, Y.M., Kim, K., Park, Y.G.: Facial Anthropometric Dimensions of Koreans and Their Associations with Fit of Quarter-Mask Respirators. Industrial Health 41, 8–18 (2003) 17. Zhuang, Z., Bradtmiller, B.: Head-and-Face Anthropometric Survey of U.S. Respirator Users. Journal of Occupational and Environmental Hygiene 2, 567–576 (2005) 18. Kim, K., Kim, H.: Three-Dimensional Shape Analysis of Commercial Half-Facepiece Respirators for Koreans: Structure Approach. Journal of the International Society for Respiraotry Protection 23, 89–99 (2006) 19. Roberge, R.J., Zhuang, Z., Stein, L.M.: Association of Body Mass Index with Facial Dimensions for Defining Respirator Fit Test Panels. Journal of the International Society for Respiratory Protection 23, 44–52 (2006) 20. Du, L., Zhuang, Z., Guan, H., Xing, J., Tang, X., Wang, L., Wang, Z., Wang, H., Liu, Y., Su, W., Benson, S., Gallagher, S., Viscusi, D., Chen, W.: Head-and-Face Anthropometric Survey of Chinese Workers. Annals of Occupational Hygiene 52, 773–782 (2008) 21. Fang, S.-Y., Fang, J.-J.: Automatic Head and Facial Features Extraction Based on Geometry Variations. In: Computer Aided Design (2010) 22. Leong, I.-F., Fang, J.-J., Tsai, M.-J.: A Feature-based Anthropometry for Garment Industry. International Journal of Clothing Science and Technology
3D Parametric Body Model Based on Chinese Female Anhtropometric Analysis Peng Sixiang1, Chan Chee-kooi1, W.H. Ip2, and Ameersing Luximon1 1
Institute of Textiles & Clothing, Hong Kong Polytechnic University, Hong Kong, People’s Republic of China 2 Department of Industrial and Systems Engineering, Hong Kong Polytechnic University, Hong Kong, People’s Republic of China
[email protected], {tcchanck,mfwhip,tcshyam}@inet.polyu.edu.hk
Abstract. This study shows a 3D parametric body model construction based on the anthropometric analysis technique. Compared to the traditional anthropometric surveys, the 3D body scanner provides with more accurate body dimension information and not only traditional measurements but also new body shape measurements. An anthropometric survey was completed to collect 3D body information of Hong Kong female. A serious of 3D models was built upon these 3D scan data. A number of body dimensions and body shape information were extracted from the 3D models. Then body shape analysis were performed, such as bust shape analysis, front and back proportion analysis, body cross section comparison, correlation relationship between body dimensions and so on. Upon the analysis of the body information, a parametric body model was built, which was in a most common body shape of the Hong Kong female and able to change critical body demotions according to the user’s inputs. Keywords: 3D, parametric, body model.
1 Introduction Computer aided design (CAD) has been a commonplace in many fields of industry nowadays. With the advancement in computer aided design (CAD) and computer aided manufacture (CAM), virtual mannequins begin to be adopted in various areas such as garment drape simulation [1], female figure identification [2] and virtual garment design [3]. Nicola D'Apuzzo [4] reviewed the existing market usage of 3D body scanning technology for fashion and apparel industry, virtual-try-on has became possible for the industry. Body model is an essential part in these applications, which provides garment designers with a virtual three-dimensional body model for them to refer their design to, to try on and to demonstrate their products in virtual environment. However there are some problems in body model that the body scan data is not in a feasible format for any commercially available CAD system, and whenever a different size body is required, a real human body of that specific size has to be scanned and the whole body model building process has to start over again[1, 5]. It will be costly relatively. Furthermore the 3D body model V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 22–29, 2011. © Springer-Verlag Berlin Heidelberg 2011
3D Parametric Body Model Based on Chinese Female Anhtropometric Analysis
23
is usually very large, brings inconvenient in data storing. Some researchers has been working on the parametric body model [1], for new approaches to build the 3D body model. First, the original shape of the parametric body model is generated from a 3D body scan data, with a set of parametric surface and deformation algorithm it can easily change into different body sizes. On the other hand, it can integrate with CAD and CAM software. However there are still several limitations for the parametric body model. First one is how precisely the deformation algorithm can describe the body shape change. Although the original body model is built according to the real human or dummy scan data, but the accuracy of the deformation algorithm is not approved against any anthropometric data, in other words, it is not guaranteed that the parametric body model is changed according to the real human body shape variety. Secondly, different race, different age group of human has their unique way of body shape variety, which should be studied before setting the deformation algorithm. This paper presents a new approach to build a 3D parametric body model. Firstly, study the various bodily profiles of the human body, obtained in the form of data clouds and avatars are extracted from body scanners database. Secondly, the morphological and topological profiles of the body parts or that are used in the 3D model design are studied in the digital format. Thirdly, these digital body model is built using common CAD software for easy integrate into other software.
2 Anthropometry Survey The parametric body model is designed to be able to change itself into the desired body shape configuration accurately, so it is essential to develop a comprehensive understanding of body shape variations. 130 Hong Kong female aged 17-22 were scanned using the TC2 body scanner. Participants were recruited for the anthropometric studies launched by the Institute of Textile and Clothing of the Hong Kong Polytechnic University. In the recruitment process, participants were chosen who had reasonably symmetrical body proportions without any obvious deformities or postural problems caused by neurological or musculoskeletal diseases. Participants were scanned in a light color close-fitting undergarment, which allows 3D body scanner to capture the body shape while avoiding deforming it. Participants were positioned in an erect but relaxed posture, with arms and legs abducted slightly so that the cameras in the scanner could capture the full torso. The scan process finished in 10 seconds, and then the raw data was processed into the file format .obj and .rbd by the 3D body scanner software as the output. Then the critical measurements are extracted and analyzed to support the following research stages. 2.1 Girth Distribution Analyses The body shape dimensions with a relatively distinct variations and belonging to the garment construction dimensions, according the ISO8559, are selected and applied
24
P. Sixiang et al.
statistical analysis. Because these dimensions are critical dimensions in garment design and the major dimensions change of the parametric body model will take place at these areas. These dimensions and the cup size are set to be the driving dimensions. The selected dimensions include: neck girth, bust girth, waist girth, abdomen girth, hip girth, bust width, neck to bust vertical length, bust to waist vertical length, waist to hip vertical length and shoulder length. Table 1 shows the distribution of the selected dimensions. Table 1. Driving dimensions distribution Dimensions
Mean (cm)
Standard deviation(cm)
μ ±3σ (cm)
Neck girth Bust girth Waist girth Abdomen girth Hip girth Shoulder length Bust width Neck to bust vertical length Bust to waist vertical length Waist to hip vertical length
33.6 83.5 70.0 80.2 93.0 11.2 17.1 20.4
2.0 6.6 5.7 7.6 5.9 1.6 1.8 1.9
27.6 to 39.6 63.7 to 103.3 52.9 to 87.1 57.4 to 103.0 75.3 to 110.7 6.4 to 16.0 11.7 to 22.5 14.7 to 26.1
18.2
2.4
11.0 to 25.4
21.9
3.2
1.5 to 12.3
2.2 Breast Size Distribution Analyses Breast size is an important and essential dimension for the women’s foundation garments, anthropometric and figure classification. Breast and bra classification follows the ISO DIS 4416 stander (shown as Table 2), and the breast size distribution analyses results are shown in Table 3. Table 2. ISO DIS 4416 preferred size scales for women’s foundation garment Underbust Girth(cm) 64 Bust Girth(cm)-Cup 76 A Cup B 80
68
72
76
80
84
88
92
80
84
88
92
96
100
104
84
88
92
96
100
104
108
Cup C
84
88
92
96
100
104
108
112
Cup D
88
92
96
100
104
108
112
116
Cup E
92
96
100
104
108
112
116
120
Cup F
96
100
104
108
112
116
120
124
3D Parametric Body Model Based on Chinese Female Anhtropometric Analysis
25
Table 3. Breast size distribution Underbust Girth(cm)
64
68
72
76
80
84
88
92
Percentage
Cup A
14%
18%
17%
6%
1%
1%
0%
0%
57%
Cup B
8%
12%
7%
3%
0%
1%
2%
1%
34%
Cup C
2%
5%
1%
1%
0%
0%
0%
0%
8%
Cup D
0%
0%
0%
0%
0%
1%
0%
0%
1%
2.3 Front and Back Proportion Analyses Seldom research has been done on the relationship between the front half part and back half part of the body girths. As the body size grows, both the front and the back part of the girth are increasing, but at different rates. With the help of the 3D body scanner, the point cloud body model was generated. Then it was split into the front part and back part using a plane which is from the two side waist points intersecting the crotch point. Then the front half waist and hip girth are compared to the back part respectively. The result shown in Fig. 1 reveals that: as body size increasing, the front part of the waist is increasing faster than the back part, while the front part of the hip is increasing slower than the back part. When the parametric body model is changing its size, the cross sections shapes change should follow these rules.
Fig. 1. Front and back proportion comparison
2.4 Relationship between Dimensions of Different Body Parts If any dimension changes, it normally will affect the other body dimension as well. For example when the hip girth becomes larger, the abdomen girth will increase as well. To gain a further understanding of the relationship between these critical
26
P. Sixiang et al.
dimensions during size changing, the correlation relationships between several critical dimensions are figured out. A sample correlation coefficient (r), sometimes referred to as a Pearson Product-moment correlation, is a measure of the strength of the linear relationship between two variables. If we have a series of n measurements of x and y written as xi and yi where i = 1, 2, ..., n, r is rewritten as (1)
∑ x y − ∑ x ∑ y ) / ( n∑ x − (∑ x )
rxy = (n
2
i
i
i
i
i
i
2
∑ y − (∑ y ) )
n
2
i
2
i
(1)
Its value ranges from -1.00 to + 1.00, with -1.00 representing a perfect negative linear relationship (as x increases, y decreases), 0.00 representing a total lack of linear relationship, and + 1.0 representing a perfect positive linear relationship (as x increases, y increases) between the variables. See Table 4, the results are similar to the correlation coefficient analyses results of anthropometric survey of US army(female) in 1988 [6], indicating these relationships are not only true of Hong Kong young female but also American young female. With the correlation relationship found, it is able to determine how the driven dimensions change following the driving dimensions of the parametric model. There are two driven dimensions: the scye circumference and under bust girth. Table 4. Correlation coefficients between critical body dimensions Correlation Coefficients
US army(female)
Hong Kong female
Bust Girth - Shoulder Length
0.074
0.118
Bust Girth - Neck Girth
0.645
0.457
Bust Girth - Scye Circumference
0.768
0.708
Bust Girth - Waist Girth
0.863
0.706
Waist Girth - Hip Girth
0.740
0.774
Neck Girth - Shoulder Length
0.080
0.105
Bust Girth - Under bust Girth
0.875
0.907
young
3 3D Parametric Body Model The shape of the 3D parametric body model was developed with the software Solidworks. It was controlled by some specific cross sections (see Fig. 3 and Fig. 4). To construct the shapes of these cross sections, the point cloud data of a normal size 12 full body forms are collected using the body scanner. Then filter the noise in the point cloud date. Connect the remaining points using B-spline method to generate the basic curves of the cross sections (Fig. 6). Compared to the point cloud data from the anthropometric survey, these basic curves are modified manually, in order to be more close to the most common body shape of the Hong Kong young women. Then
3D Parametric Body Model Based on Chinese Female Anhtropometric Analysis
27
the dimensions listed in Table 1 and the cup size were set to be the driving dimensions which are from the inputs of the user. And dimensions such as the scyecircumference and under bust girth were set to be the driven dimensions. Here is an example to explain how the parametric work. When the user increase the waist girth, the parametric body model will first increase the front part and back part separately according to the rules given in section 2.3 until the whole waist girth reaching the value which user indicated. Then the driven dimensions, such as the scye circumference and the under bust girth will change following the driving dimensions as section 2.4 describes. Fig. 6 shows an example that the parametric model changes from dimensions size A to size B (See Table 5), monitoring the body shapes variation.
Fig. 3. Parametric body model
Fig. 4. The parametric mannequin shape is controlled by the cross sections
Fig. 5. The cross section at the bust level constructed by the points and B-spline line
28
P. Sixiang et al. Table 5. Size A and B
Measurements Neck girth
Dimensions (cm) A 11.2
B 10.5
Bust girth
86.0
80.0
Waist girth
62.5
55.5
Abdomen girth
82.0
77.0
Hip girth
93.0
85.0
Neck to bust vertical length Bust to waist vertical length
23.2 16.5
20.4 14.0
Waist to hip vertical length
20.0
18.5
Fig. 6. Shape comparison between size A and B
4 Conclusion A 3D parametric body model has been built based on the anthropometric survey of the Hong Kong young women. It can monitor most of the various body figure types of the women in this age group, by simply inputting several driving dimensions. The relative analyses have been performed to develop a comprehensive understand for the human morphology. The outcome of the study provides valuable anthropometric data and a 3D paramedic body model to be used in garment drape simulation, female figure identification and virtual garment design. Further work will include bring in more driving dimensions and more driven dimensions to refine the shape and make the body model more flexible. A benchmark system will be added to check whether the user inputs are reasonable dimensions. And a user interface will be built as well to make the system more user-friendly.
3D Parametric Body Model Based on Chinese Female Anhtropometric Analysis
29
References 1. Kim, S., Park, C.: Parametric body model generation for garment drape simulation. Fibers and Polymers 5(1), 12–18 (2004) 2. Simmons, K., et al.: Female figure identification technique(FFIT) for apparel part I: describing female shapes. Journal of Textile and Apparel, Technology and Management 4(1) (2004) 3. Hardaker, C.H.M., Fozzard, G.J.W.: Towards the virtual garment: three-dimensional computer environments for garment design. International Journal of Clothing Science and Technology 10(2), 114–127 (1997) 4. D’Apuzzo, N., Consulting, H.: 3D body scanning technology for fashion and apparel industry. Videometrics IX. In: Beraldin, J.-A., Remondino, F., Shortis, M.R. (eds.) Proceedings of the SPIE, vol. 6491, p. 64910O (2007) 5. Carrere, C., et al.: Automated garment development from body scan data. National Textile Center Annual Report I (2000) 6. IL, N.U.E., et al.: Anthropometric Survey of US Army Personnel (1988): Correlation Coefficients and Regression Equations (1990)
Anthropometric Measurement of the Feet of Chinese Children Linghua Ran, Xin Zhang, Chuzhi Chao, and Taijie Liu China National Institute of Standardization, Zhichun Road, 4, Haidian District, Beijing 100088, China
[email protected]
Abstract. This paper presents the results of a nationwide anthropometric survey conducted on children in China. Foot length and foot breadth were measured from 20,000 children with age ranged from 4 to 17 years old by 3D foot scanner. Mean values, standard deviations, and the 5th, 95th percentile for the two items were estimated. The dimension difference between age and gender were discussed, and the classification of foot shape was analyzed. It was found that the mean values of the dimensions showed a gradual increase by age. The dimensions had no significant difference between genders for the children from 4 to 12, but the difference became significant for the children from 13 to 17. For both boys and girls, the intermediate type foot has the greatest proportion. These data, previously lacking in China, can benefit the children’s products design. Keywords: Foot; anthropometric measurement; Chinese children.
1 Introduction Anthropometric data are essential for the correct design of various facilities. Without such data, the designs can not fit people properly. This is especially true for children. The comfort and functional utility of the workspace, equipments and products which designed based on the anthropometric data are related with children’s health and safety. Many anthropometric studies had been undertaken to determine the size of children [1][2][3][4]. In china, a nationwide anthropometric survey project for children from 4 to 17 was completed from 2005 to 2008. This survey measured more than 100 anthropometric dimensions, including body size as well as head, foot and hand size. The foot length and foot breadth data for the children are presented in this paper. The purpose is to determine foot length and foot breadth dimensions in different age groups to facilitate the design of such products as shoes, socks and other components in their daily life.
2 Methods 2.1 Subjects China is a vast country with an area of over 96 million square kilometers. Children in different regions have large difference in body development status and body shape. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 30–36, 2011. © Springer-Verlag Berlin Heidelberg 2011
Anthropometric Measurement of the Feet of Chinese Children
31
To make the anthropometric survey more representative, a stratified cluster sampling method was used to determine the distribution of the samples. The whole country was divided into six geographical areas, which was in accordance with the adult anthropometric survey in 1988[5]: north and northeast area, central and western area, the lower reaches of the Changjiang River area, the middle reaches of the Changjiang River area, Guangdong-Guangxi-Fujian area, Yunnan-Guizhou-Sichuan area. From the statistical point of view, the people within each area have similar body shape and body size, but body shape and size for the people in different areas are different with each other. The sample size in each area was determined based on the distribution of children’s population reported by China National Bureau of Statistics [6]. One or two cities in each area were selected and some kindergartens, primary schools and high schools were taken from these cities. Within each kindergarten, primary school or high school selected, a number of classes were taken and all the children in which were measured until the number of children desired in per age group was met. According to Report on the Physical Fitness and Health Surveillance of Chinese School Students(2000)[7] and Report on the Second National Physical Fitness Surveillance(2000)[8], the children were subdivided into five age groups: preschool(4-6 years old),lower primary(7-10 years old),upper primary(11-12 years old) middle school(13-15 years old) high school(16-17 years old). In this survey, for example, 10 years old means ones whose age is from 9.5 to 10.5 years old. The sample size in each age group was distributed according to the children’s body development status. The sample size of preschool age group may be smaller. Within lower primary and middle school age group, sample size should be increased, and for upper primary and high school age group the sample size can be reduced appropriately. Based on this sampling plan, body dimension data were obtained from about 20,000 children in ten provinces distributed in the six geographical areas. 2.2 Dimension Measurements Instead of traditional Martin type anthropometer, the PEDUS 3D Foot Scanner was adopted for foot anthropometric survey. The accuracy of the scanner is 1mm, and the scan time is less than 10 second one person. The 3D scanning system is much faster than Martin method for collecting foot data and it is more applicable for large-scale anthropometric surveys. And it would provide a permanent record from which any measurement dimensions can be taken as needed. To achieve a greater scientific uniformity, measurements were always carried out on the right foot. After each scanning, a view to the scanning results was required to prevent the scanning failure caused by foot shifting. Before the start of the survey, the measurement team was trained in anthropometric techniques and was checked for consistency in their survey procedures to ensure the reliability of the anthropometric data. In each survey spot, the parents or teachers were asked to fill a form including their children’s name, sex, birth date and place, nationality, the school and grade, etc. The whole survey was completed in a period of about two years.
32
L. Ran et al.
2.3 Data Processing and Statistical Analysis Scanworx Foot Measure software was used to calculate the foot data.. This software can calculate some foot data automatically, and it also allows the user to calculate foot data interactively. In this paper, foot length and foot breadth were taken from this software. Both definitions of the two foot dimensions were taken from ISO 7250:2004[9]. The dimension values obtained were categorized according to sex and age groups and abnormity data examination was conducted. The extreme outliers and unreasonable results were identified and eliminated carefully by using 3σ test, peak value test and logical value test. The Statistical Package for the Social Sciences (SPSS) for Windows version 16.0 was used in the following statistical analysis. The descriptive statistics, including arithmetic means (M), standard deviations (SD), and percentiles (5th and 95th) of the above measurements were calculated for both boys and girls.
3 Results The statistical values of foot length and foot breadth are presented in table 1, including the number of subjects, gender (boys and girls) and age groups (4 to 17 years old). Estimates of mean, standard deviation (SD) and the 5th, 95th percentile are included in table 1. All dimensions are reported in mm. Table 1. The statistical values of foot anthropometric dimensions Gender
boys
girls
Age group
Number
Foot length
Foot breadth
4-6
1026
M 172
SD 14
P5 149
P95 194
M 67
SD 8
P5 52
P95 79
7-10 11-12
2113 1987
205 229
15 15
180 206
231 254
77 83
8 9
63 69
90 97
13-15 16-17
2794 1726
248 252
14 12
225 232
270 272
88 90
9 9
73 75
104 104
4-6
1065 2146 1941 2733 1800
169 201 225 231 231
14 16 13 11 11
145 173 204 213 213
191 228 245 250 248
64 73 80 83 83
7 9 8 8 8
53 58 68 71 70
75 87 93 95 95
7-10 11-12 13-15 16-17
4 Discussion 4.1 Differences between Age Groups From table 1, it can be found that the mean values for the two dimensions increase gradually by age. Table 2 and table 3 show the interclass increase value and relative
Anthropometric Measurement of the Feet of Chinese Children
33
odds ratio of the mean values. Both foot length and breadth show a trend for significant increase by age in boys and girls, and there are clear differences between the five age groups. For boys, the difference of foot length between age group(4-6) and (7-10) is 33mm. From (7-10) to (11-12), the foot length of boys increases by 24mm, and from (11-12) to (13-15) the increase value is 19mm. Also for girls, the increase of mean values of foot length are 32mm,24mm, 7mm respectively for the age group from (4-6) to (13-15) . The age group from (13-15) to (16-17) is the only exception and the interclass increase is -1mm. Table 2 and table 3 also reveal that for both boys and girls, there is a stage in which the feet have a relatively fast growth rate. For boys, it is in the 4 to 15 years old, but for girls it is the 4 to 12 years old. When the boy grows up to 15, girls to 12, the foot growth rate slows down. According to the Report on the Physical Fitness and Health Surveillance of Chinese School Students (2000), children have a sudden increase in youth period. During this period, their physical size has an obvious change. In that report, the periods are 12-14 and 10-12 for the boys and girls respectively. It can be found that there is a certain degree of correlation between the foot dimension changes and age group. Table 2. Mean value increase of foot length and breadth in different age groups (for boys)
Age group
Foot length Interclass Relative odds increase (mm) ratio (%)
Foot breadth Interclass Relative odds increase (mm) ratio (%)
(4-6) to (7-10)
33
119
10
115
(7-10) to (11-12)
24
111
7
109
(11-12) to (13-15)
19
108
5
106
(13-15) to (16-17)
4
102
2
102
Table 3. Mean value increase of foot length and breadth in different age groups (for girls)
Age group
Foot length Interclass Relative odds increase (mm) ratio (%)
Foot breadth Interclass Relative odds increase (mm) ratio (%)
(4-6) to (7-10)
32
119
8
113
(7-10) to (11-12)
24
112
8
110
(11-12) to (13-15)
7
103
3
103
(13-15) to (16-17)
-1
100
0
100
4.2 Gender Differences The differences of foot length and foot breadth between boys and girls can be obtained in table 1. The differences of mean values of foot length range from 3mm to 22mm. In the age group (4-6) to (11-12), the foot length of the boys’ is a little higher than girls’,
34
L. Ran et al.
but the differences are not obvious. In table 4 and table 5, the gender differences have become significant. In the age group of (13-15), the mean differences become significant which is 17mm.The differences keep increasing in the 16-17 age group by 22mm. The significance of the differences between boys and girls was also examined by Mollision’s method [10] [11] across age groups. The formula is as followed:
S=
A1 − A11 × 100 S A11
(1)
A1 — Arithmetic mean of boys in each age group; A11 — Arithmetic mean of girls in each age group; S A11 — Standard deviation of girls in each age group; Differences between the means of boys and girls are expressed in each measurement by percentage deviation. When the indicator of mean deviation is positive, then the value of the mean of boys is bigger than the mean of girls. The situation is reversed when the indicator is negative. If the result exceeds 100, then it shows that there is a significance difference between the two groups. The indicator of mean deviation was calculated. The results showed that from 4 to 12 years old, no significant differences were found between boys and girls in both of the foot length and foot breadth. In age group (13-15) and (16-17), the differences of the foot length were significant. For the foot breadth, the differences become obvious in age group (13-15) and (16-17). The results showed that the foot dimensions have very little differences between boys and girls for the children from 4-12 years old, which may imply that it was not necessary to consider gender difference in the design of some foot related products for children younger than 12 years old. But for children older than 12 years old, the difference should be taken into consideration. Table 4. Classification of the foot shape Gender
Age group
boys
4-6 7-10 11-12 13-15 16-17 4-6 7-10 11-12 13-15 16-17
girls
Narrow & long (%) 16 24 32 41 28 17 35 38 38 34
Intermediate type (%) 46 54 56 48 41 53 52 54 53 56
Short & wide (%) 38 22 12 10 31 30 13 8 9 10
Anthropometric Measurement of the Feet of Chinese Children
35
4.3 Classification of the Foot Shape According to the foot shape classification method of the former Soviet expert [12], the foot shape index was calculated (foot breadth/foot length*100%). If the index is less than 34.9,the foot is narrow & long type; If the index is between 35.0-39.9, the foot is intermediate type; If the index is higher than 40.0, the foot is short & wide type. Table 4 shows the percentage of every classification of each age group. From table 4, it can found that the intermediate type has the greatest percentage and it is almost 50% of all the children. And next is the narrow & long type. The classification can be used for the design of the children’s products.
5 Conclusion This study was conducted to provide foot anthropometric information of Chinese children from 4 to 17 years old, which could be used for the ergonomic design of workspace and products. The foot length and foot breadth dimensions extracted from 20,000 children are listed in the forms of mean, standard deviation and percentile values. The differences among age groups, between boys and girls groups are discussed, and three kinds of foot shape type was analyzed. The results showed that the differences between the age groups were significant. In age group of (13-15) and (16-17), the gender difference was significant in the foot length. Foot shape of most of the Chinese children is the intermediate type. This survey provides the first foot anthropometric database of Chinese children.
References 1. Wang, M.-J.J., Wang, E.M.-Y., Lin, Y.-C.: The Anthropometric Database for Children and Young Adults in Taiwan. Applied Ergonomics 33, 583–585 (2002) 2. Kayis, B., Ozok, A.F.: Anthropometry Survey Among Turkish Primary School Children. Applied Ergonomics 22, 55–56 (1991) 3. Steenbekkers, L.P., Molenbroek, J.F.: Anthropometric Data of Children for Non-specialist Users. Ergonomics 33(4), 421–429 (1990) 4. Prado-Leon, L.R., Avila-Chaurand, R., Gonzalez-Munoz, E.L.: Anthropometric Study of Mexican Primary School Children. Applied Ergonomics 32, 339–345 (2001) 5. Chinese National Standard, GB10000-1988: Human Dimension of Chinese Adults. Standards Press of China, Beijing (1988) 6. National Bureau of Statistics: Chinese Demographic Yearbook. China Statistics Press (2003) 7. Ministry of Education of the People’s Republic of China, General Administration of Sports of China, Ministry of Health of the People’s Republic of China, Ministry of Science and Technology, Sports and health study group of Chinese Students Allocation: Report on the Physical Fitness and Health Surveillance of Chinese School Students. Higher Education Press, Beijing (2000) 8. General Administration of Sports of China: Report on the Second National Physical Fitness Surveillance. Sports University Press, Beijing (2000)
36
L. Ran et al.
9. International Standard, ISO 7250: Basic Human Body Measurements for Technological Design. International Standard Organization (2004) 10. Hu, H., Li, Z., Yan, J., Wang, X., Xiao, H., Duan, J., Zheng, L.: Anthropometric Measurement of the Chinese Elderly Living in the Beijing Area. International Journal of Industrial Ergonomics 37, 303–311 (2007) 11. Nowak, E.: Workspace for Disabled People. Ergonomics 32(9), 1077–1088 (1989) 12. Xiao, H., Ai, Q.: Foot shape study of Children of Kazak nationality. Chinese Journal of Anatomy 23(3) (2000)
Human Dimensions of Chinese Minors Xin Zhang1, Yanyu Wang2, Linghua Ran1, Ailan Feng2, Ketai He2, Taijie Liu1, and Jianwei Niu2 1
2
China National Institute of Standardization, Beijing, 100088, China Department of Logistics Engineering, University of Science and Technology Beijing, Beijing, 100083, China
[email protected]
Abstract. This paper presents the preliminary statistical results of the latest national anthropometric survey of the minors in China mainland. About 20,000 minors (9666 males and 9699 females) were recruited from six geographical areas in China. These subjects were divided into five age groups. Body weight plus totally 134 static dimensions were selected for measurement. Non-contact threedimensional (3D) scanning technology was used in this survey while manual measuring and two-dimensional (2D) imaging measurement were used as the subsidiary methods. There is no significant difference on body height between genders at the significance level of 0.05 when the subjects are less than 12 years old. The 5th, 50th and 95th percentile values for nineteen selected dimensions were addressed. This work provides the first national wide anthropometric database of the minors in China mainland and will inevitable benefit the products design for the potential users. Keywords: Anthropometric, minors, three dimension, survey.
1 Introduction Anthropometric data are essential for the ergonomic design of numerous products, facilities and work places. Great efforts have been made through many countries and areas in establishing an anthropometric database for different population groups such as students, farmers and workers etc [1-2]. However, there is no enough 3D anthropometric data of Chinese in spite of the fast development of its economy and the expansion of the biggest labor market. Most products sold in China market are designed according to the body data of European and American, and they are not conforming to the Chinese bodily form. Though there are a large number of minors in China mainland, anthropometric data related with the minors have either been lacking or are quite obsolete. This has resulted in many problems related with functional utility of the classroom, the furniture, and the clothes relevant for the minors' health, safety and comfort [3-4]. A national anthropometric survey of the minors has been recently completed in China mainland. This survey, conducted from 2005 to 2008, is regarded as the most comprehensive, influencing and high-technique anthropometric project in China since the last national anthropometric survey was finished in 1986. Great anthropometric changes have taken place during the past decades with the rapid V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 37–45, 2011. © Springer-Verlag Berlin Heidelberg 2011
38
X. Zhang et al.
progress of the economical level and the living standards of the Chinese people. It is reported that secular changes of the anthropometric values have arisen continuously e.g., as demonstrated in the general trend of increasing stature [5]. As for the minors, notable anthropometric difference can be recognized compared with their ancestors’ counterparts. Anthropometric measuring technique has also made big advancement in the last years. Traditional univariate anthropometrical measurements could not satisfy the fitting requirements of products [6-7], while 3D laser surface imaging technology has the ability to digitize the surface information of the subjects as a 3D point cloud with high density [7-11]. Three dimensional scanning technologies is one of the most important methods for the digitalization of the objects and environment in the real world [12]. It is more and more popular and has attracted great attention of the anthropometry. Three dimensional anthropometric measurements have been used in various fields such as workplace design [13-14], sizing for clothes [6], and protective wear for motorcycle riders [15]. At present, more and more 3D scanning projects are launched in many countries [14, 16-18]. Three dimensional scanning techniques have been applied more and more widely and have the possibility to substitute for the traditional manual measuring method. The objective of this paper was to present the preliminary anthropometric data results for the minors in China mainland. The remainder of this paper is organized as follows. In Section 2, we introduce the method. Section 3 reports the results. Some discussions are given in Section 4. Finally, Section 5 summarizes this study.
2 Method 2.1 Subjects About 20,000 subjects (9,666 males and 9,699 females) participated in this survey. The population database has ages ranging from four to seventeen years old. The children were classified into five age groups: preschool (4-6 ages), junior primary school (7-10 ages), senior primary school (11-12 ages), junior high school (13-15 ages), and senior high school (16-17 ages). The criterion of age stratifying is based on ISO15535: 2003 General requirements for establishing anthropometric databases [19]. The population database includes 2,117 pre-school students, 4,263 junior primary school students, 3,930 senior primary school students, 5,527 junior high school students, and 3,527 senior high school students. The subjects were recruited from six geographical areas in China: the northeast-north China, the central and western China, the lower reach of Yangtze River, the middle reach of Yangtze River, the south-east and the south-west China. The sample size in each area was determined based on the distribution of children’s population reported by China National Bureau of Statistics [20]. One or two cities in each area were selected and some kindergartens, elementary schools and high schools were chosen from these cities. The distribution of the sample size among different regions and age groups can be seen in Table 1 and 2,
Human Dimensions of Chinese Minors
39
respectively. The parents or teachers of the minors filled the subjects' name, sex, birth date and place, nationality, the school name and grade, etc [5]. The whole measuring process was in accordance with ethics regulation. Table 1. Distribution of sample size among regions
Gender Male Female
Region Lower reach of Yangtze River
1804
1738
Middle reach of Yangtze River
1797
1804
Northeast-north China
1734
1805
South-east China
975
941
South-west China
1704
1773
Central and western China
1652
1638
Total
9666
9699
Table 2. Distribution of sample size among age groups
Age 4-6
Male 1043
Female 1074
Total 2117
7-10
2113
2150
4263
11-12
1988
1942
3930
13-15
2795
2732
5527
16-17
1727
1800
3527
Total
9666
9699
1700
2.2 Measuring Apparatus and Procedure The following instruments were used for the measurements: weighing scales, a Human Solutions Vitus 3D full body measuring equipment, a Human Solutions 3D head measuring equipment, a Human Solutions 3D foot measuring equipment, Martin type anthropometers and self-produced adjustable chairs. 3D body scanning measuring technique was primarily used in this survey, while weight, stature and middle fingertip height over head are measured manually. Hand dimensions were measured by ordinary 2D scanners. During the measuring process, the male subjects wore shorts, while the female wore underwear. As shown in Fig. 1, the subject wore a uniform tight white-colored cap to avoid the light absorption of the hair; otherwise there will be outliers and holes on the scanned head data. Since some anatomy landmarks can not be recognized by computer software, twenty-one landmarks were stuck on the skin of the subjects to help distinguish these points more easily.
40
X. Zhang et al.
Fig. 1. White-colored elastic cap worn during scanning [21]
There were three postures during body scanning in this survey, i.e., standing posture , standing posture , and sitting posture, as shown in Fig. 2, respectively.
Ⅰ
Ⅱ
(a)Standing posture I (b)Standing posture II (c)Sitting posture Fig. 2. Postures for body scanning
2.3 Measurements Body weight plus totally 134 static dimensions were selected for measurements. Sixty-nine of the static dimensions were related to the standing posture and twentythree to the sitting posture. The other forty-two were measurements of specific body segments, i.e., 15 head dimensions, 23 hand dimensions and 4 foot dimensions. The dimensions measured were all named according to International Standard, ISO 7250: Basic Human Body Measurements for Technological Design [22].
3 Results Some principles based on the expertise accepted through common practice in traditional anthropometry to ensure the data effectiveness were used to detect the outliers caused by equipment errors while taking the measurements, and to reduce both inter- and intra-investigator variability to a minimum. The statistical software
Human Dimensions of Chinese Minors
41
SPSS 16.0 was used for analysis. An ANOVA test with a significance level of 0.05 was used to test the gender effect on stature among different age groups. Results are listed as shown in Table 3. It can be seen that there is no significant difference between genders within the age group of 4-12 years old at the significance level of 0.05. However, there is significant difference between genders within the age group over 12 years at the significance level of 0.05. This coincides with the common sense. Chinese female minors usually grow into their adolescence period after they get 12 years old. Once getting into this period, significant physiology phenomenon and anthropometric characteristics will take place on the female minors. Therefore it is not necessary to separate female from male when presenting the descriptive statistics of height for the subjects younger than 12 years. Table 3. Descriptive statistics of height for male and female participants
Gender N M. Std. Minimum Maximum p-valuea Male 4942 1332 150 1785 934 4-12 0.15 Female 4810 1328 160 1768 922 13Male 2656 1617 88 1862 1327 0.00 15 Female 2628 1564 61 1754 1067 16Male 1617 1695 63 1901 1280 0.00 17 Female 1675 1581 55 1812 1403 a p-Values indicate significance of difference between genders within each age group. Age
Descriptive statistics - 5th, 50th and 95th percentile - were generated for female and male subjects, respectively within each age group. The statistical results are presented through table 4-6. The nineteen most commonly applied anthropometric measurements for the minors are presented here. Tables 4 presents the anthropometric mean values of 5th, 50th and 95th stature percentile group for both genders aging no more than 12 years. Tables 5 and 6 present the anthropometric mean values of 5th, 50th and 95th stature percentile group for male and female subjects for the two age groups, i.e., 13-15 and 15-17years, respectively.
4 Discussions Great efforts have been made to conduct data pre-processing, statistical analysis, and establishment of sizing systems once the anthropometric survey was finished. This paper only presents the preliminary statistical analysis results of this survey, mainly addresses the 5%, 50% and 95% percentiles of dimensions for each stature percentile group. It is common in ergonomics to use percentiles, for instance the 5th, 50th and 95th. However, it is not suitable for multidimensional issues [23-24]. Strictly speaking, the percentiles presented in this paper are more suitable for one dimensional design problems, such as defining the subway handrails and not suitable for multidimensional issues such as designing a vehicle cockpit [23]. For these kinds of ergonomic design problems other statistical approaches should be taken into consideration such as the anthropometric data integrated into digital human modeling software where multidimensional design tools are available [24].
42
X. Zhang et al.
Table 4. Anthropometric mean values of 5th, 50th and 95th stature percentile group for both 4. Anthr opometr ic mean values of 5th, 50th and 95th statur e per centile gr oup for gendersTable (age ≤12) both gender s (age
4-6 years
Anthropometric dimensionsa 1.stature 2.chest circumference 3.weight 4.hip breadth 5.hip circumference 6.waist circumference 7.abdominal circumference 8..shoulder breadth 9.neck circumference 10.cervical height 11.upperarm length 12.forearm length 13.iliocristale height, sitting 14.inner ankle height 15.body depth 16.eye height 17.maximum body breadth 18.shoulder height 19. iliospinale anterius height a
5%ile 996 535 152 199 552 486 499 228 247 801 175 129 129 44 171 882 289 756 502
50 %ile 1107 580 190 213 592 512 532 246 254 899 199 145 146 49 184 993 303 853 569
95%ile 1222 637 250 238 659 562 584 266 266 998 221 164 161 54 199 1098 329 951 640
Age group 7-10 years percentile 5%ile 50 %ile 1172 1302 592 649 205 272 220 246 607 682 508 566 529 594 258 285 252 268 956 1075 210 238 158 179 155 173 51 57 179 199 1054 1183 305 336 910 1027 614 702
11-12 years 95%ile 1450 772 410 288 800 677 717 317 297 1213 265 199 192 64 231 1327 391 1155 796
5%ile 1345 684 304 256 714 591 614 299 276 1125 250 186 181 59 200 1230 351 1074 735
50 %ile 1468 736 382 281 778 632 663 319 289 1234 272 204 198 64 215 1350 376 1179 809
95%ile 1601 815 504 312 865 693 732 345 311 1357 298 224 211 70 232 1479 417 1294 887
Weight is given in kg; all other values are given in mm. Table 5. Anthr opometr ic mean values of 5th, 50th and 95th statur e per centile gr oup for
>12) Table 5. Anthropometric mean values ofmale 5th,(age 50th and 95th stature percentile group for male (age >12)
Age group Anthropometric dimensions
1.stature 2.chest circumference 3.weight 4.hip breadth 5.hip circumference 6.waist circumference 7.abdominal circumference 8..shoulder breadth 9.neck circumference 10.cervical height 11.upperarm length 12.forearm length 13.iliocristale height, sitting 14.inner ankle height 15.body depth 16.eye height 17.maximum body breadth 18.shoulder height 19. iliospinale anterius height a
13-15 years
a
5%ile 1457 722 351 269 743 602 630 320 281 1225 273 203 194 64 215 1341 367 1168 810
50 %ile 1623 820 518 312 860 687 721 360 324 1378 304 226 215 72 233 1504 422 1311 894
16-17 years percentile 95%ile 5%ile 1749 1596 875 848 621 531 334 319 914 882 715 697 753 729 380 376 339 335 1488 1357 327 300 249 220 229 217 77 70 241 235 1624 1482 450 432 1423 1287 965 868
Weight is given in kg; all other values are given in mm
50 %ile 1696 868 588 328 908 716 750 383 347 1442 321 235 221 75 242 1576 446 1374 925
95%ile 1793 881 649 342 935 733 773 394 349 1532 336 252 233 79 247 1665 456 1458 986
Human Dimensions of Chinese Minors
43
Table 6. Anthropometric mean values of 5th, 50th and 95th stature percentile group for female (age >12) Age group Anthropometric dimensions
1.stature 2.chest circumference 3.weight 4.hip breadth 5.hip circumference 6.waist circumference 7.abdominal circumference 8..shoulder breadth 9.neck circumference 10.cervical height 11.upperarm length 12.forearm length 13.iliocristale height, sitting 14.inner ankle height 15.body depth 16.eye height 17.maximum body breadth 18.shoulder height 19. iliospinale anterius height a
13-15 years
a
5%ile 1465 786 412 305 835 655 690 331 292 1228 272 201 202 64 221 1346 389 1177 795
50 %ile 1566 821 489 323 884 681 725 345 304 1323 292 215 217 69 234 1446 412 1262 856
16-17 years percentile 95%ile 5%ile 1663 1490 843 839 561 463 342 322 928 876 709 680 760 717 360 339 316 301 1410 1258 313 276 228 203 227 205 73 66 239 231 1541 1377 429 409 1351 1202 919 806
50 %ile 1580 846 510 334 905 703 750 352 308 1338 294 216 217 70 239 1462 421 1276 855
95%ile 1677 855 573 346 936 702 758 359 313 1429 313 233 225 74 241 1557 429 1364 918
Weight is given in kg; all other values are given in mm
Anthropometric data has applications in establishment of sizing systems, development of digital manikins, aviation, virtual maintenance and assembly, clinical diagnostics, cosmetic surgery, forensics, arts and other fields. As the first national wide anthropometric survey in China, this work have attracted great interests from various fields, either in the academia or the industry. At present, five national standards for the minors in China, i.e., body dimensions standard, hand sizing standard, foot sizing standard, garment sizing standard and head sizing standard, will be revised based on the data and statistical results of this work.
5 Conclusions Anthropometric data of the minors are very important in products design, but they have been lacking or obsolete in China mainland for a long period. The minors in China are a tremendous group of potential customers with high consumption power for all the manufacturers in domestic or from abroad. The body dimensions of Chinese minors have changed greatly compared with their ancestors counterparts two decades ago. The anthropometric data for the minors in China will bring great benefits to the industry and academia.
Acknowledgments The study is supported by the National Natural Science Foundation of China (No.51005016) and General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China.
44
X. Zhang et al.
References 1. Prado-Leon, L.R., Avila-Chaurand, R., Gonzalez-Munoz, E.L.: Anthropometric study of Mexican primary school children. Appl. Ergon. 32, 339–345 (2001) 2. Wang, M.J., Wang, E.M.Y., Lin, Y.C.: The anthropometric database for children and young adults in Taiwan. Appl. Ergon. 33, 583–585 (2002) 3. Kayis, B., Ozok, A.I.: The anthropometry of Turkish army men. Appl. Ergon. 22(1), 49–54 (1991) 4. Georgia, P., Kosmas, C., Anthoula, P., Konstantinos Mandroukas, M.: Classroom furniture dimensions and anthropometric easures in primary school. Appl. Ergon. 35, 121–128 (2004) 5. Ran, L.H., Zhang, X., Chao, C.Z., Liu, T.J., Dong, T.T.: Anthropometric Measurement of the Hands of Chinese Children. In: Duffy, V.G. (ed.) ICDHM 2009. LNCS, vol. 5620, pp. 46–54. Springer, Heidelberg (2009) 6. Meunier, P., Yin, S.: Performance of a 2D image-based anthropometric measurement and clothing sizing systems. Appl. Ergon. 31, 445–451 (2000) 7. Ball, R., Shu, C., Xi, P.C., Rioux, M., Luximon, Y., Molenbroek, J.: A comparison between Chinese and Caucasian head shapes. Appl. Ergon. 41(6), 832–839 (2010) 8. Goonetilleke, R.S., Luximon, A.: Designing for Comfort: A Footwear Application. In: Das, B., Karwowski, W., Mondelo, P., Mattila, M. (eds.) Proceedings of the ComputerAided Ergonomics and Safety Conference, Maui, Hawaii, July 28-August 2 (2001) 9. Robinette, K., Blackwell, S., Daanen, H., Fleming, S., Boehmer, M., Brill, T., Hoeferlin, D., Burnsides, D.: Civilian American and European Surface Anthropometry Resource (CAESAR). Final Report. In: Summary, AFRL-HE-WPTR-2002-0169, vol. I. Air Force Research Laboratory, Human Effectiveness Directorate, Bioscience and Protection Division, 2800 Q Street, Wright-Patterson, AFB OH 45433-7947 (2002) 10. Ball, R.M., Molenbroek, J.F.M.: Measuring Chinese heads and faces. In: Proceedings of the Ninth International Congress of Physiological Anthropology, Human Diversity: Design for Life, Delft, Netherlands, pp. 150–155 (2008) 11. Witana, C.P., Goonetilleke, R.S., Xiong, S., Au, E.Y.L.: Effects of surface characteristics on the plantar shape of feet and subjects’ perceived sensations. Appl. Ergon. 40(2), 267– 279 (2009) 12. Chang, D.F., Lu, H.J., Mi, W.J.: Bulk Terminal Stockpile Automatic Modeling Based on 3D Scanning Technology. In: International Conference on Future Information Technology and Management Engineering, October 9-10 (2010) 13. Botha, W.E., Bridger, R.S.: Anthropometric variability, equipment usability and musculoskeletal pain in a group of nurses in the Western Cape. Appl. Ergon. 29, 481–490 (1998) 14. Wang, E.M.Y., Wang, M.J., Yeh, W.Y., Shih, Y.C., Lin, Y.C.: Development of anthropometric work environment for Taiwan workers. Int. J. Ind. Ergonomics. 23, 3–8 (1999) 15. Robertson, S.A., Minter, A.: A study of some anthropometric characteristics of motorcycle riders. Appl. Ergon. 27(4), 223–229 (1996) 16. China Standards GB/T 10000-1988. Human dimensions of Chinese adults (1988) (in Chinese) 17. Research Institute of Human Engineering for Quality Life: Japanese body size data. Japan (1994) (in Japanese) 18. Lee, Y.S.: Applied Korean anthropometric database for product design: clothing design. In: Agency for Technology and Standards, MOCIE, Korea (2000)
Human Dimensions of Chinese Minors
45
19. International Standard, ISO 15535: General requirements for establishing anthropometric databases. International Standard Organization (2003) 20. National Bureau of Statistics: Chinese Demographic Yearbook. Statistics Press, China (2003) (in Chinese) 21. China National Institute of Standardization, Technique Report: Human dimensions of Chinese minors (2009) (in Chinese) 22. International Standard, ISO 7250: Basic Human Body Measurements for Technological Design. International Standard Organization (2008) 23. Robinette, K.M., Hudson, J.A.: Anthropometry. In: Salvendy, G. (ed.) Handbook of Human Factors and Ergonomics, 3rd edn., pp. 322–339. John Wiley & Sons, New York (2006) 24. Hanson, L., Sperling, L., Gard, G., Ipsen, S., Vergara, C.O.: Swedish anthropometrics for product and workplace design. Appl. Ergon. 40, 797–806 (2009)
Development of Sizing Systems for Chinese Minors Xin Zhang1, Yanyu Wang2, Linghua Ran1, Ailan Feng2, Ketai He2, Taijie Liu1, and Jianwei Niu2 1
2
China National Institute of Standardization, Beijing, 100088, China Department of Logistics Engineering, University of Science and Technology Beijing, Beijing, 100083, China
[email protected]
Abstract. The purpose of this study is to develop sizing systems for Chinese minors. This work is based on the most up-to-date and complete national-wide anthropometric survey in China mainland. About 20,000 minors (9666 males and 9699 females) were recruited from six geographical areas in China. Body weight plus totally 134 static dimensions were collected. Stature, the bust waist girth difference (BWGD) and bust girth were finally identified through factor analysis as the three most critical parameters out of thirteen frequently used anthropometric dimensions in sizing systems. Three sizing systems were established systematically by stature group and gender, i.e., both genders shorter than 130 cm, male minors taller than 130 cm and female minors taller than 130 cm. The accommodation rate of the developed sizing systems is 47.23%, 60.19% and 81.87%, respectively. The number of sizes in each system is 25, 25 and 19, respectively. The results provide valuable references for the garment manikin design related with Chinese minors. Keywords: Anthropometric data, Sizing system, Chinese minors, Factor analysis.
1 Introduction A great deal of emphasis has been paid on the fitting design of wearing products. Among various fitting design methods, developing a sizing system is regarded as the most effective and cost saving. A sizing system classifies a specific population into homogeneous subgroups [1]. It aims to choose the sub-groups in such a way that a limited number of sizes will provide products that fit most individuals in the population [2]. One of the important parameters for a sizing system is the anthropometrics characteristic of the intended users [3]. In spite of the tremendous amount of anthropometric data recently, the anthropometric data were not fully utilized due to failure to consider the convenience of users [4]. This phenomenon is rather serious for the minors. According to the statistics of the Population Reference Bureau [5], about 29% of the world populations are children (under the age of 15 years). Being the largest potential market, the demands for Chinese minors is experiencing expansion day by day. However, little anthropometry research work has been served for the minors. Thus, as a fundamental research direction, there is an urgent need to establish V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 46–55, 2011. © Springer-Verlag Berlin Heidelberg 2011
Development of Sizing Systems for Chinese Minors
47
sizing systems that represent the specific minors group with the most recent anthropometric data. A national anthropometric survey of the minors has been recently completed in China mainland. This survey, conducted from 2005 to 2008, is regarded as the most comprehensive, influencing and high-technique anthropometric project in China. Anthropometric data from this survey is used to build up the sizing systems. Sizing systems not only benefit the apparel industry [6], but also bring reference to digital manikin design. With the rapid progress of computer technology, manikins incorporated into digital human modeling package can provide a flexible solution for many application and significantly reduce time and cost. Digital manikins have long been introduced to facilitate efficient fitting consideration in the design process [7, 8] such as in automotive [9], aircraft assembly [10] and health care industry [8], etc. Digital manikins have been integrated into many digital human modeling tools, such as Jack, Delmia, RAMSIS, SAMMIE, SAFEWORK, ANYBODY and CREW CHIEF, etc. However, there remain some inherent shortcomings for the above-mentioned digital human modeling tools. First, such as in Jack, no Chinese anthropometric data was incorporated within this software. This is because Jack was originally developed more than two decades ago in USA, and at that time no large scale anthropometric survey of Chinese minors was available. Second, when design digital manikins with different sizes within such tools, usually the percentile, a widely wrongly used concept, or "the average man" concept, was utilized. There is no such thing as "the average man" because no one is average for all measurements [11]. Zehner et al. [12] further demonstrated that the range of people accommodated with percentiles is much less than desired. An explanation for the widespread use of percentiles is that percentiles are easy to list in tabular form [11]. Fortunately, new sizing systems based on the most recent anthropometric data may improve the effectiveness of digital manikin design. The rest of this paper is organized as follows. Section 2 introduces the proposed method. Results are introduced in Section 3. Discussions are then presented in Section 4. Finally, Section 5 concludes this paper.
2 Method 2.1 Subjects About 20,000 subjects (9,666 males and 9,699 females) participated in this survey. The population database has ages ranging from four to seventeen years old. The children were classified into five age groups: preschool (4-6 ages), junior primary school (7-10 ages), senior primary school (11-12 ages), junior high school (13-15 ages), and senior high school (16-17 ages). The criterion of age stratifying is based on ISO15535: 2003 General requirements for establishing anthropometric databases [13]. The population database includes 2,117 pre-school students, 4,263 junior primary school students, 3,930 senior primary school students, 5,527 junior high school students, and 3,527 senior high school students. The subjects were recruited from six geographical areas in China: the northeast-north China, the central and western China, the lower reach of Yangtze River, the middle reach of Yangtze River, the south-east and the south-west China. The sample size in each area was determined based on the distribution of children’s population reported by China National Bureau of Statistics
48
X. Zhang et al.
[14]. One or two cities in each area were selected and some kindergartens, elementary schools and high schools were chosen from these cities. The distribution of the sample size among different age groups can be seen in Table 1. The parents or teachers of the minors filled the subjects' name, sex, birth date and place, nationality, the school name and grade, etc. Table 1. Distribution of sample size among age groups Age
Male
Female
Total
4-6
1,043
1,074
2,117
7-10
2,113
2,150
4,263
11-12
1,988
1,942
3,930
13-15
2,795
2,732
5,527
16-17
1,727
1,800
3,527
Total
9,666
9,699
1,700
2.2 Measuring Apparatus and Procedure and Dimensions
Ⅰ
There were three postures during body scanning in this survey, i.e., standing posture , standing posture , and sitting posture, respectively. The following instruments were used for the measurements: weighing scales, a Human Solutions Vitus 3D full body measuring equipment, a Human Solutions 3D head measuring equipment, a Human Solutions 3D foot measuring equipment, Martin type anthropometers, and selfproduced adjustable chairs. 3D body scanning measuring technique was primarily used in this survey, while weight, stature and middle fingertip stature over head are measured manually. During the measuring process, the male subjects wore shorts, while the female wore underwear. Body weight plus totally 134 static dimensions were selected for measurements. Sixty-nine of the static dimensions was related to the standing posture and twentythree to the sitting posture. The other forty-two were measurements of specific body segments, i.e., 15 head dimensions, 23 hand dimensions and 4 foot dimensions. The dimensions measured were all named according to International Standard, ISO 7250: Basic Human Body Measurements for Technological Design [15].
Ⅱ
2.3 Control Dimensions The use of key dimensions can help to effectively identify the control parameters for classify the figure types [1]. Thirteen dimensions were selected out of the 134 anthropometric dimensions as the most frequently used measurements in sizing systems. Considering the general practice in international standards, we'd better make further efforts to extract some key dimensions as the control parameters in performing the figure type classification. In this study, BWGD, stature and bust girth are adopted as the control dimensions. Conventionally, it’s regarded of no necessity to classify the minors’ figure type in the development of sizing systems. Even in the most recent national standard
Development of Sizing Systems for Chinese Minors
49
garments sizing systems for the minors in China issued in 2009 [16], figure type classification is still not adopted. However, based on the statistical results of the latest survey, it’s necessary and urgent to take BWGD into consideration for minors’ figure type classification. An ANOVA test with a significance level of 0.01 was used to test the gender effect on BWGD for each stature group. As shown in Table 2, for the minors whose stature is less than 1300 mm, it’s true no gender effect on BWGD for each different stature group has been found, but for the minors who are taller than 1300 mm, there does exist significant difference between genders for each stature group. Consequently, the samples were divided into three subgroups, i.e., both genders shorter than 130 cm, male minors taller than 130 cm and female minors taller than 130 cm. The sample number of each sub group is 3910, 7206 and 6908, respectively. After careful examination the outliers caused by equipment errors while taking the measurements, totally 18024 subjects were reserved for this study. Table 2. ANOVA of bust waist girth difference (BWGD) between genders for each stature group
a
Stature (cm)
F. value
Sig.
100-110
0.296
0.587
110-120
0.140
0.708
120-130
4.778
0.029
130-140
7.709
0.006a
140-150
19.017
0.000 a
150-160
7.606
0.006 a
160-170
6.970
0.008 a
Indicates the significance level equals 0.01.
The second and third control dimensions are stature and the bust girth. Hereby factor analysis was used in this study. Factor analysis is a multivariate statistical analysis method that examines the inter-relationships among a large number of variables and extracts the underlying factors [1]. The procedure of developing a sizing system for minors both genders shorter than 1300 mm is presented here as an example. KMO and the Bartlett should be examined at first to identify whether the data is fit for factor analysis. It is often suggested that Kaiser-Meyer-Olkin (KMO) should be 0.5 as a minimal level, and Bartlett test should be significant enough. According to the result, the KMO measure is 0.937, greater than the minimal level. The Bartlett's test of sphericity, p-value=0.000, is significant. Consequently, the samples are suitable for factor analysis. The dimensions with factor loading greater than 0.8 in factor one were staturerelated variables, including stature, thigh length, cervical height, waist height, cervical
50
X. Zhang et al.
height sitting, and shoulder height. Thus, factor one was named as stature factor. Similarly, factor two was named as girth factor. Here only the first two key factors were considered, because the total variance of the original variables can be explained about 87.212% by the first two key factors. Factor loadings of body dimensions to the first two factors were shown in Table 3. Factor loading of stature to factor one is 0.953, the highest score among all the selected anthropometric dimensions. This conforms with the common sense that stature is the most distinct morphologic characteristics and most widely used dimension in sizing systems. This also conforms to the recommendation of ISO/TC 133 that stature should be taken as the key variable for growing children. Factor loadings of waist girth, abdominal girth, neck girth and bust girth to factor two are 0.947, 0.925, 0.841 and 0.824, respectively. Waist girth has the highest loading score (loading=0.947). We use BWGD in stead of waist girth. This is the result of referring to the national garment sizing system for adults [17, 18]. Though the factor loading to factor two of the abdominal girth is second largest (loading=0.925), and for neck girth (loading=0.841) it is also higher than bust girth Table 3. Factor loadings of body dimensions Factor
1
2
Stature
0.953a 0.256
Bust girth
0.422 0.824 a
Thigh length
0.891 a 0.182
Hip girth
0.570 0.754
Waist girth
0.193 0.947 a
Abdominal girth
0.248 0.925 a
Shoulder breadth
0.736 0.402
Neck girth
0.104 0.841 a
Cervical height
0.954 a 0.253
Full arm length
0.678 0.602
Waist height
0.951 a 0.218
Cervical height,
0.849 a 0.277
sitting Shoulder height
a
0.949 a 0.259
% of total Variance 70.457 87.212 Factor loading>0.8
Development of Sizing Systems for Chinese Minors
51
(loading=0.824), they are replaced by bust girth. This can be explained as follows. First, during the measuring procedure, the subject was asked to breathe quietly to make no fluctuation of his/her abdominal girth, but actually it’s too difficult for the minors, especially the pre-school and the elementary school students to do so. This results in that abdominal girth is not steady enough for future analysis. Second, measuring the neck girth is more difficult then measuring the bust girth. Third, based on the results of correlation analysis, the correlation coefficients between bust girth and abdominal girth together with neck girth are 0.836 and 0.690 at the significance level of 0.05, respectively. That means the bust girth can represent abdominal girth and neck girth with high correlation. Finally, bust girth is much more widely used than abdominal girth and neck girth in practice. Hence, it is reasonable to select BWGD, stature and bust girth as the three most influential dimensions to be the control factors.
3 Results In accordance with GB/T 1335.3-2009 [16], the size interval for stature was 10 cm, for BWGD it was 5 cm and for bust girth it was 4 cm. Table 4, 5 and 6 present the accommodation rates in each sizing system. Here only the size whose accommodation rate more than 1.2% was reserved in each set. As we can see, the accommodation rate of the developed sizing systems is 47.23%, 60.19% and 81.87%, respectively. The number of sizes in each sub-group is 25, 25 and 19, respectively. For example, for minors shorter than 130cm, figure type of Y, A, B, C and D corresponds to BWGD ranging between 19-23cm, 14-18 cm, 9-13 cm, 4-8 cm, and -1-3 cm, respectively. Table 4. Accommodation ratesa (%) for minors shorter than 130cm
Figure type
Statureb
Bust girthb 120-130
D
B
a
100-110
3.32
6.73
54
90-100
1.2
54
C
110-120
58
6.34
8.57
5.7
62
7.06
4.27
1.23
66
3.07
70
1.43
58
3.84
4.94
1.71
62
9.8
4.58
66
5.09
70
1.56
1.43
The accommodation rate is computed as the ratio of the samples’ number covered by the sizing system to the whole sample number in each sub-group. b All dimensions are given in cm.
52
X. Zhang et al. Table 5. Accommodation ratesa (%) for male minors higher than 130cm Statureb
Bust girthb
Figure type
170-180
160-170
150-160
140-150
130-140
65
1.28
69
1.62
3.23
73
1.32
3.18
2.73
77
1.82
2.37
1.21
1.73
1.25
B 81
1.35
85
1.32
77
1.30
81 A
2.48
85
1.73
3.51
89
1.79
2.93
93
1.48
1.40
85
1.8
1.46
Y 89
1.22
1.72
a
The accommodation rate is computed as the ratio of the samples’ number covered by the sizing system to the whole sample number in each sub-group. b All dimensions are given in cm. Table 6. Accommodation ratesa (%) for female minors higher than 130cm
Figure type
Statureb
Bust girthb 160-170
150-160
140-150
61
B
1.42
65
1.8
3.07
69
2.72
2.07
73
2.71
77
3.24
81
2.87
85
1.75
69 A
130-140
1.27
73 77
2.43
1.72
2.72
1.65
4.95
1.49
Development of Sizing Systems for Chinese Minors
53
Table 6. (Continuous)
Y
81
2.11
5.6
85
2.37
4.11
89
1.51
2.17
81
1.59
85
1.58
89
1.27
a
The accommodation rate is computed as the ratio of the samples’ number covered by the sizing system to the whole sample number in each sub-group. b All dimensions are given in cm. Table 7 shows the control dimensions of the medoid sample of each size for the minors shorter than 130 cm. The medoid sample has the least Euclidean distance to the mean sample of each size. Besides the weight, stature, bust girth and waist girth, each size provides other six reference dimensions. The size label used in this study referred to the format of ISO 3636 and ISO 3637 [19, 20]. For example, for a size label of 125/58C, it indicates that this size is suitable for the figure type C with BWGD between -1 and 3 cm, stature between 120.0 and 129.9 cm and bust girth between 56.0 and 59.9 cm. Table 7. Size chart of both genders shorter than 130 cm
Size 125/58C 125/62C 125/66C 125/70C 125/58B 125/62B 125/66B 125/70B 115/54C 115/58C 115/62C 115/58B 115/62B 105/54D 105/54C 105/58C 105/62C 105/58B 95/54C a
Dimensiona Full Cervical Waist Shoulder Bust Weight Stature arm height height height girth length 22.2 25.2 28.1 30.2 22.9 23.7 25.8 28.4 18.4 19.0 22.2 18.9 21.8 17.3 16.7 18.4 20.7 18.6 15.6
125.2 125.9 125.1 127.7 124.8 124.8 125.9 127.3 114.7 114.6 116.2 115.0 115.8 104.6 105.3 106.7 105.7 106.8 97.8
102.6 102.8 104.8 105.5 102.7 102.9 103.5 103.4 93.0 94.0 93.8 94.5 94.9 84.6 84.8 86.3 86.6 88.3 78.9
48.4 49.0 52.1 51.4 46.7 47.8 49.8 51.4 43.7 46.6 46.5 45.5 45.7 42.7 44.3 44.1 46.1 44.3 40.4
75.6 73.9 75.8 75.5 74.6 74.3 75.8 77.2 66.8 67.4 67.3 67.4 68.1 60.4 61.3 61.6 62.7 61.1 56.0
97.7 97.7 100.2 99.9 98.0 98.1 98.3 100.2 87.3 89.4 89.8 90.3 91.6 79.8 79.8 80.8 81.7 81.4 74.6
Waist Abdominal Neck girth girth girth
58.1 61.8 66.5 69.3 59.3 63.6 66.2 69.5 55.9 56.7 61.2 58.6 61.3 55.6 53.5 58.4 60.8 57.1 54.2
Weight is given in kg, and all dimensions are given in cm.
52.3 54.6 60.7 64.3 49.2 51.7 54.9 59.6 48.5 51.1 52.8 49.1 52.1 52.5 47.7 52.7 55.2 48.5 48.0
54.8 56.5 61.4 64.5 52.6 53.8 57.4 61.0 50.7 52.7 56.7 50.5 53.7 52.9 50.4 52.9 58.3 50.4 48.1
24.7 26.8 27.9 28.7 23.5 25.7 26.7 25.6 24.7 26.3 26.5 24.1 26.3 23.9 25.2 25.7 27.5 23.5 26.3
54
X. Zhang et al.
4 Discussions An effective sizing system should satisfy three main criteria, i.e. fewer sizes, higher accommodation rate and better fit. Regarding the performance in good fit, the aggregate loss was usually recommended [2]. A good sizing system should have a low aggregate loss [1]. Evaluation, for example, by using the aggregate loss, should be made to verify whether the sizing systems developed in this study is as reasonable as possible in practice. In this paper, linear measurements were used as key dimensions. However, this may lead to design deficiency on fitting comfort since many three dimensional information on the surface of the subjects would be ignored. In recent years, some new methods of sizing methods, conducted directly on three dimensional data, were proposed to find appropriate classification system for body shape.
5 Conclusions The minors in China are a tremendous group of potential customers with high consumption power for all the manufacturers in domestic or from abroad. This study proposed sizing systems for the minors based on the most up-to-date and complete national wide survey in China mainland. This work can provide useful information for the fitting design of products, study and creation environments for the minors. Acknowledgements. The study is supported by the National Natural Science Foundation of China (No.51005016) and General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China.
References 1. Chung, M.J., Lin, H.F., Wang, M.J.: The development of sizing systems for Taiwanese elementary- and high-school students. International Journal of Industrial Ergonomics 37, 707–716 (2007) 2. Ashdown, S.P.: An investigation of the structure of sizing systems A comparison of three multidimensional optimized sizing systems generated from anthropometric data with the ASTM standard D5585-94. International Journal of Clothing Science and Technology 10(5), 324–341 (1998) 3. Kim, J.H., Whang, M.C.: Development of a set of Korean manikins. Applied Ergonomics 28(5/6), 407–410 (1997) 4. Lee, S.L., Jeong, S.Y., Lee, J.Y., Chung, H.W.: A Study on the Body Type Classification of Korean Men in their 20’s for Pattern Making. In: Proceedings of 17th World Congress on Ergonomics (2009) 5. Population Reference Bureau, World Population Data Sheet of the Population Reference Bureau. Washington, DC, USA (2006) 6. Zheng, R., Yu, W., Fan, J.T.: Development of a new Chinese bra sizing system based on breast anthropometric measurements. International Journal of Industrial Ergonomics 37, 697–705 (2007)
Development of Sizing Systems for Chinese Minors
55
7. Chaffin, D.B.: Improving digital human modeling for proactive ergonomics in design. Ergonomics 48(5), 478–491 (2005) 8. Högberg, D., Hanson, L., Lundström, D., Jönsson, M., Lämkull, D.: Representing the elderly in digital human modeling. In: Proceedings of the 40th Annual Nordic Ergonomic Society Conference, Reykjavik, Iceland, August 11-1 (2008) 9. Niu, J.W., Zhang, X.W., Zhang, X., Ran, L.H.: Investigation of Ergonomics in Automotive Assembly Line Using Jack. In: IEEE International Conference on Industrial Engineering and Engineering Management (IEEM 2010), Macau, China, December 7-10 (2010) 10. Li, J.X., Zheng, G.L.: Research on application DELMIA system in aircraft assembly simulation. Aeronautical Manufacturing Technology 2(11), 90–93 (2008) (in Chinese) 11. Whitestone, J.J., Robinette, K.M.: Fitting to maximize performance of HMD systems. In: Melzer, J.E., Moffitt, K.W. (eds.) Head-Mounted Displays: Designing for the User, pp. 175–206. McGraw-Hill, New York (1997) 12. Zehner, G., Meindl, R., Hudson, J.: A multivariate anthropometric method for crew station design: abridged. AL-TR-1992-0164 (1992) 13. ISO 15535: General requirements for establishing anthropometric databases. International Standard Organization (2003) 14. National Bureau of Statistics: Chinese Demographic Yearbook. Statistics Press, China (2003) (in Chinese) 15. International Standard, ISO 7250: Basic Human Body Measurements for Technological Design. International Standard Organization (2008) 16. GB/T 1335.3-2009: Standard sizing systems for garments-Children (2009) (in Chinese) 17. GB/T 1335.1-2008: Standard sizing systems for garments-Men (2008) (in Chinese) 18. GB/T 1335.2-2008: Standard sizing systems for garments-Women (2008) (in Chinese) 19. ISO 3636: Size Designation of Clothes-Men’s and Boys’ Outerwear Garments. International Organization for Standardization, Geneva (1977) 20. ISO 3637: Size Designation of Clothes-Women’s and Girls’ Outerwear Garments. International Organization for Standardization, Geneva (1977)
Motion Capture Experiments for Validating Optimization-Based Human Models Aimee Cloutier, Robyn Boothby, and Jingzhou (James) Yang Department of Mechanical Engineering Texas Tech University, Lubbock, TX 79409 Tel.: 806-742-3563; Fax: 806-742-3540
[email protected]
Abstract. Optimization-based digital human model research has gained significant momentum among various human models. Any task can be formulated to an optimization problem, and the model can predict not only postures but also motions. However, these optimization-based digital human models need validation using experiments. The motion capture system is one of the ways to validate predicted results. This paper summarizes the progress of motion capture experiment efforts at the Human-Centric Design Research (HCDR) Laboratory at Texas Tech University. An eight-camera motion capture system has been set up in our research lab. Marker placement protocols have been developed where markers are placed on the subjects to highlight bony landmarks and identify segments between joints in line with previously identified guidelines and suggestions in literature. A posture reconstruction algorithm has been developed to map joint angles from motion capture experiments to digital human models. Various studies have been conducted in the lab involving motion capture experiments for jumping, standing and seated reach, and pregnant women’s walking, sit to standing, seated reach, and reach with external loads. The results showed that the posture reconstruction algorithm is useful and accurate to transfer motion capture experiment data to joint angles. Marker placement protocol is reliable to capture all joints. The main task of the motion caption system is to validate all optimization-based digital human models developed by other research members at the HCDR Lab. Keywords: Digital human models, motion capture, validation.
1 Introduction Digital human modeling has gained significant momentum due to advances in computer hardware and needs for virtual prototyping. Different digital human models have been developed such as Jack, Safework, Ramsis, Anybody, OpenSim, etc. Some of them are based on experimental data and statistic models, and others are based on optimization. However, all of them need validation, and motion capture is one of the ways to acquire information about subject posture and motion that can be used to validate predictive models. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 59–68, 2011. © Springer-Verlag Berlin Heidelberg 2011
60
A. Cloutier, R. Boothby, and J. (James) Yang
The objective of this paper is to update the ongoing validation effort and progress at the HCDR Lab at Texas Tech University. The ongoing work includes standing reach, seated reach, and jumping validation for regular subjects and standing and seated stability study, walking, sit to standing, and reach with external loads for pregnant women. Significant experimental work using marker-based motion capture has been outlined in recent literature. Motion capture has become a popular tool to estimate and validate predictions of human movement, many of which are developed using optimization techniques. It has especially received attention in the field of optimization-based posture prediction in which it is used to validate the results of digital models for upper body [1], lower body [2], whole body [3-4], and neck posture [5]. It has also been used for the validation of optimization-based predictions for walking [6], reaching [7], and pointing [8] motions. Marker-based motion capture has helped researchers determine and validate optimization approaches for finding the centers of rotation in the finger [9] and joint motion of the finger and hand [10-11]. For clinical purposes, it has been used to both develop and verify methods for performing gait analysis [12-13], and it has been used in biomechanics to create optimization functions which estimate human muscle forces [14], bone movement [15], and movement tracking [16] based on joint angles. This paper is organized as follows: Section 2 introduces the digital human model developed at the HCDR Lab. Section 3 reviews optimization-based human models. Section 4 briefly introduces the motion capture system. Section 5 details marker placement protocol. Section 6 presents experiments and validation. Section 7 gives conclusion remarks.
2 Digital Human Model The human body can be modeled as a kinematic system, which is a series of links connected by revolute joints that represent musculoskeletal joints, such as the spine, shoulder, arm, and wrist. The rotation of each joint in the human body is described as a generalized coordinate qi . The 55-DOF whole-body human model includes the six global DOFs from a global origin as well as the lower/upper limbs, the torso, and the head (Fig.1). The pelvis is chosen as the base link for the global DOFs under rigidbody assumption. The configuration of the human open-loop kinematic chains is described by the Denavit-Hartenberg (DH) notation [17]. The homogeneous transform includes the position and orientation of one coordinate frame relative to another, where each frame is attached, respectively, to a rigid body in space. The n × 1 joint variables
q = [ q1 , ..., qn ] ∈ R n uniquely determine the configuration T
of a system with n DOFs and are called the generalized coordinates. The global motion (translations and rotations) of the body reference point located at the center of the human pelvis can also be described using the DH method with free link-joint structure representing the global DOFs. There are three local frames attached to the pelvis as the origins of the trunk, right leg, and left leg. Here we regard the orientation of the local frame to the trunk as the orientation of the pelvis, where only the rotations are considered from the previous frame.
Motion Capture Experiments for Validating Optimization-Based Human Models
61
Fig. 1. 55-DOF human model
3 Optimization-Based Human Models The human body is a highly redundant system in terms of degrees of freedom and muscle activation. For any given task, there exist multiple postures or motions. The difficulty is to determine which posture or motion is realistic among multiple solutions. On the other hand, we believe that some functions govern how humans move as they do. Therefore, human posture or motion prediction can be formulated as an optimization problem. The design variables are joint angles or joint angle profiles. The objective function is the human performance measures such as joint displacement, delta potential energy, discomfort, dynamic effort, etc. These performance measures govern how the avatar moves. Constraints entail joint angle limits, joint torque limits, and others. This approach ensures autonomous movement regardless of the scenario. In addition, it can be implemented in real time. Find: q ∈ R DOF
(1)
to minimize: Human Performance Measures subject to: Constraints All optimization problems are solved using the software SNOPT [18].
4 Motion Capture System Several technologies are available on the market that allow for measuring 3D motion data. Some technologies are: electromagnetic sensors, optical sensors, fiber-opticbased sensors, and inertia sensors. The in-house motion capture system is an optical motion capture system that employs reflective markers and infrared cameras that capture the reflected light from the markers. Optical systems have proven both effective and efficient at collecting objective data for 3D motion analysis, as the systems have shown to be accurate, repeatable, and consistent [19]. Currently, optical systems have also been employed in many applications in biomechanical studies [20-21]. The in-house motion capture system is an eight Eagle-4 camera system (Motion Analysis®). Each camera in the system has 4 megapixel resolution, shutter speeds
62
A. Cloutier, R. Boothby, and J. (James) Yang
from 0-2000 μs, focal lengths ranging from 18 mm to 52 mm, and a maximum of 500 frames per second. The eight cameras are set up to enclose a 10 x 10 ft square envelope and a height of 9 ft. For position tracking, it is necessary for three cameras to see a marker so that the position can be triangulated in Cartesian space. Therefore, the cameras are set up in an array surrounding the capture volume in such a way that it is possible for each marker on the body to be in range of at least three cameras.
5 Marker Placement Protocol Markers are positioned on various locations of the body to target joint locations. To ensure accurate results, they are arranged so that their placements on the body move as little as possible. Placing markers on bony landmarks and strapping down any loose areas of the subject’s clothing help guarantee that marker placement will remain consistent. Asymmetrical markers, which can be seen on the upper legs and chest in Fig.3 (left) and on the upper arms and shoulder blade in Fig. 3 (right), help the motion capture system distinguish between the right and left sides of the body. In total, there are 57-60 markers on the standing and seated reach regular subjects (three additional markers have been added to the neck to obtain more accurate results) and 63 markers on the pregnant subjects in Fig. 4 (three additional markers on the top, middle, and bottom of the belly).
6 Experiments and Validation 6.1 Subject Selection Subjects for the regular seated and standing reach experiments are chosen to represent the full range of height variation, which includes four specific sizes: females in the 5th percentile of height, average females and males (50th percentile), and males in the 95th percentile [3]. Three individuals from each height category participate in the experiments, creating a total of twelve subjects. Participants for the pregnancy study are between 6 and 9 months pregnant and above the age of 18.
(a) Front view
(b) back view
Fig. 2. Regular subject marker protocol
Fig. 3. Pregnant subject marker protocol
Motion Capture Experiments for Validating Optimization-Based Human Models
63
6.2 Tasks For regular subjects, the following tasks are tested: Standing Reach and Stability Study. Two variations of the standing reach experiments have been conducted. The first study consists of thirty target points at different locations in the motion capture volume which vary based on three vertical heights (high, middle, and low), two horizontal distances (near or far from the participant), and the point’s degree of rotation relative to the subject (5 locations based on a range of 45 degrees counterclockwise to 45 degrees clockwise). Two trials per point are done in each experiment, and one trial per point is post processed for later use. In the second study, six reach points—one near and one far target for three locations in the motion capture volume—are chosen. Target points are located high and to the left of the subject, low and to the right of the subject, and directly in front of the subject. Five trials per point are taken in each experiment, and all trials are used for validation (a total of 30 tracks per subject). Seated Reach and Stability Study. Two types of experiments have also been conducted for the seated reach and stability study. For both versions, a backless chair is used to ensure that the markers are not concealed. The first study has a total of 18 target points which vary based on vertical height (high, middle, low), horizontal distance (near or far from the participant), and the degree of rotation relative to the subject (3 locations from 45 degrees counterclockwise to 45 degrees clockwise). Two motion trials per subject are captured, and one trial per subject is post processed. The second variation of the seated reach experiments is the same as the 6-point version of the standing reach study. For pregnant women, the following tasks are studied: Standing Reach Posture Prediction with External Loads. The subject holds weights of 2.5 or 5 lbs. in each hand and lifts them in front of her up to shoulder height. Three trials of this posture are conducted. This task is also performed without weight—the subject simply lifts her arms in front of her to shoulder height—to find the difference in posture between the two cases. Seated Reach. The seated reach experiment for pregnant subjects is conducted the same way as the 6-point variation of the seated reach task for normal participants. Posture is observed for six different reach points, and each task is repeated five times. Walking. Pregnant subjects are asked to walk naturally in a straight line from one end of the motion capture volume to the other. They repeat this three times. Sit to standing. Each subject begins seated in a chair with handles on the sides and is asked to stand. The subject is encouraged to use the handles if needed. This process is repeated three times for each participant. Reach in a Car. The vehicle experiments are designed to represent posture while reaching for the seatbelt, the car door, the rearview mirror, the visor, the gearshift, the
64
A. Cloutier, R. Boothby, and J. (James) Yang
parking brake, and the glove box. The distances for each reach point are measured from the driver’s seat of a 2007 Chevrolet Cobalt LT Coupe. Subjects are told to reach for each point as naturally as possible. Three trials per task are recorded for each experiment.
Fig. 4. Seated reach experiment
Fig. 5. Seatbelt reach task for pregnant subject
6.3 Data Collection and Post Processing After the motion capture data has been recorded, a template is created for each individual study. Within the template, each marker is given a name, and links are created between markers to represent the body. A good template covers the full range of body movements performed by an individual and accurately represents all skin markers. Once a template has been created, its defined markers and links are then used to post process the data from all the subjects in a study. During post processing, all skin markers are identified, and links are automatically created between them from the template. Because occlusion, the disappearance of skin markers, is common for this style of motion capture, it is also necessary to add virtual data for each marker that briefly disappears. Virtual data for each occluded marker is created based on the movement of an origin marker, a long axis marker which creates a line with the origin, and a plane marker which forms a plane with the other two markers. These three markers are chosen based on the similarity of their movement patterns to the occluded marker. The virtual data must be checked to ensure that it accurately represents the motion, and completed data is smoothed once through a 6-Hz Butterworth filter. The creation of skin markers, links, and virtual data all leads to the development of 27 virtual markers which represent joints and are used as end-effectors in the posture reconstruction algorithm. Virtual markers are created relative to either two or three skin markers—the origin and long axis markers, and the plane marker in the case of a joint based on three surface markers. The crown and the right and left elbow, wrist, knee, ankle, and foot virtual markers can all be accurately created using a two-marker ratio. Head, neck, clavicle, spine, and central hip markers are created using a threemarker ratio. Body markers, virtual markers, and links for a regular standing reach participant can be seen in Fig. 8 below.
Motion Capture Experiments for Validating Optimization-Based Human Models
65
Fig. 6. Standing reach links, surface markers, and virtual markers
The x, y, and z coordinates of all skin markers and virtual markers are found for the frame during which the subject’s reach is maximized. The coordinates for all virtual markers and the surface markers for the fingers, toes, head, and reach points (if applicable) are exported and run through a posture reconstruction algorithm to map joint angles. The resulting information is then used to validate each optimizationbased human model. 6.4 Validation The HCDR Lab has worked on the following validations: Posture Reconstruction Methodology Study. The data collected during motion capture experiments is presented in the form of Cartesian coordinates, and in order for a motion capture experiment to effectively validate a human model, this data must be transferred from Cartesian space to joint angle space. The posture reconstruction algorithm developed by [22] is designed to obtain joint angles from Cartesian coordinates and make direct comparison between experimental and simulated posture possible. The effectiveness of the posture reconstruction algorithm is validated and demonstrated using three motion capture tasks performed by four subjects. The first subject, a 50% male, performs standing reach tasks for the right and left hand. The final three subjects, a 5% female, a 95% male, and a different 50% male, all perform the same reach task to demonstrate the algorithm’s capability of handling different subjects performing the same task. Comparison with the information obtained through the motion capture experiments shows that the algorithm is highly accurate in obtaining joint angles from Cartesian coordinates—differences between the experimental and posture reconstruction data are always below 5 mm and often below 1 mm for all subjects. Determining the Weights of Joint Displacement Function in Posture Prediction. Experimental data from the seated reach study involving 18 reach points and 4 subjects, one from each height percentile, is used to validate posture prediction using
66
A. Cloutier, R. Boothby, and J. (James) Yang
the optimization function for weights of joint displacement. Linear regression and deviations in joint angles are considered when comparing the experimental and predicted postures. Comparison shows that global weights found through the optimization-based algorithm reasonably predict the postures. The R2 value is above 0.7 for all subjects [23]. The same experimental data is used again to validate an alternate formulation for posture prediction based on the optimization function for weights of joint displacement. Using this method, the predictions improve. [24] Whole Body Standing and Seated Posture Prediction. A whole body standing and seated posture prediction formulation has been developed based on a new support reaction force algorithm [25-26]. Validation experiments are still ongoing. Subjects include both regular participants and pregnant women. Joint angles will be compared between simulation and experimental results. Statistic analysis will also be carried out to assess the accuracy of the formulation. Standing and Seated Stability. Standing and seated tasks are popular in people’s daily life. A stability algorithm has been developed without measurements [26], and the validation is still ongoing. This algorithm includes not only regular flat support surfaces (e.g. ground, or chair) but also generic support surfaces (e.g. having an angle with the horizontal plane). The prediction accuracy is critical to the standing and seated posture prediction. Stability will be validated using in-house experiments. Motion Prediction. The HCDR Lab is working on the following motion prediction tasks: i) Pregnant women walking simulation and validation; ii) pregnant women sit to stand; iii) pregnant women standing, pulling, and pushing; iv) human vertical jumping and landing; and v) slipping and falling for senior people. After the simulation algorithms are developed, subjects will be recruited to validate these algorithms. For motion prediction validation, joint determinants will be used for validation.
7 Conclusion Motion capture systems have become helpful tools for validating the predicted results of digital human models. The continuing experimental and validation efforts of the HCDR Laboratory at Texas Tech University were described in this paper. A posture reconstruction algorithm was created to map joint angles from Cartesian coordinates. Marker placement was designed to target bony landmarks, which helps ensure accurate results. Experiments for standing and seated reach, stability, and jumping have been conducted for regular subjects, and seated reach, standing reach with external loads, in-vehicle reach, walking, and sitting to standing tasks were studied for pregnant subjects using an eight Eagle-4 camera system. The surface marker information from these experiments was post processed to create useful virtual markers to represent joints. The x, y, and z coordinates of all virtual markers and selected skin markers were found for the maximum reach and transferred to joint angle space using the posture reconstruction algorithm. The resulting information was used to validate predicted results. Validation showed that the posture reconstruction algorithm is useful and accurate for predicting postures, and the optimization-based models created in the HCDRL produce reasonable predictions.
Motion Capture Experiments for Validating Optimization-Based Human Models
67
Acknowledgements. This work is partly supported by National Science Foundation (Award # 0926549 and 1048931).
References 1. Yang, J., Rahmatalla, S., Marler, T., Abdel-Malek, K., Harrison, C.: Validation of Predicted Posture for the Virtual Human Santos. In: Duffy, V.G. (ed.) HCII 2007 and DHM 2007. LNCS, vol. 4561, pp. 500–510. Springer, Heidelberg (2007) 2. Rahmatalla, S., Xiang, Y., Smith, R., Meusch, J., Li, J., Marler, T., Smith, B.: Validation of Lower-Body Posture Prediction for the Virtual Human Model Santos. In: SAE Digital Human Modeling Conference, Goteborg, Sweden (June 2009) 3. Yang, J., Marler, T., Rahmatalla, S.: Multi-Objective Optimization-Based Method for Kinematic Posture Prediction: Development and Validation. Robotica 29(2), 245–253 (2011) 4. Marler, T., Rahmatalla, S., Shanahan, M., Abdel-Malek, K.: A New Discomfort Function for Optimization-Based Posture Prediction. In: SAE Digital Human Modeling for Design and Engineering Conference, Iowa City, IA (June 2005) 5. Marler, T., Farrell, K., Kim, J., Rahmatalla, S., Abdel-Malek, K.: Vision Performance Measures for Optimization-Based Posture Prediction. In: SAE Human Modeling for Design and Engineering Conference, Lyon, France (July 2006) 6. Xiang, Y., Arora, J.S., Rahmatalla, S., Abdel-Malek, K.: Optimization-based dynamic human walking prediction: One step formulation. Int. J. Numer. Methods Eng. 79(6), 667– 695 (2009) 7. Faraway, J.: Modeling Reach Motions Using Functional Regression Analysis. In: SAE Digital Human Modeling for Design and Engineering Conference, Dearborn, MI (June 2000) 8. Simonidis, C., Stein, T., Bauer, F., Fischer, A., Schwameder, H., Seemann, W.: Determining the Principles of Human Motion by Combining Motion Analysis and Motion Synthesis. In: 9th IEEE-RAS International Conference on Humanoid Robots, pp. 317–322 (2009) 9. Zhan, G.X., Lee, S.-W., Braido, P.: Determining Finger Segmental Centers of Rotation in Flexion-Extension Based on Surface Marker Measurement. J. Biomech. 36(8), 1097–1102 (2003) 10. Cerveri, P., De Momi, E., Lopomo, N., Baud-Bovy, G., Barros, R.M.L., Ferrigno, G.: Finger Kinematic Modeling and Real-time Hand Motion Estimation. Ann. Biomed. Eng. 35(11), 1989–2002 (2007) 11. Cerveri, P., De Momi, E., Marchente, M., Lopomo, N., Baud-Bov, Y.G., Barros, R.M., Ferrigno, G.: In Vivo Validation of a Realistic Kinematic Model for the TrapezioMetacarpal Joint Using an Optoelectronic System. Ann. Biomed. Eng. 36(7), 1268–1280 (2008) 12. Frigo, C., Rabuffetti, M., Kerrigan, D.C., Deming, L.C., Pedotti, A.: Functionally Oriented and Clinically Feasible Quantitative Gait Analysis Method. Med. Biol. Eng. Comput. 36(2), 179–185 (1998) 13. Davis, I., Ounpuu, S., Tyburski, D., Gage, J.R.: A Gait Analysis Data Collection and Reduction Technique. Human Movement Science 10(5), 575–587 (1991) 14. Yamane, K., Fujita, Y., Nakamura, Y.: Estimation of Physically and Physiologically Valid Somatosensory Information. In: IEEE International Conference on Robotics and Automation, pp. 2624–2630 (2005)
68
A. Cloutier, R. Boothby, and J. (James) Yang
15. Lu, T.-W., O’Connor, J.J.: Bone Position Estimation from Skin Marker Co-ordinates Using Global Optimisation with Joint Constraints. J. Biomech. 32(2), 129–134 (1999) 16. Herda, L., Urtasun, R., Fua, P.: Hierarchical Implicit Surface Joint Limits for Human Body Tracking. Comput. Vision Image Understanding 99(2), 189–209 (2005) 17. Denavit, J., Hartenberg, R.S.: A kinematic notation for lower-pair mechanisms based on matrices. Journal of Applied Mechanics 77, 215–221 (1955) 18. Gill, P., Murray, W., Saunders, A.: SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization. SIAM Journal of Optimization 12(4), 979–1006 (2002) 19. Miller, C., Mulavara, A., Bloomberg, J.: A quasi-static method for determining the characteristic of motion capture camera system in a ‘split-volume’ configuration. Gait & Posture 16(3), 283–287 (2002) 20. Hagio, K., Sugano, N., Nishii, T., Miki, H., Otake, Y., Hattori, A., Suzuki, N., Yonenobu, K., Yoshikawa, H., Ochi, T.: A novel system of four-dimensional motion analysis after total hip athroplasty. Journal of Orthopaedic Research 22(3), 665–670 (2004) 21. Robert, J.J., Michele, O., Gordon, L.H.: Validation of the Vicon 460 Motion Capture System for Whole-Body Vibration Acceleration Determination. In: ISB XXth CongressASB 29th Annual Meeting, Cleveland, Ohio, July 31-August 5 (2005) 22. Gragg, J., Yang, J., Boothby, R.: Posture Reconstruction Method for Mapping Joint Angles of Motion Capture Experiments to Simulation Models. In: HCI International, Orlando, FL (July 2011) 23. Zou, Q.L., Zhang, Q.H., Yang, J.: Determining weights of joint displacement function in direct optimization-based posture prediction – A pilot Study. In: The 3rd International Conference on Applied Human Factors and Ergonomics, Miami, FL (July 2010) 24. Zou, Q., Zhang, Q., Yang, J., Boothby, R., Gragg, J., Cloutier, A.: Alternative-formulation for Determining Weights of Joint Displacement Objective Function in Seated Posture Prediction. In: HCI International, Orlando, FL (July 2011) 25. Howard, B., Yang, J.: Optimization-based seated posture prediction considering contact with environment. In: ASME 2011 International Design Engineering Technical Conference & Computer and Information in Engineering Conference, Washington, DC, USA, August 28-31 (2011) 26. Howard, B., Yang, J.: Ground reaction forces for various standing tasks considering generic terrain. In: ASME 2011 International Design Engineering Technical Conference & Computer and Information in Engineering Conference, Washington, DC, USA, August 2831 (2011)
Posture Reconstruction Method for Mapping Joint Angles of Motion Capture Experiments to Simulation Models Jared Gragg, Jingzhou (James) Yang, and Robyn Boothby Human-Centric Design Research Laboratory Department of Mechanical Engineering Texas Tech University, Lubbock, TX 79409
[email protected]
Abstract. Motion capture experiments are often used in coordination with digital human modeling to offer insight into the simulation of real-world tasks or as a means of validating existing simulations. However, there is a gap between the motion capture experiments and the simulation models, because the motion capture system is based on Cartesian space while the simulation models are based on joint space. This paper bridges the gap and presents a methodology that enables one to map joint angles of motion capture experiments to simulation models in order to obtain the same posture. The posture reconstruction method is an optimization-based approach where the cost function is a constant and constraints include (1) the distances between simulation model joint centers and the corresponding experimental subject joint centers are equal to zeros; (2) all joint angles are within joint limits. Examples are used to demonstrate the effectiveness of the proposed method. Keywords: digital human modeling, posture reconstruction, motion capture.
1 Introduction Human modeling and simulation has gained momentum in recent years due to the advantages of incorporating digital human models into the design process early on. A new approach to design, human-centric design, focuses on the interactions between human beings and their environment. Recent advancements in technology enable a designer to first test a product in a virtual environment. Inside a virtual environment, a designer can subject the product to a series of virtual tests using the aid of a digital human. A digital human, in a general sense, is a software application that tries to replicate the human body in form and function. Through various virtual tests, it is possible for a designer to gather useful information for refining the design. Multiple testing scenarios incorporating digital humans of variable anthropometry are made possible in the digital human simulation environment. Therefore, it is a useful design tool, allowing a designer to stage virtual testing of a product that would have previously only been possible through prototype experiments. Digital human modeling reduces the need for prototype testing, thus saving money and reducing design times. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 69–78, 2011. © Springer-Verlag Berlin Heidelberg 2011
70
J. Gragg, J. (James) Yang, and R. Boothby
Motion capture experiments are often a useful aid for digital human modeling simulations. In a typical motion capture experiment, reflective markers are placed on a subject’s skin and the Cartesian position of each marker relative to the global coordinate system is tracked through the use of high-speed infrared cameras. As the subject carries out a given physical task, a history of each marker’s position is recorded using an array of cameras. Typical output data of a motion capture experiment includes the Cartesian coordinate of each of the markers. Motion capture experiments are often used in coordination with digital human modeling for enhancements of the simulation by three means. The first is that motion capture data often provides regression models for some commercial software such as Jack, Ramsis, and Safework. By capturing motion data for various subjects and tasks, a library of movements can be built and used to create fluid, realistic movements for digital humans. Combining several movements into one task can be accomplished rather easily and any holes in the animation can be smoothed by manipulating position, velocity, and acceleration profiles and interpolation. Many technical applications such as LifeMod, Anybody, and DI-Guy incorporate motion capture animation to simulate complex physical tasks such as running or dunking a basketball (Veloso et al., 2004; Rasmussen and Ozen, 2007; DI Guy, 2008).The second way motion capture is incorporated into digital human simulations is through the use of motion capture as a study aid for complex physical task simulations. Many physical tasks are difficult to model without the aid of motion capture. By studying the position, velocity, and acceleration profiles of a motion capture experiment, it is possible to gather useful information that can be incorporated into the mathematical simulations of complex movements. Using motion capture data often provides important details about initial conditions or constraints of movement. Incorporating this knowledge is often essential to the simulation of complex physical tasks, such as walking, running, or jumping. Zou et al., (2010) and Ozsoy et al., (2011) provide examples of digital human simulations where motion capture aids in the simulation of walking and jumping, respectively. Finally, the third way motion capture is used in coordination with digital human modeling is through validation. It is always necessary to ensure that digital human simulations mimic real-world scenarios closely. Otherwise, the usefulness of the simulation is lost. If a digital human is to be used for design of a product, the simulation must provide realistic results. Comparing digital human simulations to motion capture experiments often provides a means of validation. This paper is dedicated to developing a methodology map joint angles of motion capture experiments to simulation models in order to obtain the same posture. The current digital human employed by the Human-Centric Design Research Laboratory (HCDRL) has 49 degrees-of-freedom (DOF) based on human anatomy. The movement of the digital human is tied to a kinematic joint chain that represents the human skeletal system. Thus, a single posture is uniquely defined by a single set of real numbers, or joint angles. When motion capture is used as a method of validation, it is necessary to be able to directly compare the results of the simulation with results from the motion capture experiments. The experimental data obtained through motion capture is typically the Cartesian coordinates of the markers with respect to the global coordinate system. Thus, it is necessary to map the experimental data from Cartesian position space to joint angle space. The motion capture software has the capability to transfer Cartesian space to joint space, but the error is large. To better compare the
Posture Reconstruction Method for Mapping Joint Angles
71
experimental data to simulation results, a posture reconstruction algorithm has been developed. Posture reconstruction involves calculating the necessary joint angles that will recreate the experimental posture with the digital human. Once the joint angles are calculated, they can be directly compared to predicted joint angles from the simulation of the same task or posture. Since predicting human posture is a redundant problem, the posture reconstruction algorithm is formulated as an optimization problem. This paper develops a posture reconstruction algorithm to directly compare digital human simulations to motion capture experimental data by mapping the experimental data from Cartesian space to joint space. Several examples of posture reconstruction are provided to demonstrate the algorithm’s effectiveness for various subjects and tasks. This paper is organized as follows: Section 2 describes the posture reconstruction formulation, Section 3 details the motion capture experiments and equipment, Section 4 provides examples of posture reconstruction through examples of four subjects performing three different tasks and validation, and Section 5 contains discussion of the posture reconstruction algorithm and extended applications.
2 Posture Reconstruction Formulation The idea behind posture reconstruction is the same as posture prediction, albeit with considerably more end-effectors. In a posture prediction simulation, typically only a human’s natural end-effectors (hands, feet, and head) are prescribed to a certain point in Cartesian space. Typically the other joints in the body are free to move according to joint limits or some other criteria. In our previous work, digital human movement is dictated by human performance measures, such as joint displacement, joint torque, musculoskeletal discomfort, visual displacement, and delta-potential energy (Yang et al., 2004). Posture prediction is formulated as an optimization problem with human performance measures serving as objective functions in the optimization problem. A typical posture prediction problem formulation is as follows: • • •
Find: Minimize: Subject to:
joint angles that represent a specific posture human performance measures distance constraints on multiple end-effectors
Human performance measures are combined into a single weighted multi-objective function in the multi-objective optimization (MOO) problem described by Yang et al., (2004). A typical posture prediction problem is described in detail as part of previous work (Gragg et al., 2010a). In a general sense, there are user-defined constraint functions and objective function(s). The objective functions, constraint functions, and function gradients are defined and the optimization problem is solved by SNOPT. SNOPT is a general purpose system that solves constrained optimization problems (Gill et al., 2002). The posture reconstruction algorithm expands upon the posture prediction formulation by adding more end-effectors in a highly-constrained optimization problem. The digital human employed by the HCDRL is controlled through movement of a kinematic joint chain that represents a human’s skeletal system. Deformable skin is attached to the skeleton to aid in visualization, but the kinematics of the digital human are only associated with the skeletal structure (Gragg et al., 2010b). The skeletal structure is constructed according to the Denavit-Hartenberg (DH) convention
72
J. Gragg, J. (James) Yang, and R. Boothby
(Denavit and Hartenberg, 1955). All joints in the chain are single DOF revolute joints described by a single real number, or joint angle. For the i th joint, we associate the joint variable, qi , and rigidly attach a local coordinate system. Each subsequent joint is connected by a link of varying length. This does not cause loss of generality, as physical human joints with multiple associated DOFs can be represented as multiple DH joints connected by a link of length zero. The set of all joint angles, q = [ q1
qn ] ∈ R n , represents a specific posture. x(q) ∈ R 3 is the position T
vector in Cartesian space that describes the location of an end-effector with respect to the global coordinate system. x(q) is determined using the DH method and expressed as the product of the individual transformations associated with each joint. The 4x4 transformation matrix Tij denotes the position and orientation of joint i with respect to joint j . x(q) and Tij are expressed mathematically as follows:
Tii −1
⎡cos θi ⎢ sin θ i =⎢ ⎢ 0 ⎢ ⎣ 0
⎛ n ⎞ x (q) = ⎜ ∏ Tii −1 ⎟ x n ⎝ i =1 ⎠ − cos α i sin θi sin α i sin θi
cos α i cos θi
− sin α i cos θi
sin α i
cos α i
0
0
(1)
ai cos θi ⎤ ai sin θi ⎥⎥ di ⎥ ⎥ 1 ⎦
(2)
x n is the position of the end-effector with respect to the nth frame and n is the number of DOFs. The rotational displacement qi changes the value of θi . The constants α , a , and d are the DH parameters link twist, link length, and link offset and along with θ are used to define the geometric relationships between the joint i − 1 and joint i . In the posture reconstruction formulation, each physical joint center of the human body is treated as an end-effector. That is, every joint center that represents the position of a physical joint in the human body is prescribed to some point in Cartesian space, the same position of the corresponding joint center position from the motion capture experiment. Joints like the shoulders, clavicles, and elbows are now considered as “end-effectors” and thus have an associated distance constraint. Each of the following joint centers is an end effector: four spinal joints, two clavicle joints, two shoulder joints, two elbow joints, two wrist joints, two hand joints, three neck/head joints, one hip-point, two hip joints, two knee joints, two ankle joints, and two foot joints. Fig. 1 shows each of the joints whose position is prescribed. As seen in Fig. 1, each physical joint in the human body is an end-effector. For joints where there are multiple DOFs, such as the hip, the first DH joint in the kinematic chain is the end-effector joint. There are 26 joint centers that are considered end-effectors. There are a total of 55 DOF for the digital human, 49 human DOF, and 6 global DOF that describe the position and orientation of the hip point in Cartesian space. Since the amount of distance constraints, equal to the number of end-effectors, is almost half the total DOF of the system, the problem is considered highlyconstrained. Thus, for computational efficiency, there are no human performance
Posture Reconstruction Method for Mapping Joint Angles
73
measures to minimize in the optimization formulation. The objective function is given as a constant with the objective gradient with respect to all angles thus being zero. Since there is no minimization of a human performance measure, the optimization solver is merely trying to find a feasible solution. If a human performance measure is used in the optimization problem, the computational cost of the simulation will increase. Digital human simulations typically strive to be as close to real-time simulations as possible to enhance the benefits of using digital human modeling as a design tool. A typical posture prediction problem can be solved in less than half a second, thus able to claim real-time simulation capability. The posture reconstruction formulation is defined as follows: •
Find:
x = [ q0
•
Minimize:
f =1
•
Subject to:
qm ]
q1
g j = ( x j − TPx ) + ( y j − TPy ) + ( z j − TPz ) , j = 1,…, M 2
2
2
−∞ ≤ g j ≤ ε qiL ≤ qi ≤ qiU
where m = 55 , M = 26 , and ε = .001 . x j , y j , and z j are the x, y, and z position of the individual joint. TPx , TPy , and TPz are the x, y, and z position of the joint’s target position. Since the units associated with the simulation are meters, each of the joint centers is required to be within 1mm of the experimental position of the joint center. Since all of the physical human joint centers’ positions are prescribed, the result is a posture that fully recreates the experimental posture. crown R hand
R wrist R elbow
head neck
R shoulder R clavicle
L elbow L clavicle L wrist L shoulder
spine 4 spine 2
spine 3
L hand
spine 1 L hip
R hip
hip point R knee
L knee
R ankle L ankle
R foot
L foot
Fig. 1. Important joint centers for posture reconstruction
The output of the posture reconstruction simulation is the joint angles for the experimental posture. These joint angles can then be directly compared to predicted
74
J. Gragg, J. (James) Yang, and R. Boothby
joint angles from a simulation for purpose of validation. This is why the posture reconstruction simulation is said to map the joint angles from motion capture experiments to the digital human.
3 Motion Capture Experiment Several technologies are available on the market that allow for measuring 3D motion data. Some technologies are: electromagnetic sensors, optical sensors, fiber-opticbased sensors, and inertia sensors. The in-house motion capture system is an optical motion capture system that employs reflective markers and infrared cameras that capture the reflected light from the markers. Optical systems have proven both effective and efficient at collecting objective data for 3D motion analysis, as the systems have shown to be accurate, repeatable, and consistent (Miller et al., 2002). Currently, optical systems have also been employed in many applications in biomechanical studies (Hagio et al., 2004; Robert et al., 2005). The in-house motion capture system is an eight Eagle-4 camera system (Motion Analysis®). Each camera in the system has 4 megapixel resolution, shutter speeds from 0-2000 μs, focal lengths ranging from 18 mm to 52 mm, and a maximum of 500 frames per second. The eight cameras are set up to enclose a 10 x 10 ft square envelope and a height of 9 ft. For position tracking, it is necessary for three cameras to see a marker so that the position can be triangulated in Cartesian space. Therefore, the cameras are set up in an array surrounding the capture volume in such a way that it is possible for each marker on the body to be in range of at least three cameras.
Fig. 2. Marker placement protocol
Marker placement is an important aspect of any motion capture experiment. Proper marker placement is essential for quality results. Several protocols have been introduced in the literature, with plug-in gait being a typical protocol that has been adopted by systems such as Motion Analysis, VICON, and Life MOD. According to plug-in gait protocol, markers are attached to bony landmarks on the subject’s body.
Posture Reconstruction Method for Mapping Joint Angles
75
The physical markers are then related to joint centers by virtual markers. For instance, to find the joint center position of the wrist, it is necessary to place two markers on the end of the radius and ulna bones. The position of the joint center is then taken to be halfway in between these two physical markers. A virtual marker is then established in the software which represents the position of the joint center. Fig. 2 details the marker placement protocol. The total number of physical markers on the subject is 48, with additional markers for target points. The position of each joint center is approximated by using two or three physical markers for each important joint center. The position is approximated relative to the position of the real markers, so error is introduced into the system. Thus, it could happen that a posture captured by the motion capture experiment cannot be completely recreated by the digital human. In some instances the posture reconstruction algorithm will fail to meet the distance requirements fully, instead minimizing the error.
4 Posture Reconstruction Examples Four subjects were chosen for demonstration of the posture reconstruction algorithm: a 5% female, two 50% males, and a 95% male. The first subject, Subject 1, is a 50% male subject required to perform two standing tasks, a right-hand reach and a lefthand reach. This example demonstrates the algorithm’s capability of handling a single subject with multiple tasks. The second demonstration requires Subject 2, Subject 3, and Subject 4, a 5% female, a different 50% male, and a 95% male, respectively, to perform a seated reaching task. This example demonstrates the algorithm’s capability of handling multiple subjects with the same task. Together these examples demonstrate the posture reconstruction algorithm’s versatility in handling multiple subjects and multiple tasks. Subject 1 performs two standing tasks, a right-hand reach and left-hand reach. The subject is required to touch the same point with first the right hand and then the left hand without moving the feet to get closer. Fig. 3 shows the results for the right hand reach and left hand reach standing tasks. Table 1. compares the x, y, and z positions (in m) of each of the joint centers for the motion capture data and posture reconstruction data for the right hand reach and left hand reach standing tasks. The motion capture figure shows both the positions of the physical markers, depicted as colored dots, and the positions of the virtual markers, depicted as dark blue crosshairs. The colored lines represent relationships between physical markers. In order to distinguish individual markers, relationships are built that determine which marker is which based on the distance to other markers. For example, the two right elbow markers are about the same distance from each other throughout the entire experiment, so the software can distinguish these markers from the wrist markers based on proximity. The digital human is shown without skin in order for direct comparison to the motion capture view. The second example requires three subjects to touch a point with their right index finger while seated. Each subject is required to touch the same point with the chair in the same position for each trial. Fig. 4 shows the results for Subject 2, Subject 3, and Subject 4 seated tasks. Table 2 gives the difference between the x, y, and z positions (in m) of each of the joint centers for the motion capture data and posture reconstruction data for Subject 2, Subject 3, and Subject 4.
76
J. Gragg, J. (James) Yang, and R. Boothby
Fig. 3. Subject 1: (a) right hand reach- motion capture posture and reconstructed posture (b) left hand reach- motion capture posture and reconstructed posture Table 1. Motion capture vs. Posture Reconstruction- Subject 1: (a) right hand reach; (b) left hand reach
Fig. 4. Seated right hand reach: (a) Subject 2- motion capture posture and reconstructed posture; (b) Subject 3- motion capture posture and reconstructed posture; (c) Subject 4- motion capture posture and reconstructed posture
As seen in Fig. 3, the posture reconstruction algorithm is able to handle a subject performing a variety of tasks with high accuracy. Fig. 4 demonstrates the ability of the algorithm to handle multiple subjects performing the same task. Error introduced through the approximation of the joint centers by virtual markers accounts for slight
Posture Reconstruction Method for Mapping Joint Angles
77
differences in the positions of the corresponding joint centers, but all joint centers fall within a maximum of 5 mm from the joint center positions from the motion capture experiments. The posture reconstruction algorithm provides a useful tool to obtain the joint angles from motion capture experiment corresponding to the simulation model. Table 2. Motion capture vs. Posture Reconstruction- Subject 2
5 Conclusion When simulating human movement in a digital environment, it is necessary to validate the predicted posture or motion to ensure realistic simulations. One means of validation is provided by the use of motion capture experiments for comparison with predicted results. The current digital human simulations predict a set of joint angles that represents a physical posture. The in-house motion capture equipment provides only Cartesian coordinates of various joint centers during a task. Therefore, a posture reconstruction algorithm has been developed that maps the motion capture data from Cartesian space to joint space, allowing for direct comparison of digital human simulations and motion capture experiments. With the posture reconstruction algorithm, it is now possible to validate any future digital human simulations to ensure reliability. Once a particular task is simulated, a series of experiments can be run with subjects of varying anthropometry and the results can be directly compared to predicted values of joint angles. After comparison, if the predicted results are unacceptable, alterations can be made to the simulations, providing higher accuracy and realism to the digital human simulation and virtual environment. Future work with posture reconstruction will be dedicated to comparing the current algorithm with two alternate methods. The first alternate method is to incorporate a human performance measure such as joint displacement into the optimization formulation. The second alternate method is to implement a two-stage approach to posture reconstruction. The first stage will consist of the current algorithm. The results from the first stage will be used as an initial guess for the second stage. The second stage will incorporate a human performance measure into the optimization formulation as well as lowering the distance tolerance, and the initial joint angle guess will come
78
J. Gragg, J. (James) Yang, and R. Boothby
from the first stage. The three methods will be compared according to accuracy and run-time. Other future work includes using the newly developed algorithms to validate past simulations. Digital human simulations have proven useful as an aid in design, reducing the need for prototypes, reducing costs, and shortening design time to market. Validation of digital human simulations is important to ensure that the simulations provide realistic, reliable results. This posture reconstruction algorithm will provide a means to validate current and future digital human simulations at the HCDRL. Acknowledgments. This work is partly supported by National Science Foundation Project (Award # NSF09-265499).
References 1. Denavit, J., Hartenberg, R.S.: A Kinematic Notation for Lower-Pair Mechanisms Based on Matrices. Journal of Applied Mechanics 22, 215–221 (1955) 2. DI-Guy- Human Simulation Software, Boston Dynamics (2008), http://www.bostondynamics.com/diguy/index.htm 3. Hagio, K., Sugano, N., Nishii, T., Miki, H., Otake, Y., Hattori, A., Suzuki, N., Yonenobu, K., Yoshikawa, H., Ochi, T.: A novel system of four-dimensional motion analysis after total hip athroplasty. Journal of Orthopaedic Research 22(3), 665–670 (2004) 4. Gill, P.E., Murray, W., Saunders, M.A.: SNOPT: An SQP algorithm for large-scale constrained optimization. SIAM J. Optim. 12 (2002) 5. Gragg, J., Yang, J., Long, J.: Optimization-Based Approach for Determining Driver Seat Adjustment range for Vehicles. International Journal of Vehicle Design (2010a) (in print) 6. Gragg, J., Yang, J., Long, J.: Digital Human Model for Driver Seat Adjustment Range Determination. In: SAE 2010 World Congress and Exhibition, Detroit, MI, April 12-15 (2010b) 7. Miller, C., Mulavara, A., Bloomberg, J.: A quasi-static method for determining the characteristic of motion capture camera system in a ‘split-volume’ configuration. Gait & Posture 16(3), 283–287 (2002) 8. Ozsoy, B., Yang, J., Boros, R., Hashemi, J.: Direct Optimization-Based Planar Human Vertical Jumping Simulation. International Journal of Human Factors Modelling and Simulation (2010) (submitted) 9. Rasmussen, J., Ozen, M.: AnyBody – ANSYS Interface: CAE Technology for the Human Body. CADFEM Medical (2007) 10. Robert, J.J., Michele, O., Gordon, L.H.: Validation of the Vicon 460 Motion Capture SystemTM for Whole-Body Vibration Acceleration Determination. In: ISB XXth Congress-ASB 29th Annual Meeting, Cleveland, Ohio, July 31-August 5 (2005) 11. Veloso, A., Esteves, G., Silva, S., Ferreira, C., Brandão, F.: Biomechanics Modeling of Human Musculoskeletal System Using Adams Multibody Dynamics Package. Faculty of Human Movement Sciences – Technical University of Lisbon (2004) 12. Yang, J., Marler, R.T., Kim, H., Arora, J., Abdel-Malek, K.: Multi-Objective Optimization for Upper Body Posture Prediction. In: 10th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, Albany, New York, USA, August 30-September 1 (2004) 13. Zou, Q., Zhang, Q., Yang, J.: Determining Weights of Joint Displacement Objective Function in Optimization-Based Posture Prediction. In: 1st International Conference on Applied Digital Human Modeling, Miami, Florida, July 17-20 (2010)
Joint Torque Modeling of Knee Extension and Flexion Fabian Guenzkofer, Florian Engstler, Heiner Bubb, and Klaus Bengler Institute of Ergonomics, Technische Universität München, Boltzmannstr.15, 85747 Garching, Germany {guenzkofer,engstler,bengler,bubb}@lfe.mw.tum.de
Abstract. The purpose of this experiment is to obtain isometric knee extension and flexion joint torque - joint angle functions considering necessary biomechanical aspects. In order to examine gender and age effects four different subject groups (10 young males and females, 8 old males and females) were used. Age and gender had a significant influence for both force directions. Not only different maximum values but also different curve shapes were identified for different age groups. Additionally the hip flexion angle significantly influenced the joint torque production. Keywords: joint torque, age effects, strength, knee, force.
1 Introduction Knowledge of executable forces in different postures is crucial for product design [1] as well as for workplace design [2]. Strength prediction for arbitrary postures presupposes correlations between joint angles and joint torques [3] due to the length – tension relationship of muscles [4]. Furthermore torque-angle curves permit posture prediction by a minimization of all joint torques [5]. From all relevant torque – angle relationships this paper concentrates on the relationships in isometric knee extension and knee flexion. Several authors have examined the variation of knee flexion torque within the range of knee flexion motion keeping the hip flexion angle constant [6-10]. All studies revealed a descending curve. However, taking into account the responsible muscles indicates that hereby an important aspect is neglected: The biggest muscles responsible for the knee flexion are the m. biceps femoris, the m. semitendinosus and the m. semimembranosus. Except the short head of the m. biceps femoris all muscles are biarticular. That is to say, they cover the knee as well as the hip joint. Therefore the position and posture of the hip should have a distinct influence on the maximum joint torque of knee flexion. Some authors have considered this interrelation and have additionally varied the hip flexion angle [1113]. All of them state that the joint torques significantly increased with the hip being more flexed. In her review of human strength curves Kulig [14] mentions knee extension measurements by only varying the knee flexion angle [15-16]. Several more authors examined knee extension joint torque – knee joint angle relationship [7-9, 10, 17, 18]. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 79–88, 2011. © Springer-Verlag Berlin Heidelberg 2011
80
F. Guenzkofer et al.
In some cases the hip flexion angle was not controlled as no backrest was available for the subjects [16, 19]. In almost all of these studies the researchers were faced with an ascending limb, followed by a plateau and finally by a descending limb. Regarding knee extension the m. rectus femoris is the only biarticular muscle. Herzog et al. [20] compared knee extensor strength curves with two different hip flexion angles obtained theoretically and experimentally. They hypothesized an offset between the strength curves with a right shift of higher hip flexion angles (seated instead of prone). The results agreed well with each other. Also Houtz et al. [11] examined the influence of varied hip flexion angles on knee extension torque and was faced with a shift due to hip flexion angle. Another important aspect for product and workplace design is the loss of strength due to aging [10, 21-24]. Some authors have compared strength differences between young and old subjects [19, 27]. Samuel et al. [10] have compared knee joint torques of 60-, 70- and 80-year-old subjects. The 80-year old subjects had 20% less strength compared to the 60-year-olds. In this study women produced approximately 40% less knee torques than men. In the context of the research project DHErgo functional joint torque data are collected for digital human models. In comparison to most studies which solely concentrate on particular joints, complete individual torque models of the subjects should be developed by comprehensive torque measurements combining all relevant degrees of freedom and adjacent joints. To account for the low sample size, the idea is to combine these detailed measurements with a large database of strength data in order to obtain percentile values for joint torques [28]. This paper describes isometric knee joint torque measurements in numerous joint positions considering necessary biomechanical aspects using four different subject groups. The aim of the data collection is not to create a huge amount of data for reliable population prognosis. Taking detailed measurements on 18 subjects should reveal qualitative information about general curve shapes and influences of adjacent joint angles.
2 Methods and Materials 2.1 Subjects Eighteen subjects volunteered to take part in this study. In order to be able to examine age and gender effects young and old as well as female and male subjects were recruited (see table 1). Age group definitions were agreed on within the DHErgo consortium. All subjects had to fulfill specific requirements. On the one hand competitive athletes were rejected as the results should approximate average population strengths at the best. On the other hand subjects had to be healthy and free of orthopaedic and neurological disorders. All subjects gave their written consent after having been informed about the purpose of the study and its procedures.
Joint Torque Modeling of Knee Extension and Flexion
81
Table 1. Mean and standard deviation of age, height and weight of subjects in each group
Group Young men Young women Old men Old women
N 5 5 4 4
Age 30,2 ± 2,95 years 24,2 ± 4,32 years 75,5 ± 3,11 years 73,25 ± 5,44 years
Height 180,67 ± 4,03cm 165,68 ± 2,42 cm 172 ± 3,34 cm 162,4 ± 3,47 cm
Weight 77,96 ± 11,11 kg 57,28 ± 4,76 kg 81,03 ± 3,98 kg 67,43 ± 8,05 kg
2.2 Apparatus Joint torques were measured using a modified training device with several adjustment possibilities (see Fig. 1). The backrest can be adjusted in height and inclination and the distance to the seat pan can be varied to account for different thigh lengths. A burster torque sensor (burster präzisionsmesstechnik gmbh & co kg, Gernsbach, Germany) directly grabs the joint torque. Straps for the upper body ensured maximum stabilization for the subjects. The torque sensor read-out was conducted using LabVIEW 8.5 (National Instruments Corporation, Austin, USA).
Fig. 1. Adjustable knee joint torque measuring device
Postures of the subjects were recorded using the six-camera optoelectronic Vicon MX T10 motion analysis system (Oxford Metrics Ltd., Oxford, UK). However, motion reconstruction is not part of this paper. For the analysis the requested joint angles were used. 2.3 Experimental Design In order to obtain the knee joint torque – knee flexion angle relationship dependent on different hip flexion angles both joint angles were varied for the experiment (see table 2). Unpublished pretests have shown that the differences resulting from 45 and 0 degrees (torso and thighs in line) hip flexion angles were quite negligible for knee flexion. Therefore only two different but extreme hip flexion angles (0 and 90 degrees) are taken into account. In order to best obtain the course of the function four different knee flexion angles are used. A knee flexion angle of 0 degrees means full
82
F. Guenzkofer et al.
extension. Furthermore unpublished pretests have revealed that ascending and descending limbs of knee extension were almost linear with a maximum at 60 degrees independent from the hip flexion angle. Therefore only three different knee flexion angles have been chosen. A knee flexion angle of 60 degrees was only combined with one hip flexion angle. Table 1. Measurement positions for the experiment
Position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Hip flexion angle 0 0 0 0 90 90 90 90 0 0 45 45 45 90 90
Knee flexion angle 0 30 60 110 0 30 60 110 20 110 20 60 110 20 110
Position Prone Prone Prone Prone Seated Seated Seated Seated Supine Supine Seated Seated Seated Seated Seated
Direction Flexion Flexion Flexion Flexion Flexion Flexion Flexion Flexion Extension Extension Extension Extension Extension Extension Extension
For the isometric joint torque measurements the plateau-method was used, in which the subjects build up their maximum force in the first second and maintain the force for four seconds [29-32]. The average value over a three seconds interval during the last four seconds matches the maximum value. In between two trials the subjects had a rest period of 2 minutes. In order to ensure getting the real maximum torque value the subjects performed two trials in every position. Throughout the experiment the subjects didn´t receive any kind of motivation, neither verbal encouragement nor visual feedback [29, 30]. 2.4 Procedure Each subject was instructed on the experimental procedure and some anthropometrical measures were taken. After the possibility to pose questions they ran through a warmup phase consisting of 15 minutes on a cross-trainer and 10 squats. The subjects were then familiarized with the measuring device. The device was adjusted to their anthropometric dimensions. The right lateral femoral epicondyle was used for the correct alignment of the anatomical flexion and the sensor axis. By means of a goniometer the different knee flexion and hip flexion angles were each set. Thereupon the subjects had the possibility to perform training trials to get used to the measurement positions and the strategy to optimally apply the maximum possible force. After performing the maximum torque measurements the subject´s lower leg
Joint Torque Modeling of Knee Extension and Flexion
83
was tied to the bolster and the subject had to relax. Thus the torque induced by the body segment weight was recorded in order to take into account gravitational influences. 2.5 Data Analysis All data were analyzed using the SPSS statistical package, release 18 (SPSS, Inc., Chicago, IL, USA). Of the two repetitions per trial only the higher value was further processed [29]. The level of statistical significance was set at p<.05. A mixed design ANOVA was conducted using age and gender as between-subjects factors and hip and knee as repeated-measures within-subject variables. Non-linear multiple regression analyses were performed using Matlab R2009a (Mathworks, Inc., Natick, MA, USA).
3 Results The resulting joint torques for each group are listed up in table 3. Table 2. Mean and standard deviation of resulting joint torques for each group
Men Young knee hip 0 0
Flexion
Extension
Women Old
Young
Old
M 83.78
SD M SD M 25.64 38.10 11.89 52.84
SD M SD 9.10 30.23 9.64
30
0
76.42
21.64 40.68 7.08
43.88
2.62 27.75 10.12
60
0
71.40
15.28 38.03 6.79
35.14
5.01 22.93 9.68
110
0
40.08
6.74 22.00 7.42
22.72
3.26 19.60 0.40
0
90
64.34
14.91 35.15 14.48 44.62
5.08 18.90 5.21
30
90
71.58
26.95 28.20 7.40
43.86
13.03 18.13 7.06
60
90
73.82
16.79 28.83 9.72
49.00
9.46 18.93 7.34
110
90
61.22
13.34 31.80 8.73
41.50
8.41 23.80 10.87
20
0
91.42
26.57 57.03 12.22 70.24
17.30 39.55 9.98
110
0
107.66 14.83 53.48 22.25 71.56
12.57 37.98 2.90
20
45
92.72
21.07 40.55 3.45
60
45 131.78 25.74 72.80 17.87 101.88 16.03 45.35 11.11
110
45 117.64 9.89 64.30 13.70 83.96
11.76 43.08 11.99
20
90
22.58 45.48 18.64 71.34
22.61 33.43 2.95
110
90 111.28 9.77 60.85 15.59 82.72
5.97 40.05 12.47
88.12
17.27 53.50 15.75 73.28
Joint angle – joint torque diagrams are depicted in figure 2. Standard deviations (included in table 3) are omitted in favor of clarity.
84
F. Guenzkofer et al. Knee flexion (90 degrees hip flexion)
Knee flexion (0 degrees hip flexion) 100
80
80 ] m60 [N e u 40 q r o T20
] 60 m N [ e 40 u q r o T 20
0
0 0
50
100
0
50
100
Knee extension under 45 degrees hip flexion 140 120 ] m 100 [N 80 e u 60 q r o 40 T 20 0
young men young women old men old women
0
50
100
Knee flexion angle [degrees]
Fig. 2. Mean Torque Values for each measurement posture for each group
Knee Flexion. The main effects gender, F(1,13)=15.20, p<.01, as well as age, F(1,13)=34,67, p<.001, had a significant effect on torque production. The effect of age on torque production was not different between male and female subjects, F(1,13)=3.92, p>.05. As expected, the resulting joint torques were significantly dependent on the knee flexion angle, F(3, 39)=13.78, p<.001. Contrasts revealed that the knee flexion angle of 110 degrees significantly differs from all other knee flexion angles. The main effect hip angle did not show a significant influence, F(1,13)=.14, p>.05. However, interaction effects show, that the influence of knee angles have to be considered in context of hip position, F(3,39)=17.32, p<.001, and age, F(3,39)=4.04, p<.05. The diagrams in Figure 2 support the results from contrasts that for a hip angle of 90 degrees different knee flexion angles have only a minor effect. Adopting a hip angle of 0 degrees, increasing knee angles lead to decreasing joint torques. Concerning age, interaction graphs show that up to 60 degrees knee flexion, young and old groups show parallel lines. The torque decay beyond 60 degrees is significantly higher for young subjects. At last, the way the hip angle influences the torque production in certain knee positions is again dependent on age (interaction effect knee*hip*age), F(3,39)=3.32, p<.05. Multiple regression analyses revealed the following equations for young males (1), young females (2), old males (3) and old females (4). The variables h stand for hip flexion angle and k for knee flexion angle. In brackets two coefficients of determination are given. The first one refers to the averaged values for each groups
Joint Torque Modeling of Knee Extension and Flexion
85
(see table 2) and the second one for a regression using the individual values. The low amount of shared variance for the second correlation is not unexpected but due to different strengths per group. T = 82.54 – 0.2008h – 0.0097k + 0.0040hk – 0.0034k² (R1²=.99; R2²=.35) .
(1)
T = 51.58 - 0.0734h - 0.2171k + 0.0028hk - 0.0005k² (R1²=.96; R2²=.58) .
(2)
T = 38.35 – 0.0383h + 0.0759k – 0.0044hk – 0.0002k² + 5.254e-5hk² 1.693e- 5k³ (R1²=.99; R2²=.34) .
(3)
T = 31 – 0.1394h – 0.1658k + 0.0016hk + 0.0006k² (R1²=.98; R2²=.28) .
(4)
Knee Extension. Due to the results of pretests the knee flexion angle of 60 degrees was only combined with one hip flexion angle of 45 degrees. Therefore the mixed design ANOVA could only be conducted with two knee flexion angles of 20 and 110 degrees. Both knee angles lead to significantly different joint torques, F(1,14)=7.27, p>.05. Similarly to knee flexion gender, F(1,14)=15.56, p>.01, as well as age, F(1,14)=75.77, p<.001, had a significant influence. Furthermore an interaction effect of knee x hip could be detected, F(2,28)=4.01, p<.05. Contrasts revealed significant differences when comparing the torque of 0 and 45 degrees as well as 0 and 90 degrees hip flexion for both knee angles. No interaction effect between knee position and age could be identified. However, looking at Fig. 2, it becomes evident, that age influences the shape of the functions when taking a knee flexion angle of 60 degrees into account. Therefore an additional repeated-measures ANOVA with only 45 degrees hip flexion and three knee positions was conducted. Again significant influences for knee position, F(2,28)=18.87, p<.001, gender, F(1,14)=15.03, p<.01, and age, F(1,14)=57.01, p<.001, were detected. Contrasts show differences between all knee flexion angles. In this case also the interaction effect knee x age turned out to be significant, F(2, 28)=4.25, p<.05. Contrasts revealed a significantly different torque increase from 20 to 60 degrees knee flexion between the young and old groups. Similarly to knee flexion multiple regression analyses lead to the following equations (5-8). T = 55.98 + 0.1934h + 2.006k - 0.0027h² + 0.0009hk - 0.0139k² (R1²=.99; R2²=.42) .
(5)
T = 42.84 + 0.1943h + 1.571k - 0.0023h² + 0.0012hk - 0.0119k² (R1²=.99; R2²=.34) .
(6)
T = 40.89 + 0.0336h + 0.9094k – 0.0023h² + 0.0023hk – 0.0072k² (R1²=.98; R2²=.23) .
(7)
T = 35.85 + 0.0923h + 0.2216k - 0.0020h² + 0.0010hk – 0.0018k² (R1²=1; R2²=.17) .
(8)
4 Conclusions The purpose of the current study was to examine joint torque – joint angle relationships for knee extension and flexion for different subject groups.
86
F. Guenzkofer et al.
In many comparable studies knee flexion torques follow a descending limb over the range of knee motion [6-12] for a hip flexion angle of 0 degrees (prone position). This is also true for our experiments. Only old males applied less force than expected at a knee flexion angle of zero degrees. Figure 2 illustrates a correlation between slope and intercept. That is to say, strong persons show higher strength variability than weaker ones. Some authors report that hip flexion changes the shape from descending to ascending-descending [11-13]. In our experiments this is only true for young males. Old males show a contrarian behavior. Furthermore these authors identified a torque shift induced by different hip positions. This could also be found in our experiments. Adopting a knee flexion angle of 0 degrees the subjects applied a significantly higher torque in the prone position than in the seated position. It is assumed that the increased hip flexion leads to an excess of the ideal muscle length of the hamstrings. Reversely subjects showed significantly higher torques in the seated position than in the prone position, assuming a knee flexion angle of 110 degrees. In this case further flexing the hip lengthens the hamstrings towards ideal muscle length. It seems that the weaker the subjects are the smaller is the influence of the position in the knee range of motion for the knee flexion torque. Knapik et al. [7] discovered that the angle at which peak torques occurred was highly variable among subjects. This can also be seen in our tests with a hip flexion angle of 90 degrees. Fig.2 shows that the occurrence of the maximum depends on the age group. The young groups have their maximum at a knee flexion of 60 degrees, old men at zero degree and old women at 110 degrees. All reviewed papers showed ascending – (plateau) – descending shapes for knee extension [7-11, 17-20]. Houtz et al. [11] and Herzog et al. [20] report a torque shift due to changing hip flexion angles. Exactly this behavior was evident in our experiments. A hip flexion angle of zero degrees leads to higher torques at a knee flexion angle of 20 degrees than a hip flexion angle of 90 degrees. The reverse behavior is the case for a knee flexion angle of 110 degrees. Pincivero et al. [17] reports a higher slope for men than for women. In our experiments we found out that the stronger the subjects the higher the slopes of the ascending and descending limb. To sum it up gender, age and hip flexion angle showed influences for both force directions. Knowing the general curve shape and just translating functions using the maximum value turns out to be inaccurate as the slope depends on the intercept. The resulting joint torque- joint angle functions can now be implemented in digital human models as an intermediate step for posture and strength prediction. Thus workplaces and products can be evaluated in terms of relative joint loads [28] and resulting postures. Combinations with extensive measurement campaigns will finally lead to absolute strength values expressed in percentiles [28]. Acknowledgments. The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no 218525. The authors gratefully acknowledge the support of the TUM Graduate School at Technische Universität München, Germany.
Joint Torque Modeling of Knee Extension and Flexion
87
References 1. Daams, B.: Human force exertion in user-product interaction. backgrounds for design. Delft: Fac. of Industrial Design Engineering, Delft Univ. of Technology (Series in physical ergonomics, 2) (1994) 2. Chaffin, D.B.: Prediction of Population Strength. In: Proceedings of the SAE DHMC (Digital Human Modeling Conference), H. SAE 981307 (1998) 3. Schwarz, W.: 3D-Video-Belastungsanalyse. Ein neuer Ansatz zur Kraft- und Haltungsanalyse. Als Ms. gedr. Düsseldorf: VDI-Verl (Fortschritt-Berichte VDIReihe 17, Biotechnik/Medizintechnik, 166) (1997) 4. Winter, D.A.: Biomechanics and motor control of human movement, 3rd edn. Wiley, Hoboken (2005) 5. Seitz, T., Recluta, D., Zimmermann, D., Wirsching, H.-J.: FOCOPP - An approach for a human posture prediction model using internal/external forces and discomfort. In: Proceedings of the SAE DHMC (Digital Human Modeling Conference), H. SAE 2005-012694 (2005) 6. Campney, H.K., Wehr, R.W.: Significance of strength variation through a range of joint motion. Phys. Ther. (45), 773–779 (1965) 7. Knapik, J.J., Wright, J.E., Mawdsley, R.H., Braun, J.: Isometric, Isotonic, and Isokinetic Torque Variations in 4 Muscle Groups Through a Range of Joint Motion. Physical Therapy 63(6), 938–947 (1983) 8. Yoon, S.T., Park, D.S., Kang, S.W., Chun, S.-I., Shin, J.S.: Isometric and Isokinetic Torque Curves at the Knee Joint. Yonsei Medical Journal 32(1), 33–43 (1991) 9. Anderson, D.E., Madigan, M.L., Nussbaum, M.A.: Maximum voluntary joint torque as a function of joint angle and angular velocity: Model development and application to the lower limb. Journal of Biomechanics 40(14), 3105–3113 (2007) 10. Samuel, D., Rowe, P.J.: Effect of Ageing on Isometric Strength through Joint Range at Knee and Hip Joints in Three Age Groups of Older Adults. Gerontology 55(6), 621–629 (2009) 11. Houtz, S., Lebow, M., Beyer, F.: Effect of posture on strength of the knee flexor and extensor muscles. Journal of Applied Physiology (11), 475–480 (1957) 12. Williams, M., Stutzmann, L.: Strength variation through the range of motion. Phys. Ther. (39), 145–152 (1959) 13. Mohamed, F., Perry, J., Hislop, H.: Relationship between wire EMG activity, muscle length, and torque of the hamstrings. Clinical Biomechanics 17(8), 569–579 (2002) 14. Kulig, K., Andrews, J.G., Hay, J.G.: Human strength curves. In: Terjung, R.L. (ed.) Exercise and Sport Science Reviews, vol. (12), pp. 417–466. The Collamore Press, Lexington (1984) 15. Campney, H.K., Wehr, R.W.: An interpretation of the strength differences associated with varying angles of pull. Res. Q (36), 403–412 (1965) 16. Lindahl, O., Movin, A., Rindqvist, I.: Knee extension measurement of the isometric force in different positions of the knee joint. Acta Orthop. Scand. (40), 79–85 (1969) 17. Pincivero, D.M., Salfetnikov, Y., Campy, R.M., Coelho, A.J.: Angle- and gender-specific quadriceps femoris muscle recruitment and knee extensor torque. Journal of Biomechanics, 1689–1697 (2007) 18. Becker, R., Awiszus, F.: Physiological alterations of maximal voluntary quadriceps activation by changes of knee joint angle. Muscle & Nerve 24(5), 667–672 (2001) 19. Haffajee, D., Moritz, M., Svantensson, G.: Isometric knee extension strength as a function of joint angle, muscle length and motor unit activiy. Acta Orthop. Scand. (43), 138–147 (1972)
88
F. Guenzkofer et al.
20. Herzog, W., Hasler, E., Abrahamse, S.K.: A comparison of knee extensor strength curves obtained theoretically and experimentally. Medicine and Science in Sports and Exercise 23(1), 108–114 (1991) 21. Hurley, B.F.: Age, Gender, and Muscular Strength. Journals of Gerontology Series ABiological Sciences and Medical Sciences 50, 41–44 (1995) 22. Era, P., Lyyra, A.L., Viitasalo, J.T., Heikkinen, E.: Determinants of isometric muscle strength in men of different ages. European Journal of Applied Physiology and Occupational Physiology 64(1), 84–91 (1992) 23. Herzog, W.: Determinants of muscle strength. Muscle Strength, 45–82 (2004) 24. Goodpaster, B.H., Park, S.W., Harris, T.B., Kritchevsky, S.B., Nevitt, M., Schwartz, A.V., et al.: The loss of skeletal muscle strength, mass, and quality in older adults: The health, aging and body composition study. Journals of Gerontology Series A-Biologicl Sciences and Medical Sciences 61(10), 1059–1064 (2006) 25. Bemben, M.G., Massey, B.H., Bemben, D.A., Misner, J.E., Boileau, R.A.: Isometric Muscle Force Production as a function of age in healthy 20 to 74-yr-old men. Medicine and Science in Sports and Exercise 23(11), 1302–1310 (1991) 26. Frontera, W.R., Hughes, V.A., Lutz, K.J., Evans, W.J.: A cross-sectional study of muscle strength and mass in 45-year-old to 78-yr-old men and women. Journal of Applied Physiology 71(2), 644–650 (1991) 27. Izquierdo, M., Ibañez, J., Gorostiaga, E., Garrues, M., Zúñiga, A., Antón, A., et al.: Maximal strength and power characteristics in isometric and dynamic actions of the upper and lower extremities in middle-aged and older men. Acta Physiologica (167), 57–68 (1999) 28. Engstler, F., Bubb, H.: Generation of Percentile Values for Human Joint Torque Characteristics. In: Duffy, V.G. (ed.) ICDHM 2009. LNCS, vol. 5620, pp. 95–104. Springer, Heidelberg (2009) 29. Kumar, S.: Muscle strength. CRC Press, Boca Raton (2004) 30. Mital, A., Kumar, S.: Human muscle strength definitions, measurement, and usage: Part I Guidelines for the practitioner. International Journal of Industrial Ergonomics 22(1-2), 101–121 (1998) 31. Brown, L.E., Weir, J.P.: ASEP Procedures recommendation 1: Accurate Assessment of muscular strength and power. Journal of Exercise Physiology (online) 4(3) (2001) 32. Chaffin, D.B., Andersson, G., Martin, B.J.: Occupational biomechanics, 4th edn. WileyInterscience, Hoboken (2006)
Predicting Support Reaction Forces for Standing and Seated Tasks with Given Postures-A Preliminary Study Brad Howard and Jingzhou (James) Yang Department of Mechanical Engineering Texas Tech University, Lubbock, TX 79409, USA
[email protected]
Abstract. This paper proposes a systematic approach for predicting the support reaction forces (SRFs) acting on a digital human model with a given posture. In addition, a generic method has been developed to determine the accurate body segment inertia properties (BSIPs) needed for subject-specific simulation. Experiments based on motion capture are used to track the posture and to find subject’s link lengths. The prediction model calculates the support reaction forces by using the zero moment point (ZMP) formulation. This study considers two general postural cases: standing and seated. The standing tasks include standing on two planes with arbitrary orientations. The seated tasks include sitting on a seat where the seat pan is parallel to the floor and both feet are on the floor. Keywords: Support reaction forces; digital human model; ZMP; posture.
1 Introduction The advances in the field of digital human modeling (DHM) have allowed researchers to accurately predict human posture and motion while interacting with the environment. One of the major environmental interactions, and long standing problems in DHM, is how to predict the support reaction forces (SRF’s) for various standing and seated tasks without measuring them. Traditional methods used to determine the SRFs are based on experiments to measure these forces using forces plates or load cells. However, in the digital human modeling environment, it is impossible to measure the SRFs. Therefore, it is important to accurately predict the SRFs for various standing and seated tasks. To date most DHM models only have the capability to predict SRFs on flat ground. A lot of human tasks such as climbing stairs or standing on uneven terrain are not considered in currently available prediction models. Recent work has extended that capability to predict SRFs acting on a human in a seated position ignoring foot-ground contacts (Kim et al., 2009). However, most seated tasks include those reactions. Therefore, it is necessary for the development of predictions models that include these types of tasks. Body segment inertia properties (segment mass, centers of mass, and moments of inertia) are important sets of data for any type of biomechanical analysis. Any prediction model using human dynamics that considers human subjects instead of general populations needs subject-specific data including BSIPs. Consequently, it is V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 89–98, 2011. © Springer-Verlag Berlin Heidelberg 2011
90
B. Howard and J. (James) Yang
necessary to have a generic method that can estimate these data based on the subject’s total body stature and mass. The objective of this study is to predict the support reaction forces acting on a human in a known posture by (1) developing a method to accurately estimate the body segment inertia properties of specific subjects and (2) creating a prediction model based on the zero moment point formulation that can account for arbitrary support planes while standing, and for seated postures that include foot contacts. Recently, the static joint torques of a digital human model were calculated in Yang et al. (2009). Yang et al. (2009) also used the ZMP formulation and a linear distribution model to predict ground reaction forces on flat ground. The ZMP formulation can be found in Vukobratović and Borovac (2004) and Goswami et al. (1999). Its coincidence with the center of pressure is described in Sardain and Bessonet (2004). Kim et al. (2009) extended Yang et al. (2009) formulation to include seated reaction forces without considering foot contacts. Applying these methods to subject-specific models has not yet been done and, as mentioned, needs the addition of subject-specific body segment inertia properties. Dempster (1955) completed a study for the air force that served as a benchmark for many future studies. Since then, many linear and non-linear regression equations have been developed that can estimate the BSIP information given body stature and weight (Zatsiorski et al., 1990, Hatze et al., 1980, McConville et al., 1980, Young et al., 1983, Niklova et al., 2007, Niklova et al., 2010). Many studies recently have improved the regression equations found in these studies so that they apply more to the general population or a specific application (de Leva, 1996, Dumas et al., 2006, Dumas et al., 2007). This paper is organized as follows. First the kinematics of the human model will be discussed. Next the formulation of the problem will be covered to include estimation of body segment inertia properties, calculation of static joint torques, the zeromoment point formulation, and the SRF prediction model. Examples will then be presented and discussed. Finally conclusion and future work considerations will be offered.
2 Human Model and Kinematics The human body can be modeled as a kinematic chain consisting of revolute joints representing the musculoskeletal joints and are connected by links that represent the bones. The vector q = [ q1 L qn ] is comprised of the generalized coordinates and T
represents a unique posture. The position of the end effectors, the hands and the feet, are described by a vector in Cartesian space and is a function of q : x(q) ∈ R3 (Yang et al., 2004). Since the generalized coordinates are measured about the local z axis, transformation matrices are needed for each degree of freedom in order to calculate x(q) . The transformation matrices are found through the use of the Denavit Hartenberg (DH) method (Denavit Hartenberg, 1955). The general form of the matrix can be presented by i −1Ti . By successively multiplying these matrices together as in Eq. (1), the global coordinates of any point in the mechanism can be calculated.
Predicting Support Reaction Forces for Standing and Seated Tasks
91
Likewise, the time derivatives of Eq. (1) will provide the global velocities and acceleration of any point in the mechanism. A j = T1T2 L T j = A j −1T j
(1)
Human model used in this study is based on the kinematics above and consists of 55 DOF. There are 6 global DOF, 12 spinal DOF, 9 DOF in each arm, 5 DOF in the head and neck, and 7 DOF in each leg.
3 Problem Formulation The formulations of the different aspects of this study will be covered in this section. The general problem is to find support reaction forces for given subject specific BSIPs and posture q . First the estimation methods for the BSIPs will be briefly introduced. Then the joint torque and ZMP will be discussed, and finally the SRF prediction models will be presented. 3.1 Subject-Specific Body Segment Inertial Properties
The body segment inertial properties of each segment need to be accurately calculated in order to predict or calculate accurate support reaction forces. The method in this study is based on studies from Dumas et al. (2006), and de Leva et al. (1996). Fig. 1. and Table 1 show the data collected from these studies (the moments of inertia will not be shown as only static postures are considered in this study). For segments of the body such as the thigh, shank, forearm, and upper arm, the data from these studies can be used directly. However, some of the data from these studies could not be used directly as certain parts of the body were lumped together. For example, it can be seen in Fig. 1. that the torso includes all spinal segments and. The BSIP data for these segments need to be subject to further decomposition, due to the nature of the human model used for simulation.
Fig. 1. Center of mass percentages (de Leva et al., 1996)
This decomposition is accomplished by transforming the 50 percentile model into a subject specific model. In order to illustrate this transformation, the decomposition of the torso will be presented. First, the mass percentage of the torso of each individual segment in the torso, %mitorso , must be calculated.
92
B. Howard and J. (James) Yang 9
% mitorso = mi / ∑ mi
(2)
i =1
mi
( i = 1, 2,..., 9 ) is the individual mass of each segment in the torso. Next the
mass percentage of the total body mass of each individual segment in the torso, % mitotal , must be found. This is calculated using the data in Table 1 . % mitotal = % mitorso × 33.3%
(3)
Table 1. Segment mass Body Segment Mass (%) Head and Neck Torso Arm Forearm Hand Pelvis Thigh Shank Foot
6.7 33.3 2.4 1.7 0.6 14.2 12.3 4.8 1.2
Now the mass of the individual torso segments of any subject can be calculated from the subjects body mass, msubject , recorded during the experiment. mi = msubject × % mitotal
(4)
The center of mass for each segment was estimated to be at 50% of the segment length. This brought the estimated center of mass for the whole torso very close to that found in Fig. 1. 3.2 Joint Torque and ZMP
Joint torque is realized through the backward recursive dynamic formulation (Yang et al., 2009). The ZMP is defined as the point on the ground in which the tipping moments due to the reaction forces are zero. The reaction forces at the ZMP, F ZMP and M ZMP , can then be calculated (Yang et al., 2009). It is important to note that PZMP should coincide with the center of pressure (CoP) (Sardain and Bessonet, 2004). And, since the postures are static, the CoP/ZMP should be a projection of the center of mass of the body on the ground or support plane. Because the SRF distribution is based on the ZMP position, human models, specifically the BSIPs and body segment length, are required to be subject-specific. 3.3 Support Reaction Forces
The support reaction force prediction model is formulated in this section. Generally, it is a linear distribution based on the distance between the application of the SRFs and PZMP , Newton’s first law, and the angles of the support planes. The distribution model can be organized into two cases: the standing case (two-force distribution) and the seated case (four-force distribution).
Predicting Support Reaction Forces for Standing and Seated Tasks
93
Before the actual prediction model is presented, the application of Newton’s first law will be explained. Fig. 2. is schematic that shows the distribution of F ZMP and M ZMP to the two feet ( F R , F L , M R , and M L ) where D R and D L are the distance vectors from the ZMP to the application points PR and PL . FyR =
DxL + DzL DxR + DzR ZMP L ; = F F FyZMP y y DxR + DzR + DxL + DzL DxR + DzR + DxL + DzL
(5)
Eq. (5) only shows the model for one component in a general two forces case but the method is applied for the other two components. This model provides the basis for all the prediction models.
Fig. 2. General schematic for two-SRF distribution
(1) General Standing Case
The general standing case is a two force distribution between the right and left feet. The distribution is compensated by using cosines of the support plane angle: FyR = FzR =
FyL = FzL =
( DyL + DzL ) cos 2 α ( DxL + DzL ) cos 2 α F , FxR = R FxZMP L L R 2 2 ( D + D ) cos β + ( Dx + Dz ) cos α ( Dy + Dz ) cos 2 β + ( D yL + DzL ) cos 2 α R x
R z
( DxL + DyL ) cos 2 α ( DxR + DyR ) cos 2 β + ( DxL + DyL ) cos 2 α
( DyR + DzR ) cos 2 β ( DxR + DzR ) cos 2 β FyZMP , FxL = R FxZMP 2 2 L L R ( D + D ) cos β + ( Dx + Dz ) cos α ( Dy + Dz ) cos 2 β + ( DyL + DzL ) cos 2 α R x
R z
( DxR + DyR ) cos 2 β ( D + D ) cos β + ( D + D ) cos α R x
R y
2
L x
2
L y
(6)
FzZMP
(7)
FzZMP
α is the angle of the right foot support plane and β is angle of the left foot support plane and can be seen in Fig. 3. The moment vectors of the SRF are then calculated using the same distribution scheme. However distribution of the moment also requires the parallel axis theorem: M yR = M = R z
→ → ( D yL + DzL ) cos 2 α ( DxL + DzL ) cos 2 α M yZMP + ( D R × F R ) j , M xR = M xZMP + ( D R × F R ) i ( DxR + DzR ) cos 2 β + ( DxL + DzL ) cos 2 α ( D yR + DzR ) cos 2 β + ( D yL + DzL ) cos 2 α
( DxL + D yL ) cos 2 α ( DxR + D yR ) cos 2 β + ( DxL + D yL ) cos 2 α
M
ZMP z
→
+ (D × F ) k R
R
(8)
94
B. Howard and J. (James) Yang
Fig. 3. Visual representation of α and β
(2) General Seated Case
The seated case becomes more complicated because there are four points of contact to which the SRF’s have to be distributed. This study only considers a seat pan parallel to the floor. The cosine functions used in the distribution will go to 1 as the angles are zero. The general template derived in (5) can still be followed, however another step must be added. Basically, two 2-force distribution models will be used in series. First the loads at the ZMP will be distributed between the seat pan and the floor. Fig. 4. shows the location of the points of distribution ( P SP and P F ) for the seat pan and the floor, where DT and D F are the vectors from the ZMP to the thigh and foot application points, respectively. D RT , D LT , D RF , D LF are the distance vectors between the intermediate application points and the end application points. These vectors can be seen in Fig. 4. The equations for the distribution are as follows: FyT =
( DyF + DzF ) ( DxF + DzF ) FyZMP , FxT = T FxZMP T F F ( D + Dz ) + ( Dx + Dz ) ( Dy + DzT ) + ( DyF + DzF )
FzT =
( DxF + DzF ) FzZMP ( D + DyT ) + ( DxF + DyF )
T x
(9)
T x
( DTy + DzT ) ( DxT + DzT ) FyZMP , FxF = T FxZMP ( DxT + DzT ) + ( DxF + DzF ) ( Dy + DzT ) + ( DyF + DzF )
FyF =
( DxT + DTy )
FzF =
(D + D ) + (D + D ) T x
T y
F x
F y
(10)
FzZMP
Once the loads are distributed to P SP and P F , they can be distributed again to the points of contact ( P RT , P LT , P RF , and P LF ). That is accomplished through another two force distribution: FyRT = RT z
F
=
RT x
(D
( DyLT + DzLT ) ( DxLT + DzLT ) FyT , FxRT = RT FxT RT LT LT + Dz ) + ( Dx + Dz ) ( Dy + DzRT ) + ( DyLT + DzLT ) ( DxLT + DyLT )
RT x
(D
FyRF = FzRF =
+ D ) + (D RT y
LT x
+D ) LT y
( DyLF + DzLF ) ( DxLF + DzLF ) FyF , FxRF = RF FxF ( DxRF + DzRF ) + ( DxLF + DzLF ) ( Dy + DzRF ) + ( DyLF + DzLF ) ( DxLF + DyLF ) RF x
(D
+ D ) + (D RF y
LF x
+D ) LF y
(11)
FzT
(12)
FzF
Substituting equations (9) and (10) into (11) and (12) respectively, the full Point distribution prediction model can be calculated. RatenApplication will be use as the i
Predicting Support Reaction Forces for Standing and Seated Tasks
95
uur
template to call the intermediate distribution models for clarity, where ni ( i = 1, 2, 3 ) denotes the global x y or z unit vectors, respectively. The full general prediction models are as follows: SP ZMP r Rateuur Fuur FuunrRT = RatenuuRT , FuunrLT = RateuunLTr RateuunSPr FuunrZMP n n
(13)
F ZMP F ZMP r Rateuur Fuur r Rateuur Fuur FnuurRF = RatenuuRF , FnuurLF = RatenuuLF n n n n
(14)
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
The moment vectors of SRF follow the same model but include the parallel axis theorem: uur SP ZMP r r Rate uur M uur = RatenuuRT + [( D T + D RT ) × F RT ] ⋅ ni M nuuRT ni ni i i uur (15) LT SP ZMP r = Rate uur Rate uur M uur M uunLT + [( D T + D LT ) × F LT ] ⋅ ni n n n i
i
i
i
uur F ZMP r r Rate uur M uur = Rate nuuRF + [( D F + D RF ) × F RF ] ⋅ ni M uunRF ni ni i i uur F ZMP r r Rate uur M uur M uunLF = RatenuuLF + [( D F + D LF ) × F LF ] ⋅ ni n n i
i
i
(16)
i
Note that when the distance vectors are used in the distribution model, the magnitude is used and the direction is ignored for the numeric simplicity.
Fig. 4. General schematic for the seated case
4 Simulation Examples The results of the simulations are presented in this section. There are four different tasks applied to two different postural orientation cases. The orientation cases are: standing with each foot on an angled plane (uneven terrain), and seated on a ground parallel seat pan with feet on the ground. (1) Standing Case
The support plane angles α and β were selected to be 0 and 15 degrees respectively for the standing case. The results for four different tasks will now be presented. The SRFs predicted for each of the tasks in Fig. 5 can be found below in Table 2.
96
B. Howard and J. (James) Yang
(a) Task 1
(b) Task 2
(c) Task 3
(d) Task 4
Fig. 5. Visualization of the predicted postures for each task Table 2. SRF for standing all Tasks SRF
Task 1 Task 2 Task 3 Task 4
Right Foot Left Foot Right Foot Left Foot Right Foot Left Foot Right Foot Left Foot
x 0 0 0 0 0 0 0 0
Force Vector (N) y 552.005 138.841 485.278 205.569 122.59 568.255 219.69 471.15
Moment Vector (N-m) x y z 16.456 0 32.989 4.06614 0 -43.423 8.716 0 59.1513 12.67 0 -52.519 -2.86 0 39.119 -1.838 0 -38.489 -0.43353 0 60.275 14.501 0 -43.8107
z 0 0 0 0 0 0 0 0
4.1 Seated Case
The seated case had four tasks as well but they were repositioned so that they were reachable from seated position. There is only one postural orientation for the seated case. The SRF’s for all four seated tasks are shown in Table 3. Table 3. SRF for all four tasks in the seated case SRF
Task 1
Task 2
Task 3
Task4
Right Thigh Left Thigh Right Foot Left Foot Right Thigh Left Thigh Right Foot Left Foot Right Thigh Left Thigh Right Foot Left Foot Right Thigh Left Thigh Right Foot Left Foot
x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Force Vector y 333.0565 134.9282 155.233 67.629 295.89 192.187 131.378 71.393 33.675 420.242 58.429 178.5 28.41 430.22 41.772 190.446
z 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Moment Vector x y z -42.1194 0 20.049 -15.23 0 -24.578 36.79 0 26.896 17.84 0 -25.09 -34.22 0 28.002 -19.72 0 -30.87 32.45 0 25.501 17.515 0 -25.384 -4.056 0 7.432 -54.436 0 -7.739 14.5 0 24.858 42.91 0 -23.468 -2.524 0 6.447 -53.4322 0 -7.45 10.246 0 19.099 45.866 0 -18.251
4.2 Discussion
The SRFs predicted from the simulations can all be considered logical or realistic. They follow the general trends that would be expected from the postural orientations
Predicting Support Reaction Forces for Standing and Seated Tasks
97
that were required to complete the tasks. The ZMP positions of each of the tasks were not shown even though the force distribution is a direct reflection its value. This data was not presented because of the nature of given postures. The given postures were found through the posture reconstruction of motion capture data. The locations of the feet, though realistically were identical for all tasks, were calculated to be different in the reconstruction because it assumes that the hip point is at the origin. This renders the comparison of ZMP locations between tasks ineffectual and does not reflect the difference in postural stability or SRF distribution across the postural orientations. One aspect that is important to note is the relationship between the body segment inertia properties and the calculated support reaction forces. For any given trial the summation of the vertical component of the SRF force vector provided a total force of 690.846 N. When dividing by the acceleration of gravity, the total mass of the DHM is calculated to be 70.42 kg. When the subject was weighed in before the experiment, he registered 155 lbs or 70.32 kg. The means that the body mass of the subject was estimated to .142% of the actual body mass; an error that is definitely acceptable.
5 Conclusions and Future Work This paper presented a method for predicting support reaction forces from known posture found in motion capture experimentation. Also presented was a generic method for determining the body segment inertia properties for specific subjects used in experimentation. Based on these methods, experiments in motion capture and simulations were run in order to test the methods presented. The SRF force predicted followed the general trends expected from the capture and provided realistic values. The body segment inertial property estimation provided a body mass with the .142% error. Validation for the SRF prediction model is needed to evaluate the accuracy of the predicted loads. Overall the proposed methodology presented in this paper is sound. It provides the ability to predict SRFs for general cases. As mentioned in the discussion portion of this paper, it is important to define accurate location of application for the reaction forces. The addition of pressure mapping system in the validation of the model presented in this paper would be very useful. Acknowledgments. This research work was partly supported by National Science Foundation (Award # 0926549).
References 1. Denavit, J., Hartenberg, R.S.: A Kinematic Notation for Lower-Pair Mechanisms Based on Matrices. Journal of Applied Mechancis 22, 215–221 (1955) 2. Fujimaki, G., Mitsuya, R.: Study of the Seated Posture for VDT Work. Displays 23(1-2), 17–24 (2002) 3. Goswami, A.: Postural Stability of Biped Robots and the Foot Rotation Indicator (FRI) Point. International Journal of Robotics Research 18(6) (1999) 4. Kim, J., Yang, J., Abdel-Malek, K.: Multi-objective Optimization Approach for Predicting Seated Posture Considering Balance. International Journal of Vehicle Design 51(3/4), 278–291 (2009)
98
B. Howard and J. (James) Yang
5. Sardain, P., Bessonnet, G.: Forces Acting on a Biped Robot. Center of Pressure-Zero Moment Point. IEEE Transactions on Systems, Man and Cybernetics Part A 34(5), 630– 637 (2004) 6. Vukobratović, M., Borovac, B.: Zero-moment Point–Thirty Five Years of its Life. International Journal of Humanoid Robotics 1(1), 157–173 (2004) 7. Yang, J., Xiang, Y., Kim, J.: Determining the Static Joint Torques of Digital Human Models Considering Balance. In: ASME 2009 International Design Engineering Technical Conference, San Diego California (2009) 8. Hatze, H.: A Mathematical Model for the Computational Determination of Parameter Values of Anthropomorphic Segments. Journal of Biomechanics 13(10), 833–843 (1980) 9. Yang, J., Marler, T., Kim, H.J., Arora, J., Abdel-Malek, K.: Multi-Objective Optimization for Upper Body Posture Prediction. In: 10th AIAA/ISSMO Multidesicinplinary Analysis and Optimization Conference, Albany, New York (2004) 10. Zatsiorsky, V., Seluyanov, V., Chugunova, L.G.: Methods of Determining Mass-Inertial Characteristics of Human Body Segments. In: Chernyi, G.G., Regirer, S.A. (eds.) Contemporary Problems of Biomechanics, pp. 272–291. CRC Press, Boston (1990) 11. Dempster, W.T.: Space Requirements for the Seated Operator. WADC Technical Report TR-55-159, Wright Air Development Center, Wright–Patterson Air Force Base, Dayton, Ohio (1955) 12. de Leva, P.: Adjustments to Zatsiorsky–Seluyanov’s Segment Inertia Parameters. Journal of Biomechanics 29, 1223–1230 (1996) 13. McConville, J.T., Churchill, T.D., Kaleps, I., Clauser, C.E., Cuzzi, J.: Anthropometric Relationships of Body and Body Segment Moments of Inertia. Technical Report AFAMRL-TR-80-119, Aerospace Medical Research Laboratory, Wright–Patterson Air Force Base, Dayton, Ohio (1980) 14. Young, J.W., Chandler, R.F., Snow, C.C., Robinette, K.M., Zehner, G.F., Lofberg, M.S.: Anthropometric and Mass Distribution Characteristics of Adult Females. Technical ReportFA-AM-83-16, FAA Civil Aeromedical Institute, Oklahoma City, Oklahoma (1983) 15. Nikolova, G., Toshev, Y.: Estimation of Male and Female Body Segment Parameters of the Bulgarian Population Using a 16-segmental Mathematical Model. Journal of Biomechanics 40, 3700–3707 (2007) 16. Nikolova, G.: Anthropometric Measurements and Model Evaluation of Mass-Inertial Parameters of the Human Upper and Lower Extremities. In: IFMBE Proceedings, vol. 29, pp. 574–577 (2010) 17. Dumas, R., Cheze, L., Verriest, J.P.: Adjustments to McConville et al. and Young et al. Body Segment Inertial Parameters. Journal of Biomechanics 40, 543–553 (2006) 18. Dumas, R., Cheze, L., Verriest, J.P.: Corrigendum to “Adjustments to McConville et al. and Young et al. Body Segment Inertial Parameters” [J. Biomech. 40 (2007) 543–553]. Journal of Biomechanics 40, 1651–1652 (2006) 19. Gragg, J., Yang, J.: Posture Reconstruction for Digital Human Models - Methodology Study. Ergonomics (2010) (submitted)
Schema for Motion Capture Data Management Ali Keyvani1,2, Henrik Johansson1, Mikael Ericsson1, Dan Lämkull3, and Roland Örtengren4 1
University West, Department of Engineering Science, 46186 Trollhättan, Sweden 2 Innovatum AB, Production Technology Centre, 46129 Trollhättan, Sweden 3 Volvo Car Corporation, Dept. 81022/PVÖS32, SE-405 31 Göteborg, Sweden 4 Chalmers University of Technology, Department of Product and Production Development, SE-412 96 Göteborg, Sweden {Ali.Keyvani,Mikael.Ericsson,Henrik.Johansson}@hv.se
[email protected],
[email protected]
Abstract. A unified database platform capable of storing both motion captured data and information about these motions (metadata) is described. The platform stores large motion captured data in order to be used by different applications for searching, comparing, analyzing and updating existing motions. The platform is intended to be used to choose a realistic motion in simulation of production lines. It is capable of supporting and handling different motion formats, various skeleton types and distinctive body regions in a uniform data model. Extended annotating system is also introduced to mark the captured data not only in the time domain (temporal) but also on different body regions (spatial). To utilize the platform, sample tests are performed to prove the functionality. Several motion captured data is uploaded to the database while MATLAB is used to access the data, ergonomically analyze the motions based on OWAS standard, and add the results to the database by automatic tagging of the postures. Keywords: Motion Capture Database, Virtual Production Systems, Digital Human Modeling, Computerized Ergonomic Analysis.
1 Introduction Many production systems face challenges because of costly changes imposed by late modifications in the product design/production line. When there is a need to simulate such production systems, one challenge is to integrate the human behavior together with other production elements in the simulation environment. Digital Human Modeling (DHM) tools address this problem by integrating human motions into the simulation platform. Process planners often use these DHM tools to simulate different work postures for ergonomic analysis or to investigate possible hand reach and line of sight. The advantage with DHM tools is the ability to make ergonomic analysis of work environments to prevent musculoskeletal disorders and thereby increase quality and productivity [1]. However, generating natural human motions is not an easy task. Redundancy, caused by additional degrees of freedom, restrains conventional mathematical V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 99–108, 2011. © Springer-Verlag Berlin Heidelberg 2011
100
A. Keyvani et al.
solutions. It is identified that motion-solving algorithms based on real motion captured data is one promising solution. This paper describes how database techniques can be used to facilitate the usage of motion capture (Mocap) data within DHM tools. The database techniques provide a framework for handling motion data, not only for DHM tools but also for every tool that uses motion data. The outputs from Mocap systems are generally stored in datasets that include the motion data and information about the captured data. In section 4, a general workflow is presented of how datasets are used to populate the database. Annotation methods are also introduced to describe information about the motions. The experiments in section 5 verify that the database can handle data according to the desired requirements.
2 Background The need to use manikins and DHM tools in the production processes often comes from big companies such as car manufactures. As a result, many of today’s process simulation vendors have integrated DHM tools into their software, such as Jack from Siemens PLM software [2]. One of the first applications of DHM was to analyze if it is possible to reach a specific location or control [3]. Today, DHM models are commonly used during the design of vehicle. Common industrial applications are to model the interaction between driver, passengers and the interior in a car [4-6]. Another example of an industrial task is to analyze if a human can reach a specified hand position in an assembly operation in a digital simulation of a factory [7, 8]. A large sector of DHM outside the industrial usage is entertainment, such as games and movies [9]. The simulation of human motions and postures in the production line is often done using mouse and keyboard. A natural posture of the manikin is often judged by a production engineer via visualization and assumptions [10]. However, manual manipulation of manikins are both time consuming and not always reliable. To help end-user generating postures, several techniques have been introduced to predict realistic postures. Some of them are based on pure mathematical methods, normally referred to as inverse kinematic methods [11]. Optimization tools are also used in order to minimize joint stresses and predict a 2D posture based on these [12, 13]. Furthermore, other methods try to use real human motions as a reference to predict postures. A common need in all these techniques, which use real human motions, is to have structured databases of recorded human motions. These databases shall include changes in body joints during time and information about the characteristics of the motions (annotations). Mocap is a common way to collect real human motion data. Several different commercial Mocap systems are available today and can be categorized in three main groups: Optical systems, magnetic systems and mechanical systems. Other methods exist such as ultrasonic and internal systems [14]. As a result, considerable work has been done to introduce efficient techniques for collecting and manipulating the human Mocap data. These techniques have been divided into two main approaches: database techniques [15], and annotation of motion sequences [16]. Furthermore, to generate a posture from real human data, different methods have been suggested. Examples are: using statistical models [17], treating recorded motions as a manikin memory [18],
Schema for Motion Capture Data Management
101
warping an existing motion by adjusting the motion graphs [19], synthesizing recorded data with the help of annotations [20], and mixing recorded motions with inverse kinematic techniques [21, 22].
3 Problem Description Mocap data is recorded using a human subject performing a specified task. The subject usually wears a special suit with markers. The position of each marker is tracked by a Mocap system. It is both time consuming and expensive to use Mocap system to acquire and record new motion for each specific task. Recorded Mocap data from different sources can normally not be reused because of a number of reasons. One reason is that different file formats are used to save the Mocap data due to different types of systems. Another reason when reusing Mocap data is the unique nature of each motion, which in turn is sensitive to changes in the skeleton configuration. Depending on the application, different skeleton configurations are used. Variations in these skeleton configurations can be e.g. number of segments, joint configuration and the length of individual segments. Fig. 1 shows an example of two skeletons with different number of segments. A third reason is the variations in naming convention due to lack of standards. Each research group can have its own naming conventions both for segment parts and/or for tagging the performed action. For example, use of trunk/chest in naming of segments and walk/pace/stride in naming an action.
Fig. 1. Two different types of human skeleton, where lines and dots representing the segments and the joints
Generally, Mocap data are collected in datasets, which are usually folders named with a proper term describing the action and populated with Mocap files. One problem with large datasets of motions is the limited possibility to search within Mocap files. One possibility to facilitate search will be to integrate the Mocap data in a database. A database is a system provides means to manage, store, search, retrieve and process large amount of structured data [23]. Such a system should also allow the user to combine several existing motions and store it as a new entity. In this article, a unified database platform is described capable of storing both Mocap data and information about these motions (metadata). It stores Mocap data that can be used, by different applications, for searching, comparing, analyzing and updating existing motions. A new annotating system is proposed to mark the captured
102
A. Keyvani et al.
data. Compared to existing methods [1], the proposed technique applies annotations both on the time domain (temporal) and on the different body regions (spatial). The advantage of this annotation method is that it will improve the search functionality by providing better-segmented search criteria.
4 Database Architecture The main concept of the proposed platform is to provide a generic workspace, which can manipulate Mocap data in an organized way. It also provides efficient ways for other applications, especially industrial virtual production/ergonomic tools, to access the data, analyze them and feedback the results to the platform. Later on, these data will be the source for DHM tools to find and generate realistic and valid motions. For doing so, a standard database approach with a three-tier (Data, Query, and Application) architecture was implemented [23], see Figure 2. The inner layer, called data layer, keeps the data within tables. The middle layer, called query layer, is the link between the inner and outer layer. It handles the queries generated from the outer layer and communicates with the data layer. The query layer also insures that the data is consistent and well defined by means of relationships, attributes and rules. In fact, accurate presentation of this layer, called data schema, is one of the contributions in the paper. The outer layer, called application layer, is the interface communicating with the user. It can be any application that needs to communicate with the data layer. The data can travel in both directions (publish and update, or search and retrieve) as long as the applications talk to the query layer in the standard language. One benefit of having such architecture is that the application layer can access the data regardless of the application type. This means that many applications can communicate with the data core using a standard query builder (SQL in this case) without worrying about how the data is handled within the system. Furthermore, different functions such as visualization, analysis, updates and data combination can be done by different applications simultaneously. Preparations are needed to populate the database with motion files. This means that for one time, each Mocap file shall be interpreted and parsed into the database according to a specific workflow. To accomplish this, two separate procedures will be discussed in the rest of the paper, data storage and data retrieval. However, before going into these details, different data types to be handled by the database are presented in the next subsection. 4.1 Data Classes/Objects Definition In this section, the implementation of the data schema is presented. The target of the schema is to satisfy a number of key requirements for a Mocap database to be efficiently utilized. These key requirements are: • Support of skeleton variations • Support of customized body segmentation and partial (upper body, hands, etc.) motion description • Support of two domains (spatial and temporal) for annotating • Support of categorized annotation system
Schema for Motion Capture Data Management
103
Fig. 2. Data, Query and Application layers in a three-tier architecture
The data schema is based on several data objects (D.O.s), described as follows: Marker D.O.: Markers (in optical Mocap) are reflective objects attached to the subject (actor) body, and are the targets for the Mocap device to track them accurately. To define a motion from the Mocap output, one needs to map the marker data to the joint centroids of the subject body. This means that first there should be a Skeleton (hierarchy) defining how the body segments are connected to each other; and second there should be a table defining the length of these segments for each subject (person) performing the motion. Finally, the relational movements of these segments shall be defined by means of changes in the joint values (rotations). Regarding this, three different D.O.s are needed in the schema to precisely define a motion independent of the motion subject. Skeleton D.O.: As mentioned, the hierarchy of the body segments shall be defined for a specific motion in the Skeleton Data object. Notice that, depending on the application, different skeletons may be defined. This type of D.O. defines a hierarchy tree of parent-child segments connected to each other. Actors D.O.: Assuming identical skeleton hierarchy, different subjects can be distinguished by assigning their specific segment lengths to the same skeleton type. An Actors D.O. contains these data in addition to other general data about the motion subject such as gender, weight, height, etc. Joint Value D.O.: having a skeleton and subject’s segment length data, a motion can be sufficiently defined by the relative rotational values in each joint during time. Generally, three degrees of freedom are needed to define a generic rotation, and therefore Joint Value D.O.s are pieces of data containing the joint name and three rotation values at a specific time. BodyZone D.O.: In addition to above, motions are not necessarily performed full body and retrieval of full body motions are not needed all the times. Therefore, there should be a way to define a motion partially based on different body zones. The BodyZone D.O. defines how a number of joints can be grouped to form a zone. Moreover, it is possible to form a new body zone by combining existing body zones (Both_Hands = Right_Hand + Left_Hand).
104
A. Keyvani et al.
Annotation D.O.: The proposed data model is able to annotate motions both temporally (frame range) and spatially (body zones). Therefore, an Annotation D.O. contains information about the type of annotation, tag name, target motion object, frame range and the applicable body zone. 4.2 Data Storage Workflow In order to make the Mocap data generic and reusable, several actions need to be performed while they are being populated into the database (Fig. 3). When receiving a Mocap file, a new object called Basic_Motion D.O. is created. This object is the core object in the database placed in a table called “Basic_Motion”. This table consists of some mandatory and optional attribute fields, which primarily describe the motion and consist of relational fields, which connect the Basic_Motion object to other tables. Examples of mandatory attribute fields are the unique motion ID (primary key), number of frames and data format. Examples of optional fields are frame rate, primary motion description, date and number of cameras. Examples of mandatory relational fields are a relation key to the “Skeleton_Type” table and a relation key to the “Body_Zone” table. Example of optional relation is relation keys to the “Actors” tables which show who was the primary subject of the motion. Notice that the schema assumes that the motions can be considered independent of the subject and can be applied to a new subject by changing the reference key in the segment data table. As a result, this is an optional and not a mandatory relation when defining the Basic_Motion object.
Fig. 3. Data Storage Workflow
For the next step, skeleton data, subject data (segment length) and joint data are separated, analyzed and stored in the database. Most of the current commercial
Schema for Motion Capture Data Management
105
Mocap file formats (.htr by Motion Analysis, .asf/amc by Acclaim) include these data. “htr” format keeps the hierarchy, segment lengths and joint values in one file and “asf/amc” keeps the skeleton information (.asf) and joint data (.amc) separately. As mentioned before, there can be different skeleton definitions and a motion can be performed by many subjects. When receiving a new motion, the Skeleton hierarchy will be checked to see if it exists in the database. In case of new skeleton, the database is populated by adding the skeleton and mapping the previously defined body zones to it. For example if a body zone called Right_Hand was defined for previous skeletons, the definition shall be updated by assigning which joints from the new skeleton are also members of this body zone. Next, the basic motion D.O. is updated with the correct skeleton type and the assigned body zone (notice that not all the motions are full body motions). Similar scenario will be executed for a new subject definition. Finally, the joint values are stored in tables. For each joint three values, corresponding to Euler rotation angles will be stored. In case the file format was not based on Euler angles, a conversion is performed on the data to find the equivalent Euler angles and store them in the joint data tables. 4.3 Annotate, Search and Retrieve The main goal of storing motions in a database is to provide access for different applications, for example to further analyze the motion data. The results of these analyses are metadata, which is stored in the database as annotations (tags). The proposed model is able to assign tags, not only temporally (frame range) but also spatially (body zone). This means that the annotating system provides a 2D space to describe the motion pieces. This approach will help the user applications to request specific motions by searching for part of the body and part of the motion in the database. In addition, the proposed data schema provides flexibility to generate both simple and complex queries to search and retrieve motions. These queries are sent from the user applications to the database for such purposes as: • In a simple scenario, the query is retrieving a complete motion or part of a specific motion for visualizing. • Analysis tools can send queries to retrieve part of a motion, analyze it, and send back the results in form of annotations to the database. • Motion synthesizers can generate queries to search for a specific action (previously annotated) with the closest match to a certain scenario.
5 Experiments Sample motions were generated using an in-house Mocap laboratory. The laboratory consists of 10 Hawk cameras and Cortex 2.0 software from Motion Analysis. The generated motions were all full body motions of different lengths (150 to 2500 frames), and the generated output was “.htr” format. Two types of skeleton definitions (27 and 15 segments) were used, and four different subjects performed the motions. Moreover, eight different body zones were defined and assigned to the skeleton definitions. The data schema was implemented in Microsoft Access, and MATLAB was used as a user application to access the database using the SQL toolbox.
106
A. Keyvani et al.
5.1 Human Motion Analysis and Data Update The focus of this work is to reuse real human motions in different virtual production applications. Here MATLAB reuses motion data for ergonomic analyses. MATLAB evaluates and annotates postures frame by frame. Among other possible standards, The Ovako Working Posture Analysis System (OWAS) [24] is used as an example of an ergonomic tool to analyze the motion. The OWAS concerns four different categories, which are: Back, Upper limbs, Lower limbs and Carrying Load. Based on the joint values in each frame the position of the body segments are decided. This result in a 4-digit code, were each digit corresponding to a category. The final combination will decide the ergonomic condition of a posture. These results are updated into the database, so other applications can use them while searching for an ergonomically acceptable posture (Fig. 4 up-left). 5.2 Annotation Report As mentioned, the OWAS analysis results were fed to the data schema by means of annotations. In addition, performed motions were also annotated based on the type of actions that subjects are doing. Tags such as walk forward, walk backward, seated, stand on one foot, etc. were applied to the corresponding frame ranges and body zones. The annotation tables enable the user applications to generate several types of reports and diagrams. Fig. 4 (right) shows the overall ergonomic condition of a selected motion. Based on the OWAS code each frame is situated in a colored zone. 5.3 Visualization
Red
Again, MATLAB and Cortex 2.0 were used to search for part of a motion in the database, retrieve it and visualize it. Search functions can be any type of query and the result can be a single frame or a range of frames for a full body or a partial body motion. Figure 4 (down-left) shows a sample search result for a range of motion tagged with “Walk Backward”.
Yellow Orange
4311 2331 4131 2151
Green
2161 4121 2121 4321 3321 1121 3121 1171 3221 100
400
700
1000
1300
Frames #
Fig. 4. Up-Left: frame samples presenting postures tagged with OWAS code 3121 (twisted, standing on both feet and with hands below shoulder). Down-left: frame samples presenting walk backward. Right: OWAS report of a sample motion during time (Frame #).
Schema for Motion Capture Data Management
107
6 Conclusion and Discussions The data schema presented in this article successfully stores and retrieves different motions regardless of variations in the skeleton types. It is also possible to annotate the motions based on customized body zones independent of the skeleton types. The presented data schema is validated using different type of queries, such as motion level queries, which are searching for annotations, and joint level queries, which are searching for a specific posture. The implemented annotation system enables the applications to search for a specific motion in a part of the body. The data schema is confirmed to be a suitable platform for the DHM and ergonomic tools. It is showed that via interpreters, ergonomic tools can communicate with the platform and examine motions frame by frame for unhealthy work postures. It is also shown that DHM tools can search among real human data for a required posture or motion.
References 1. Eklund, J.A.E.: Relationships between Ergonomics and Quality in Assembly Work. Appl. Ergon. 26, 15–20 (1995) 2. Siemens Product Lifecycle Management Software Inc., http://www.plm.automation.siemens.com/en_us/products/ tecnomatix/assembly_planning/jack/index.shtml 3. Ryan, P., Springer, W., Hiastala, M.: Cockpit Geometry Evaluation (1970) 4. Chaffin, D.B.: Digital human modeling for vehicle and workplace design. In: Society of Automotive Engineers, Warrendale, Pa (2001) 5. Yang, J.Z., Kim, J.H., Abdel-Malek, K., Marler, T., Beck, S., Kopp, G.R.: A new digital human environment and assessment of vehicle interior design. Comput. Aided Design. 39, 548–558 (2007) 6. Reed, M.P., Huang, S.: Modeling vehicle ingress and egress using the Human Motion Simulation Framework. In: SAE Digital Human Modeling for Design and Engineering Conference (2008) 7. Wenzel, S., Jessen, U., Bernhard, J.: Classifications and conventions structure the handling of models within the Digital Factory. Comput. Ind. 56, 334–346 (2005) 8. Mavrikios, D., Karabatsou, V., Pappas, M., Chryssolouris, G.: An efficient approach to human motion modeling for the verification of human-centric product design and manufacturing in virtual environments. Robot. Cim.-Int. Manuf. 23, 533–543 (2007) 9. Menache, A.: Understanding motion capture for computer animation. Morgan Kaufmann, Burlington (2011) 10. Lämkull, D., Hanson, L., Örtengren, R.: Uniformity in manikin posturing: a comparison between posture prediction and manual joint manipulation. International Journal of Human Factors Modelling and Simulation 1, 225–243 (2008) 11. Jung, E.S., Kee, D., Chung, M.K.: Upper body reach posture prediction for ergonomic evaluation models. Int. J. Ind. Ergonom. 16, 95–107 (1995) 12. Zhang, X.D., Kuo, A.D., Chaffin, D.B.: Optimization-based differential kinematic modeling exhibits a velocity-control strategy for dynamic posture determination in seated reaching movements. J. Biomech. 31, 1035–1042 (1998) 13. Lin, C.J., Ayoub, M.M., Bernard, T.M.: Computer motion simulation for sagittal plane lifting activities. Int. J. Ind. Ergonom. 24, 141–155 (1999)
108
A. Keyvani et al.
14. Kitagawa, M., Windsor, B.: MoCap for artists: workflow and techniques for motion capture. Elsevier/Focal Press, Amsterdam, Boston (2008) 15. Awad, C.: A Database Architecture for Real-Time Motion Retrieval. In: 7th International Workshop of content-based Multimedia Indexing (CBMI 2009), pp. 225–230 (2009) 16. Kahol, K., Tripathi, P., Panchanathan, S.: Documenting motion sequences with a personalized annotation system. IEEE Multimedia 13, 37–45 (2006) 17. Faraway, J.J.: Regression analysis for a functional response. Technometrics 39, 254–261 (1997) 18. Park, W., Chaffin, D.B., Martin, B.J., Yoon, J.: Memory-based human motion simulation for computer-aided ergonomic design. IEEE T. Syst. Man Cy. A 38, 513–527 (2008) 19. Witkin, A., Popovic, Z.: Motion warping. In: Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques, pp. 105–108. ACM, New York (1995) 20. Arikan, O., Forsyth, D.A., O’Brien, J.F.: Motion synthesis from annotations. ACM T. Graphic. 22, 402–408 (2003) 21. Faraway, J.J.: Data-based motion prediction. Society of Automotive Engineers, New York, NY, ETATS-UNIS (2003) 22. Aydin, Y., Nakajima, M.: Database guided computer animation of human grasping using forward and inverse kinematics. Comput. Graph.-Uk 23, 145–154 (1999) 23. Garcia-Molina, H., Ullman, J.D., Widom, J.: Database Systems: The Complete Book. Prentice Hall, Englewood Cliffs (2008) 24. Louhevaara, S.: OWAS—A method for the evaluation of postural load during work. Institute of Occupational Health and Centre for Occupational Safety (1992)
Simulating Ingress Motion for Heavy Earthmoving Equipment HyunJung Kwon, Mahdiar Hariri, Rajan Bhatt, Jasbir Arora*, and Karim Abdel-Malek Center for Computer Aided Design, The University of Iowa, 116 Engineering Research Facility 330 S. Madison Street Iowa City, IA 52246, USA 4110 Seamans Center for the Engineering Arts and Sciences, The University of Iowa, Iowa City, IA 52242-1527
[email protected]
Abstract. Design of heavy earth moving equipment is based primarily on feedback from the driver. Most design studies on ingress focus on the motion itself and rely heavily on experimental data. This process requires physical construction of an expensive mockup before any feedback can be obtained. Moreover, most research and development on subject of ingress are limited to studies on passenger vehicles. Although the design of heavy vehicles requires more consideration to human safety and comfort, very little attention has been given to simulating ingress movement on those vehicles. This paper describes the development of a model to perform ingress motion for heavy equipment and its applications to study the response of the operator for different cab designs using SantosTM, the digital human model developed at Virtual Soldier Research at the University of Iowa. Keywords: Human Modeling, Ingress, Predictive Dynamics, Heavy Equipment.
1 Introduction Collecting feedback from customers is an important consideration while designing the layout of fixtures inside a heavy vehicle. However, it is time consuming and expensive to conduct experiments with multiple human subjects. Moreover, experiments can be performed only if the actual physical model of the cab, also known as a mock-up, exists. Building different vehicle mock-ups for experiments is also expensive and time consuming. It is even more difficult to rearrange the layout in a physical model in order to test various conceptual designs. Virtual tools that allow the designer to visualize how an operator is able to see the cab and surrounding environment once a physical prototype is build, could be very helpful to the designer to surmount these difficulties. In addition, if the virtual tools also allow the user to receive feedback in terms of physics based data that correspond to how the operator feels during cab ingress, it would allow designers to try different layout designs and perform a comparative study without having to build prototype mock-ups. Hence, this study *
Corresponding author.
V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 109–118, 2011. © Springer-Verlag Berlin Heidelberg 2011
110
H. Kwon et al.
focuses on the simulating ingress motion for a virtual human, especially on the virtual model of the heavy equipment. By generating different motions on different models, the designers can integrate the results of virtual human simulation tools into heavy vehicle cab design. In the current field of study, most studies have focused on safety aspect due to the nature of the heavy equipment and its hazardous environment. Falls from equipment injuries in U.S mining were studied [1] and the impact on industry of those injuries was investigated [2]. Injury databases were utilized for this study by utilizing the injury narrative, nature of injury, body part injured, mine type, age at injuries. Impact forces during different ingress and egress methods are compared [3] [4]. This approach emphasizes the point that optimal design of entry and exit aids will minimize injuries. Ingress and egress motions were studied for injury aspects before the late 2000s. However, those investigations were concentrated on ingress/egress motion outside of the vehicle such as on the ladder or steps for going up or down. Moreover the studies just analyzed motions of certain subjects, but did not suggest or study improvements for design. Recently, the attention of studies has shifted to the ingress/egress motion itself. Reed [5] presented an approach to suggest guidelines for locating steps relative to door openings and seating positions. Ingress/egress motion for a truck cabin is usually measured by motion capture system [6]. Two different methods for evaluating joint loads were presented; force and moment methods and a force only method. Two different stepping motions were analyzed and validated. However, that study [6] is focused on stepping up and down motion, and on one specific truck cabin. Although numerous studies on ingress and egress have been performed recently, greater part of the studies are focused on exterior of the heavy vehicle (the ladder or on the steps). None of the studies have reported what the proper locations of the components. Therefore the present study will predict ingress/egress motion of heavy vehicle using digital human modeling. The motion either within different vehicle geometry environment or with different personal characteristics can be predicted. Ingress motion will be tested in the environment of different components such as controller, handle, seat affected by driver’s safety, health and efficiency.
2 Problem Definition 2.1 Heavy Equipment Environment Unlike passenger cars, heavy equipment vehicles have complex component connections, transmit loads and have attachment points for implements. In addition the heavy equipment vehicle is designed to travel over rough terrain. The operating room of the vehicle is located high off the ground level and is accessible through a ladder. This high position of the operating room is to protect the operator from the harsh construction environment as well as provide wider view of the construction area. While the driver’s seat is located proximally to an entrance door for a passenger car, in heavy equipment vehicles the operator’s seat is located in the center of the room. The operating room of a heavy vehicle has more space for the driver to move in and out. In addition, the operating room ceiling is placed at a higher level making it
Simulating Ingress Motion for Heavy Earthmoving Equipment
111
easy for the operator to walk upright and reach to the operator’s seat. The machine’s frame, articulation (control panel), and steering for wheeled equipment are the major components of the heavy equipment operating cab. Once the operator steps off the access ladder, the operator has to avoid these components to get in or get out. Since it is off the ground and is a restricted space, the locations of those components become important. The location and placement of these components should also guarantee the operators’ safety. 2.2 Safety Factors (Regulations) Heavy equipment operators are prone to fall frequently both on and off the work zone premises, and for many different reasons [7]. Use of foot and handholds, as well as non-skid coating on the deck, can help prevent falls. In addition, drivers are expected to maintain a "three-point stance" during truck ingress and egress. That is, they are expected to have two feet on the steps and one hand on the handhold at all times. This stance should also be used when climbing up and down or between the cab and trailer to connect brake and light lines. Furthermore, the driver must be aware that jumping from cabs, trailer is dangerous.
3 Video Analysis Ingress and egress movements are complex, as almost the whole body is involved in the motion within a highly constrained environment. These motions are affected by not only the physical capability of the operator, but also dimensions and anthropometry of the operator’s body. The size, layout, width and shape of the door influence the motion as well. A detailed motion analysis must be carried out in order to be able to generate different movements for predicting motion. The motion is defined based on the structure of the object (vehicle). Stepping up or climbing up a ladder is included in ingress motion for the heavy equipment vehicle. But, for some larger vehicles such as the wheel loader, which have large operating rooms, entering the operating room and sitting could be also a part of ingress. 3.1 Motion Analysis Experienced drivers were asked to get in and out normally of a wheel loader. The open door motion strategy is studied as a pilot study in this paper. The observation is mostly focused how to move from or to the interior of the cab avoiding obstacles. Key components are determined as obstacles to avoid. The movement of the body and direction of the foot was tracked in the cab geometry. Ingress motion is analyzed in chronological phases. Ingress consists of three distinct motion phases called subtasks: walking forward (to approach the cab), ladder climbing, and the ingress maneuver. The ingress maneuver is again divided into two subtasks: entering the cab and sitting. For the continuity of the motion for ingress, the end position of each subtask should be identical with next starting position. The key frames for each subtask are summarized as below;
112
H. Kwon et al.
Subtask 1 – Entering inside. Since climbing the ladder and opening the door will not be considered in this subtask, the starting position should be standing posture. Therefore, Subtask 1 starts from a standing posture and ends in a sit-ready position in the cab. Subtask 2 – Sitting. This is the position in which a person fits their body in a narrow area between the dashboard and seat. Then the hip should be located in the proper area of the seat.
Fig. 1. Motion Analysis
3.2 Key Components In order to improve computational efficiency not all components of the cab are included. Key components are determined by video analysis. When the operator gets in through the door frame, it becomes a key component to avoid. Both arms cannot penetrate the frames and foot cannot go through a step; the operator has to bend to avoid hitting his head on the top side of the frame. While moving in and out of the narrow seat area, the lower body should avoid seat. Therefore seat is defined as another key component. The key components are provided to the ingress model in the form of simple primitives that represent the complicated cab geometry. 3.3 Model A CAD model of the cab is redefined and reduced to a primitive model. The prop model consists of key components giving users the ability to test and interact with between these components easily and study the resulting motion. It also simplifies the model, it allowing the user easily define the necessary parameters for a given task. This model takes on the size and shape of a generic model representing the model the operator is performing the task on. The user can then scale this prop model to size, orientatation, and position the model as required for the task. The properties defined by the model are then passed to the digital human modeling evaluator and task analysis can be performed. It is imperative that the user has the ability to easily define the parameters. It is also important for the user to be able to expand the ingress models to different cab models. For these reasons, it has been determined that the best way to give user both ease of use and flexibility is to use a more simplistic prop model than Fig 2(a).
Simulating Ingress Motion for Heavy Earthmoving Equipment
(a) Actual Model
113
(b) Simplified Prop Model
Fig. 2. Digital Models
3.4 Inputs Obstacles. The first type of input needed from the user is information on the particular cab model. This is provided to the ingress motion in the form of simple primitives that represent the complex cab geometry that SantosTM will encounter while entering the cab. This task accepts two types of primitives, cylinders and planes. The cylinders and planes can be defined as either finite or infinite. For finite cases, the cylinder is defined by two end points and a radius; while, a plane is defined by three points and a thickness. These primitives can then be used to approximate geometry where santos will come in contact with during ingress. Foot Position. Foot position is defined by the combination of the size of the seat cushion, which shape is assumed to be a square, location of the door, used as a criterion to recognize the door location from the Santos so that he can detect the entry point. Foot position is also defined by available room area that Santos can walk in. As available area in the room gets larger, bigger step sizes are needed. Once user makes changes to the cab geometry, the step will be automatically adjusted. Hand Positions. To adhere to the three points of contact safety regulations in section 2.2, the operator must grab something for their safety while moving into the cab. In order to satisfy this condition, the user has to specify hand location as an input. Hand position is measured in Cartesian coordinate space from global axis of digital human. The user can choose either right hand or left hand. Assumptions. For this study, in its preliminary stage, there are several assumptions. Walking to the ladder and ladder climbing are not considered. Instead of connected motion between ladder climbing and entering, it assumed that Santos is standing on the deck area; as his starting position. The door of the operating room is open such that driver doesn’t need to open and close during the motion. The operator’s movable range is limited to a minimum of 3 steps at the very least .The adjustable area for the objects should not interrupt those limited steps. Other assumptions in this study include the exclusion of vision effects, lack of reaction forces and friction is made during this study.
114
H. Kwon et al.
4 Simulation Concepts First, assuming operating room area does not change, seat position is selected in three different locations along each x (case1) and z (case2) axis in Fig 3.
Case 1
Case 2
Case 3
Fig. 3. Simulation cases
Maximum movable points are marked as points 1and 3 in Fig. 3; Point 2 is the midpoint of the line joining points 1 and 3. Movable seat area is measured excluding the foot rest area (case 1) and control penal area (case 2). Ceiling height is a variable in case 3. Point 3 is sitting height of the digital human and point 1 is standing height in case 3. Default locations of the actual model are Point 2 for the case 1 and case 2, point 1 for the case 3 are. Therefore, a total 9 different cases will be simulated to investigate the effects of location of key components with regard to the movement of digital human.
5 Formulation The entire simulation is governed by an optimization formulation. The design for ingress motion where i is the number of variables are the joint profiles degrees of freedom. Besides the joint profiles, the initial posture is also optimized rather than specified from the experiment, while, the torque profiles are calculated from joint profiles using the recursive Lagrangian dynamics equations. The dynamic effort (the integral of the squares of all joint torques) is used as the objective function and is defined as: Minimize
, | |
, | |
d
(1)
is the maximum absolute value of where T is the total time for the one step, |τ| joint torque limit. Several constraints are proposed and implemented in this work to satisfy laws of physics and boundary conditions throughout the ingress process. These constraints include joint angle limits, ground penetration, and foot contacting positions, ZMP location [8], obstacle avoidance, continuity conditions.
Simulating Ingress Motion for Heavy Earthmoving Equipment
115
5.1 Obstacle Avoidance An avatar must avoid the collision of its body segments with other non-adjacent body segments as well as with the objects in the environment while performing a task. The avatar body segments and the obstacles in the environment are modeled using surrogate geometry. The body segments are represented by using one or more spheres rigidly attached to a local reference frame, such that these spheres move with the body segments.. The objects in the environment are modeled using one or more of the five primitive geometries: spheres, infinite cylinders, infinite planes, finite cylinders, and finite planes. A generic collision avoidance strategy is developed to avoid spheres with all the five primitive geometries used for representing obstacles. As discussed in [9], the gradients of the collision avoidance constraint between 2 primitives should be continuous. For this purpose, the primitives used in collision avoidance need to have smooth surfaces (no edges). But finite cylinders and finite planes do not have smooth edges. Hence, as discussed in [9], we use additional spheres and cylinders to smooth out the edges of finite cylinders and planes and consider these modified elements instead, such that the constraint gradients are continuous. Note that collision avoidance includes both self (the avatar) avoidance and obstacle (the virtual environment) avoidance. As a result, the (modified) finite cylinders and planes used in the collision avoidance modules look as shown below:
Fig. 4. Finite cylinders and planes
5.2 Continuity Conditions The entire ingress tasks are divided to separate subtask for the convenience of calculation. To combine each subtask together, the continuity condition constraint is imposed. The basic idea is that the final posture of the previous subtask and initial postures of the subsequent subtask should be identical. These conditions are expressed as T
T
T
T
0
t
(2)
T
are previous and following subtasks, , represents angle, velocity, where q , q of each DOF at the final (T) and initial time(0). The continuity condition is imposed on every subtask.
116
H. Kwon et al.
6 Results The output results from the formulation are joint angle profiles and joint torque profiles. The predicted joint angle profiles and joint torque profiles are presented in Fig. 5 (case1), Fig. 6 (case2) and Fig. 7 (case3) respectively and plotted against time period. Ingress is a whole body motion; all the joint angles and torques would be plotted and compared. However, one specific joint – 1st spine is presented in this paper due to low back pain issues which is a pervasive disorder, affects 70% to 80% of adults at some time during their lives, and it costs at least $16 billion each year and disables 5.4 million Americans [10]. Therefore 1st spine is selected and plotted in each case. Entering motion for subtask1 arises from 0s to 3s followed by sitting motion for subtask2 from 3s to 4.5s. Since hand inputs have not changed, results of sitting motions are more sensitive than entering motions as changing geometry,. Most of the plots shaped similar in case 1 and 2 due to the foot position which is formulated to step around the seat area. In case 3, changing amount of 1st spine flexion/extension has been watched due to the height of the ceiling.
Fig. 5. Joint Angle and Torque for case 1
Fig. 6. Joint Angle and Torque for case 2
Simulating Ingress Motion for Heavy Earthmoving Equipment
117
Fig. 7. Joint Angle and Torque for case 3
7 Conclusions Initial results obtained from this study are promising and allow a user to perform comparative virtual studies of ingress motion for different geometry configurations and subsequently analyze the kinematic and kinetic results. This, in turn, helps with designing better cab interior for drivers to move without having to build expensive prototypes. This study is helpful in gathering information about critical effect of the joint actuation requirements under different loading conditions so that designers can prevent injuries to workers. By investigating changes in cab geometry for cab design process during ingress, working conditions for the operators can be improved. Such a capability will help in better design of in-cab environments of the heavy vehicles, and could be a useful tool in reducing the injury-related problems for the operators. The same process can also be applied for simulating egress motion.
References 1. Moore, S.M., Porter, W.L., Dempsey, P.G.: Fall from equipment injuries in U.S. mining: Identification of specific research areas for future investigation, http://www.cdc.gov 2. Gavan, G.R., Strassel, D.P., Johnson, D.: The Development of Improved Ingress/Egress Systems for Large Haulage Trucks, SAE (1980) 3. Giguere, D., Marchand, D.: Perceived safety and biomechanical stress to the lower limbs when stepping down from fire fighting vehicles. Applied Ergonomics 36, 107–109 (2005) 4. Fathallah, F., Cotnam, J.P.: Maximum forces sustained during various methods of exiting commercial tractors, trailers and trucks. Applied Ergonomics 31, 25–33 (2000) 5. Reed, M.P.: Simulating Crew Ingress and Egress for Ground Vehicles. In: Ground Vehicle Systems Engineering and Technology Symposium (2009) 6. Monnier, G., Chateauroux, E., Wang, X., Roybin, C.: Inverse Dynamic Reconstruction of Truck Cabin Ingress/Egress Motions. In: Digital Human Modeling for Design and Engineering Conference and Exhibition (2009)
118
H. Kwon et al.
7. Flatow, S.: Safety in the trucking industry: non-driving incidents. Professional Safety (2000) 8. Kwon, H., Xiang, Y., Rahmatalla, S.: Optimization-based Digital Human Dynamics: santosTM walking backwards. In: ASME, International Design Engineering Technical Conferences & Computers and Information in Engineering Conference (2007) 9. Hariri, M., Bhatt, R., Arora, J., Abdel-Malek, K.: Optimization-based collision avoidance using spheres, finite cylinders and finite planes. In: 3rd International Conference on Applied Human Factors and Ergonomics, AHFE (2010) 10. John, W.: Back Pain and Sciatica. N. Engl. J. Med. 318, 291–300 (1988)
Contact Area Determination between a N95 Filtering Facepiece Respirator and a Headform Zhipeng Lei and Jingzhou (James) Yang Department of Mechanical Engineering Texas Tech University, Lubbock, TX 79409 Tel.: 806-742-3563, Fax: 806-742-3540
[email protected]
Abstract. This study investigates two methods to determine the contact area between a N95 filtering facepiece (NFF) respirator and a headform. Five size headforms (large, medium, small, long/narrow and short/wide) wear a NFF respirator (3M 8210) respectively. A biofidelity finite element model of headform is built to simulate its interaction with a NFF respirator. During the simulation, the respirator contacts the headform. Two methods are presented in this paper for determining contact areas: The first one is through the observation of contact pressure distribution. The assumption is that the contact area is the fraction of surface area with positive contact pressure. The second method is through extracting the intersecting area between deformed surfaces of the headform and NFF respirator. Finally, the experiment, which directly measures the dimensions of contact area between prototypes of the headform and the NFF respirator, validates the proposed methods. Keywords: Headform, respirator, finite element (FE) method, contact area.
1 Introduction The N95 filtering facepiece (NFF) respirator protects the user by filtering at least 95% of airborne particles out of the breathing air. Respirator contact has been studied for evaluating respirator fit (Lei et al., 2010). Contact areas are generated when a respirator contacts a human face. Hidson (1984) obtained the contact area between a mask and a face by mounting a mask on a standard headform and spraying a chalk dust in the interior zone. The inner boundary of the contact area was drawn by the chalk dust and the outer boundary was drawn by the mask boundary. This method is feasible for the mask, which has a thick contact area, but is not feasible for a respirator, which has a very narrow contact area. Dellweg (et al., 2010) measured contact area between a non-invasive ventilation and an even surface. Color imprints of the mask cushion on scaled paper were taken and the area of the color imprints was recorded by planimetric measurement. In order to use this method, the boundary of the mark has to be nearly in the same plane, while the NFF respirator does not fit this requirement. Friess (2004) used a 3D scanner to calculate the respirator seal area. Subjects’ faces were scanned before and after they wore a respirator. The outlines of V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 119–128, 2011. © Springer-Verlag Berlin Heidelberg 2011
120
Z. Lei and J. (James) Yang
the inner and outer seals of the respirator were marked and recorded. A 3D surface model of the area on the face was created. Still, this method is useful for a respirator has a thick sealing area, but is not suited for a NFF respirator. This paper focuses on methods to determine contact area between a NFF respirator and a headform. Contact simulations between a NFF respirator and a headform are introduced using high biofidelity FE models (Lei et al., 2010) in Section 2. Two methods to determine contact areas are proposed in Section 3. Experiments are set up for validating the methods in Section 4. Finally, a summery is given.
2 Contact Simulations Five size headforms (large, medium, small, long/narrow and short/wide) created by the National Institute for Occupational Safety and Health (NIOSH) (Zhuang et al., 2005) are used in this study. An NFF respirator, 3M 8210 (one size), is used. Headforms and the respirator are generated by a 3D scanner (Fig. 1 and 2).
Fig. 1. Five size headform models from NOISH (Zhuang et al., 2005)
Fig. 2. Scanned 3M 8210 respirator model
Five size headforms wear the respirator separately. Thus, there are five contact simulations (Lei et al., 2010). Fig. 3(a) shows that the respirator model is comprised of multiple layers and two straps. As shown in Fig. 3(b), the headform model is divided into five parts (two areas for cheeks, one area for the upper forehead, one area for the chin, and one area for the back side of head). It has multiple layers that include a skin layer, muscle layer, fat tissue layer, and bone layer, as shown in Fig. 4. Two stages are used for the contact simulation. Stage I is to wrap the straps around the back of the headform and pull the respirator away from the face. Fig. 5(a) is the initial state of Stage I and Fig. 5(b) is the ending state of Stage I.
Contact Area Determination between a NFF Respirator and a Headform
(a)
121
(b)
Fig. 3. FE models of (a) the respirator and (b) the headform
(a)
(c)
(b)
(d)
Fig. 4. Layers in (a) forehead part; (b) left and right cheek parts; (c) chin part; (d) four front face parts together
Stage II is to release the respirator so that the respirator moves towards the face. Deformed geometries and internal stresses are imported into the initial state of Stage II (Fig. 5c). Strap forces and contact interactions are automatically generated between the respirator and headform. Finally, the respirator contacts the human face (Fig. 5d).
3 Determination of Contact Area This section presents two methods to determine the contact area. The first one is through pressure distributions on the headform surface and the assumption is that
122
Z. Lei and J. (James) Yang
contact area is the fraction of surface area with positive pressures. The second one is through extracting the intersecting area between the deformed surfaces of the headform and the respirator.
(a)
(b)
(c)
(d)
Fig. 5. Stages: (a) Stage I initial state; (b) Stage I ending state; (c) Stage II initial state; (d) Stage II ending state
3.1 Contact Area from Pressure Distributions When two objects contact, the stresses and deformation arise (Johnson, 1985). On a point of contact area, the stress is resolved into a pressure, acting along the common normal of the contact surface, and a shear, acting in the tangent plane of the contact surface. The pressure is highly concentrated on the contact area. Thus, we can assume that the contact area is part of surface area where pressure is positive. Using contact pressure as the indicator is the easiest way to determine contact area, since LS-DYNA has the post-processing software, named PrePost, that can directly plot the pressure distribution on the headform surface. Figs. 6(a)-(e) give the contours of pressure on the headform surfaces (large medium, small, long/narrow and short/wide). The surface area with a deep blue color indicates zero pressure, while the area with other colors indicates the contact area that has positive pressure values. 3.2 Contact Area from the Intersection of Surfaces The contact area depends on the geometry of the contact bodies. When the respirator contacts the headform, their surfaces contact each other and the intersection happens. Therefore, the contact area can be determined by extracting the intersecting area between surfaces. The definition of intersection is that a set of points common to two or more geometric surfaces. In respirator contact area, points belonging to the respirator surface exactly locate on the headform surface without distance. However, the headform and respirator surfaces are discretized into triangle elements and conforming contact among these triangles in 3D space is nearly impossible. Thus, we should expand our definition of intersection to the one that a set of points on the respirator surface is close to the headform surface with a very small offset distance, like 0.1 mm.
Contact Area Determination between a NFF Respirator and a Headform
(a)
(b)
(d)
123
(c)
(e)
Fig. 6. Pressure distribution: (a) large; (b) medium; (c) small; (d) long/narrow; (e) short/wide
In the final state of Stage II, the shapes and positions of the headform and respirator reach a constant state. The inner surface of the respirator and the outer surface of the headform are extracted and stored as a Keyword file that stores the information of unstructured surfaces consisting of triangles, as shown in Fig. 7. The format of the Keyword file is described in LS-DYNA’s Keyword user’s manual (Livermore software Technology corporation, 2007). A Matlab program is developed to read the Keyword file. Two sets of data are exported from the Keyword file. One is the set of triangle elements and the other one is the set of nodes which are denoted by global Cartesian coordinates. So, two matrixes are created for restoring the two sets of data and for further operating.
Fig. 7. Respirator and headform surfaces in Keyword format
124
Z. Lei and J. (James) Yang
The next step is to use an algorithm to find the shell elements on contact area. Let us set the respirator surface as the slave segment and the headform surface as the master segment. At the beginning, a search for each slave node on the respirator surface to find its nearest node on the master segment (the headform surface) is done. Consider a slave node, Ns, and a master node, Am, which is Ns’s nearest node on the headform surface. The master element m is one of the elements that contain the node Am. Fig. 8 shows that the node Pm is the projection of Ns onto the plane that has the element m. The vector g begins at the node Am and ends at the node Ns; the vector h begins at the node Am and ends at the node Pm; the vector l begins at the node Pm and ends at the node Ns. The vectors c1 and c2 begin at the node Am and end at the node Bm and Cm, respectively.
Fig. 8. Projection of a slave node Ns onto the plane that has the master element m
l = (g in )n
(1)
h = g −l
(2)
For element m
n=
c1 × c2 c1 × c2
(3)
If the slave node Ns contacts the master element m, two conditions should be satisfied. First, the node Pm should lie in the triangle element m. The following tests can be used:
(c1 × h )i(c1 × c2 ) > 0 (c1 × h )i(h × c2 ) > 0
(4)
Second, the direction of l should be opposite to the direction of n or the magnitude of l should be within a very small tolerance (0.01 mm). According to Equation (1), this condition can be expressed mathematically as following:
Contact Area Determination between a NFF Respirator and a Headform
g in < 0.01mm
125
(5)
When g in < 0mm , the slave node Ns penetrates the master element m. Using the criteria described above, we can obtain the contact nodes on the respirator surface and the contact elements on the headform. The contact area consists of these contact elements. Fig. 9(a)-(e) show the contact areas which are from the surface intersections.
(a)
(b)
(d)
(c)
(e)
Fig. 9. Contact areas: (a) large; (b) medium; (c) small; (d) long/narrow; (e) short/wide
4 Validation This section conducts experiments to validate the contact areas from the computation. Because the contact area is covered by the respirator and any movement of the respirator will affect the respirator contact, it is very hard to obtain the contact area from the experiment. However, the experiment can be used to validate our second method of determining the contact area through surface intersecting. The experiments are based on the characteristics of the pressure distribution and contact area. Fig. 10(a) gives the pressure distribution on the headform surface. Pressures are gathered in six key areas of the headform surface. The six key areas are numbered as followed: (1) nasal bridge, (2) top of right cheek, (3) top of left cheek,
126
Z. Lei and J. (James) Yang
(4) bottom of right cheek, (5) bottom of left cheek, and (6) chin, as shown in Fig. 10(b). Since contact area is related to the pressure distribution, Fig. 10(c) shows that the contact area is also grouped in these six key areas. Thus, we can define three dimensions that depict the contact area. Width 1 is the linear distance of the key area (2) and (3), Width 2 is the linear distance of the key area (4) and (5) and Length is the linear distance of the key area (1) and (6). These three dimensions, Width 1, Width 2 and Length, can be measured in the computational post-processing and the experiment, and be compared for the validation.
(a)
(b)
(c)
Fig. 10. (a) Pressure distribution; (b) Six key area; (c) Contact area with dimensions
An experiment for validating the results from simulations is set up. Prototypes of five size headforms, produced by American Precision Prototyping, LLC with Polypro-Like Accura 25 material, are used. The surfaces of the prototypes are nondeformable and detailed facial features and sizes are maintained as the same as the NIOSH models (Zhuang et al., 2008). The Tactilus free form sensor system, manufactured by Sensor Products Inc., is incorporated into the experiments. Separate sensors with thin and tactile surface are available for conducting measurements on surfaces with complex geometries like a human face. The user can place sensors at desired locations. The hub collects the pressure data and sends the data to the computer by a cable. The Tactilus software collects and stores the results. A vernier caliper is also used to measure the linear distance. The respirator is worn by the headform prototypes. Because the NFF respirator is closed to the headform, it is hard to locate their contact area purely by human eyes. So, pressure sensors are adopted for locating the boundary of the contact area. In each respirator-headform combination, the vernier caliper measures the three dimensions of the contact area (Width 1, Width 2, and Length). Table 1 presents the dimensions of the contact areas in computational results and experimental results. Fig. 11, in which the horizontal axis is the dimensions of contact area from experiments and the vertical axis is the dimensions of contact area from simulations, provides linear regression analysis on computational and experimental results. The R2 value of 0.9105 indicates that our simulation results are well validated by the experiments.
Contact Area Determination between a NFF Respirator and a Headform
127
Table 1. The results of contact area dimensions (Units: mm) Dimension Experiments Headform Width 1 Width 2 Length
Large Medium Small Long/narrow short/wide
108.69 108.84 104.83 101.80 113.94
100.03 104.72 97.89 94.36 108.71
123.14 120.60 107.34 122.61 110.26
Simulations Width 1 Width 2 Length 108.34 98.90 122.15 104.16 102.91 115.35 106.41 92.26 108.26 99.99 89.19 122.09 112.90 103.81 114.94
Fig. 11. Correlation of experimental and computational results
5 Conclusion This paper discusses methods to determine contact area between a NFF respirator and a headform. Five size headforms (large, medium, small, long/narrow and short/wide) wear a NFF (3M 8210) respectively. Two methods were developed for determining the contact areas between a respirator and headform. The first one is through plotting the pressure distribution and the second one is through Matlab programming that extracts the geometry intersection. The first method is easy and fast, but the second one can give more accurate results. Finally, the computational results are validated by experiments. The contact area between the respirator and headform is very useful for respirator designers and standard makers. It can be used to exam the respirator fit. The discontinuous curve of contact area indicates potential respirator leak locations. The contact area can also be used to design a new half facepiece respirator. By overlapping the contact areas from five respirator-headform combinations, we obtain the workspace of respirator sealing, as shown in Fig. 12. Future work includes more simulations to obtain contact areas from other respirator brands. In addition, the relationships between the contact area and headform dimensions and between the contact area and respirator designs should be investigated.
128
Z. Lei and J. (James) Yang
Fig. 12. The workspace of respirator sealing
References 1. Dellweg, D., Hochrainer, D., Klauke, M., Kerl, J., Eiger, G., Kohler, D.: Determinants of Skin Contact Pressure Formation During Non-invasive Ventilation. Journal of Biomechanics 43, 652–657 (2010) 2. Friess, M.: Analysis of 3d Data for the Improvement of Respirator Seals-Final Report (2004) 3. Hidson, D.J.: Compotator-aided Design of a Respirator Facepiece Model. Defense Research Establishment Ottawa, Report, Ottawa (1984) 4. Johnson, K.L.: Contact Mechanics. Cambridge University Press, Cambridge (1985) 5. Lei, Z., Yang, J.: Toward High Fidelity Respirator and Headform Models. In: 1st International Conference on Applied Digital Human Modeling, Miami, Florida, July 17-20 (2010) 6. Lei, Z., Yang, J., Zhuang, Z.: Contact Pressure of N95 Filtering Face-Piece Respirators Using Finite Element Method. Computer Aided Design and Applications 7(6), 847–861 (2010) 7. Livermore software Technology Corporation, LS-DYNA Keyword User’s Manual. Version 971, LSTC, vol. I (May 2007) 8. Zhuang, Z., Bradtmiller, B.: Head-and-Face Anthropometric Survey of U.S. Respirator Users. Journal of Occupational and Environmental Hygiene 2, 567–576 (2005) 9. Zhuang, Z., Viscusi, D.: A New Approach to Developing Digital 3-D Headforms. In: SAE Digital Human Modeling for Engineering and Design, Pittsburgh, PA, USA (2008)
Ergonomics Evaluation of Three Operation Postures for Astronauts Dongxu Li and Yan Zhao College of Aerospace and Material Engineering, National University of Defense Technology, 410073, Changsha Hunan, China
[email protected],
[email protected]
Abstract. Push/pull is very common and frequent activity for astronauts in the space operation. For the sake of researching on the strength change of astronauts’ upper limbs during different operation postures, this paper major on performed the simulation aviation operation experiment. The paper quantitatively evaluates the absolute peak force, relative peak force of human single left/right hand and strain energy of grip made by operating force in comfortable, horizontal widest, and longitudinal widest span. The result here demonstrates that absolute peak force, relative peak force, strain energy are effective factors to estimate upper limbs strength, and the individual upper limbs power is significantly effected by different postures. Keywords: astronaut, upper limb, absolute peak force, relative peak force, strain energy.
1 Introduction Astronauts play a very important role in manned spacecraft system. In microgravity environment, however, they must go through decreases in muscle tone mass, alterations in the neuromuscular functions, etc. As the results, these changes of skeletal muscle will induce the remodeling of muscle, and then decrements occur in skeletal muscle strength, fatigue resistance, motor performance and connective tissue integrity[1].Within the investigation of spaceflight experience and previous studies: the maximal force of several muscle groups showed a substantial decrease (6-25% of pre-flight values), the maximal power during very short “explosive” efforts of 0.250.30s showed an even greater fall, being reduced to 65% after 1 month and to 45% (of pre-flight values) after 6 months[2]. Therefore, astronauts must expose to different degrees of altered performance in strength, endurance, balance, coordination, and the ability to ambulate in different muscle groups in the various mission specific periods. The structural degradation of muscle functions could increase the risk of health and interfere with the safe completion of critical tasks during spaceflight or during emergency procedures or EVA (Extravehicular Activity). V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 129–138, 2011. © Springer-Verlag Berlin Heidelberg 2011
130
D. Li and Y. Zhao
The evaluation for the skeletal muscle characteristics has become global research hot spot in the field of Manned Space Flight, with significant theoretical meaning. A large number of studies have been performed to obtain quantitative criteria of muscle function not only in ground-based experiments but also in on-orbit tasks: with humans currently occupying the International Space Station (ISS) for six months and space exploration missions of one to three years on the earth. NASA had designed the Effect of Prolonged Space Flight on Human Skeletal Muscle Program to assess muscle volume [3,4]. The other study-Evaluation of Skeletal Muscle Performance and Characteristics [5] has similar purpose of understanding muscle de-conditioning and evaluating the effects of the in flight countermeasures on muscle performance. Within this experiment, a survey of hand endurance was carried out at current and relative strength levels with a hand dynamometer. When monitored the current strength, the subjects will be asked to hold a 50% as long as possible, the absolute endurance was assessed with a series of contractions(3s contraction, 2s relaxation) at 75% of preflight strength, but relative endurance was (relative to 50% current level of strength) increased or was maintained after long duration flight. Muscle Atrophy Research and Exercise System (MARES) sponsored by ESA, is capable of supporting measurements and exercises on seven different human joints, encompassing nine different angular movements, as well as two additional linear movements(arms/legs)[6,7]. It’s considerably more advanced than current ground based medical dynamometers (measure force or torque). MARES could gather the datum for parameters, such as position control, velocity control, torque/force control and quick release that is generally associated with any desired external devices. The Enhanced Dynamic Load Sensor (EDLS) experiment was established in the Mir habitat module to respond to the crew-induced forces and torques imparted on interior surfaces. Analysis of collected datum indicated that the crew loads induced on module internal structures were no greater than 70N at a frequency of range from 0 to 10 Hz [8,9,10,11,12]. From above mentioned research experiences, it is directed that a successful measurement and evaluation for skeletal muscle activity is important to understand the operators' physiological capabilities and limitations. The capabilities and limitations of operators as critical factors will affect safety, performance, and productivity for spaceflight tasks. In order to evaluate the local load of astronauts arm muscles, the ground based experiment was conducted in this study. So two especial circumstances were simulated: when subjects participate in this experiment, they must put on the spacesuit which inner excessive pressure is 40Kpa and inner temperature fluctuates from 18 to 26 ; the body of subject who wore spacesuit was supported by a metal ring frame near the centre of waist, the feet of subject were hung in the air and slightly attached the ground. Three issues that relate to the muscles activity were concerned: the type of manual handling, push and pull behaviors were paid attention in this study; the mechanical properties of skeletal muscle, focus was paid on the isometric (static) contraction; the position of upper limb, three positions which two arms of operators relative to their chest was considered.
①
①
℃②
②
③
Ergonomics Evaluation of Three Operation Postures for Astronauts
131
2 Method 2.1 Subjects 11 healthy male volunteers took part in the laboratory study. According to the normal criterion of Body Mass Index (BMI) for adults in China is between 19 and 25, subjects whose BMI is coincident with the criterion were selected. All subjects were in good health and must not be suffering from physical complaints at the time of study. 2.2 Simulation of the Operation Task The handle which is mainly made of alloy steel with implanted six-dimensional (3D forces/moments) strain transducer was designed for this test. The handle’s holding rod is 20cm long. Two handles are both fixed on the Lucite plate with measuring scale which has scale mark on its surface, accurately indicating the installation location of handles. Based on the experience from human spaceflight push/pull is very common activities in the space operation, for instance, hatch opening, traversing in the space capsule, and climbing the ladder on the outside surface of vehicles. The simulation mainly consisted of push/pull activities. The three positions that two arms of operators relative to the body themselves were considered: P1, left/right handles located at comfortable height and width over the subjects’ chest; P2, the widest span between left and right handle that at comfortable height over the subjects’ chest; P3, put the left handle above chest and right handle under chest, the furthest span between left and right handle that at comfortable width over subjects’ chest (see Fig.1).
①
②
Fig. 1. Real handle object and its installation location
③
132
D. Li and Y. Zhao
2.3 Procedures The participants operated handles at P1, P2, P3 positions when they wore the spacesuit. There are two measurement statues that included single left-hand and righthand operation. In addition, there are six-way measurement directions that included left/right, forward/back, up/down (Table1). Thereby, the overall experiment was carried out in six sub-tests. Taking the density of carbon dioxide inner the spacesuit into account, each sub-test lasted about 5±1min, between these sub-tests, the subjects were allowed to rest for about 2min (5±1min contraction, 2min relaxation). An additional emergency break was also allowed in the experiment for the health of subjects. Each measuring directions consisted of three successive push/pull isometric contractions short intervals (12s). Participants finish the push/pull tasks by their most strongly force according with their feelings. Table 1. Operating force measurement list Positions
P1 P2 P3
Status Single left-hand / Single right-hand
Directions left right up down front back
2.4 Data Analysis In order to verify the effect of different positions in the evaluation of push and pull isometric contractions for participants, the different forces of sub-tests in simulated situations were measured by the strain transducers of handles. Results from transducer recordings were analyzed for the absolute peak forces, the relative peak forces, and the strain energy of handles. Choosing the peak forces in measurement results of the three successive push/pull isometric contractions as the push-pull effort was called the absolute peak forces, Fabs − peak (N). The quotient that the absolute peak forces divided by the weight of operators themselves is the relative peak forces, Frel − peak (N/kg):
Frel − peak =
Fabs − peak W
(1)
There is no velocity and displacement when muscles complete isometric contractions. In order to give a description of the energy for operations a method applied strain energy to estimate operators’ energy is proposed in this paper. Vε is used to represent strain energy. Fig. 2a shows the coordinate of the handle, Fig. 2b provides the parameters of the handle.
Ergonomics Evaluation of Three Operation Postures for Astronauts
133
l2 l1
a. The coordinate of handle
2
l1
b. The parameters of handle
Fig. 2. The handle illustration
Strain energy for the grip (SEG) of handles was calculated using the following formula (2-7):
Vε − axes1
1 1 ( Fx ) 2 ⋅ ( l1 ) 2 = 2 2 E1 A1
(2)
Where: Vε − axes1 = SEG along grip axes (J), Fx =the operating force in direction X,
E1 = Young’s modulus of the grip (N/m2), A1 = cross sectional area of the grip (m2);
Vε − twist1
1 ( M x )2 ⋅ a b 1 b−x 1 = 2 + ∫ ( Mx ⋅ )⋅ dx 0 2G1 I p1 2 x G1 I p1
(3)
Where: Vε − twist1 = SEG along grip twist (J), M x =the operation moment in direction X, G1 =shear modulus of the grip, I p1 = polar moment of inertia of the grip(m4), I1 = cross sectional moment of inertia of the grip (m4). The grip of handle consists of hollow cylinder, so:
Ip =
π D4
π D4 (1 − α 4 ) I = (1 − α 4 ) 32 64 ,
(4)
Where: D =external diameter(m), d =interior diameter (m). Otherwise, the mathematical formula considering strain energy along grip bend in indirection Z and Y are as following, respectively: 1 1 1 bx − x2 1 x 2 1 1 ( Mz − Fy ⋅ a − Fy ⋅ − Fy ⋅ ) ( Mz − Fy ⋅ x)2 b 2 2 2b 2 2 dx 2 Vεz−bend1 = ∫ 2 dx + ∫ 2 0 0 EI EI 1 1 1 1 a
(5)
Where: Vε z − bend1 = SEG along grip bend in direction Z (J), Fy the operating force in direction Y (N), M z the operation moment in direction Z (N
·m).
1 1 1 bx−x2 1 x 2 1 1 2 ( M + F ⋅ a + Fz ⋅ + Fz ⋅ ) ( M + Fx ) y z y z a b 2 2 2b 2 2 dx 2 Vεy−bend1 = ∫ 2 dx+∫ 2 0 0 EI EI 11 11
(6)
134
D. Li and Y. Zhao
Where: Vε y −bend 1 = SEG along grip bend in direction Y (J), Fy the operating force
·
in direction Z (N), M z the operation moment in direction Y (N m). The overall SEG when subjects applied power on the grip was define as:
Vε − grip =
Vε − axes1 + Vε −twist1 + Vεz −bend1 + Vεy −bend1
(7)
2
3 Results 3.1 Anthropometric Data of Subjects
The collective of subjects has specific characteristics with respect to BMI19~25. All subjects were on an average 168.66±3.83cm tall, 65.97±4.59Kg weight and 30±5 years old. The distributions of weight and height for 11 subjects are presented in Fig. 3. 80.00
75.00
70.00
Weight
Kg
) (
65.00
60.00
55.00
50.00
45.00
Subjects BMI=19 BMI=25
40.00
1620.00 1640.00 1660.00 1680.00 1700.00 1720.00 1740.00 1760.00
Height
(mm)
Fig. 3. Distributions of weight and height for subjects
3.2 Absolute/ Relative Peak Forces
The absolute/relative peak operation strength of single left/right hand at 3 postures in 6-dimensional direction was presented in Fig.4. 3.3 Strain Energy for Grip
The grip is alloy steel which will result the slightly energy from subjects operation. Due to the difference of grip’s transformation, corresponding energy level is very low. In y-axis of Fig.5, the strain energy of grip is expressed in units of 10-6 joules (J).
Ergonomics Evaluation of Three Operation Postures for Astronauts
135
4.00
kgF/kgB
)
relative peak forces
(2.00 0.00
left
right
front
back
a1) X- direction absolute peak forces
4.00
(
2.00
relative peak forces
kgF/kgB
)
0.00
left
right
front
back
up
down
-2.00
left hand-comfortable span
right hand-comfortable span
left hand-horizontal widest span
right hand-horizontal widest span
left hand-longitudinal widest span
right hand-longitudinal widest span
b2) Y-direction relative peak forces 4.00
kgF/kgB
)
250
)200 (
( 2.00
150 100 50 0
left
right
front
Left Hand-comfortable span Left Hand-horizontal widest span Left Hand-longitudinal widest span
back
up
down
Right Hand-comfortable span Right Hand-horizontal widest span Right Hand-longitudinal widest span
c1) Z-direction absolute peak forces
relative peak forces
absolute peak force N
right hand-comfortable span right hand-horizontal widest span right hand-longitudinal widest span
a2) X- direction relative peak forces
b1) Y-direction absolute peak forces
-50
down
-2.00
left hand-comfortable span left hand-horizontal widest span left hand-longitudinal widest span
300
up
0.00
left
right
front
back
up
down
-2.00
left hand-comfortable span
right hand-comfortable span
left hand-horizontal widest span
right hand-horizontal widest span
left hand-longitudinal widest span
right hand-longitudinal widest span
c2) Z-direction relative peak forces
Fig. 4. Absolute/ relative peak forces at 3 postures 6-D directions of single left/right hand
4 Discussions (1)Based on the description of 11 subjects about manipulation idioms, all of them are used to prefer to right/dominant side, but the movements were not higher for the right/dominant side than for left/non-dominate when they wore spacesuit in all three postures (Fig. 6). This is caused by a multifaceted aspect, one of the most important reasons is that both of arms motion were restricted by the flexibility and the range of spacesuit’s joint, for example, wrist, elbow and shoulder joint. Simultaneously, the motion of arms must overcome the obstacles of excessive pressure 40Kpa inner
136
D. Li and Y. Zhao
spacesuit when subjects completed extending and bending actions. In extreme circumstances, the predominance of right hand, such as strength, balance stability, and alternant perception between brain and limbs, will go down. As non-dominant hand, left side is even more skillfully than right/dominant side. 6000
×10-6
5000 4000
Vε
J
)3000 (2000 1000 0
left right front back up down left hand-comfortable span right hand-comfortable span left hand-horizontal widest span right hand-horizontal widest span left hand-longitudinal widest span right hand-longitudinal widest span
Fig. 5. SEG at 3 postures in 6-D directions of single left/right hand
(2) In Fig. 6, it also revealed that in the direction of applying force, the minimum of peak force appears at longitudinal widest span by right hand up-direction pushing, it is 49.28±8.28N. The maximum of peak force appears at comfortable span by right hand forward-direction pushing. It is 226.18±16.37 N. Meanwhile, for all three postures, the average of peak force in Z-axis is maximum (149.36N), and the average of peak force in Y-axis is minimum (62.83N). (3) These stability of operating force in Z-axis by left hand was the worst, namely forward and backward push/pull behavior, the standard deviation in Z-axis of three postures is 66.40N. The stability of operating force in Y-axis by left hand was the best, namely up and down push/pull behavior, the standard deviation in Y-axis of three postures is 14.70N. 250.00
absolute peak force
N
) 200.00 ( 150.00
left
100.00
right front
50.00
back
0.00
up X Y Z
X Y Z
X Y Z
X Y Z
XY Z
X Y Z
LH-C
RH-C
LH-H
RH-H
LH-L
RH-L
down
Attitudes
Fig. 6. Comparison with the 3-D forces at three postures of single left/right hand* *LH-C/RH-C: left/ right hand comfortable span; LH/RH-H: left/ right hand horizontal widest span; LH/RH-L: left/ right hand longitudinal widest span.
Ergonomics Evaluation of Three Operation Postures for Astronauts
137
(4) Either left or right hand, SEG made by down-direction action greater than other direction in 3 operation postures. (5) In accordance with the relative force evaluation standard: over 2.49, Excellent "A"; 2.48~2.17, good "B"; 2.16~1.95, favorable "C"; 1.94~1.65, bad "D"; below 1.64, poor "E". The strength of operating force can be divided into five ranks (Table 2). Table 2. Operating force with relative force evaluation standard
left hand P1 right hand left hand P2 right hand left hand P3 right hand
X-axis relative force 1.4 1.73 1.61 1.37 1.22 2.26 2.08 1.02 1.22 0.86 0.85 1.25
Y-axis rank relative force E 1.01 D 2.36 E 0.78 E 2.38 E 1.16 B 1.89 C 0.80 E 1.66 E 1.30 E 1.71 E 0.76 E 1.72
Z-axis rank relative force E 3.21 B 2.05 E 3.48 B 0.91 E 2.66 D 1.61 E 1.54 D 1.90 E 0.80 D 2.00 E 2.10 D 0.83
rank A C A E A E E D E C C E
5 Conclusions This paper deals with the study of simulation aviation operation experiment, three parameters were applied to quantitatively assess the musculoskeletal strength of upper limb when human performed isometric contraction on 6-D pushing and pulling at three postures: absolute peak force, relative peak force, strain energy. The result here demonstrates that above three indexes are specific and effective approaches available to successfully estimate the change of upper limb strength in 3 postures, and the individual upper limb power was significantly affected by different postures Astronauts' operations remain a critical factor in safe and efficient aviation operations. In the future, biomechanics combined with the EMG signal will be important tool that can better address human operation evaluation issues. How to move the study of these issues out of the laboratory and apply it into real world operating environments is still a grand challenge.
References 1. Baldwin, K.M., Caiozzo, V.J., Feeback, D.L., Guilliams, M.E., Lee, S.M.C., Loehr, J.A., Ryder, J.W., Scheuring, R.A., Sharp, G., Spiering, B.A.: Risk of Impaired Performance Due to Reduced Muscle Mass, Strength, and Endurance. NASA report: HRP-47072 4 (2008) 2. Di Prampero, P.E., Narici, M.V.: Muscles in microgravity: from fibres to human motion. Journal of Biomechanics 36, 403–412 (2003)
138
D. Li and Y. Zhao
3. Fitts, R.H., Trappe, S.W., Costill, D.L., Gallagher, P.M., Creer, A.C., Colloton, P.A., Peters, J.R., Romatowski, J.G., Bain, J.L., Riley, D.A.: Prolonged Space Flight-induced Alterations in the Structure and Function of Human Skeletal Muscle Fibres. The Journal of Physiology 588, 3567–3592 (2010) 4. Trappe, S., Costill, D., Gallagher, P.M., Creer, A., Peters, J.R., Evans, H., Riley, D.A., Fitts, R.H.: Exercise in space: human skeletal muscle after 6 months aboard the International Space Station. Journal of Applied Physiology 106(4), 1159–1168 (2009) 5. Siconolfi, S.F., Koslovskaya, I.B.: Skeletal Muscle Performance and Characteristics. Mir 18 Final Science Report, http://www.nasa.gov/ 6. Barattini, P., Schneider, S., Edgerton, R., Castellsague, J.: MARES: A New Tool for Muscular, Neuromuscular and Exercise Research in the International Space Station. Journal of Gravitational Physiology 12, 61–68 (2005) 7. Nunez, F., Romero, A., Chua, J., Mas, J., Tomas, A., Catalan, A., Castellsaguer, J.: Body Position Reproducibility and Joint Alignment Stability Criticality on a Muscular Strength Research Device. Journal of Gravitational Physiology 12(1), 141–142 (2005) 8. Lofton, R., Conley, C.: International Space Station Phase 1 Risk Mitigation and Technology Demonstration Experiments. In: 48Th International Astronautical Congress. International Astronautical Federation Press, Turin (1997) 9. Newman, D.J., Tryfonidis, M., Van Schoor, M.: Astronaut-Induced Disturbances in Microgravity. AIAA J. Spacecr Rockets 34(2), 252–254 (1997) 10. Newman, D.J., Van Schoor, M.: Dynamic Load Sensor Experiment: Background, Science Requirements, and Preliminary Design. white paper (1992) 11. Newman, D.J.: Dynamic Load Sensors (DLS) Spaceflight Experiment, report and world wide website, http://web.mit.edu/aeroastro/www/labs/DLS/ 12. Dynast Engineering Company: ISS Phase 1 Risk Mitigation Experiments and Technology Demonstration summaries and Lessons Learned. Technical report, ISS Phase 1 RME Forum (1998)
In Silicon Study of 3D Elbow Kinematics Kang Li1 and Virak Tan2 1
Department of Industrial and Systems Engineering, Rutgers University Piscataway, NJ 08854 2 Department of Orthopaedic Surgery, University of Medicine and Dentistry of New Jersey, Newark, NJ 07103, USA
Abstract. This study is to propose a novel technique to improve the accuracy of estimating bone kinematics. This technique will use the radiographic information of both soft tissue and hard tissue for the 2D-3D registration. Non-rigid registration technique and rigid-body registration will work seamlessly to guide the matching process to find the optimal bone pose. Such a technique could improve and accelerate the matching process. Keywords: Elbow, Kinematics, Imaging.
1 Introduction Musculoskeletal injuries are the most common healthcare problem in the United States. For example, it was reported that more than 57 million musculoskeletal injuries, which accounted for 60 percent of injuries of all types, were treated in healthcare setting in 2004 (AAOS 2009). The healthcare cost of treating these injures was estimated at $127 billion (AAOS 2008). Besides the direct cost, these injuries result in a lot more indirect cost since the injuries can lead to short term or long term disabilities. In fact, in 2004 musculoskeletal injuries resulted in the more than 72 million lost work days (AAOS 2008). These huge costs are expected to keep increasing due to the increasing prevalence of musculoskeletal injuries, especially in aging population (AAOS 2008). Although musculoskeletal injury is a prevalent problem, current clinical diagnosis and evaluation methods for most mobility injuries have not been significantly advanced. Surgeons heavily rely on subjective evaluation, simple clinical tests, or at most radiographic images in unloaded or static scenarios to diagnose the injuries, evaluation surgical techniques, and determine treatment strategies. These simple techniques may not be able to detect the injury-induced critical kinematic changes, which are only observed in dynamical scenarios (Li, Tashman et al. 2011). Improving diagnostic techniques and treatment strategies may alleviate the economic burdens resulting from musculoskeletal injuries by reducing the long-term disability and returning the patients back to higher levels of functioning. In the last decade of the 20th century, a surface marker-based motion capture technology was introduced into clinical studies to examine 3D in vivo joint functions, and thus enabled dynamic evaluation of surgical technique and rehabilitation strategy. However, the accuracy of surface marker-based systems in measuring underlying V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 139–142, 2011. © Springer-Verlag Berlin Heidelberg 2011
140
K. Li and V. Tan
skeletal kinematics is compromised by the soft tissue artifacts (Tashman, Kopf et al. 2008). In fact, surface marker-based systems are even not able to track the gross movement of bones and joints, e.g. carpal bones of the hand. Thus, this type of system is not able to determine subtle but important changes in joint mechanics caused by musculoskeletal injuries and therefore not able to resolve the controversies surrounding the treatment management of many injuries. Recently, a small number of research programs in the US start to develop bi-planar videofluoroscopy systems, which use two high-frequency cardiac cine-radiographic generators to measure 2D in vivo joint kinematics (Tashman, Kopf et al. 2008; Scarvell, Pickering et al. 2010). These systems utilize 2D-3D registration techniques (Fregly, Rahman et al. 2005; Lu, Tsai et al. 2008; Zheng and Zhang 2009; Acker, Li et al. 2010; Matsuki, Matsuki et al. 2010; Scarvell, Pickering et al. 2010; Tersi, Fantozzi et al. 2010) to match the digitally reconstructed radiographs based on 3D bone scans with the experimental measured fluoroscopic images, then estimate the 3D bone pose with high accuracy and resolution. Although these systems allow joint kinematics to be measured in vivo, a few drawbacks prevent widespread use of them in clinical studies: (1) high cost of building such a system, (2) relatively larger radiation expose compared to clinically available fluoroscopic device, (3) a very limited field of view, which does not allow capturing the entire movement of many functional activities (Matsuki, Matsuki et al. 2010), (4) significant setup and calibration time. In clinical environment, commonly available is the single plane fluoroscopy system, such as the mobile C-arm. This type of system has relative larger field of view and generates less radiation expose but work at a lower sampling frequency. Although it has been demonstrated that such a system can be used for measuring in vivo 3D kinematics of upper and lower extremities (Matsuki, Matsuki et al. 2010; Tersi, Fantozzi et al. 2010), the out-of-plane accuracy is less optimal compared to the bi-planar X-ray system (Fregly, Rahman et al. 2005; Acker, Li et al. 2010). Therefore, research effort has been dedicated to develop new algorithms to improve the out-ofplane accuracy (Scarvell, Pickering et al. 2010). Although it is possible to identify the edge of soft tissue around the joint, such information has been ignored and never been utilized for helping identify the 3D bone pose. We now propose a novel technique to improve the accuracy of estimation out of plane kinematics. This technique will use the radiographic information of both soft tissue and hard tissue for the 2D-3D registration. Non-rigid registration technique and rigid-body registration will work seamlessly to guide the matching process to find the optimal bone pose. Such a technique could improve and accelerate the matching process.
2 Methods One specimen will be used in this study. The specimen will be free of any other elbow injuries or medical conditions (musculoskeletal or otherwise) that would interfere with performing the requisite testing activities. High-resolution CT scanning of the intact elbow specimens will be obtained. The elbow will be mounted onto a jig that holds the humerus in neutral rotation and perpendicular to the floor. Elbow ROM will be easily achieved in the gravity and
In Silicon Study of 3D Elbow Kinematics
141
antigravity orientation by rotating the jig 180 degrees. Active ROM will be achieved by running cables through a line-of-action approximating each muscle’s centroid in both the elbow flexor and extensor muscle units. The muscle lines of action will be maintained using a tendon alignment unit. Testing of the elbow is done in the supine overhead and upright positions before and after sectioning of the collateral ligaments. In total five severity levels of injury on each specimen will be created including the intact elbow, elbow with LCL torn, elbow with LCL and posterior band of MCL torn, elbow with LCL and entire MCL disrupted, distal humerus stripped of all soft tissues. For each level of severity of injury, 2 trials of movement (2 rehabilitation protocol) will be performed (10 trials total). The single-plane radiographic imaging system consisting of a mobile C-arm fluoroscope, retrofitted with 30-cm image intensifiers and high-speed video cameras will be used for measuring accurate 3D skeletal kinematics. For this study, images will be acquired at 15 Hz. The following outcome variables will be obtained in order to determine elbow instability: 1) displacement of the proximal ulna relative to the distal humerus, 2) displacement of the radial head relative to the distal humerus and proximal ulna. The recorded dynamic fluoroscopic radiographic data will go through a modelbased tracking procedure to determine 3D elbow kinematics. CT scans of the elbows will be used to create specimen-specific 3D volumetric bone models. Then, a virtual x-ray system model proportionately identical to the true testing facility will be created. Digitally reconstructed radiographs (DRRs) can be computed using this simulation system for any bone orientation. A computer optimization algorithm will automatically determine the bone position/orientation at each instant in time that maximizes the correlation between the DRRs and the actual radiographs. The relative motions between the distal humerus, proximal ulna, and radial head bone will then be used to create a 3D model of skeletal movement. The joint rotations and translations will be determined. This model-based tracking, coordinate system determination, and kinematic analyses have been extensively used in our previous studies.
References [1] AAOS. Burden of musculoskeletal diseases in the United States: prevalence, societal and economic Cost, American Academy of Orthopaedic Surgeons (2008) [2] AAOS. Half of all musculoskeletal injuries occur in home. AAOS now (April 2009) [3] Acker, S., Li, R., et al.: Accuracy of single-plane fluoroscopy in determining relative position and orientation of total knee replacement components. Journal of Biomechanics (2010) (in press) [4] Fregly, B.J., Rahman, H.A., et al.: Theoretical accuracy of model-based shape matching for measuring natural knee kinematics with single-plane fluoroscopy. Journal of Biomechanical Engineering 127(4), 692–699 (2005) [5] Li, K., Tashman, S., et al.: Biodynamical responses during running after isolated PCL injury. In: The 57th Annual Meeting of Orthopaedic Research Society (2011) [6] Lu, T.W., Tsai, T.Y., et al.: In vivo three-dimensional kinematics of the normal knee during active extension under unloaded and loaded conditions using single-plane fluoroscopy. Medical Engineering Physics 30(8), 1004–1012 (2008)
142
K. Li and V. Tan
[7] Matsuki, K.O., Matsuki, K., et al.: In vivo 3D kinematics of normal forearms: analysis of dynamic forearm rotation. Clinical Biomechanics (Bristol, Avon) 25(10), 979–983 (2010) [8] Scarvell, J.M., Pickering, M.R., et al.: New registration algorithm for determining 3D knee kinematics using CT and single-plane fluoroscopy with improved out-of-plane translation accuracy. Journal of Orthopaedic Research 28(3), 334–340 (2010) [9] Tashman, S., Kopf, S., et al.: The kinematic basis of ACL reconstruction. Oper. Tech. Sports Med. 16(3), 116–118 (2008) [10] Tersi, L., Fantozzi, S., et al.: 3D Elbow Kinematics with Monoplanar Fluoroscopy. In: Silico Evaluation. Eurasip Journal on Advances in Signal Processing (2010) [11] Zheng, G.Y., Zhang, X.: A novel parameter decomposition based optimization approach for automatic pose estimation of distal locking holes from single calibrated fluoroscopic image. Pattern Recognition Letters 30(9), 838–847 (2009)
Implicit Human-Computer Interaction by Posture Recognition Enrico Maier Berlin Institute of Technology, Department of Psychology and Ergonomics, Chair of Human-Machine Systems, Research Training Group prometei, Franklinstr. 28-29, 10587 Berlin, Germany
Abstract. The presented work introduces the evolution of the computer to an ubiquitous attendant of the human. The related incessantly opportunity to interact with the computer and the explicit character of current user interfaces leads to the distraction of the human from his actual tasks. Accordingly, the authors of this work suggest the design and the research of new and implicit interaction forms, which support the human while the processing of his tasks. In this work, a concept for the implicit human-computer interaction by the recognition of prejudicial postures will presented. The detection and feedback of prejudicial postures by mobile computers like smartphones shall warn the human of unhealthy postures in everyday life and on work and prevent him from sanitary after effects. Keywords: human-computer interaction, implicit interaction, mobile interaction, ubiquitous computing.
1
Introduction
The strong evolution of the computer and communication technology embeds more and more small and mobile computers like smartphones, personal digital assistants (PDAs) and tablet PCs in our environment and shifts the human to the third era of the modern computer age, the era of ubiquitous computers [8]. The modern computer age started in the first era with the development of the mainframes. These huge Computers, which filled up whole rooms, served at the same time many people and were owned by one or perhaps many companies. Only a few computers served many people in this era. In the second era the personal computer (PC) was developed and spreads themselves into the work and living environment of the human. Nearly every human in the developed world owns or at least uses his own personal computer and the ratio between people and computers was balanced. The evolution of the mobile computer technology spreads explosively small and mobile computer into the environment and the computer emerged to an ubiquitous attendant of the human. This era, where the human is currently V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 143–150, 2011. c Springer-Verlag Berlin Heidelberg 2011
144
E. Maier
Fig. 1. Quantified evolution of the computer in the three eras of the modern computer age from [8]
situated, is called the era of ubiquitous computing. The number of the computers beats the number of people by far and the human has no longer to operate one computer but many computers. In the first two eras the interaction between human and computer takes place on a desk or other static places. Through the evolution of the mobile computer technology the computer become a ubiquitous attendant of the human and the interaction leaves the desk and static places and becomes mobile as well. Now, the interaction is anytime, anywhere and permanent possible and more and more required. Current user interfaces use different modalities like text, speech, gestures or graphical objects for the interaction with the human, but base all on the same explicit interaction paradigm. At this, the human communicate with computer through explicit and precise commands. Every interaction can be understood as a time-limited bi-directional communication between the computer and the human. Both the information exchange between computer and human and the processing of the actual tasks claim cognitive resources of the human. Certainly, the cognitive resources of the human are limited, so that the human can not process two cognitive tasks like reading and writing with satisfying results at the same time [5]. This feature of the human can be found in Wickens model of resource allocation during a information processing task (see Fig. 2). The model shows, that the resources for the perception as well as the resources for the selection of a reaction are limited. Through the evolution of the computer to an ubiquitous attendant of the human a permanent interaction with it is possible and more and more required. The limitation of the human cognitive resources respectively the ability of parallel processing of different tasks and the explicit communication character in the current interaction paradigm leads to a permanent distraction of the human from his actual tasks. At every interaction with a computer the human has to interrupt the task processing and to focus on the dialog with the computer.
Implicit Human-Computer Interaction by Posture Recognition
145
Fig. 2. Wickens model of ressource allocation during a information processing task from [5]
The computer as ubiquitous attendant requires the design and the research of new interaction forms, which base on an implicit and unobtrusive interaction paradigm.
2
Implicit Human-Computer Interaction
The exchange of information between human and computer and the impact on operations of the computer system is named human-computer interaction [11]. Norman [9] describes in his interaction model The execution-evaluation cycle (see Fig. 3.a) the fundamental structure of an interaction in current and conventional user interfaces. At this, the human formulates an execution plan and execute it on the user interface afterwards. After the execution of the plan or a part of it, observes the human the interface and determines further actions if necessary. A similarly interaction model is described by Abowd and Beale [1]. In the model User-System (see Fig. 3.b) communicates the user over the interface with the computer. He articulate his commands, observes the interface and articulates further commands if necessary. The fundamental structure of the interaction in both models can be found in command-line-, speech-, gesture- or object-based respectively graphical user interfaces. The underlying interaction paradigm is explicit in all interfaces, the human interact with the computer through explicit and precise commands. Thus, the interaction requires a sort of dialog with the computer, so that the human has to interrupt the processing of his tasks and the ubiquity of the computer can switch to a disadvantage. Mark Weiser, a pioneer and co-funder of the research area Ubiquitous Computing, noticed: The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it [14].
146
E. Maier
Fig. 3. Interaction models: a) The Execution-Evaluation Cycle from Norman [9] and b) The User-System model from Abowd and Beale [1]
For the realization of this vision, not only the computer but also part of the interaction with it has to disappear into the background. For this purpose, Schmidt [11] suggests a new concept for human-computer interaction, which bases on an implicit information exchange respectively communication, and calls it implicit human-computer interaction (iHCI). He defines the new interaction paradigm in the following way: Definition: Implicit Human-Computer Interaction (iHCI) iHCI is the interaction of a human with the environment and with artifacts which is aimed to accomplish a goal. Within this process the system acquires implicit input from the user and may present implicit output to the user. Definition: Implicit Input Implicit input are actions and behavior of humans, which are done to achieve a goal and are not primarily regarded as interaction with a computer, but captured, recognized and interpret by a computer system as input. Definition: Implicit Output Output of a computer that is not directly related to an explicit input and which is seamlessly integrated with the environment and the task of the user. Several research and discussion can be found in [7], [12] and [6]. The fundamental idea behind the new interaction concept is the recognition of the interaction between the human and the environment. These information serve as input for the computer system for the selection and activation of situation-supporting services. The activity of the human will implicit and unobtrusive captured by the system as user-input and the human need no longer to interact explicitly with the system and can focus on his actual tasks.
Implicit Human-Computer Interaction by Posture Recognition
3
147
The Smartphone as Ubiquitous Computer
Since 2000, the skills of high-end mobile phones, so-called smartphones, has extremely improved. They integrate the resources and the functionality of computers into palm-sized mobile devices, small enough to fit in every pocket and carried at any time by the user. The explosively distribution of these devices makes the smartphone besides tablet PCs and personal digital assistants to the most successful ubiquitous computer ever been. In the year 2007, for the first time more smartphones then notebooks were sold and the forecast shipments show an increasing distribution in the next years (see Fig. 4).
Fig. 4. Shipments of mobile devices between 2004 and 2008 and forecast between 2008 and 2013 from [8]
How described in chapter one, the operation of this mobile computers through traditional and explicit user interfaces can lead to a distraction of the human from his actual tasks. Accordingly, new implicit interaction forms should be designed and researched for this kind of computers, which support the human during the processing of his tasks. Smartphones offer not only resources like a computer but also diverse sensors like position-, magnetic-field-, light- and proximity-sensors or GPS-receiver. The position-sensor so-called accelerator offers the potential for the capturing of implicit user input, described in chapter one. Brezmes [3] and Berchtold [2] showed with 80% up to 97% recognition rates the feasibility of activityrecognition through integrated accelerators and demonstrate the potential of implicit human-computer interaction with smartphones.
4
Detection and Feedback of Disadvantageous Postures
The lifting and carrying of loads is an exposure for which the human body is insufficiently prepared. Accordingly, lifting- and carrying-operations has a share on early detritions on the musculoskeletal system, which effects in backache.
148
E. Maier
Fig. 5. Advantageous (big pictures) and disadvantageous (small pictures) postures during lifting and carrying of loads from [13]
Primarily, during lifting- and carrying-operations the right posture plays an important role for avoiding overcharge and impairment of the backbone. During the lifting with straight back, the exposure of the spinal discs is reduced by 20% compared to the lifting with flexed back [13]. Additional, with a good lifting technic the spinal discs getting more consistently charged, because they getting only charged under pressure and the pressure can spread well on the whole area of the discs and the vertebral bodies. Fig. 5 shows advantageous and disadvantageous postures during lifting and carrying of loads. Furthermore, around one-third of all employees in the European Union spend more then half the workday in painfully and exhausting postures and around 50% of all employees encountered in repetitive activities, which go along with painfully and exhausting movements [10]. Pain and exhaustion can lead to musculoskeletal diseases, reduced productivity and worsen posture and movement control. The latter can increase the risk of error, decrease the quality and lead to dangerous situations. The european standard Safety of machinery Human physical performance [4] identify three torso postures which should receive attention with respect to health risk, the torso inclination forwards/backwards (Fig. 6.a), the torso inclination lateral (Fig. 6.b) and the torso rotation (Fig. 6.c). Furthermore, the standard define acceptable, partly acceptable and unacceptable values with respect to inclination and rotation of the torso. The presented work want to use the integrated position sensor from mobile computers to measure the current torso position with respect to inclination forwards/backwards, inclination lateral and the rotation. When the defined thresholds from [4] getting exceeded, the user get a vibrotactile feedback from the mobile computer. The detection and feedback of disadvantageous torso positions based on the implicit interaction paradigm, defined by [11]. The mobile computer captures implicit and unobtrusive the torso posture of the human and determines permanently health risk for the user. If a disadvantageous posture is detected the user gets a warning feedback. For the implementation of this interaction form the mobile computer has to be placed precisely on the lumbar area of the human, so the current position of the device corresponds to the current position of the torso. In an initial process,
Implicit Human-Computer Interaction by Posture Recognition
149
Fig. 6. Health relevant torso postures from [4]
the current position is defined as the null-position (x=0, y=0, z=0). After this, the device monitors permanently the current coordinates of the position sensor. If defined thresholds getting exceeded the user gets a vibrotactile feedback.
5
Conclusion
The described idea, for the realization of human-computer interaction based on the recognition of postures, presents an application which supports the human during the processing of his tasks and dont distract him. The detection and feedback disadvantageous posture aim to reduce the health risk in work and everyday life. This aim shall achieved by the immediately prevention of disadvantageous postures and long-term change in behavior.
Acknowledgements We want to thank the Deutsche Forschungsgemeinschaft (DFG) for funding this research (prometei 1013).
References 1. Abowd, G.D., Beale, R.: Users, systems and interfaces: a unifying framework for interaction. People and Computers 5, 73–87 (1991) 2. Berchtold, M., Budde, M., Gordon, D., Schmidtke, H., Beigl, M.: Actiserv: Activity recognition service for mobile phones. In: Wearable Computers (ISWC), pp. 1–9 (August 2010) 3. Brezmes, T., Gorricho, J.L., Cotrina, J.: Activity recognition from accelerometer data on a mobile phone. In: Omatu, S., Rocha, M.P., Bravo, J., Fern´ andez, F., Corchado, E., Bustillo, A., Corchado, J.M. (eds.) IWANN 2009. LNCS, vol. 5518, pp. 796–799. Springer, Heidelberg (2009)
150
E. Maier
4. CEN: Safety of machinery – human physical performance. European Standard, pp. 1–22 (2009) 5. Christopher, D., Wickens, J.G.H. (eds.): Engineering Psychology and Human Performance, 3rd edn. Prentice Hall, Englewood Cliffs (1999) 6. Gellersen, H.: Look who’s visiting: supporting visitor awareness in the web. International Journal of Human-Computer Studies 56(1), 25–46 (2002) 7. Gellersen, H.W., Schmidt, A., Beigl, M.: Ambient media for peripheral information display. Personal Technologies 3, 199–208 (1999) 8. Krumm, J. (ed.): Ubiquitous Computing Fundamentals. Chapman Hall/CRC (2010) 9. Norman, D.A.: The Psychology Of Everyday Things. Basic Books, New York (1988) 10. Paoli, P., Merllie, D.: Third european survey on working conditions 2000. In: European Foundation for the Improvement of Living and Working Conditions, pp. 1–86 (2001) 11. Schmidt, A.: Interactive context-aware systems interacting with ambient intelligence. In: Ambient Intelligence, pp. 1–20 (2005) 12. Schmidt, A., Gellersen, H.W.: Visitor awareness in the web. In: WWW 10 (2001) 13. Steinberg, U.: Heben und tragen ohne schaden. Bundesanstalt f¨ ur Arbeitsschutz und Arbeitsmedizin 5, 1–20 (2008) 14. Weiser, M.: The computer for the 21st century. Scientific American Ubicomp Paper (1991), http://nano.xerox.com/hypertext/weiser/SciAmDraft3.html
Optimization-Based Posture Prediction for Analysis of Box Lifting Tasks Tim Marler, Lindsey Knake, and Ross Johnson Center for Computer Aided Design, Virtual Soldier Research (VSR) Program, The Univeristy of Iowa; Iowa City, Iowa 52242, USA
[email protected]
Abstract. New methods for optimization-based posture prediction with external forces are presented and tested. The proposed approach incorporates prediction of 113 degrees of freedom including global position and orientation of the body as well as foot position, while considering balance. Postures and joint torques are successfully predicted and compared to motion-capture data and literaturebased data respectively. This approach is applied to a box-lifting task and provides a robust tool for studying human performance and for preventing injuries. Keywords: Posture prediction, optimization, box lifting, joint torque.
1 Introduction In an effort to address potential pain and injury associated with box-lifting tasks, new developments with optimization-based posture prediction not only reflect research but also yield a unique and versatile tool for predicting and analyzing human performance. Especially within the manufacturing arena, data-based ergonomic tools help assess the risks associated with demanding tasks. While these tools provide well accepted standards, they are limited in the types of scenarios to which they apply. Furthermore, traditional human modeling tools are limited in their predictive capabilities and versatility. Consequently, this paper presents new posture-prediction capabilities that are successfully applied to tasks like box lifting. Two novel predictive components are presented in the context of the optimization-based postureprediction approach developed at University of Iowa’s Virtual Soldier Research (VSR) Program: 1) whole-body posture prediction including global position and orientation, which inherently includes prediction of foot position, and 2) posture prediction with external forces, which includes incorporation of balance and torquebased performance measures. While research with reach analysis and human posture analysis is extensive, relatively little work has been completed with real-time human posture prediction, especially with a whole-body model. The proposed work centers on a whole-body predictive human model that includes a total of 113 predicted degrees of freedom (DOFs). In addition to the underlying kinematic posture-prediction model, we present methods for incorporating external forces. In this way, the mass of body segments and V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 151–160, 2011. © Springer-Verlag Berlin Heidelberg 2011
152
T. Marler, L. Knake, and R. Johnson
of components in the virtual environment are considered. Finally, various new torquebased performance measures are implemented to study the effects of joint torque on human performance. This improves realism substantially when modeling tasks such as box lifting. The consequent predictive capabilities provide realistic results visually and objectively. The predicted torques are well within literature-based results. The ability to predict posture for a large number of DOFs; to predict overall body position and orientation, including foot position; and to predict joint torques while respecting balance restrictions are significant contributions. The fields of posture prediction and box-lifting analysis have both received considerable attention in the DHM community. As Marler et al (2005) and Marler et al (2008) summarize, the field of posture prediction has fostered three primary approaches: data-based approaches (Farraway, 1997; Chaffin et al, 1999), analytical inverse kinematics (IK) approaches, and optimization-based approaches (Zhao et al, 1994; Riffard et al, 1996, Mi et al, 2002; Farrell et al, 2005). While data-based methods can capture the realistic nuances of human motion, they tend to be resource intensive. Analytical approaches are becoming less common because of the limitations in the scale of the model that can be used. If formulated and validated properly, optimization based methods can provide realistic and computationally fast results for complex systems, and can provide unique tools for studying how and why humans behave the way they do. Optimization-based approaches allow one to study how different performance measures (used as objective functions) drive human postures (Marler, 2005; Marler et al, 2005b; Marler et al, 2009), but minimal work has been done with torque-based objectives. Although some optimization-based methods depend on unconstrained formulations (Zhao et al, 1994; Riffard et al, 1996), it can be helpful to separate out objective functions and constraints. The objectives represent what drives human performance, and the constraints represent the boundary conditions of the problem. Some literature suggests that joint torque should be combined with posture prediction capabilities in order to predict human posture more accurately (Allread et al, 1998; Santos et al, 2000; Zacher and Bubb, 2004), and joint torques can only be calculated if external forces are incorporated in the predictive models. Although much experimental work has been completed regarding joint torque and posture, there are few computational posture-prediction models involving joint torque. Kim et al (2009) use the Santos model and the approach presented by Liu et al (2009), to develop a balance constraint for seated conditions, which involves calculating joint torques. Although a whole-body model is developed, the global degrees of freedom and the lower limbs are essentially frozen during the postureprediction process. The visual results are similar to those presented by Marler et al (2007). Yang and Kim (2010) use the same underlying human model as Kim et al (2009), and simplify the work of Kim et al (2008) for static cases as did Liu et al (2009). This work focuses on a method for calculating joint-torques of a predefined posture, for standing and seated cases. The test cases are similar to those used by Liu et al (2009) and Marler et al (2007). Liu et al (2009) presents a method for optimization-based posture prediction with external forces. However, it does not use a whole-body model for the predictions, and it does not consider balance. Thus, the proposed work is and extension of the methods presented by Liu et al (2009).
Optimization-Based Posture Prediction for Analysis of Box Lifting Tasks
153
The literature analyzing box-lifting tasks is extensive, responding primarily to the need to reduce potential injuries. Two common tools for such analysis are the NIOH lifting equation (Waters et al, 1994) and the Liberty Mutual Lifting Tables (Snook and Cirello, 1991). However, these tools are static and limited with respect to the scenarios to which they apply. Alternatively, in the context of complete DHM tools, some work has been done with dynamic motion prediction (Chang, et al 2001; Kim et al, 2005; Xiang et al, 2009). While dynamic motion prediction provides the most accurate analysis of box lifting tasks (Wagner et al, 2007), the work in this paper provides expanded and fast posture-prediction capabilities with broad applications defined on the fly by the user, in order to bridge the gap between data based tools and dynamic simulations.
2 Method: Posture Prediction with External Forces This section provides an overview of the underlying human model, the conceptual formulation for posture prediction with external forces, and a summary of the technical components of the formulation. The work presented in this paper uses the Santos human model (Abdel-Malek et al, 2004) as a platform for further development. The underlying skeletal structure for Santos™ is modeled as a series of links with each pair of links connected by one or more revolute joints (Figure 1).
Fig. 1. The Santos Model
There is one joint angle for each DOF, and the relationship between the joint angles and the position of points on the series of links (or on the actual avatar) is defined using the Denavit-Hartenberg (DH)-notation (Denavit and Hartenberg, 1955).
154
T. Marler, L. Knake, and R. Johnson
This structure is becoming a common foundation for additional work in the field of predictive human modeling and has been successfully used with other research efforts (Ma et al, 2009; Howard et al, 2010; Yang and Kim, 2010). Given the structure in Figure 1, postures are predicted using an optimization-based approach detailed by Marler (2005). Joint angles serve as the design variables, which are incorporated in various objective functions and constraints, the fundamental formulation for which given as follows: Find: q ∈ R DOF To minimize: f (q ) Subject to:
Distance = x (q )
end-effector
− x target point ≤ ε
qiL ≤ qi ≤ qiU ; i = 1, 2, K , DOF
q is a vector of joint angles, x is the position of an end-effector or point on the avatar, ε is a small positive number that approximates zero, and DOF is the total number of degrees of freedom. With this study, a single model with 113 DOFs is used for the human torso, arms, legs, neck, hands, eyes, and global position and orientation. f(q) can be one of many performance measures (Marler et al, 2005; Marler et al, 2005b; Marler, 2005; Marler et al, 2009). The primary constraint, called the distance constraint, requires the end-effector(s) to contact a specified target point(s). qiU
represents the upper limit, and qiL represents the lower limit. These limits are derived from anthropometric data. In addition to these basic constraints, many other constraints can be used as boundary conditions to represent the virtual environment. This optimization problem can include a series of additional constraints created by the user on the fly, in order to simulate infinitely many tasks. Automatically including the global DOFs allows one to predict the position and orientation of the body. Consequently, this is not just a method for inverse kinematics of human limbs. It is a method for simulating human performance. During the posture prediction process, the feet are not fixed to one location. Rather, they are constrained to a bounded plane that represents the ground, so the optimum (most realistic) foot position is automatically predicted in real time. 2.1 Formulation
This section uses the basic structure presented above, and develops a new approach for posture prediction with external forces. The conceptual formulation is outlined as follows: Given: 1) End-effectors and associated target points/lines/planes 2) External loads 3) Position and orientation of ground Find: Joint angles
Optimization-Based Posture Prediction for Analysis of Box Lifting Tasks
155
To minimize: Joint torques and/or joint displacement Subject to: 1) Distance constraints involving given target points/lines/planes 2) Joint limits 3) Torque limits 4) All contact points remain on the contact plane 5) Balance 6) Static equilibrium, including body mass, external forces, and ground reaction forces
The mathematical formulation is outlined as follows and is solved using the SNOPT software (Gill et al, 2002), with analytical gradients determined for all objective functions and for all constraints. Given: 1) Target point & end-effector:
2) External loads at given positions: k 3) Contact plane: specified with point Find: Joint angles: To minimize:
, at position
for link
and normal
Joint torques: Subject to: 1) Distance constraints: 2) Joint limits: 3) Torque limits: 4) All contact points on contact plane: where represents a local contact point on the foot. 5) ZMP restricted to be inside the foot support area (FSR): where
is a local foot boundary position.
Static equilibrium, including reaction forces and calculation of the zero-momentpoint (ZMP), is inherently considered in the calculation of joint torques, which is conducted using a variation of the method discussed by Kim et al (2008). The aforementioned method has been adapted to the static case and thus only considers torques due to gravity and defined external forces. Essentially, equations of static equilibrium are included as constraints. Then, a ZMP constraint is implemented to ensure the avatar remains balanced. The ZMP is comparable to the center of gravity
156
T. Marler, L. Knake, and R. Johnson
but with consideration of external forces, not just gravity (Vukobratovic and Borovac, 2004). This point is constrained to remain within a foot support region. The predicted joint-torques are constrained to remain within limits representing strength restrictions. The objective function used to combine torques was developed by Marler (2005) as an approximation of the min-max approach to multi-objective optimization. Thus, the maximum joint-torque in the body is minimized. With smaller values of p, this function tends to mimic a sum, resulting in the collective minimization of all torques.
3 Results As an initial test of the calculated torques, Santos was positioned with his arm extended forward. While considering the weight of the arm and hand, the predicted torque was 11.84Nm. The torque calculated analytically was approximately the same: 12.01Nm. As a more functional basic test, posture is predicted with a 40N upward force applied to each wrist (Figure 2), with the global DOFs fixed. The feet are constrained to the ground, but their precise position is predicted. The torque objective function is combined with a joint displacement function (Marler et al, 2009), using the formulation developed by Marler (2005) and using weights of 0.75 (75%) and 0.25 (25%), respectively. The predicted posture is reasonable and results in low joint torques throughout the body. Furthermore all predicted joint torques are well within literature-based limits.
Fig. 2. Basic Test with Upward Forces
Figure 3 shows a series of tests with downward forces. When joint displacement is used alone, although most joint torques are relatively low, the torque in the right clavicle joint is nearly at its limit, indicating excessive use of the upper back muscles. With the second case, joint displacement is combined with maximum torque, which acts to reduce the clavicle torque. However, the load is distributed to the spine (lower back) where the risk of injury can be high. The third case provides a balance between using upper and lower back muscles. This series of tests demonstrates how different objective functions can be used to study what drives human behavior. Although joint
Optimization-Based Posture Prediction for Analysis of Box Lifting Tasks
157
torque (and associated muscle force) plays a significant role, it is not the only factor. In general, we find that minimizing maximum joint torque (as opposed to a sum of joint torques) provides the most realistic results, a finding that corresponds well with experimental findings in the literature (Zacher and Bubb, 2004). However, in some cases (i.e. Figure 3) humans behave based on slightly different objectives, and the presented model allows for this versatility. In the vein, the proposed work can be used for trade-off analyses to study what kinds of postures tend to minimize which performance measures. 100% joint disp.
25% joint disp. 75% joint torque
25% joint disp. 75% joint torque; p=2
Fig. 3. Basic Tests with Downward Forces
The proposed approach is now applied to a basic box-lifting task with a box of dimensions 55cm X 39cm X 32 cm. Whole-body posture and foot position are predicted while respecting the ZMP constraint. Figure 4 shows predicted results with a weightless box and a 20lb box, and then motion capture results. The posture while considering the weight of the box involves slightly more bending of the knees and a straighter back. The motion capture results show a slight bend in the arm, which is a result of the anticipated motion that would require the box avoiding the knees as the box is lifted. The posture prediction results model the approximate case of pulling up on a static box, rather than initiating motion. Nonetheless, although posture prediction provides an approximate simulation of box lifting tasks, its computational speed (typically less than fifteen seconds with external forces) and versatility provide a useful tool. Weightless Box
20 lb Box
Motion Capture
Fig. 4. Box Lifting Results with 25% Joint Displacement and 75% Joint Torque
158
T. Marler, L. Knake, and R. Johnson
Figure 5 shows results using a female avatar with an alternate combination of performance measures. Clearly, box lifting tasks require one to focus on minimizing joint torque rather than joint displacement. When more joint displacement is included in the objective function, relatively high joint torques arise in the hip. Thus, typical strategies in lifting boxes help reduce hip torque as well as back torques. 75% joint displacement 25% joint torque 25% joint disp. 75% joint torque
Fig. 5. Box Lifting Results with Alternate Performance Measures
4 Discussion This paper has presented new capabilities for optimization-based posture prediction with external forces. A computationally fast and robust method has been presented for predicting human whole-body postures while considering any set of external forces or torques, as well as balance. Not only is the posture predicted in near real time but so is the overall body position and orientation. Although applied to a box lifting task for his work, the proposed method can be used with infinitely many scenarios. Furthermore, this optimization-based approach allows one to study what drives human performance. During this study, a few interesting issues surfaced. When compared to motioncapture results, the distinction between box-lifting motion and simply touching or pulling on a box surfaces. Posture prediction or even quasi-static analysis does not consider path planning (future conditions at each time step). Using posture prediction for a box lifting task provides only an approximation of the visual results. Nonetheless, it is useful for studying trends in joint torque and thus the propensity for injury. Secondly, with some of the results, there was a slight intersection between the box and the avatar knees. Collision avoidance is incorporated in this study, but the surrogate geometry used to represent the box is approximate and needs to be refined. With an optimization-based approach, one often raises the question of which objective function to use. However, the objective functions not only provide means of modeling and simulating human performance, but also tools in virtual experiments, to see the consequences of different drivers. Acknowledgments. The authors would like to thank Mr. John Meusch for his support and input.
Optimization-Based Posture Prediction for Analysis of Box Lifting Tasks
159
References 1. Abdel-Malek, K., Yang, J., Kim, J., Marler, R.T., Beck, S., Nebel, K.: Santos: A Virtual Human Environment for Human Factors Assessment. In: 24th Army Science Conference, Orlando, FL, Assistant Secretary of the Army (Research, Development and Acquisition). Department of the Army, Washington, DC (November 2004) 2. Allread, W.G., Marras, W.S., Fathallah, F.A.: The Relationship, Between Occupational Musculoskeletal Discomfort and Workplace, Personal, and Trunk Kinematic Factors. In: Proceedings of the Human Factors and Ergonomics Society 42nd Annual Meeting, Chicago, IL, pp. 896–900. Human Factors and Ergonomics Society, Santa Monica (1998) 3. Chaffin, D., Faraway, J., Zhang, X.: Simulating Reach Motions. In: SAE Digital Human Modeling for Design and Engineering International Conference, The Hague, The Netherlands, pp. 18–20 (May 1999) 4. Chang, C.-C., Brown, D.R., Bloswick, D., Hsiang, S.M.: Biomechanical Simulation of Manual Lifting Using Spacetime Optimization. Journal of Biomechanics 34, 527–532 (2001) 5. Denavit, J., Hartenberg, R.S.: A Kinematic Notation for Lower-pair Mechanisms Based on Matrices. Journal of Applied Mechanics 77, 215–221 (1955) 6. Faraway, J.J.: Regression Analysis for a Functional Response. Techometrics 39(3), 254– 262 (1997) 7. Farrell, K., Marler, R.T., Abdel-Malek, K.: Modeling Dual-Arm Coordination for Posture: An Optimization-Based Approach. In: SAE 2005 Transactions Journal of Passenger Cars Mechanical Systems, 114-6, 2891, SAE paper number 2005-01-2686 (2005) 8. Gill, P., Murray, W., Saunders, A.: SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization. SIAM Journal of Optimization 12(4), 979–1006 (2002) 9. Howard, B., Yang, J., Gragg, J.: Toward a New Digital Pregnamt Woman Model and Kinematic Posture Prediction. In: 3rd International Conference on Applied Human Factors and Ergonomics, Miami, FL (July 2010) 10. Kim, J.H., Abdel-Malek, K., Yang, J., Marler, T., Nebel, K.: Lifting Posture Analysis in Material Handling Using Virtual Humans. In: 2005 ASME International Mechanical Engineering Congress and Exposition, American Society of Mechanical Engineers (November 2005) 11. Kim, H.J., Xiang, Y., Bhatt, R., Ynag, J., Chung, H.-J., Patrick, A., Arora, J.S., AbdelMalek, K.: Efficient ZMP Formulation and Effective Whole-Body Motion Generation for a Human-Like Mechanism. In: Proceedings of the AME 2008 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference. American Society of Mechanical Engineers, Brooklyn (2008) 12. Kim, J.H., Yang, J., Abdel-Malek, K.: Multi-objective Optimization Approach for Predicting Seated Posture Considering Balance. International Journal of Vehicle Design 51(3/4), 278–291 (2009) 13. Liu, Q., Marler, T., Yang, J., Kim, J., Harrison, C.: Posture Prediction with External Loads – A Pilot Study. In: SAE 2009 World Congress, Detroit, MI. Society of Automotive Engineers, Warrendale (2009) 14. Ma, L., Zhang, W., Chablat, D., Bennis, F., Guillaume, F.: Multi-objective Optimization Method for Posture Prediction and Analysis with Consideration of Fatigue Effect and Its Application Case. Computers and Industrial Engineering 57, 1235–1245 (2009) 15. Marler, R.T.: A Study of Multi-objective Optimization Methods for Engineering Applications. Ph.D. Dissertation, University of Iowa, Iowa City, IA (2005) 16. Marler, T., Arora, J., Beck, S., Lu, J., Mathai, A., Patrick, A., Swan, C.: Computational Approaches in DHM. In: Duffy, V.G. (ed.) Handbook of Digital Human Modeling for Human Factors and Ergonomics. Taylor and Francis Press, London (2008)
160
T. Marler, L. Knake, and R. Johnson
17. Marler, R.T., Arora, J.S., Yang, J., Kim, H.-J., Abdel-Malek, K.: Use of Multi-objective Optimization for Digital Human Posture Prediction. Engineering Optimization 41(10), 295–943 (2009) 18. Marler, R.T., Rahmatalla, S., Shanahan, M., Abdel-Malek, K.: A New Discomfort Function for Optimization-Based Posture Prediction. In: SAE Human Modeling for Design and Engineering Conference, Iowa City, IA. Society of Automotive Engineers, Warrendale (2005) 19. Marler, R.T., Yang, J., Arora, J.S., Abdel-Malek, K.: Study of Bi-Criterion Upper Body Posture Prediction using Pareto Optimal Sets. In: IASTED International Conference on Modeling, Simulation, and Optimization, Oranjestad, Aruba. International Association of Science and Technology for Development, Canada (2005b) 20. Marler, T., Yang, J., Rahmatalla, S., Abdel-Malek, K., Harrison, C.: Validation Methodology Development for Predicted Posture. In: SAE Digital Human Modeling Conference, Seattle, WA. Society of Automotive Engineers, Warrendale (2007) 21. Mi, Z., Yang, J., Abdel-Malek, K., Mun, J.H., Nebel, K.: Real-Time Inverse Kinematics for Humans. In: Proceedings of the 2002 ASME Design Engineering Technical Conferences and Computer and Information in Engineering Conference, 5A, Montreal, Canada, pp. 349–359. American Society of Mechanical Engineers, New York (2002) 22. Riffard, V., Chedmail, P.: Optimal Posture of a Human Operator and CAD in Robotics. In: Proceedings of the 1996 IEEE International Conference on Robotics and Automation, Minneapolis, MN, pp. 1199–1204. Institute of Electrical and Electronics Engineers, New York (1996) 23. Santos, R., Verriest, J.P., Silva, K.: Relationship Between Discomfort Perception and Biomechanical Criteria During A Reaching Task. In: Proceedings of the XIVth Triennial Congress of the International Ergonomics Association and 44th Annual Meeting of the Human Factors and Ergonomics Association, San Diego, CA, pp. 333–336. Human Factors and Ergonomics Society, Santa Monica (2000) 24. Snook, S., Cirello, V.M.: The design of manual handling tasks: revised tables of maximum acceptable weights and forces. Ergonomics 34(9), 1197–1213 (1991) 25. Vukobratovic, M., Borovac, B.: Zero-Moment Point – Thirty Five Years of Its Life. International Journal of Humanoid Robotics 1(1), 157–173 (2004) 26. Wagner, D.W., Reed, M.P., Rasmussen, J.: Assessing the Importance of Motion Dynamics for Ergonomic Analysis of Manual Materials Handling Tasks using the AnyBody Modeling System. In: 2007 Digital Human Modeling Conference, Seattle, WA. Society of Automotive Engineers, Warrendale (2007) 27. Waters, T.R., Putz-Anderson, V., Garg, A.: Applications Manual For The Revised NIOSH Lifting Equation. U.S. Department of Health and Human Services, DHHS 28 (1994) 28. Xiang, Y., Arora, J.S., Rahmatalla, S., Bhatt, R., Marler, T., Abdel-Malek, K.: Human Lifting Simulation using Multi-objective Optimization Approach. Multibody Systems Dynamics 23(4), 431–451 (2009) 29. Yang, J., Kim, J.: Static Joint Torque Determination of a Human Model for Standing and Seating Tasks Considering Balance. Journal of Mechanisms and Robotics 2(3) (2010) 30. Zacher, I., Bubb, H.: Strength Based Discomfort Model of Posture and Movement. In: SAE Digital Human Modeling for Design and Engineering, Rochester, MI. SAE International, Warrendale (2004), SAE paper 2004-01-2139 31. Zhao, J., Badler, N.I.: Inverse Kinematics Positioning Using Nonlinear Programming for Highly Articulated Figures. ACM Transactions on Graphics 13(4), 313–336 (1994)
Planar Vertical Jumping Simulation-A Pilot Study Burak Ozsoy and Jingzhou (James) Yang Human-Centric Design Research Laboratory Department of Mechanical Engineering Texas Tech University, Lubbock, TX 79409
[email protected]
Abstract. Vertical jumping is one of the fundamental motions among other jumping types in sport biomechanics. Two important criteria in sport biomechanics are critical to all athletes: Injury and performance. In literature two major approaches have been investigated: experiment-based methods and optimization-based methods. Experiment-based methods are time consuming and tedious. Optimization-based methods for musculoskeletal models are computationally expensive because their models include all muscles and explicit integration of equation of motion. In this pilot study, a direct optimization-based method for a skeletal model was proposed in sagittal plane, where this formulation was based on joint space that was only considered the resultant results of muscles (joint torques) instead of individual muscles to reduce computational time. The cost function included increasing the center of mass velocity at take-off and increasing the center of mass position at take-off. Constraints included joint limits, torque limits, initial posture, ground contact, initial angular velocity and acceleration, zero-ground reaction forces, and moment at take-off. This optimization problem was solved by a commercial optimization solver SNOPT and the CPU time was 227 seconds on a regular PC (Intel® Core® 2 duo CPU, 3.16 GHZ and 3.25 GB RAM). Preliminary results highly correlated results from the literature. This simple planar simulation is the first step to understand the cause and effect for vertical jumping with or without arm swing. Keywords: Vertical jumping; planar model; injury; performance, arm swing.
1 Introduction Jumping is a fundamental human movement and a vital skill, especially in sports, that requires a coordination of lower and upper extremities. This movement consists of three stages: propulsive stage, flying stage and finally landing stage. Because of the contact of the foot with the ground at the end of propulsive stage, it is prone to injuries. In order to prevent the injuries and also find out how to increase the performance, significant research has been done to this area. A literature review (Ozsoy et. al, 2010) about vertical jumping was performed. In literature two major approaches have been investigated: experiment-based methods and optimization-based methods. Experiment-based methods are time consuming and tedious, also it can give investigate the problem only for the subject’s V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 161–170, 2011. © Springer-Verlag Berlin Heidelberg 2011
162
B. Ozsoy and J. (James) Yang
performance. Optimization-based methods can predict the motion but they are computationally expensive because their models include all muscles. There are many studies about the investigation of different types of jumps such as squat vertical jumping and countermovement jumping with and without arm swing (Luthanen et al., 1978, Shetty et al., 1989, Khalid et al., 1989, Harman et al., 1990, Lees et al., 2004, Hara et al, 2006, Pandy et al., 1990, Cheng et al., 2008 ). In this study, optimization-based recursive dynamic formulation was proposed for the planar simulation for jumping in sagittal plane where this formulation was based on joint space that was only considered the resultant results of muscles (joint torques) instead of individual muscles to reduce computational time. The problem formulation was subjected to a set of constraints. Cost function which was the maximization of the performance of jump included two terms: vertical center of mass position at take-off and vertical center of mass velocity at take-off. The current problem formulation can investigate the cause and effect easily therefore it was also used for investigating the effect of arm swing to the performance of jump.
q7
q6
q5
q4
y
x
q3
q1
q2
z
Fig. 1. Planar body model
2 Human Body Model In this study, a planar body model with 7 degrees of freedom was developed. The body model consisted of foot, shank, thigh, trunk, and arm. The body in sagittal plane model was shown in Figure 1. One revolute DOF for shoulder, hip, knee, and ankle was defined, respectively, three global DOFs (2 prismatic and one revolute) were defined, and the global coordinate system was fixed on the ground. The joint angle vector was defined as q = [q1 q2 K q7 ] . The revolute joints used in the body model were modeled as frictionless hinge joint. The three global DOFs were considered as virtual joints which were used for calculation of GRFs. The body model T
Planar Vertical Jumping Simulation-A Pilot Study
163
parameters, link lengths, link masses, moments of inertia and center of masses were taken from a previous study (Cheng et al., 2008). The human body can be modeled as a kinematic chain consisting of revolute joints representing the musculoskeletal joints and are connected by links that represent the bones. A local Cartesian coordinate system was fixed to each link and simulated motion was created by rotating each of the joints about this local z-axis. The model was represented by generalized coordinates ( qi ). Since the generalized coordinates were measured about the local axis, transformation matrices were needed for each joint and generated with Denavit-Hartenberg (DH) method (Denavit and Hartenberg, 1955).
3 Dynamics Model In this study, the problem of the motion prediction of a two-dimensional human body model was formulated as an optimization problem. The recursive Lagrangian dynamics which includes forward recursive kinematics and backward recursive dynamics was used. 3.1 Forward Recursive Kinematics Transformation matrices obtained from the DH-method can be used to determine the forward kinematics. The procedure was shown below.
A j = T1T2T3 ...Tj = A j-1Tj
(1)
& = B T + A ∂Tj q& Bj = A j j -1 j j -1 j ∂q j
(2)
2 && = C T + 2B ∂Tj q& + A ∂ Tj q& 2 + A ∂Tj q&& C j = B& j = A j j −1 j j −1 j j −1 2 j j −1 j ∂q j ∂ qj ∂q j
(3)
where A 0 = 1 (identity matrix) and B 0 = C 0 = 0 . After the calculation of transformation matrices, the global position, velocity and acceleration of a point in Cartesian space can be calculated as: 0
rj = A j rj , 0 r& j = B j rj , 0 && rj = C j rj
(4)
3.2 Backward Recursive Dynamics Torques at each joint that generate the motion were calculated with given mass inertia properties of each link, Ii , external force fkT = [ k f x k f y k f z 0] and the external moment hTk = [ k hx
k
hy k hz 0] for the link k defined in global coordinate system. The equation consists of four components: torque due to inertia and coriolis acceleration, gravity, external forces, and external moments.
164
B. Ozsoy and J. (James) Yang
τ i = tr[
n ∂A i ∂A i ∂A i k n Di ] + g T Ei + ∑ fkT Fi + ∑ (G ik ) A i -1z 0 ∂qi ∂qi ∂qi k =1 k =1
(5)
where Di = I i CiT
(6)
i
E i = mi ri + Ti +1E i +1
(7)
F = r f δ ik + Ti +1Fi +1
(8)
G = h k δ ik + G i +1
(9)
k i
k
k i
D n +1 = 0 and E n +1 = Fn +1 = G n +1 = 0 ; I i was the inertia matrix for link i ; mi was the
mass of link i ; gT = [0 − g 0 0] was the gravity vector; i ri was the location of center of mass of link i in the local frame i ; k r f was position of the external force in the local frame k ; z 0 = [0 0 1 0 ] for a revolute joint and z 0 = [0 0 0 0 ] for a T
T
prismatic joint. δ ik is Kronecker delta.
4 Numerical Discretization The motion has to be discretized for numerical calculation. The entire time domain was discretized by 3rd order B-spline curves due to the important properties of local control, differentiability and continuity. As the motion was generated by the joint torques, motion plans for each joint angular position, velocity and acceleration were predicted by 3rd order B-spline which is a function of control points, and basis function. When the joint angular position as a function of time was predicted, its first and second derivatives were the profiles for angular velocity and acceleration.
5 Optimization Formulation In this section, the problem formulation developed in this study was described. The design variables, objective function and the constraints used in the simulation were explained. 5.1 Design Variables In the optimization formulation, the design variables were the control points in Bspline curves. As the total time of the motion and number of steps are given into the problem formulation, the knots were already defined as an input. Therefore the optimization problem took part in order to predict the control points to generate the motion planning for jumping with given objective function. 5.2 Objective Function The jumping motion which requires a coordination of the upper and lower extremities can be considered as two-dimensional projectile motion. The objective function was
Planar Vertical Jumping Simulation-A Pilot Study
165
to maximize the center of mass position at the apex of the flight. Center of mass position (X CM ) at the apex of the flight depends on two variables: Center of mass position and velocity (VCM ) at take-off. Therefore the problem was considered as multi-objective optimization (MOO) problem. The objective function was given in (10) where w1 and w2 were the weight values and considered as 1. 2 ⎛ VCM@Take-off max ⎜ w1 X CM@Take-off + w2 ⎜ 2g ⎝
⎞ ⎟⎟ ⎠
(10)
Center of mass position can be calculated directly from the orientation of the individual segments; however, the center of mass velocity required additional calculation. In order to calculate the center of mass velocity, impulse-momentum relationship can be used (Harman et al., 1990). The relationship was shown in (11) where mt was the total mass of the body, W was the weight of body and GRFy was vertical ground reaction force and t o was the time at take-off where the optimization was ended. t
VCM@take-off =
1 o GRFy -Wdt mt ∫0
(11)
5.3 Constraints In order to predict the realistic human vertical jump, several constraints were implemented to the problem. These constraints are: Joint limits, torque limits, initial posture, ground contact, initial angular velocity and acceleration, zero-GRFs and moment at take-off. Joint Angle Limits. Joint angle limits were constrained to prevent the hyperextension and to predict the motion in physical range. The angular position for each joint should remain between the upper and lower physical limits as shown in (12). qiL ≤ qi (t) ≤ qiU
0 ≤ t ≤ to
(12)
In restricted arm-swing jump simulation, the arm movement was prevented by setting the joint limit of arm to a value as a function of time where the arms are parallel to the trunk. Joint Torque Limits. In this study, the problem formulation was based on joint space that was only considered the resultant results of muscles (joint torques) instead of individual muscles to reduce computational time. Also as well as the joint angle limits, the joint torques were constrained as shown in (13) to define the problem in physical range. The joint torque limits were taken from the literature.
τ iL ≤ τ i (t) ≤ τ iU
0 ≤ t ≤ to
(13)
166
B. Ozsoy and J. (James) Yang
Initial Posture. The initial posture for the simulation was obtained through experiments. The performance of the jumping also depends on the initial posture. A posture reconstruction algorithm was applied to the experimental data to determine the initial joint angle positions where the body were in static equilibrium. Ground Contact. As only the first stage of the jumping was predicted in this study, there was always contact between the body and the ground. The origin was placed to the toe and the position of it was constrained in horizontal and vertical direction. Initial Angular Velocity and Acceleration. When the body was in static equilibrium at the beginning of the movement, the vertical GRF should be equal to the body weight. Therefore all the initial angular joint velocities and accelerations are constrained to zero for this purpose. Zero Ground Reaction Force and Moment at Take-Off. In order to finalize the prediction of motion, zero GRFs and GRM constraints were used. The physical meaning of this constraint is releasing the ground contact constraint where the motion starts the second stage which is flying.
6 Results and Discussion The kinematic results of the simulations with arm swing (5S) and without arm swing (4S) are shown in Table 1. Table 1. Results of jumping with and without arm swing
Kinematic results Time elapsed before take-off [s] CM position at take-off [m] CM vertical velocity at take-off [m/s] Maximum CM height at apex [m]
4S 0.43 0.899 2.315 1.172
5S 0.5 0.939 2.676 1.304
Total time of the jumping required was used as an input which was obtained from the experiment for both types of jumps. It can be obviously seen that the jumping with arm swing required more time which increased the center of mass position and velocity at take-off at take-off. The center of mass position and velocity profiles were shown as a function of time in Figure 2 and 3. As it can be seen from Figure 2, vertical center of mass positions at the beginning were not same for 4S and 5S jumps due to different orientation of arm segment. In order to predict a realistic jumping, the body remained in a static posture at the beginning of the motion and then relaxed as shown in Figure 4 and 5 for both types of jumping. This relaxation period yielded a negative vertical velocity at the beginning
Planar Vertical Jumping Simulation-A Pilot Study
167
for 5S jump. For 4S jump vertical CM velocity became slightly less than zero at 0.2 s before take-off. The reason of different timings of becoming negative velocity may be due to arm swing. The maximum vertical center of mass velocity achieved for 5S
Fig. 2. Vertical CM position profile
Fig. 3. Vertical CM velocity profile
Fig. 4. Stick diagram postures without arm- Fig. 5. Stick diagram postures with arm-swing swing as a function of time as a function of time
jump was predicted 2.724 m/s at 0.010s before take-off where for 4S jump it was 2.349 m/s at 0.011s before take-off. The increase of the CM position from beginning to take-off is predicted as 0.260 m with arm swing and 0.181 m without arm swing. The CM velocity at take-off was found as 2.676 m/s and 2.314 m/s with and without arm swing respectively. This results are close to the values found as 2.71m/s and 2.51 m/s (Hara et al., 2006), 2.81 m/s and 2.58 m/s (Feltner et al., 1999) . Vertical GRFs for both 4S and 5S started with a value equal to the body weight due to the constraint of zero angular velocity and acceleration at the beginning. Due to the relaxation period at the beginning, there were decreases in vertical ground reaction profiles and then they started to increase. The arm swing lengthened pre-take-off duration and yielded a higher peak value at the latter half of the jumping as shown in Figure 6. Take-off velocity, which was calculated by impulse-momentum relationship, was enhanced by 0.361 m/s with contribution of arm swing. The contribution of the velocity to the performance was found by 69.1%, where the remaining 30.9% was due to the higher center of mass position at take-off.
168
B. Ozsoy and J. (James) Yang
Fig. 6. Vertical GRF profiles
(a) Ankle joint torque profiles
(b) Knee joint torque profiles
(c) Hip joint torque profiles
(d) Shoulder joint torque profiles
Fig. 7. Joint torque profiles
Planar Vertical Jumping Simulation-A Pilot Study
169
At the former half of the both 4S and 5S jump, the torque profiles showed same trend for 4S and 5S as shown in Figure 7. The effect of arm swing to lower extremities was too obvious especially for the knee and hip joint. A the latter half of the jump, they achieved higher torques due to arm swing.
7 Conclusion This paper presented a computationally efficient direct optimization-based motion prediction method for planar vertical jumps with and without arm swing. Resultant joint torques were used in calculations instead of muscles to reduce computational time. Simulations took 227 seconds with a regular PC (Intel® Core® 2 duo CPU, 3.16 GHz and 3.25 GB RAM). According to the simulation results, vertical center of mass take-off velocities were found 2.676 m/s and 2.315 m/s where the vertical center of mass positions were 0.939 m and 0.899 m for jumping with arm swing and without arm swing respectively. The performance increase which is the position of the center of mass can reach at flying stage was 69.1 % due to enhanced velocity and the remaining 30.9% was due to higher center of mass position. Take-off velocity was calculated with impulse-momentum relationship which means the enhanced take-off velocity was due to higher vertical ground reaction profile and more contact time duration. In conclusion, this simple model is useful to understand the vertical jumping and investigate the cause and effect easily. Preliminary results were obtained through this new formulation and matched the literature results. Future work includes 1) embedding time as a design variable that can be used to optimize time for vertical jumping; 2) extending the planar example to a 3 dimensional (3D) human model with high degrees of freedom; 3) validation of the simulation results for 3D model; 4) including landing in the optimization to study the landing strategies to reduce injuries.
References 1. Cheng, K.B., Wang, C.H., Chen, H.C., Wu, C.D., Chiu, H.T.: The mechanisms that enablearm motion to enhance vertical jump performance-A simulation study. Journal of Biomechanics 41, 1847–1854 (2008) 2. Denavit, J., Hartenberg, R.S.: A kinematic notation for lower-pair mechanisms based on matrices. Journal of Applied Mechancis 22, 215–221 (1955) 3. Feltner, M.E., Frasceti, D.J., Crisp, R.J.: Upper extremity augmentation of lower extremity kinetics during countermovement vertical jumps. Journal of Sports Sciences 17, 449–466 (1999) 4. Hara, M., Shibayama, A., Takeshita, D., Fukashiro, S.: The effect of arm swing on lower extremities in vertical jumping. Journal of Biomechanics 39, 2503–2511 (2006) 5. Harman, E.A., Rosenstein, M.T., Frykman, P.N., Rosenstein, R.M.: The effects of arms and countermovement on vertical jumping. Medicine and Science in Sports and Exercise 22, 825–833 (1990) 6. Khalid, W., Amin, M., Bober, T.: The influence of upper extremities movement on takeoff in vertical jump. Biomechanics in Sports, 375–379 (1989)
170
B. Ozsoy and J. (James) Yang
7. Lees, A., Vanrenterghem, J., Clercq, D.D.: Understanding how an arm swing enhances performance in the vertical jump. Journal of Biomechanics 37, 1929–1940 (2004) 8. Luhtanen, P., Komi, P.V.: Segmental contribution to forces in vertical jump. European Journal of Applied Physiology and Occupational Physiology 38, 181–188 (1978) 9. Ozsoy, B., Yang, J., Boros, R.: Human jumping motion analysis and simulation- a literature review. In: 3rd International Conference on Applied Human Factors and Ergonomics, Miami, Florida, July 17-20 (2010) 10. Pandy, M.G., Zajac, F.E., Sim, E., Levine, W.S.: An optimal control model for maximumheight human jumping. Journal of Biomechanics 23, 1185–1198 (1990) 11. Shetty, A.B., Etnyre, B.R.: Contribution of arm movement to the force components of a maximum vertical jump. Journal of Orthopedic and Sports Therapy 11, 198–201 (1989)
StabilitySole: Embedded Sensor Insole for Balance and Gait Monitoring Peyton Paulick, Hamid Djalilian, and Mark Bachman Henry Samueli School of Engineering University of California Irvine Department of Biomedical Engineering 305 Rockwell Engineering Center, Irvine CA USA 92697-2700
[email protected],
[email protected]
Abstract. Our group has developed an easy-to-use pressure-sensing insole equipped with accelerometers to evaluate a patient’s postural control and balance ability in a non-obtrusive manner. Pressure sensors and MEMS accelerometers are embedded in a shoe insole and display balance information on an easy-to-use graphical user interface. The importance of balance and gait monitoring is applicable to a variety of fields such as: evaluation of neurological disorders, early disease detection in children, new medication monitoring, and elderly fall risk. With this technology patients can use the insole in a home setting to monitor their balancing ability and relay data to their physician wirelessly for proper evaluation eliminating the need for additional in-office visits. Keywords: Balance monitoring, gait monitoring, insole, embedded technology, graphical user interface, pressure sensing, MEMS accelerometer.
1 Introduction The complex human balance system is primarily based on three components: the vestibular system located in the inner ear, visual feedback, and proprioception. Any impairment in these three categories can significantly affect an individual’s sense of balance. The importance of balance and gait monitoring is applicable to a variety of 2 fields such as: evaluation of neurological disorders, early disease detection in children , new medication monitoring, post surgery monitoring, elderly fall risk, and post sports injury analysis, all of which can cause impairments to balance and postural control. Specifically, elderly fall risk is of major concern to society due to its high occurrence, detrimental effects on the quality of life of elders, and financial cost. Falls cause not only physical but also psychological trauma, reduced activity, loss of 5 independence, and even death . Additionally, the direct medical costs for elderly fall risk totaled $19 billion USD in 2000 for non-fatal fall injures [5]. Thus, elderly fall risk is not only a social problem but a financial problem as well. When a patient has a prospective balance impairment physicians recommend a balance assessment using a posturography machine. Traditional posturography machines, which quantify a patient’s postural control, are extremely costly and require operation by a technician and physical therapist for balance evaluation. Although this technology provides complex and useful information for balance evaluation, it requires a visit to the physician, multiple technicians for operations and is extremely costly. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 171–177, 2011. © Springer-Verlag Berlin Heidelberg 2011
172
P. Paulick, H. Djalilian, and M. Bachman
With the technology boom sensors and micro processing chips are becoming smaller and cheaper and thus can be integrated into “smart clothes” and smaller more feasible technology for daily use. Therefore since the technology is becoming more cost effective the possibility of integrating these sensors into technologies for home use is becoming more feasible. Using the technology used in the posturography and integrating it into an in-home system would be extremely useful for those patients who have difficulty leaving the home on a regular basis. Through the use of inertial sensors and pressure sensors, information about postural control and balance ability can be attained. Our group has developed a shoe insole with integrated pressure sensors and inertial sensors to accurately obtain data about a patients gait in real time. By developing a graphical user interface that interprets the acquired data patients can visualize their gait and balance parameters and determine if their balance is improving or worsening. This can be used as a tool to evaluate the effectiveness of physical therapy in an easy-to-use method or to determine the likelihood that a patient will experience a fall. This technology can be applied to a variety of situations to evaluate gait and balance from elderly fall detection to post sports injury evaluation. With additional research those with balance impairments can use these technologies to ameliorate some of the disadvantages that come with a balance disorder and improve daily life.
2 Methods Our group sought to create a technology to monitor balance and gait that would be extremely user friendly, unobtrusive and easy-to-use. The most logical approach that fit our criteria was an athletic insole that could easily be placed in a sneaker or sandal that a patient could use on a daily basis without sensing that the insole was anything different than a standard ‘off the shelf’ shoe insole. The insole is equipped with four piezoresistive pressure sensors by FlexiForce® placed on four locations in the insole. Those locations are the heel, big toe, right ball, and left ball. These locations were chosen based on a standard pressure distribution of the foot that indicates these locations as the places of highest pressure on the foot in both static and dynamic conditions. This pressure sensor choice was fitted for this application due to its flat geometry and flexibility therefore these sensors will go unnoticed by the patient using the insoles. The piezoresistive sensor changes resistance based on applied stress. There are two main causes for this change in resistance [1]. First, resistance values are determined by bulk resistivity, length, and cross sectional area. Therefore if the geometry of the sensor changes due to applied stress 1 the resistance will also change accordingly . Secondly, resistivity is dependent upon average atomic spacing in a semiconductor lattice, which is subject to change when a 1 deformation of the material occurs, thus changing the resistivity of the material . Both dimensional change and atomic spacing contribute to the decrease in resistance when 1 a force is applied to the sensor . Additionally, a 3-axis accelerometer by Freestyle Semiconductors® was integrated into each insole. Information acquired from an accelerometer can provide additional insight to the diagnosis of a balance or neurological disorder. Identifying this abnormality in the home setting can pinpoint the disorder earlier that with a standard doctors office visit.
StabilitySole: Embedded Sensor Insole for Balance and Gait Monitoring
173
Fig. 1. Computer aided conceptual design of StabilitySole demonstrating both the MEMS accelerometer and pressure sensors
The University of Pittsburgh found that abnormal gait in high functioning older adults can be an indicator for subtle brain abnormalities. Different spatial and temporal parameters of gait such as speed, stride length and double support time can 3 be associated with subclinical infarcts or brain atrophy . This group found that high functioning older adults that have poorer gait speed, shorter stride, and longer double support time are associated with high white matter disease and subclinical strokes [3]. A group in Boston, Massachusetts found that velocity, cadence (number of steps per minute), step length, stride length appear to be effective tools for evaluation of physical therapy outcomes [2]. Specifically, for the evaluation of patients suffering from Multiple Sclerosis and hemiparesis [2]. Researchers at the Albert Einstein College of Medicine found that neurological gait abnormities in elderly subjects are a significant predictor of the risk of development of dementia [6]. The aforementioned claims have validated the importance of using an accelerometer in the StabilitySole balance and gait system. Therefore information obtained from an embedded accelerometer can evaluate these gait predictors to identify abnormalities in a non-obtrusive in-home manner. The novel approach of incorporating an accelerometer into the insole provides bipedal information related to foot displacement, force of contact, velocity, and gait cycle. The acceleration data in combination with pressure distribution can be used to evaluate abnormalities of gait and postural control. Preliminary data demonstrates marked difference in acceleration patterns for different types of gait abnormalities, supporting the effectiveness of this added parameter. Information about the pressure distribution and acceleration in the feet is relayed to a graphical user interface that demonstrates a colored pressure map of both feet. This pressure map will indicate when a certain pressure sensor has been activated and to what degree. Not only will the color map be activated, but also pressure values will be indicated in the information window for each sensor to give a more comprehensive look into the patients balancing abilities. In addition to the pressure
174
P. Paulick, H. Djalilian, and M. Bachman
Fig. 2. StabilitySole graphical user interface
Fig. 3. StabilitySole integrated into a standard athletic sandal demonstrating versatility of technology
map, the user interface displays information about a patient’s center of gravity using crosshairs. This icon moves within a predefined space based on a center of gravity calculation and information from both insoles. The center of gravity icon also displays a ‘trace’ feature that will show past history of the center of gravity position. Numerical values of the center of gravity are recorded in real time and used for analysis. Identifying the center of gravity traces for normal gait and postural control can be used as a means of comparison for an abnormal gait. This feature can be of great value for static postural control and determination of sway parameters, which relate directly to balance control. An increased postural sway can be an indicator of a
StabilitySole: Embedded Sensor Insole for Balance and Gait Monitoring
175
patient who is at risk of falling. These sway parameters will be assessed directly by the real time measurement of the center of gravity values over time. Analog information is relayed from the sensors to a PICAXE® microcontroller, then transmitted to an XBee® wireless module. Next, data is sent wirelessly from the insoles to the receiver module that is linked to the desired computer through an easyto-use USB attachment and data is displayed on the GUI. This assistive device can relay important information about patients’ gait and postural control that can be used to diagnose fall risk, evaluate a patient’s response to new medication and determine the effectiveness of physical therapy regimes.
3 Results To date, StabilitySole has been built and is successfully transmitting data wirelessly. Preliminary acceleration results have been collected and show a distinct difference between different classifications of abnormal gait. Specifically, z-axis acceleration data seems to show the greatest variation between gait classifications. The graphs display bipedal acceleration over time in the z-direction. Normal, hemiplegic, and spastic gait were used as the three categories of abnormal gait. All three categories showed a marked difference in acceleration patterns, demonstrating the ability to classify gait through acceleration.
Fig. 4. Preliminary bipedal acceleration data for hemiplegic (right limp), normal, and ataxic (right drag), demonstrating marked differences between gait patterns
176
P. Paulick, H. Djalilian, and M. Bachman
Moreover, the placement of the accelerometer is of great importance. Determination of whether the most information would be gained from what portion of the insole is a crucial parameter in the building of the insole. Three potential options were explored: placement in the toe, middle, and heel of the insole. Preliminary results demonstrate placement in the toe of the insole provides the most comprehensive look into the patient’s gait. This finding is most likely due to the dorsiflexion and plantar flexion movements during a gait cycle. Distinct patterns between different types of gait demonstrate that acceleration data can be a marker for gait and balance abnormalities. The acceleration data also can function as a tilt sensor giving insight to a patients sway parameters, which have been shown to be indicative of a balance disorder.
4 Future Works Future work includes testing subjects to validate the user-friendly nature of the insole and the feasibility of integrating this insole into daily life. Next, for the project the group will expand on methods for data analysis to create an algorithm that can identify different center of gravity traces and acceleration, and use this data to identify balance abilities and gait parameters. This algorithm will be based on pattern recognition. Finally, groups have shown that a variety of somatosensory feedback methods can be helpful to those with a balance disorder, feeding back information on the patients’ center of gravity. Currently our device is strictly a means of a non-intrusive diagnostic for at-home use. However, if sensory feedback based on the patients balancing and gait parameters can be integrated to help the patient correct for balancing abnormalities that would be of great value, and our group is planning on exploring those capabilities.
5 Conclusions StabilitySole is pressure-sensing insole equipped with accelerometers to evaluate a patient’s postural control and balance ability in a non-obtrusive manner. The importance of balance and gait monitoring is applicable to a variety of fields such as: evaluation of neurological disorders, early disease detection in children, new medication monitoring, and elderly fall risk. With this technology patients can use the insole in a home setting to monitor their balancing ability and relay data to their physician wirelessly for proper evaluation eliminating the need for additional inoffice visits. Home health monitoring is of increasing importance especially for the older adult population who has difficulty leaving the home. Therefore in-home methods for health care monitoring are of great value. Acknowledgements. I would like to thank Dr. Hamid Djalilian for the medical expertise and input into the StabilitySole project. And Luke Heidbrink for the creation of the graphical user interface of StabilitySole.
StabilitySole: Embedded Sensor Insole for Balance and Gait Monitoring
177
References 1. Geyling, F.T., Forst, J.J.: Semiconductor Strain Transducers. The Bell System Technical Journal 39, 705–731 (1960) 2. Holden, M., et al.: Clinical Gait Assessment in the Neurologically Impaired, vol. 64(1) (January 1984) 3. Rosano, C., et al.: Quantitative Measures of Gait Characteristics Indicate Prevalence of Underlying Subclinical Structural Brain Abnormalities in High-Functioning Older Adults. Neuroepidemiology 26, 52–60 (2006) 4. Anne, S.-C., et al.: Effect of balance training on recovery of stability in children with cerebral palsy. Developmental Medicine & Child Neurology 45, 591–602 (2003) 5. Stevens, J.A., Corso, P.S., Finkelstein, E.A., Miller, T.R.: The costs of fatal and nonfatal falls among older adults. Injury Prevention 12, 290–295 (2006) 6. Verghese, J., et al.: Abnormality of Gait as a Predictor of Non-Alzheimer’s Dementia. New England Journal of Medicine 347(22) (November 28, 2002)
The Upper Extremity Loading during Typing Using One, Two and Three Fingers Jin Qin1, Matthieu Trudeau1, and Jack T. Dennerlein1,2 1
Department of Environmental Health, Harvard School of Public Health, Boston, MA, 02115, USA 2 Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
[email protected]
Abstract. This study aimed to evaluate the effect of the number of fingers used during typing on the biomechanical loading on the upper extremity. Six subjects typed in phone numbers using their right hand on a stand-alone numeric keypad in three conditions: (1) typing using the index finger; (2) typing using the index and the middle fingers; (3) typing using the index, middle and ring fingers. Typing with three fingers decreased wrist posture deviation, decreased angular velocity at the wrist, elbow and shoulder joints, and decreased peak to peak torques at the wrist and shoulder joints compared to single finger typing, while no difference was found between one and two finger typing. These results demonstrated that different computer keyboarding styles affect the biomechanical loading on the upper extremity. Keywords: Typing; Upper extremity; Kinematics; Kinetics.
1 Introduction More than 76 million or 56% of employed people in the US use a computer at work in 2003 [1]. The high prevalence of computer use and its association with upper extremity musculoskeletal disorders (MSDs) have caused a substantial public health burden in the US. Epidemiologic evidence has consistently reported more than 50% of computer users sustain musculoskeletal symptoms and disorders of the upper extremity spanning from the neck and shoulder to the hand and wrist [2,3,4,5]. The biomechanical risk factors, including posture, force, repetition, and dynamic motion affect the loading on the soft tissues, which are related to the injury outcomes for the upper extremity [6]. Studies examining the biomechanics of computer use have focused almost exclusively on the distal joints and segments, and only the kinematics of the movement [7,8,9]. Kuo et al. [10] characterized index finger tapping on a keyswitch in terms of finger joint kinetics, kinematics, muscle activation patterns, and energy profiles. They found that while both potential energy and kinetic energy are large enough to overcome the work necessary to press the key switches, nonetheless, the motor control strategies utilize muscle forces and joint torques to ensure a successful keystroke. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 178–185, 2011. © Springer-Verlag Berlin Heidelberg 2011
The Upper Extremity Loading during Typing Using One, Two and Three Fingers
179
Dennerlein et al. [11] measured both the kinematic and kinetic characteristics of the whole upper extremity. They suggested that finger, wrist and elbow joints work in synergy to produce fingertip motion, with a majority of the movement being generated by the finger and the wrist joint. However, both of these studies were limited in that subjects were tapping on a single key switch using the index finger only, and the motion occurred mainly in the two-dimensional sagittal plane. The upper extremity movement during keyboard use in reality is more complicated which involves threedimensional (3-D) motion and multiple fingers. This study aims to explore the characteristics of the 3-D motion of the wrist, elbow and shoulder joint, and to estimate the joint torques while typing using one, two and three fingers. The force, posture, dynamic motion and joint torque can provide quantitative assessment of the biomechanical loading on the upper extremity musculoskeletal system. This is the first step to study the biomechanical risk of upper extremity MSDs associated with computer use.
2 Methods 2.1 Experimental Setup Six participants (mean (sd) age 28.3 (2.0) years, mean (sd) height 168.4 (7.4) cm, mean (sd) weight 64.9 (12.7) kg), three women and three men entered phone numbers on a stand-alone numeric keypad with their right hand (Fig. 1). The phone numbers included only numbers randomly generated from zero to nine. The experiment included three conditions: (1) typing using the index finger only; (2) typing using the index and the middle fingers only; (3) typing using the index, middle and ring fingers only. With a repeated measures study design, each participant completed all three conditions with the order of the conditions randomized. The participants were free of upper extremity MSDs at the time of the experiments. All subjects were right-handed. Informed consent was obtained, and all experimental procedures and forms were approved by the Harvard School of Public Health Institutional Review Board. A stand-alone numeric keypad was used instead of a full-sized keyboard because we wanted participants to perform a less familiar task than typing on a standard keyboard to minimize the effect of typing skills on the performance and potentially on the kinematics and kinetics. All conditions were performed at the participant’s own comfortable speed. During the experiment, subjects were instructed to type normally as if on their own computers. For the conditions that required typing with multiple fingers, subjects were not instructed to use a specific finger for a specific column of the keys, however, the numbers of fingers used was monitored to ensure that the required fingers were indeed used. They were instructed to proceed if mistakes were made during typing. The key switch’s activation force was 0.44 N, and the total travel (the maximum distance the key can be pressed down in the vertical direction) was 3.23 mm. The subjects completed the typing tasks while seated in a chair adjusted such that their thighs were parallel to the ground and the height of the typing surface was even with their seated elbow height. No forearm or palm support was provided.
180
J. Qin, M. Trudeau, and J.T. Dennerlein
x right shoulder
y 7 8 9 4 5 6 1 2 3 0
z y x MCP
elbow
wrist
keypad
force transducer
Fig. 1. Schematic view of the sagittal plane of the right upper extremity during experiment defining the global coordinate system
2.2 Experimental Measurements Force vectors applied by the fingertip to the key and keypad were measured by a 6axis force-torque transducer (ATI Industrial Automation, model Gamma, SI-65-5.5, Apex, NC, USA). The keypad sat on top of a 3 mm-thick aluminum sheet attached to the force-torque transducer. The amplified signals from the force-torque transducer were recorded onto a personal computer with a National Instruments Board (NI PCI6040, National Instruments, Austin, TX, USA) at 200 samples per second. The force signal was low pass filtered digitally using a fourth order Butterworth filter with a cutoff frequency of 20 Hz and a zero phase shift. Three-dimensional kinematics of the upper extremity was recorded using an activemarker infrared motion analysis system (Optotrak Certus System, Northern Digital, Ontario, Canada). Four clusters of three markers secured on a rigid plate were mounted on the hand, forearm, upper arm and torso, each was treated as a rigid body segment. A single marker was attached to the fingertip of the index finger. The 3-D positions of these markers were recorded at 200 samples per second. All kinematic data were low-pass filtered digitally using a fourth order Butterworth filter with a cutoff frequency of 10 Hz and a zero phase shift. Using the relationships of the anatomical landmarks and anthropometrical data, the position of each segment’s center of mass, its inertia tensors and orientation were calculated at each instance of time based on the 3-D trajectories of the markers [12,13,14]. From the trajectories of the upper extremity segments we calculated joint angle, angular excursion, and joint torques. Joint angles were calculated using Euler angles with reference to the posture of the forearm fully pronated, the elbow flexed 90 degrees, and the upper arm vertical to the ground. The angular velocity was derived as
The Upper Extremity Loading during Typing Using One, Two and Three Fingers
181
the derivative of posture and the rectified mean was calculated. For joint posture excursion and torque sign convention, flexion, adduction, supination and internal rotation were positive, and extension, abduction, pronation and external rotation were negative. Angular excursion was calculated as the difference between the 95th and the 5th percentiles of joint angles. A 3-D multi-segment inverse dynamic model [10,15] calculated net reactive joint forces and moments for the wrist, elbow and shoulder joints. Peak-to-peak joint torque was calculated as the difference between the 95th and the 5th percentiles of torques. Participants typed for 20 seconds in each condition, and the mid fifteen seconds of data for each subject in each condition were extracted for analysis. One minute rests were given between conditions. 2.3 Statistical Analysis The dependent variables were calculated based on each participant’s anthropometry. The mean and standard deviation of the dependent variables across participants were presented. To determine the difference of joint loading while using different number of fingers during typing, we compared the kinematics and kinetics across three experimental conditions. We evaluated the effect of typing condition on fingertip force, joint posture, and moment in 3-D coordinates for each joint using a mixed effect model with the condition as the fixed effect and subject as the random effect. If a significant condition effect was found (P ≤ 0.05), we proceeded with Tukey’s posthoc pairwise comparison between each condition pair.
3 Results The fingertip force magnitudes were 1.5 to 1.6 times of the activation force requirement of the keypad. The average forces on the keyboard were 0.65(0.16) N, 0.65(0.07) N and 0.72(0.08) N when using one, two and three fingers during typing. The mean wrist abduction (ulnar deviation) posture across six subjects was 6.4 (4.7) degrees (Table 1) during one finger typing, significantly greater than three finger typing (2.3 (4.1) deg). The wrist flexion excursions in one (13.6 (2.2) deg) and two (14.1 (1.1) deg) finger typing conditions were significantly greater than three finger typing (11.6 (1.9) deg). The mean posture of shoulder internal rotation during three finger typing was greater than two finger typing, and there was more forearm supination excursion during three finger typing than one finger typing. The flexion angular velocity of the wrist (22.7 (4.0) deg/s), elbow (7.8 (2.3) deg/s) and shoulder (5.0 (2.0) deg/s) during three finger typing were lower than one finger typing (Pvalues < 0.06). Shoulder internal rotation velocities during one (7.3 (2.2) deg/s) and two (7.3 (3.6) deg/s) finger typing were significantly greater than three finger typing (6.1 (3.0) deg/s). Two finger typing did not differ from one finger typing in terms of joint angles and velocity. The mean joint torque did not differ across conditions (Table 2), while the peak to peak joint torque of wrist adduction decreased significantly from one (6.7 (1.8) Ncm)
182
J. Qin, M. Trudeau, and J.T. Dennerlein
to three finger typing (5.6 (1.5) Ncm). Similarly, peak to peak torque of shoulder internal rotation during one finger typing (35.4 (10.5) Ncm) was significantly greater than three finger typing (24.3 (7.6) Ncm). The shoulder adduction peak to peak torque difference between one (25.0 (8.9) Ncm) and three (22.1 (8.9) Ncm) finger typing was marginally significant (P = 0.06). No significant difference of joint torque between one and two finger typing was found.
4 Discussion The goal of this paper was to evaluate the effect of the number of fingers used during typing on the biomechanical loading of the upper extremity. Three dimensional motion of the upper extremity during random phone number typing was tracked and the kinematic and kinetic profile from the shoulder to wrist joint was recorded and computed. The results showed decreased wrist posture deviation, decreased angular velocity at the wrist, elbow and shoulder joints, and decreased peak to peak torques at the wrist and shoulder joints as the number of fingers increased from one to three during typing. This study is among the first few endeavors to quantitatively assess the 3-D kinematics and kinetics of the upper extremity during computer keyboard use. Previous effort on assessing biomechanical loading evaluated the task of single finger tapping on a single key-switch [11]. As the next step, the task in this study was more complicated in that participants were typing on multiple keys using multiple fingers. This study showed that various typing styles could lead to different biomechanical loading on the upper extremity. However, this is just one piece of a big puzzle. Typing is a complicated motion which can be affected by individual factors, physical work environment, and cognitive and mental load. Future research need to evaluate each of those factors and the associated internal loading on the musculoskeletal system. Duration has been reported to be another risk factor of the upper extremity MSDs related to computer use [16], therefore, the temporal changes over a longer period of time needs to be evaluated. This study is a step stone for designing future research in the real work situations typing on standard QWERT computer keyboard for an extended period of time. Noted that no differences were found between typing using one and two finger, but they are different from typing using three fingers. It is likely that typing with four or five fingers from each hand will lead to even more different biomechanical loading profiles. More markers can be added on all fingers and each finger segment so that the DIP and PIP joint kinematics and kinetics from each finger can be recorded. In conclusion, the results of this study demonstrated that number of fingers used during keying lead to different biomechanical loading patterns. The biomechanical loading on the wrist and the shoulder during one and two finger typing was higher than three finger typing and no difference was found between one and two finger typing. The results of this study maybe used to design future research to study the association between typing styles and the risks and injury mechanisms of computer use related upper extremity MSDs.
5.3 (0.8)
Angular velocity (deg/s) Wrist 30.4 (3.8)A Forearm Elbow 9.4 (3.7)A Shoulder 6.0 (2.0)
Shoulder
One Posture mean (deg) Wrist -10.7 (14.3) Forearm Elbow 4.9 (8.9) Shoulder 4.5 (9.9) Posture excursion (deg) Wrist 13.6 (2.2)A Forearm Elbow 3.6 (1.7) 11.6 (1.9)B 3.5 (1.3)
14.1 (1.1)A
3.6 (2.0)
22.7 (4.0)B 7.8 (2.3)B 5.0 (2.0)
27.0 (3.9)AB
9.1 (3.5)AB 6.1 (2.7)
5.1 (1.9)
2.7 (8.0) 6.1 (7.4)
4.7 (8.4) 3.2 (8.8)
5.9 (1.2)
-11.7 (15.9)
Three
-11.9 (15.5)
Flexion Two
2.9 (1.1)
12.1 (2.9)
1.6 (0.9)
5.2 (1.9)
-17.7 (2.8)
-6.4 (4.7)B
One
2.7 (1.7)
11.8 (3.2)
1.7 (1.1)
5.8 (0.8)
-16.9 (2.3)
-4.5 (3.2)AB
Adduction Two
2.3 (0.9)
11.1 (3.8)
1.3 (0.6)
6.1 (1.4)
-16.7 (2.6)
-2.3 (4.1)A
Three
10.6 (3.3) 7.3 (3.6)A
7.3 (2.2)A
4.0 (1.7)
3.8 (1.2)AB
9.9 (8.9)B
24.8 (3.6)
10.7 (2.7)
3.8 (0.8)
3.5 (1.1)B
10.4 (8.4)AB
25.1 (4.0)
6.1 (3.0)B
10.8 (2.9)
3.8 (1.9)
4.9 (1.7)A
12.7 (8.2)A
24.5 (3.7)
Internal rotation (Supination) One Two Three
Table 1. Mean (SD) joint posture, excursion and velocity across six subjects for three typing conditions: using one, two or three fingers. Conditions with different superscript letters are significantly different, and the corresponding means are represented by the letters such that A>B. Bold numbers are condition effect with P ≤0.05. Joint angular velocity were mean rectified. Flexion, adduction, supination and internal rotation were positive, and extension, abduction, pronation and external rotation were negative.
The Upper Extremity Loading during Typing Using One, Two and Three Fingers 183
-26.1 (8.6)
-200 (70) -255 (147)
-200 (74) -266 (157)
Wrist Forearm Elbow Shoulder
Flexion Two
-26.2 (9.4)
10.3 (1.4)
28.8 (6.4) 61.0 (21.1)
11.4 (1.9)
30.5 (5.7) 60.6 (12.3)
Peak to peak joint torque (Ncm)
Wrist Forearm Elbow Shoulder
Mean joint torque (Ncm)
One
28.2 (5.5) 56.1 (18.2)
10.3 (1.2)
-200 (71) -284 (154)
-26.4 (8.9)
Three
25.0 (8.9)
6.7 (1.8)A
-164 (44)
3.5 (2.0)
One
27.6 (12.9)
5.9 (1.2)AB
-158 (67)
3.4 (1.9)
Adduction Two
22.1 (8.9)
5.6 (1.5)B
-154 (52)
3.1 (2.3)
Three
35.4 (10.5)A
11.7 (2.6)
1.8 (2.7)
-17.2 (30.6)
31.3 (10.4)A
12.0 (3.4)
2.4 (4.2)
-16.8 (29.7)
24.3 (7.6)B
11.4 (4.2)
0.4 (2.1)
-20.7 (32.9)
Internal rotation (Supination) One Two Three
Table 2. Mean (SD) of joint torque (Ncm) statistics across six subjects for three typing conditions: using one, two or three fingers. Conditions with different superscript letters are significantly different, and the corresponding means are represented by the letters such that A>B. Bold numbers are condition effect with P ≤0.05. Flexion, adduction, supination and internal rotation were positive, and extension, abduction, pronation and external rotation were negative.
184 J. Qin, M. Trudeau, and J.T. Dennerlein
The Upper Extremity Loading during Typing Using One, Two and Three Fingers
185
Acknowledgements. This work was funded in part by NIOSH R01 OH008373, and the NIOSH ERC at Harvard University-grant (T42 OH008416-05).
References 1. US Census Bureau. Computer and Internet Use in the United States, Current Population Reports, pp. 23–208 (2005) 2. Eltayeb, S., Staal, J.B., Kennes, J., Lamberts, P.H., de Bie, R.A.: Prevalence of complaints of arm, neck and shoulder among computer office workers and psychometric evaluation of a risk factor questionnaire. BMC Musculoskel Dis. 8, 68 (2007) 3. Eltayeb, S.M., Staal, J.B., Hassan, A.A., Awad, S.S., de Bie, R.A.: Complaints of the arm, neck and shoulder among computer office workers in Sudan: a prevalence study with validation of an Arabic risk factors questionnaire. Environ. Health 7, 33 (2008) 4. Klussmann, A., Gebhardt, H., Liebers, F., Rieger, M.A.: Musculoskeletal symptoms of the upper extremities and the neck: a cross-sectional study on prevalence and symptompredicting factors at visual display terminal (VDT) workstations. BMC Musculoskel Dis. 9, 96 (2008) 5. Gerr, F., Marcus, M., Ensor, C., Kleinbaum, D., Cohen, S., Edwards, A., Gentry, E., Ortiz, D.J., Monteilh, C.: A prospective study of computer users: I. Study design and incidence of musculoskeletal symptoms and disorders. Am. J. Ind. Med. 41, 221–235 (2002) 6. NIOSH. In: Bernard B.P. (ed.) Musculoskeletal disorders and workplace factors: A critical review of epidemiologic evidence for work-related musculoskeletal disorders of the neck, upper extremity, and low back. Cincinnati, OH. DHHS (NIOSH) Publication Number 97141 (1997) 7. Baker, N.A., Cham, R., Cidboy, E.H., Cook, J., Redfern, M.S.: Kinematics of the fingers and hands during computer keyboard use. Clin. Biomech. (Bristol, Avon) 22, 34–43 (2007) 8. Serina, E.R., Tal, R., Rempel, D.: Wrist and forearm postures and motions during typing. Ergonomics 42, 938–951 (1999) 9. Sommerich, C., Marcus, W., Parnianpour, M.: A quantitative description of typing biomechanics. J. Occup.Rehabil. 6, 33–55 (1996) 10. Kuo, P.-L., Lee, D.L., Jindrich, D.L., Dennerlein, J.T.: Finger joint coordination during tapping. J. Biomech. 39, 2934–2942 (2006) 11. Dennerlein, J.T., Kingma, I., Visser, B., van Dieen, J.H.: The contribution of the wrist, elbow and shoulder joints to single-finger tapping. J. Biomech. 40, 3013–3022 (2007) 12. Cappozzo, A., Catani, F., Croce, U.D., Leardini, A.: Position and orientation in space of bones during movement: anatomical frame definition and determination. Clin. Biomech. (Bristol, Avon) 10, 171–178 (1995) 13. McConville, J.T., Churchill, T.D., Kaleps, I., Clauser, C.E., Cuzzi, J.: Anthropometric relationships of body and body segment moments of inertia. Air Force Aerospace Medical Research Laboratory, Wright-Patterson Air Force Base, Ohio, AFAMRL-TR-80–119 (1980) 14. Veldpaus, F.E., Woltring, H.J., Dortmans, L.J.: A least-squares algorithm for the equiform transformation from spatial marker co-ordinates. J. Biomech. 21, 45–54 (1988) 15. Kingma, I., Toussaint, H.M., De Looze, M.P., Van Dieen, J.H.: Segment inertial parameter evaluation in two anthropometric models by application of a dynamic linked segment model. J. Biomech. 29, 693–704 (1996) 16. Gerr, F., Monteilh, C.P., Marcus, M.: Keyboard use and musculoskeletal outcomes among computer users. J. Occup. Rehabil. 16, 265–277 (2006)
Automatic Face Feature Points Extraction Dominik Rupprecht, Sebastian Hesse, and Rainer Blum Hochschule Fulda – University of Applied Sciences, Marquardstr. 35, 36039 Fulda, Germany {Dominik.Rupprecht,Sebastian.Hesse,Rainer.Blum}@ informatik.hs-fulda.de
Abstract. In this paper we present results of finding a way to automatically equip a three-dimensional avatar with a model of a user’s individual head. For the generation of the head model certain so-called face feature points must be extracted from a face picture of the user. A survey of several state-of-the-art techniques and the results of an approach for the extraction process are given for the points of the middle of each iris, the nasal wings and the mouth corners. Keywords: Avatar, Face Feature Points, Integral Projection, Circle Detection, E-Commerce.
1 Introduction High quality three-dimensional virtual human bodies (avatars) are playing an increasingly important role in the areas of information technology, for example in networked 3D worlds like Second Life1 and Twinity2, in the simulation of processes or in virtual try-ons of clothing. In the field of creating avatars, and most of all customized ones, especially in the context of an e-commerce application the customers’ acceptance of their personal avatar is very important. On possible solution to increase the users’ acceptance of their avatar might be to equip it with an individual, three-dimensional model of their individual heads. For this challenge the automatic face modeling software ”FaceGen” [1] provides a basis. Offering frontal human face pictures as input, a textured threedimensional face model is generated. Unfortunately, the software requires the user to manually identify several so-called face feature points, first. But, as the context of this work calls for a fluent web shopping experience, it is advisable to completely automate the head model generation process, including the face feature extraction. This would keep the customers from any tedious extra work. The approach presented here combines several state-of-the-art techniques. It differs from similar face recognition techniques, where often detection of regions and template matching are applied. In contrast, for the context at hand, it is essential to identify accurate points in order to obtain a good model and texture as result of the automatic head generation process. 1 2
http://secondlife.com/ http://www.twinity.com/
V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 186–194, 2011. © Springer-Verlag Berlin Heidelberg 2011
Automatic Face Feature Points Extraction
187
1.1 Objectives and Significance In this paper we present intermediate findings of the ongoing research project KAvaCo3. Focus of its research activities is to improve customized avatars for the usage in an innovative, interactive system for the support of apparel retail via Internet. The results reported here were obtained in the Bachelor thesis of this paper’s coauthor Sebastian Hesse. To interpret our findings correctly and to transfer them to other research contexts, we provide details on the study’s context. 1.2 The KAvaCo Project On the one hand, the aim of the ongoing research project KAvaCo is to develop an interactive system for generating humanoid, anatomically correct, animatable, virtual humans. Secondly, the avatars are used as representations of individual customers in ecommerce and are evaluated as such. Imitation refers both to geometric properties such as body sizes and shapes as well as to the visual appearance, such as skin or hair color. All of the necessary information can be entered by the users themselves. For a realistic representation of the head, photos are processed, which can be provided by the customers with little effort using conventional digital cameras. The optimal design of the user interaction is a central goal for both the creation and the use of the avatars. In addition to the technical progress the project also investigates the until now only partially clarified scientific question, what demands are made by customers on their personal virtual twins in the context of e-commerce. Concerning the representation, for example, it has to be clarified which level of realism is desired by customers, whether individual ”problem areas“ and characteristics should be concealed or whether an idealized representation achieves greater acceptance. The results of the research will be applied and tested in the context of the clothing retail. Here comes into play that the developed avatars can directly be used in systems for virtual-try on for made-to-measure and ready-made clothing. But also the use for other applications such as in the area of public health, for example the visualization of weight gain and loss, is planned.
2 Method The face modeling software employed by the project, ”FaceGen”, needs eleven points from the frontal face image for computation. In this work techniques and approaches for the points of the eyes (feature point: middle of each iris), the nose (nasal wings) and the mouth (mouth corners) were considered. The approach which is finally used to extract the facial feature points is divided into two sections. The first is to reduce the image data to small areas of interest for later detection. This means, that the feature points do not have to be searched in the whole face image, but in small zones. Afterwards, in the second section the required points are searched through several extracting techniques. The techniques are applied to regions where the feature points are expected. The detailed procedure is shown in the following sections. 3
http://www.kavaco.info/
188
D. Rupprecht, S. Hesse, and R. Blum
To evaluate the results of each concept a set of 100 frontal face images has been acquired from [2]. To be as representative as possible the persons on the photos were chosen from different nationalities, different age and gender. Each of the approaches was tested with all pictures and documented. Therefore, the detected sections and features were logged to validate them against human specified ones. Classification numbers were defined to decide how robust the region detection and the feature extraction are. For the region detection the classification numbers are false-positive-, false-negative- and right-positive-rate in percent, for the feature extraction it is the variance between the identified and the ideal position in pixel.
3 Region Detection 3.1 Detection Concept As said before, the first part of the identified solution reduces the area that is subject to the later detection, to small regions. The detection area is constrained in a “topdown” manner to so-called regions of interest (ROI), here, first the face, then eye, mouth and nose. The ”region-detection” makes use of the Haar-Like Feature Extraction with AdaBoost (see e.g. [3]) at the beginning. The Haar-Like Feature Extraction is an object recognition technique which uses weak features to detect domain objects. With the learning algorithm AdaBoost a classifier is constructed for the detector. The identification of the detection area starts by searching the part of the image that contains the frontal face. This area is then divided into three further ones: 1. Area – upper face third, for searching the eyes-region 2. Area – middle face third, for searching the nose-region 3. Area – lower face third, for searching the mouth-region This division of the face is done by intuition, because the position of the searched regions in normal faces is given by the human anatomy. These new regions are treated as search candidates for the remaining three ROI. Due to this division into smaller areas the detector needs less performance and it also reduces the detection of false regions. A test iteration with the 100 human faces showed, that the detection of the faceand eyes-region has a good detection rate. Only two percent of these both regions were not detected. But the false-positive rate of the other two regions was too high (30 percent for the nose and 47 percent for the mouth). So there is a need to extend this concept to counteract the high false-positive rate. 3.2 Extended Detection Concept In a second concept some extensions are made to the first one reported in the previous section in order to decrease the false-positive-rate of the mouth and nose region. Therefor a validation was implemented which checks the regions with a set of rules. The detected regions are related to each other and the face area. Primary point of reference is the region of the eyes which has a very good detection rate. The violation
Automatic Face Feature Points Extraction
189
of a rule excludes the region from the list of potential candidates. The rules were set up under the condition that they are not too detailed and not too general to avoid excluding too many right or too little wrong detected regions. Even though the eye-regions have good results there are rules implemented to prevent a false detection here. This can occur in the area of the nostrils because the integral projection of two eyes and nostrils could be very similar. Because the nostrils are less distant than the eyes, the width of the eyes-region is related to the width of the face. If the width of the eyes region falls below 30 percent of the face region width it will be excluded. For the validation of the nose region it is related to the eyes region. Two rules for the nose region are used to check that the nose region’s horizontal position is within the eyes region. Another rule ensures that the nose region is not above the eyes. The mouth region uses the same rules as the nose region but the rule for the vertical position is not applied, because the mouth is only looked for in the lower face third. Using these rules the false-positive rate could be reduced in the mouth region by 10.03 percent and the nose region by 11.43 percent. But there were many areas that were too small to be correct ones. Thus, the rules were adjusted by analyzing the smallest region in context of other regions. It turned out that the mouth and nose region fits 60 times into the facial region. An extension of the rules checking this threshold decreased the false-positive rate for the mouth region by 23.22 percent and for the nose region by further 6.82 percent. The false-positive rate was now low enough to develop and use some techniques to extract the desired feature points in the detected regions.
4 Feature Extraction Facial feature points can be identified through their miscellaneous properties. Their extraction has also a high relevance in domains like gesture detection, emotion recognition and the transfer of real facial expressions onto an avatar. Such feature points are often used to identify a face in an image but, vice versa, to detect the face region at first with a different, global approach could help making a following feature extraction more robust and effective. The feature extraction approach presented here makes use of several image processing techniques which are described in the following sections. 4.1 Extraction of the Center of the Eye The eye is one of the most distinctive regions in the human face. Two well recognizable attributes are the elliptic geometry of eye and iris and its brightness distribution. One possibility is to use the geometric attribute of the eye to detect its center. To detect the iris the Circular Hough Transformation can be applied to the respective region. Another technique to find the iris is the integral projection. With the integral projection it is possible to extract the brightness distribution of the eye region. The following sections will analyze both techniques.
190
D. Rupprecht, S. Hesse, and R. Blum
Circle Detection. To detect the iris with the Circular Hough Transformation (cf. [4]) it is necessary to transform the region from a colored to a gray scale image. Most image processing algorithms are applied on gray scale or binary images because they contain less information than colored images. Hereafter, possible image artifacts are removed with a Gaussian filter. Now, the iris can be highlighted by an edge-detector, here the Canny filter. The result of the Canny filter has the benefit that edges are represented as one pixel wide lines (also called binary edges). The image delivered by this pre-processing step forms the basis for the subsequent circle detection. To prevent false-detected circles, a smallest and largest acceptable radius is calculated dynamically relative to the height of the eye region. The minimum radius has to be greater than one tenth of the eye region height and the maximum radius has to be smaller than half of the region height.
Fig. 1. Results of the Circular Hough Detection
The circle detection has good results for many eyes regions (see Fig. 1, part a). But too much result images show that this method produces also a lot false-detected circles (see Fig. 1, part b). This is the result of a too dark illumination of the images or not wide enough opened eyes. An increase of the contrast or the change from binary to gray scale images did not produce better results. So, this approach was no longer analyzed. Integral Projection. As already mentioned in section 4.1 another method to detect the eye is the integral projection. This method uses the brightness distribution in the eyes region. To detect the iris it is necessary to calculate the vertical and horizontal integral projection. As shown in [5] the coordinates of the iris can be extracted from the two integral projections. The x-coordinate can be derived from the horizontal and the y-coordinate from the vertical projection. The calculation of the projection can be disturbed by several factors such as dark bushy eyebrows, eyelashes and dark shadows. So, a pre-processing step is needed to decrease the effect of these factors. After transforming the image into a gray-scale one, a median filter is used to eliminate image artifacts. To highlight the iris and remove reflection in the eye, morphological operations are used. A method called “erosion” enlarges the dark areas in an image. That is how the iris can be highlighted but it also highlights shadows, eyebrows and eyelashes. This can be contracted by increasing the contrast (Fig. 2 a) and using “dilation” (Fig. 2 b) to shrink dark areas before applying the erosion method (Fig. 2 c).
Automatic Face Feature Points Extraction
191
Fig. 2. Preprocessing the image for the integral projection
The used pre-processing method delivers good results so that the integral projection can extract the darkest area. The large part of disturbing factors can be eliminated through the contrast enhancement and use of dilation. Mostly all of the extracted x-coordinate from the horizontal and the y-coordinate from the vertical integral projection were inside the iris. Only bad exposure and big shadows disturbs the extraction too much. But such face images will also result in bad textures and cannot be used for the threedimensional head-generation. Therefore the exposure problem is not analyzed anyway. At the end the results of this method were usable. For the head generation process with “FaceGen” feature points with a variance less than 20 pixels from the ideal position can be used. Only 12.24 percent of the detected points were too far away and could not be used for the automated head generation process. 4.2 Extraction of the Nasal Wings In contrast to the eyes, the nose does not have such good attributes to recognize. Most noticeable is the tip of the nose, the nostrils and the contours of the nasal wings. In the following two approaches to track the nose are described. Contour tracking. The pre-processing of the contour tracking is similar to the circle detection of the eyes. Often, the nasal wings have differences in brightness to the surrounding face so that they can be extracted as edges. The result of the Canny-filter is the basis for this method. An algorithm is applied onto the edge image which follows the edges to find contours. The contours that are farthest to the right and left potentially represent the nasal wings. Inspecting maps of edges for the 100 test faces, it is obvious that this method cannot deliver useful results. The differences in brightness from the nasal wings and the surrounding face were not big enough. Often, it was even not possible to mark the nasal wings manually because of a fluent crossover between the nasal wings and the surrounding face region (Fig. 3 a). Only if the exposure is good enough the edges are recognizable. Another problem is the falsification of the edge image by facial hair like beards (Fig. 3 b). Brunelli and Poggio [6] had used this method to find the nose but it is not useful to extract the nasal wings.
192
D. Rupprecht, S. Hesse, and R. Blum
Fig. 3. Results of the contour tracking of the nose
Integral Projection. As Hua et al [5] describe the integral projection can be used to find the tip of the nose or the nostrils. So, for pre-processing, the region has only to be transformed in a gray-scale image and possible image artifacts have to be removed by using the median-filter. The nose region is then analyzed with the horizontal projection. In this projection the recognized low points are the nostrils and the amplitude in between represents the tip of the nose. The vertical position of the nostrils can be extracted by calculating the vertical integral projection. Here the lowest point mostly describes the dark nostrils. The effect of the nostrils in the integral projection cannot be highlighted by using the erosion. Sometimes the tip of the nose is beneath the nostrils so that they are not visible and hence not detectable.
Fig. 4. Results of the integral projection to identify the nostrils
In some cases this method could deliver acceptable results (Fig. 4 a). But, depending on the face tilt and the shape of the nose the nostrils may not be identified (Fig. 4 b). A good and consistent exposure of the nose is also important. Otherwise the horizontal projection is useless (Fig. 4 c). But another problem is that the position of the nostril does not imply the position of the respective wing of the nose, which is needed for the head generation process. So, this method could not be used to detect the wings of the nose. 4.3 Extraction of the Corners of the Mouth The mouth has some distinctive attributes to recognize. In most cases the mouth is darker than the face. This characteristic can be used to extract the contour of the mouth and its width and position. At first the contour tracking algorithm like section
Automatic Face Feature Points Extraction
193
4.2 was used to extract the mouth but the results were not usable. There are too many disturbing factors like beards, small lips or makeup. These circumstances make an extraction with this method impossible. As already mentioned the mouth has a darker brightness as the surrounding face. So, it is possible to extract it by applying a vertical and horizontal integral projection in the mouth region. In the pre-processing step the region is transformed into a gray scale image, then using dilation to reduce the effect of dark hairs and afterwards erosion to mark the dark lips. In the horizontal projection the width of the mouth can be extracted by searching the two outer flanks and the vertical projection is useful to detect the vertical position. Here it was also not possible to extract the feature points exactly enough. The vertical position is often recognizable as a down point in the vertical integral projection. But a dark beard or bad lighting conditions disturb the extraction massively. In [6] it is described how to extract the width of the mouth with the horizontal projection but the effect of the disturbing factors is too big. The shape of the mouth often varies a lot so that lips were not recognizable. So this method can also not be used to extract the required feature points.
5 Conclusion This paper shows that there are many different ways to try to extract the several required feature points from a face photo. The chosen approach with detection of the face first and then separation of the regions of interest for a subsequent feature detection appears to be a good choice. The Haar-Like Feature Extraction with AdaBoost identifies the face with a very high right-positive rate. The implemented rules, which identify the detected ROI as valid, helped to reduce the initial high false-positive rate when selecting the three regions of interest for the eyes, the nose and the mouth. The quality of the analyzed approaches for the feature extraction varies a lot. The integral projection is the best choice for finding the center of the eyes. Almost 90 percent of the given eyes could be detected. But for the other two feature points the algorithms are too vulnerable to disturbing factors like facial hair or bad exposure. So, the differences in brightness between nose and face are mostly too small to detect the edges of the nose. The extraction of the mouth has similar problems. The shape of the mouth that greatly varies from face to face as well as beards impede useful results. As future development it is planned to investigate further procedures to get better results. One possible solution may be the combination of different approaches for example template matching with transformable templates, to find the edges of mouth and nose. These edges can then be used to extract the desired feature points. Another possible attempt might be to separate the mouth from the rest of the face not only by its brightness distribution but by its color and subsequently detect its corners. Acknowledgments. This research was financially supported by the German Federal Ministry of Education and Research within the framework of the program “Forschung an Fachhochschulen mit Unternehmen (FHprofUnd)”.
194
D. Rupprecht, S. Hesse, and R. Blum
References 1. Singular Inversions Inc. - FaceGen, http://facegen.com/ 2. SmartNet IBC LTD - Human references for 3D Artists and Game Developers, http://www.3d.sk/ 3. Viola, P., Jones, M.J.: Robust real-time object detection. International Journal of Computer Vision 57, 137–154 (2004) 4. Gupta, P., Mehrotra, H., Rattani, A., Chatterjee, A., Kaushik, A.K.: Iris Recognition using Corner Detection. In: Proceedings of 23rd International Biometric Conference (IBC 2006), Montreal, Canada (2006) 5. Hua, G., Guangda, S., Cheng, D.: Feature points extraction from faces. In: Proceedings of the Image and Vision Computing, New Zealand (2003) 6. Brunellin, R., Poggio, T.: Face recognition: Features versus templates. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 1042–1052 (1993)
3D Human Motion Capturing Based Only on Acceleration and Angular Rate Measurement for Low Extremities Christoph Schiefer1,2, Thomas Kraus1, Elke Ochsmann1, Ingo Hermanns2, and Rolf Ellegast2 1
RWTH Aachen University, Institute of Occupational and Social Medicine, Medical Faculty, Pauwelsstraße 30, 52074 Aachen, Germany 2 IFA, Institute for Occupational Safety and Health of the German Social Accident Insurance, Alte Heerstrasse 111, 53757 Sankt Augustin, Germany {cschiefer,tkraus,eochsmann}@ukaachen.de, {christoph.schiefer,ingo.hermanns,rolf.ellegast}@dguv.de
Abstract. Human motion capturing is used in ergonomics for ambulatory assessment of physical workloads in field. This is necessary to investigate the risk of work-related musculoskeletal disorders. Since more than fifteen years the IFA is developing and using the motion and force capture system CUELA, which is designed for whole-shift recordings and analysis of work-related postural and mechanical loads. A modified CUELA system was developed based on 3D inertial measurement units to replace all mechanical components in the present system. The unit consists of accelerometer and gyroscopes, measuring acceleration and angular rate in three dimensions. The orientation determination based on angular rate has the risk of integration errors due to sensor drift of the gyroscope. We introduced “zero points” to compensate gyroscope drift and reinitialize orientation computation. In a first evaluation step the movements of lower extremities are analyzed and compared to the optical motion tracking system Vicon. Keywords: ambulatory workload assessment, inertial tracking device, motion capturing, CUELA, ergonomic field analysis.
1 Introduction Human motion capturing is used in ergonomics for ambulatory assessment of physical workloads in field. This is necessary to investigate the risk of work-related musculoskeletal disorders. Since more than fifteen years the Institute for Occupational Safety and Health of the German Social Accident Insurance (IFA) is developing and using the motion and force capture system CUELA (computer-assisted recording and long-term analysis of musculoskeletal loads), which is designed for whole-shift recordings and analysis of work-related postural and mechanical loads in ergonomic field analysis [1, 2, 3]. Typical fields of application are measurements in office, factory or construction environments as well as handling of building machines and other vehicles [4, 5, 6]. The main focus of the motion analysis is on body angles and motion pattern detection. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 195–203, 2011. © Springer-Verlag Berlin Heidelberg 2011
196
C. Schiefer et al.
The CUELA system uses a combination of gyroscopes, inclinometers and goniometers to measure human body angels. Using mechanical components provide contentedly results, however at some workplaces the freedom of movement of the test person is restricted, or the use of mechanical components is impossible at all. Therefore a new sensor that uses 3D inertial measurement units (IMUs) was designed for the CUELA system to replace all mechanical components. The IMU consists of a 3D accelerometer and three one axis gyroscopes measuring acceleration and angular rate. The orientation determination based on angular rate has the risk of integration errors due to sensor drift of the gyroscope. To compensate the integration error, other inertial sensor based human motion capturing systems use complementary sensors like magnetometers [7, 8]. The magnetometers proved not to be suitable for measurements at real workplaces as ferromagnetic materials or electromagnetic radiation in the vicinity of the magnetometer distorts the sensor readings [9]. This article describes an alternative attempt on 3D human motion capturing based only on acceleration and angular rate. In a first evaluation step the movements of lower extremities are analyzed and compared to the optical motion tracking system Vicon [10].
2 Methods 2.1 Motion Capturing and Analysis The IMU is designed as USB human interface device with a size of 6.0cm*2.5cm*0.8cm and consists of three ADIS16100 gyroscopes perpendicular to each other and the SCA3000 triaxial accelerometer. The device works in this study at a sampling frequency of 50Hz. Higher sampling rates are also possible. A battery powered USB hub is designed for power supply of the sensors, so they can be attached to a PC or a mobile device as data logger. Elastic and breathable straps are used to fix the sensors below or above the clothing at the body segments of interest. A person wearing the system is not hindered in its movement and can move freely. To support and simplify the motion analysis, measurements are synchronously video recorded. The orientation of each body segment, equipped with an IMU, is determined by a quaternion based algorithm. An orientation quaternion qt represents the current orientation state of the IMU in relation to the initial orientation q0. Depending on the dynamical state of the IMU, qt is updated in two ways: in static case, indicated by the acceleration vector length of 1g, the accelerometer measures the gravitational acceleration and is therefore interpreted as inclinometer. A quaternion qacc,t is calculated, that represents the orientation of the IMU at time t using the factored quaternion algorithm (FQA) [11]. The elevation and roll component of qacc,t are determined by FQA. The azimuth component of qacc,t originates from integrating the vertical angular rate component. The vertical angular rate component is gained from the transformation of the measured angular rate vector from the body coordinate frame into the global coordinate frame. In the dynamic state of the IMU, the gravitational acceleration component is superimposed by other acceleration effects. In this case the gyroscope readings, measuring
3D Human Motion Capturing Based Only on Acceleration and Angular Rate Measurement
197
angular rate in the body coordinate frame, are used to update the previous orientation quaternion qt-1. The resulting quaternion qω,t is fused with qacc,t by spherical linear interpolation using a weight factor that depends on the dynamical state of the IMU. The weight factor is computed by a Gaussian bell function and the current acceleration vector length as input. To enhance the long term accuracy and reduce the orientation drift error, an automated initialization routine based on so called “zero points” is used. In static situations, with the subject facing to an initial direction, the gyroscope offset and azimuth angle are set to zero. For later analysis, these zero points are identified in the video. This method showed to be more effective than using a high-pass filter for offset compensation of the gyroscope, as the filter also removes “offsets” resulting from a turn of the person. Depending on whether the human movement is reconstructed in real time or later on for detailed analysis, the algorithm can be applied uni- and bidirectional on the arriving or stored data. As the motion data are computed forward and backward in time at the post processing, the positive initializing effect of the zero points is doubled. All results presented in this article are based on bidirectional motion analysis. The motion data of each body segment are feed to a simplified human body model (Fig. 1). The human model uses constraints on the degree of freedom of human joints to prevent body segments moving in a way that is impossible for a healthy person. The translation in 3D space of the human body is not taken into account in this model. The hip center point is fixed in a predefined height above the origin of the coordinate frame, but it can change freely its orientation.
Fig. 1. Visualization of the simplified human body model
2.2 Evaluation of the CUELA Inertial System To evaluate the inertial sensor based CUELA system, it is compared with the optical motion tracking system VICON in a laboratory environment. Optical motion tracking systems excel in a high accuracy and are often used for stationary motion tracking and evaluation of other motion tracking systems [12, 13]. As the degree of freedom of leg
198
C. Schiefer et al.
movement is restricted in comparison to other joints and segments of the human body, we decided to start evaluation of the motion analysis algorithm at the lower extremities. Eleven healthy subjects voluntarily participated in the evaluation study. Their anthropometric characteristics, given in Table 1, varied intentionally to ensure the human model works independent from the anthropometry. Table 1. Anthropometric characteristics of the study participants Female (n = 5) Mean SD Age (years) 28.2 3.5 Weight (kg) 60.6 8.2 Height (cm) 167.8 4.5
Male (n = 6) Mean SD 39.6 7.8 82.5 7.5 179.0 6.2
Total (n = 11) Mean SD Min 34.4 8.4 24 72.5 13.6 48 173.9 7.9 164
Max 49 91 190
Our Vicon setup tracks passive markers with twelve cameras at a frequency of 100 Hz. 20 reflective markers were placed at the lower extremities of the subjects, following the Plug-in-Gait marker placement protocol. The positions are at the feet, knee and hip. The Vicon BodyBuilder tool is used to compute the corresponding joint center points based on the marker set. All subjects were also equipped with five IMUs at the following positions: frontal at lower leg left and right as well as upper leg left and right and backwards on hip height (see Fig. 2). The segment length of each subject was adapted in our human model using the measured mean segment length from Vicon BodyBuilder. The subjects were asked to perform different standardized movements. The movements include standing upright, step to the front with alternating legs and to the left and right side, squatting in different ways on the ground, climbing stairs and ladder up and down as well as turn on point (see Fig. 3). All movements were repeated three times. Some of the movements needed additional setup. A ladder was put to the measurement field. The subjects were asked to climb the ladder, till their hip reaches a height of about two meters, to not exceed the upper bound of the measurement volume. Climbing stairs and turn on point was combined into one exercise. A stairway with three steps up and down was put into the measurement field. The subjects were asked to walk the stairs up and down, turn on point, walk the stairs back and turn again - three times. Both systems synchronously recorded the sensor readings continuously. The time for changing the measurement setup was not taken into account. The adjusted measurement time was about 2.7 minutes. Only before and after a subject performs a taskblock, a zero point is set. To generate comparable data, both systems determined the joint center points of ankle, knee and hip of the left and right side. As the CUELA model does not capture translation, the translational part of the movements was eliminated by a customized BodyBuilder script. As both human models differ in their complexity, the joint center points showed not to be an adequate comparison criterion. Offsets between joint center points in both models could not be removed as they are not constant in direction for 3D movement.
3D Human Motion Capturing Based Only on Acceleration and Angular Rate Measurement
199
Fig. 2. Subject with CUELA inertial (left) and additionally with Vicon markers (middle - right)
Fig. 3. Subjects performing the tasks step to front, to the side, climbing stairs and ladder
For that reason, we took the left and right knee angles according to the neutral zero method and the hip azimuth angle for comparison criterions. The knee angle is computed by the enclosed angle of the vectors pointing from knee to the ankle and hip joint center points. The hip azimuth angle is computed by the vector pointing from the right to the left hip joint center. The root mean square error (RMSE) of the course of time of the angles is used as an objective quality criterion and was computed for all subjects.
200
C. Schiefer et al.
3 Results Figure 4 shows the course of time of the knee angles of one subject during the stair climbing task with turn on point measured by Vicon and CUELA. The Vicon curve is mainly covered by the CUELA curve. The neutral zero method defines the knee angle as follows: the transition point from knee flexion to extension corresponds to the neutral knee position and is defined as 0°. In the knee extension case, the knee angle is defined negative. This is problematic in our way of knee angle calculation, as the angle between two vectors in 3D is defined only in a range of 0° to 180°. This effect is observable in this example at the left knee angle determined by CUELA. Knee angle - left 120 Vicon CUELA
100
angle [°]
80 60 40 20 0 -20 0
10
20
30
40
50
60
time [sec]
Knee angle - right 120 Vicon CUELA
100
angle [°]
80 60 40 20 0 -20 0
10
20
30
40
50
60
time [sec]
Fig. 4. Knee angles during stair climbing and turn on point of the left and right knee
The corresponding hip azimuth angle is shown in Figure 5. Again the CUELA curve covers mainly the vicon curve, even though the azimuth angle is determined by angular rate integration. The six plateaus with small amplitudes represent the hip rotation during climbing the stair up and down. The plateau transitions correspond to the 180° turn on point behind the steps.
3D Human Motion Capturing Based Only on Acceleration and Angular Rate Measurement
201
hip azimuth angle 100 Vicon CUELA
80 60
angle [°]
40 20 0 -20 -40 -60 -80 -100 0
10
20
30
40
50
60
time [sec]
Fig. 5. Hip azimuth angle during stair climbing and turn on point
For an objective analysis of the difference between the Vicon and CUELA measurement, the RMSE of each subject and task is computed using Vicon as reference for the knee and hip azimuth angle. The mean RMSE of all subjects is shown in table 2. Table 2. Mean RMSE values of CUELA inertial using Vicon as reference knee angle, mean RMSE in degrees (SD)
hip azimuth angle, mean RMSE in degrees (SD)
squatting
4.6 (± 1.8)
3.4 (± 1.8)
step to front
5.5 (± 1.3)
5.8 (± 2.6)
step to side
5.7 (± 2.1)
7.1 (± 2.3)
ladder climbing
5.7 (± 1.7)
5.0 (± 1.7)
stair climbing and turn by 180°
6.4 (± 1.4)
6.4 (± 2.3)
task
The mean RMSE values of the knee angles range from 4.6° (squatting) to 6.4° (stair climbing and turn by 180°). For the hip azimuth angle they range from 3.4° (squatting) to 7.1° (step to the side).
4
Discussion
In literature, often a RMSE between 1° to 6° is given for inertial sensor based motion tracking systems [7, 14, 15, 16]. Most of these systems use complementary sensors to increase the accuracy of the gyroscope and accelerometer measurement. Our results in the range of 3.4° to 7.1° are in a comparable range without using complementary sensors. In our evaluation study we compared both systems in general. A lot of factors
202
C. Schiefer et al.
can therefore influence our results besides the sensor and algorithm accuracy. The marker and IMUs can generate movement artefacts independent from each other, the human models computing the joint center points differ in their complexity and the calculation of knee angels from vectors in 3D is problematic - to name some external impacts. A reduction of the comparison complexity may improve our result by eliminating external disturbances. For example fixing the Vicon markers to the IMUs allows for measuring the IMU movement independent from movement artefacts and different human body models. The calculation of common medical angles based on the body segment quaternions needs investigation to get a better basis for comparing both systems independent from gimbal locks. Introducing zero points increased essential the accuracy of azimuth angle determination and gyroscope drift compensation. The correlation of quantity of zero points and long term accuracy needs investigation in the future.
5 Conclusion For ergonomic field analysis of human motion, we developed an inertial sensor based human motion capturing system. It captures movement without using mechanical components, magnetometers or other components to allow measurements at various real workplaces. Zero points are used to reinitialize the orientation computation to reduce gyroscope drift and its consequences. In this evaluation study we analyzed and compared the motion of lower extremities to the optical motion tracking system Vicon. The mean RMSE values of the knee and hip azimuth angle are used as quality criterion. The resulting RMSE values are somewhat higher but in the range of published evaluation data of comparable systems and are expected to get smaller with further optimization. The CUELA inertial system, based only on accelerometer and gyroscopes, is at a viable starting point for usage in long term analysis of human motion tracking at real workplaces.
References 1. Ellegast, R.P., Kupfer, J.: Portable posture and motion measuring system for use in ergonomic field analysis. In: Ergonomic Software Tools in Product and Workplace Design, pp. 47–54 (2000) 2. Hermanns, I., Raffler, N., Ellegast, R.P., Fischer, S., Göres, B.: Simultaneous field measuring method of vibration and body posture for assessment of seated occupational driving tasks. International Journal of Industrial Ergonomics 38, 255–263 (2008) 3. Ellegast, R., Hermanns, I., Schiefer, C.: Workload Assessment in Field Using the Ambulatory CUELA System. In: Duffy, V.G. (ed.) ICDHM 2009. LNCS, vol. 5620, pp. 221–226. Springer, Heidelberg (2009) 4. Glitsch, U., Ottersbach, H.J., Ellegast, R.P., Schaub, K., Franz, G., Jäger, M.: Physical workload of flight attendants when pushing and pulling trolleys aboard aircraft. Int. Journal of Ind. Ergonomics 37, 845–854 (2007)
3D Human Motion Capturing Based Only on Acceleration and Angular Rate Measurement
203
5. Ditchen, D., Ellegast, R.P., Herda, C., Hoehne-Hückstädt, U.: Ergonomic intervention on musculoskeletal discomfort among crane operators at waste-to-energy-plants. In: Bust, P.D., McCabe (eds.) Contemporary Ergonomics, pp. 22–26. Taylor & Francis, London (2005) 6. Weber, B., Hermanns, I., Ellegast, R., Kleinert, J.: A person-centered measurement system for quantification of physical activity and energy expenditure at workplaces. In: Karsh, B.T. (ed.) EHAWC 2009. LNCS, vol. 5624, pp. 121–130. Springer, Heidelberg (2009) 7. Plamondon, A., Delisle, A., Larue, C., Brouillette, D., McFadden, D., Desjardins, P., Larivière, C.: Evaluation of a hybrid system for three-dimensional measurement of trunk posture in motion. Applied Ergonomics 38, 697–712 (2007) 8. Yun, X., Bachmann, E., Kavousanos-Kavousanakis, A., Yildiz, F., McGhee, R.: Design and implementation of the MARG human body motion tracking system. In: International Conference on Intelligent Robots and Systems, vol. 1, pp. 625–630 (2004) 9. Bachmann, E., Yun, X., Peterson, C.: An investigation of the effects of magnetic variations on inertial/magnetic orientation sensors. In: IEEE International Conference on Robotics and Automation, vol. 2, pp. 1115–1122 (2004) 10. Vicon, http://www.vicon.com 11. Yun, X., Bachmann, E., McGhee, R.: A Simplified Quaternion-Based Algorithm for Orientation Estimation From Earth Gravity and Magnetic Field Measurements, vol. 57(3), pp. 638–650 (2008) 12. Bergmann, J.H., Mayagoitia, R.E., Smith, I.C.: A portable system for collecting anatomical joint angles during stair ascent: a comparison with an optical tracking device. Dynamic Medicine 8, 3 (2009) 13. Mayagoitia, R.E., Nene, A.V., Veltink, P.H.: Accelerometer and rate gyroscope measurement of kinematics: an inexpensive alternative to optical motion analysis systems. Journal of Biomechanics 35(4), 537–542 (2002) 14. Sabatini, A.: Quaternion-based extended Kalman filter for determining orientation by inertial and magnetic sensing. IEEE Transactions on Biomedical Engineering 53(7), 1346– 1356 (2006) 15. Foxlin, E., Harrington, M., Altshuler, Y.: Miniature 6-DOF Inertial System for Tracking HMDs. In: SPIE, Orlando, FL, vol. 3362 (1998) 16. Bachmann, E.R.: Inertial and Magnetic Tracking of Limb Segment Orientation for Inserting Humans into Synthetic Environments. Naval Postgraduate School, United State Navy (2000)
Application of Human Modeling in Multi-crew Cockpit Design Xiaohui Sun, Feng Gao, Xiugan Yuan, and Jingquan Zhao School of Aeronautical Science and Engineering, Beihang University, 37 Xueyuan Load Beihang University, Beijing
[email protected], {gaofeng,yuanxg}@buaa.edu.cn
Abstract. Based on the need of multi-crew cockpit ergonomic design, we set up a parameterized digital human model. By virtually controlling digital human model to devices of cockpit, the potential conflicts in the process of multi-crew coordination were identified. The solutions on device layout were proposed. This method beforehand took human factors into account in the multi-crew cockpit design. Design efficiency was improved greatly. Keywords: Human modeling, Cockpit design, Crew Coordination.
1 Introduction The tasks in a multi-crew cockpit not only are completed by individual, but depend more on the coordination of crew. In the process of the coordination, the diversity of crew and their cognition will cause all kinds of conflict, some of which may influence the safety of the airplane. Statistics show that the craft accidents caused by coordination problems make up more than 70% of all the accidents that happened [1]. Most of the current studies on human factor analyze the field of vision in cockpit, the working area, and the arrangement of equipments and the amenity of working position [2-4]. But the study on the conflicts of the multi-crew cockpit is very poor. Therefore, in order to improve the safety of flight and avoid the conflict, in the stage of cockpit concept design, analyzing the working efficiency of the coordination is necessary by using digital human model in multi-crew cockpit.
2 Research on Human Modeling 2.1 Human Statistics The accuracy of the size of human model is an important factor for assessing working efficiency. Digital human model, which is based on the statistics of anthropometry, is able to solve the accuracy issue. According to digital human body, the virtual working environment is made to stimulate. And the virtual working environment can support the analysis on working efficiency and acquire high credibility [4]. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 204–209, 2011. © Springer-Verlag Berlin Heidelberg 2011
Application of Human Modeling in Multi-crew Cockpit Design
205
Human body dimensions include static dimensions and dynamic dimensions. The static dimensions of human model in the paper were according to “Chinese male pilots’ measurement” (GJB 4586-2003). 25 measurement statistics relating to working efficiency were used in the paper, including size of limbs e.g. the length of standing body, the length of sitting body, the length of arm and size of some specific point (e.g. the gap between eyes, the gap between hips.). The Dynamic range, by analysis of literature [5-8] in relation to the human body range of motion data, defined the final size of the dynamic range of the human body model, including the dynamic range data of 28 body parts( e.g. the neck, chest, shoulders and elbows, etc.). 2.2 Digital Human Model The model adopted surface model method, dividing human body to skeleton layer and segment layer. Segment layer is over the skeleton layer. Skeleton layer conforms to human topology structure, and can express information of human body pretty precisely, by adjusting each joint’s position to drive the segment layer to change accordingly. So the digital human model is conveniently controlled the movement.
Fig. 1. Human skeleton model
When human skeleton pattern was set up, the paper considered the composition of skeleton and the function of every joint. Human skeleton pattern was divided to 66 segments and 65 joints including 131 freedom of motion. According to the feature of pilot’s job and the need of analyzing cockpit, the micro-movement of trunk and hands of the digital human model was needed to simulate. Therefore, the trunk and the hand were divided to 50 segments, which met the real composition of human body. Human skeleton pattern is shown in Fig. 1. 2.3 Posture Control of Human Model Human model must be set at a specific posture according to working condition in human-machine ergonomic evaluation of cockpit. In order to realize the setting of
206
X. Sun et al.
human model posture, the paper determined the position segments of human model by limiting joints angle which is called positive kinematic technology. Setting human model static posture depended on the angle of the whole body’s 65 joints which consists of 131 degrees of freedom. So if the angle of joints was given, human model posture was completely set. In the process of controlling actual posture, statistics of joint angle is confirmed in two ways: the first is to adopt the special work of pilots and the arrangement of the cockpit, which identify figure of each joint’s angle; the second is to acquire Movement Capture System (MCS). For the first method, a lot of angles should be decided by human. Therefore, Vicon system was used for acquiring movement information of the pilots in the paper. The data points of capture are shown in Fig. 2. The acquired figures were amended, and then a figure document including spatial coordinates and main joint changing angle would be produced. The human model posture would be created by reading the figure document.
Fig. 2. The capture data points of MCS
3 The Application Research of Human Model 3.1 The Analysis of Human Factor in Multi-crew Cockpit In the multi-crew cockpit, the coordination of the crew is inevitably needed in the process of completing the task, and makes human factor issues more complex than the single cockpit. So the coordination process is mainly to analyzed and evaluated in ergonomics issues of the multi-crew. Wesley Allan Olson (1999) [9] provided the definition of coordination: coordination can be viewed either as a cooperative process which requires agents to flexibly and adaptively work together towards goal attainment or it may be viewed as a conflict resolution process in which interactive participation is not required-instead the focus is on detecting and resolving conflicting goals and actions. Therefore, the multi-crew coordination is analyzed from two aspects: (1) the coordination process; (2) the conflicts during coordination. By analyzing multi-crew coordination, Eugene and Grubb (2002) [10] proposed Crew Coordination Model (CCM). In the model, the crew cooperation process will
Application of Human Modeling in Multi-crew Cockpit Design
Fig. 3. The analysis of the multi-crew cockpit design
207
Fig. 4. The analysis interface of the vision
happen in these occasions: inputting when plan changed, informing when situation awareness changed, inputting when state of aircraft changed and supervising when operation operates. In the process of cooperation, the cooperation model can be divided into three categories (take two- pilot cockpit for example): one pilot operates while the other monitors; two monitor interactively; two operate interactively. By analysis of multi-crew conflict, according to Jehn's (1995) [11] view, the conflict is divided into task conflict, relationship conflict and process conflict. In the multi-crew cockpit, the task conflict refers to the inconsistency of the content of the crew tasks in the cockpit; relationship conflict refers to the incompatible interpersonal relations between crew; process conflict refers to conflicts how to complete the task, especially the conflict of the monitoring and operation in the process of cooperative. Despite the different content of conflicts, the conflicts will affect the crews operating performance, increase the crew workload. The paper mainly studied ergonomics issues of the process conflict, which was analyzed and evaluated by the digital human model. 3.2 The Application Example of Human Model Take two-person crew for example, the mode of the cooperation model is that one operates and the other monitors. The coordinated task was that the crews needed to reenter the data when situation awareness changed. In the coordination mode, the pilot on the right seat is in charge of entering new flying statistics, operating EICAS and controlling devices on the head plate, while the pilot on the left seat is in charge of supervising outside, the operating results of the pilot on the right seat and changes of the panel. In the process of multi-crew cockpit design, the process conflict in coordination mode was analyzed and assessed by human model, as shown in Fig. 3. The human model in Fig. 3 is the 50 percentage human model. The assessments included: whether the position and arrangement of the monitor was in the best field of vision, whether the
208
X. Sun et al.
position of controller operated by right-seated pilot was inside the field of vision, whether the result of operation was blocked during the process of operation the rightseated pilot. The analysis interface of the vision is shown in Fig. 4. Analyze the accessibility of right-seated pilot, which was to assess whether the controller was in range of accessibility of the right-seated pilot. The analysis interface of the accessibility is shown in Fig. 5. In the analysis, the posture control and movement control of human model was accomplished by the control interface of human model (Fig. 6).
Fig. 5. The analysis interface of the accessibility
Fig. 6. The control interface of human model
4 Summary By analyzing human factor of multi-crew design with digital human model, the paper evaluated the process conflict in multi-crew cockpit from the aspects of vision and accessibility. According to the result of evaluation, some advices were given on multi-crew cockpit design. The conflict would be reduced in cockpit, and the safety and efficiency could be increased in multi-crew flight operation. But the conflicts in multi-crew cockpit are influenced by many factors. The paper just analyzed the process conflict. The influence of task conflict and relation conflict in multi-crew cockpit design needs further researching.
References [1] Orlady, H.W., Orlady, L.M.: Human factors in multi-crew flight operation (1999) [2] Wang, L.J., Yuan, X.G.: Geometrical Dummy used for ergonomic evaluation on early stage cockpit conceptual design. J. Space Medicine & Medical Engineering 14(6), 439– 443 (2001)
Application of Human Modeling in Multi-crew Cockpit Design
209
[3] Su, R.-E., Xue, H.-U., Song, B.-F.: Ergonomic virtual assessment for cockpit layout of civil aircraft. J. Systems Engineering-Theory & Practice 29(1), 185–191 (2009) [4] Libo, Z.: Research on technology and visualization of ergonomics analysis and assessment for aircraft cockpit. Beihang university, Beijing (2008) [5] Zhu, D., Zheng, L.: Human Anatomy. Fudan University publishing, Shanghai (2002) [6] GJB 2873-1997 Human Engineering design criteria for military equipment and facilities. The criterion publishing company of china, Beijing (1997) [7] GJB/Z 131-2002 Human engineering design handbook for military equipment and facilities. The criterion publishing company of china, Beijing (2002) [8] Zheng, X., Jia, S., Gao, Y., et al.: The modern biodynamic of movement. The national defense industry publishing company, Beijing (2002) [9] Olson, W.A.: Supporting Coordination Widely Distributed Cognitive Systems: The Role Of Conflict Type, Time Pressure, Display Design, And Trust. College Of The University Of Illinois, Illinois (1999) [10] Grubb, G., Simon, R.: Development of candidate crew coordination training methods and materials. ARI Contractor Report (2002) [11] Jehn, K.A.: A multi-method examination of the benefits and detriments of intra-group conflict. J. Administrative Science Quarterly 40, 530–557 (1995)
A Biomechanical Approach for Evaluating Motion Related Discomfort : Illustration by an Application to Pedal Clutching Movement Xuguang Wang1, Romain Pannetier1,2, Nagananda Krishna Burra1, and Julien Numa1 1
Université de Lyon, F-69622, Lyon, France; IFSTTAR, UMR_T9406, LBMC, Université Lyon 1 2 Renault SAS, Service Facteur Humain, Conduite et Vie Abord, 1 avenue du Golf, Guyancourt, France
[email protected]
Abstract. In this paper, a motion related discomfort modelling approach based on the concept of “less constraint movement” has been proposed and illustrated by a case study of clutching pedal movements. Using a multi-adjustable car mock-up, 6 existing pedal configurations were tested by 20 subjects (5 young and 5 older males, 5 young and older females) and compared with those freely adjusted pedal positions, called ‘less constraint’ configurations. From questionnaire and motion analysis of the experimental data, it was observed that pedal resistance had a dominant effect on discomfort perception. The pedal position adjustment seemed to mainly reduce the discomfort at the beginning of travel. The ergonomic criterion for pedal design should therefore take into account two main factors: 1/ pedal resistance, 2/ a good trade-off between pedal position at the beginning of travel and its end position. The movements corresponding to less constraint configurations will be used as reference data for defining ergonomic criterion. In particular, The focus should be put on the hip, knee and ankle joint torques at the end of travel and the joint angles at the beginning and end of pedal travel. Keywords: Discomfort, Clutch pedal, Biomechanics, Ergonomics, Task-oriented movement.
1 Introduction Digital human simulation is believed to be one of the key technologies for proactive design. Almost all components of a product can be digitalized using today’s CAD (Computer-Aided Design) technologies. Being able to evaluate the customer’s satisfaction through digital human simulation at the early stage of product design becomes an absolute necessity for reducing design cycle. It can be easily understood that such an advanced digital human simulation tool can be applied in a large industrial domain where the designed system is addressed to a high number of users (clients/operators) and its physical mock-up is difficult to reproduce. However, although DHMs have already existed for over 30 years, the current models do not V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 210–219, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Biomechanical Approach for Evaluating Motion Related Discomfort
211
fulfil all the requirements desired by the users. Refer to the handbook of Digital Human Modeling (DHM) edited recently by Duffy (2008) for a most exhaustive review of DHM researches and developments. The main reasons are, on the one hand, high expectation from users, and, on the other hand, high complexity of human modelling. Despite recent progresses in motion simulation, current DHMs are mainly limited to geometric and kinematic representations of humans, which are enough to have a visually human-like simulation of motions. But objective discomfort criteria for evaluating a task are still missing, making it difficult for a design engineer to judge whether the designed product will be appreciated in terms of ease-of-use and comfort. Discomfort is induced by interactions with environment and internal biomechanical constraints affecting the musculoskeletal system. We believe that a dynamic and muscular activities simulation is required in order to evaluate discomfort. Meanwhile, functional human data, such as joint maximum range of motion and joint muscle strength, are necessary for validating biomechanical models and for defining task related discomfort criteria. However, few consistent and reliable data are available and can be integrated into a DHM. Therefore, a three-year European collaborative research project, named DHErgo (Digital Humans for Ergonomic design of products, www.dhergo.org), has been launched in 2008 for developing dynamic and musculo-skeletal human models for ergonomic applications. The objective of this paper is to present the biomechanical approach adopted in DHErgo for evaluating motion related discomfort. The approach will be illustrated with one of the three case studies performed in this project: lower limb movement when clutching a pedal.
2 General Motion Related Discomfort Modelling Approach An integrated approach has been adopted for modeling both motion and discomfort (Fig. 1). An objective evaluation of discomfort requires a realistic motion simulation, whereas a realistic motion simulation has to take into account discomfort criteria. Typically, our research begins with an experiment with voluntary subjects with a well planned experimental design. For each movement to be studied, we make use of a motion capture system (e.g. Vicon) to measure the trajectories of the markers attached to the body surface. Meanwhile, external contact forces are also required if we want to estimate joint forces and torques. Individual digital mannequin can be defined through superposing a digital model and subject’s photos in a calibrated space (see for instance, Seitz and Bubb, 2001). In DHErgo, in order to improve the accuracy of individualized model, anatomic landmarks are also palpated (Robert et al, 2010). In case of musculo-skeletal models, an attempt has also been made to scale muscle strength with measured maximum joint strength (Pannetier et al, 2011). With the inputs as marker trajectories, initial digital anthropometric model as well as external forces, the movement can be reconstructed kinematically using a global optimization procedure (Ausejo et al., 2006) and then joint forces and torques can be estimated using an inverse dynamic procedure. For instance, inverse dynamic analysis was recently applied to truck (Monnier et al, 2009 & 2010) and car ingress/egress movements (Causse et al, 2009 & 2011). Using a static optimisation method, muscles forces can also be estimated (see for instance, Fraysse et al, 2009). Then, motion
212
X. Wang et al.
analysis can be performed in order to: 1/identify and understand motion control strategies, 2/ to structure motion data according to subject’s characteristics, task, and motion control strategies, 3/to understand the sources of discomfort and search for possible relationship between discomfort and biomechanical parameters. Moreover, some basic biomechanical data are required such as joint maximum range of motion and joint strength for motion simulation and discomfort evaluation. Meanwhile, specific efforts are needed for developing efficient motion adaptation algorithm in order to adapt an existing motion to a new simulation scenario. Task specific experiment Anthropometry and physical characteristics
Discomfort (Questionnaire)
Trajectories + Ext. forces (Vicon, Force sensors )
Basic data collecting et model development Fonctional data (ROM, Strength)
Anatomic data and human models (MB and MS)
Motion control rules and simulation algorithms
Individualized models
Motion reconstruction (angles, torques, muscle forces…)
Discomfort analysis
Motion analysis
Predictive models
Discomfort
Motion
Fig. 1. Integrated approach for simulating motion and discomfort
It should be pointed out that existing ergonomic assessment methods such as OWAS (Kerhu et al, 1997), RULA (McAtamney and Corlett, 1993), REBA (Highett and McAtamney, 2000), OCRA (Occhipinti, 1998) are initially developed for observation of working postures in industry. Only a very rough estimation of posture is usually required either from direct visual estimation or from recorded video. In addition, the postural evaluation criteria were decided by a group of ergonomics experts. These methods can certainly be helpful for detecting main risk factors of a workplace. But they can hardly be used for ergonomic evaluation of a product such as a vehicle. Any task oriented motion is more or less constraint by the environment. A design engineer often needs to choose one solution among several alternatives. If we accept the idea that a better comfort can be obtained when people can make appropriate adjustments by themselves, then less constraint motions can be obtained experimentally. For this, a multi-adjustable mock-up is usually required. These less constraint motions, called also “neutral” motions by Dufour and Wang (2005), can then be referred as reference data for comparing a proposed solution. Therefore, the proposed discomfort modelling approach for the ergonomic assessment of a taskrelated motion can be summarized as follows:
A Biomechanical Approach for Evaluating Motion Related Discomfort
213
1. Identify main critical design parameters that may affect the discomfort perception and develop working hypothesis. 2. Plan the experiment. In this step, one has to propose an appropriate experimental design with identified independent variables which should be related to the main critical parameters defined in Step 1. The appropriate apparatus should be chosen to manipulate the independent variables and to measure the dependent responses. In case of product design optimisation, a multi-adjustable experimental mock-up is usually needed, with the necessary facilities allowing the participants to easily choose their preferred adjustments. In addition to motion capture with the measurement of external contact forces, a discomfort questionnaire is the only direct way to “measure” the subjective feeling of the participants. 3. Conduct the experiment with a sample of voluntary participants. 4. Data processing and analysis. Three types of data analysis have to be performed mainly by comparing imposed and freely adjusted configurations: a) questionnaire analysis for understanding the effects of independent variables on perceived discomfort, b) motion analysis for identifying main motion characteristics (keyframes, motion control strategies, etc.), c) search for possible relationships between perceived discomfort and biomechanical parameters from motion analysis. 5. Define ergonomic criterion based on data analysis for the implementation in a DHM tool.
3 Application to Pedal Clutching Movement In the project DHErgo, pedal clutching movement was chosen as the first case study to check how the proposed approach could be helpful for improving the pedal design when using a DHM tool. 3.1 Data Collecting Two types of data were collected: 1/ data characterizing a participant in terms of anthropometry, ranges of motion, joint strength as well as joint torque perception, and 2/ clutching motions for several pedal configurations proposed by the three car manufacturers involved in DHErgo. A pedal configuration is mainly characterized by seat height, pedal resistance at the end of travel, pedal travel length and travel inclination angle (Wang et al, 2000). Six different pedal configurations were tested. In this experiment, a less constraint configuration with an adjustable pedal position was also tested, allowing the investigation of the effects of pedal position. For this, a multi-adjustable experimental mock-up similar to that used in a past study (Wang et al, 2000, Fig. 2) was used. 10 younger and 10 older male and female subjects participated in the experiment. They were divided into 4 groups with the stature close to their respective average value (young male, young female, older male, older female). A high number of physical parameters related to clutch pedal operation were measured: whole body motions with external surface markers, 3-axes force applied on the pedal, the pressure between subject and seat, EMG signals of the main muscle groups of the lower limb; etc… Meanwhile, discomfort feelings when clutching were
214
X. Wang et al.
also collected through a questionnaire. Prior to the clutch pedal experiment, the data for characterizing subject’s individual physical capacity of the left lower limb were also collected (Fig.2): joint strength and joint torque perception, joint range of motion. All these data were used for estimating the biomechanical parameters such as joint angle, joint torque as well as muscle forces for a better understanding of the perceived discomfort when clutching and for proposing a predictive discomfort model of pedal clutching.
(a)
(b)
(c)
Fig. 2. Experimental set up for the measurement of ankle joint maximum dorsi-flexion (a), of the hip maximum flexion torque and four other intermediate force levels (b) and of the clutching pedal movements (c)
3.2 Main Experimental Results Here only main results from data analysis are presented for illustrating how the proposed discomfort modelling approach has been applied to pedal clutching operation. More details will presented elsewhere (Pannetier et al, 2011, Burra et al, 2011). In this case study, the responses to the questionnaire or biomechanical parameters were analysed according to three independent variables: • Group of subjects (Group or G), i.e. 4 groups according to age and gender • Configuration (Config or C), 6 existing pedal configurations provided by three car manufacturers • Type of configuration (ConfigType or T), i.e. imposed or freely chosen Here we are going to mainly focus on the comparison between imposed and free pedal configurations. Range of Pedal Adjustment. After having tested each of the imposed pedal configurations, the participants were asked to only change the pedal position without modifying other car parameters. Table 1 summarizes the pedal adjustment in x (longitudinal, + backwards), y (lateral, + towards right) and z (vertical, + upwards) with respect to the six imposed configurations. The adjustments in x, y and z directions were in average respectively -8.8, -21 and -16.8 mm. They were strongly dependent on pedal configuration. The effect of subject group was observed only in x and z directions. Globally, the participants preferred to put the pedal slightly further forwards (Δx=-8.8 mm in average), a little bit on the left (Δy=-21 mm in average) and also slightly lower (Δz=-16.8 mm in average).
A Biomechanical Approach for Evaluating Motion Related Discomfort
215
Table 1. Means and standard deviations of the adjustment in x (longitudinal, + backwards), y (lateral, + towards right) and z (vertical, + upwards) with respect to the six imposed pedal positions
Config C1 C2 C3 C4 C5 C6 Total
N 20 20 20 20 60 20 160
Δx (mm) -3.0±17.7 -3.4±20.6 -1.4±32.6 -3.9±24.8 -14.9±18.0 -13.8±23.1 -8.8C*,G*±22.5
Δy (mm) -41.0±18.8 -4.2±12.7 0.9±4.0 -31.0±22.6 -18.3±14.0 -37.9±16.3 -21.0C***±20.6
Δz (mm) -45.5±32.9 -46.6±24.8 -43.5±30.8 -4.4±25.7 0.2±27.9 4.6±36.4 -16.8C***,G**±36.5
*p<0.05, **p<0.01,*** p<0.001: significance levels of factors in ANOVA. C: configuration, G: group of subjects
Means and 95.0 Percent LSD Intervals
6
10
4
7 Discomfort ratings
Discomfort ratings
Means and 95.0 Percent LSD Intervals
2 0 -2 -4
4 1 -2 -5
-6 C1
C2
C3 C4 C5 Configuration
(a)
C6
-8 1 2 3 4 Pedal Resistance (Perceived levels)
(b)
Fig. 3. Discomfort ratings centred to each participant’s mean value in terms of configuration (a) and perceived resistance levels (b)
Questionnaire Analysis. Two questions were formulated to know how the participants perceived the pedal position at the beginning and end of travel. Most of the participants found the imposed pedal positions too close, too on the right and too high at the beginning, in agreement with the range of adjustments they preferred (Table 1). This was particularly true for the female participants. However, a higher proportion of participants found the end pedal positions too far after adjustment, suggesting that the participants preferred a more comfort pedal position at the beginning at expense of a less comfortable posture at the end of pedal travel. The analysis of the responses on the perception of the pedal resistance showed that the distribution of the responses (5 responses from ‘very low’ to ‘very high’) was neither affected by the type of configuration nor by the group of subjects. It only depended on configuration. Three groups of configurations were defined: 1) C3 with most of responses as “very low” and “low”, 2) C1, C2 and C5 with most of responses as “Medium”, 3) C4 and C6 with most of responses as “Medium” or “High”. The participants were asked to rate the global discomfort of clutching a pedal using the CP-50 scale. The analysis of discomfort ratings showed that the perceived
216
X. Wang et al.
discomfort was strongly correlated with the perceived pedal resistance level. The pedal configurations with a low perceived resistance also had a lower discomfort rating (Fig. 3). Small but significant difference in perceived discomfort was found between imposed and less constraint pedal configurations. As expected, freely adjusted pedal configurations were better rated. Motion analysis. After having reconstructed clutching pedal movements, biomechanical parameters such as joint angles and joint torques were analysed. Several key frames were identified, among them the beginning and end of pedal travel. Comparison between imposed and less constraint pedal configurations showed significant difference in joint angles at hip, knee and ankle due to the change of pedal position. The resultant pedal force at the end of travel varied from 166N to 184N in average depending pedal configuration. Note that the pedal C3 had the lowest resistance in agreement with subjective perception of resistance level. No significant difference in pedal force was found between imposed and less constraint configurations. A high pedal force does not necessarily produce a high joint torque because joint torque also depends on force direction knowing that pedal force direction is not fully imposed (Wang et al, 2000). Means and 95.0 Percent LSD Intervals
Means and 95.0 Percent LSD Intervals
200
-11
190 -15
Fresultant
Knee Flexion/Extension Moment
-7
-19
180
-23
170 -27 -31
160 C1
C2
C3 C4 Config
C5
C6
(a)
C1
C2
C3 C4 Config
C5
C6
(b)
Fig. 4. Mean knee extension torques in Nm (a) and mean pedal resultant forces in N (b) at end of travel for pedal configurations
Fig. 4 shows the mean values of knee extension torques and pedal resultant forces for the 6 configurations (imposed and free). From the visual inspection of knee joint torques, the same three groups of configurations as those according to discomfort rating and force perception can be observed, suggesting joint torque might be used for defining ergonomic assessment criterion. Joint Torque Perception. The joint maximum torques as well as joint torques corresponding to four other intermediate force levels (very low, low, medium, high) were also collected prior to clutch pedal experiment with the same participants. These data allowed us to test the hypothesis that joint force perception depends only on the force level normalized by the corresponding maximum force. Maximum flexion and extension active voluntary isometric forces were measured for the left lower limb joints hip, knee and ankle at different joint angular positions. The active muscular
A Biomechanical Approach for Evaluating Motion Related Discomfort
217
joint torque, which takes into account the effects of body segment weight, was calculated. Joint torques were measured at 4 intermediate voluntary force levels in addition to the rest force and maximum forces to analyze the subjective joint force perception. The force levels normalized by the corresponding maximum force were analyzed in function of joint, joint position and force direction. High variations were observed in the measured forces for the four intermediate categories of force levels. Our results seemed to support that joint force perception only depends on ratio between current force level and it maximum voluntary value. The normalized joint torque did not depend on age, gender, joint, joint position, force direction, force application mode and experimental session. Fig. 5 shows an example for the knee flexion at the knee flexion angle of 45°. If we use Borg’s CR-10 rating scale (see Borg, 1998), the joint torque perception law can be obtained in a numerical form. This provides a way to interpret joint torque. All, R2=0.61964, C1=8.7956 et C2=0.012067 1 0.9
0.9
0.8
0.8
Normalized net joint torque
Normalized net joint torque
Joint force perception 1
0.7 0.6 0.5 0.4 0.3
0.6 0.5 0.4 0.3 0.2
0.2 0.1 0
0.7
0.1 vl
lo
me
hi
Force level
(a)
0
0
1
2
3
4
5 6 Force level
7
8
9
10
(b)
Fig. 5. Mean normalized net joint torques of all subjects at 4 force intermediate levels (very low, low, medium, high) for knee flexion at 45° (a) and corresponding fitting curve if the four levels are transformed into numbers as Borg’s CR-10
Ergonomic Criterion. In summary, from questionnaire and motion analysis of clutching pedal movements, it can be concluded that pedal resistance had a dominant effect. The pedal position adjustment seemed to mainly reduce the discomfort at the beginning of travel. These observations are in agreement with the results of our previous study (Wang et al, 2004). The ergonomic criterion for pedal design should therefore take into account two main factors: 1/ pedal resistance, 2/ a good trade-off between pedal position at the beginning of travel and its end position. The movements corresponding to less constraint configurations will be used as reference data for defining ergonomic criterion. In particular, we are going to focus on the hip, knee and ankle joint torques at the end of travel and to the joint angles at the beginning and end of pedal travel.
4 Discussion and Concluding Remarks In this paper, a motion related discomfort modelling approach based on the concept of “less constraint movement” has been proposed and illustrated by a case study of
218
X. Wang et al.
clutching pedal movements. From the analysis of collected data, the ways to identify ergonomic criterion for optimising pedal design has been identified. At the time of writing this paper, this work is still on-going. The identification of ergonomic criterion and their implementation in a DHM will be available soonly. As for the task oriented motion simulation approach we proposed before (Monnier et al, 2008), the proposed discomfort modelling approach is also data-based. As for all data-based methods, it suffers from two main drawbacks: 1/ a new experiment has to be performed for a new task-related design problem. This is a time consuming process. 2/ It is difficult to extrapolate outside the experimental conditions. Acknowledgement. The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n°218525.
References 1. Ausejo, S., Suescun, Á., Celigüeta, J., Wang, X.: Robust human motion reconstruction in the presence of missing markers and the absence of markers for some body segments. In: SAE 2006 International Conference and Exposition of Digital Human Modeling for Design and Engineering, Lyon, France (July 4-6, 2006); SAE Paper 2006–01–2321 2. Borg, G.: Borg’s perceived exertion and pain scales. Human Kinetics (1998) 3. Burra, N.K., Numa, J., Pannetier, R., Wang, X.: Experimental investigation on subjective joint force perception. In: 1st International Symposium on Digital Human Modelling, Lyon, France, June 14-16 (2011) 4. Causse, J., Chateauroux, E., Monnier, G., Wang, X., Denninger, L.: Dynamic Analysis of Car Ingress/egress Movement: an Experimental Protocol and Preliminary Results. In: SAE 2009 International Conference and Exposition Conference of Digital Human Modeling for Design and Engineering, Göteborg, Suede, June 9-11 (2009); SAE Paper 2009–01–2309 5. Causse, J., Wang, X., Denninger, L.: Use of the steering wheel during the car egress motion. In: XXIIIth Congress of International Society of Biomechanics, Brussel, Belgium, July 3-7 (2011) (submitted) 6. Duffy, V.: Handbook of Digital Human Modeling: Research for Applied Ergonomics and Human Factors Engineering. CRC Press, Boca Raton (2008) 7. Dufour, F., Wang, X.: Discomfort assessment of car ingress/egress motions using the concept of neutral movement. In: SAE 2005 Transactions Journal of Passenger Cars Mechanical Systems, pp. 2905–2914 (2005) 8. Fraysse, F., Dumas, R., Cheze, L., Wang, X.: Comparison of global and joint-to-joint methods for estimating the hip joint and the muscle forces during walking. J. Journal of Biomechanics 42(14), 2357–2362 (2009) 9. Hignett, S., McAtamney, L.: Rapid entire body assessment (REBA). Applied Ergonomics 31, 201–205 (2000) 10. Karhu, O., Kansi, P., Kuorinka, I.: Correcting working postures in industry: a practical method for analysis. Applied Ergonomics 8(4), 199–201 (1977) 11. McAtamney, L., Corlett, E.N.: RULA: a survey method for the investigation of workrelated upper limb disorders. Applied Ergonomics 24(2), 91–99 (1993) 12. Monnier, G., Wang, X., Trasbot: RPx: A motion simulation tool for car interior design. In: Duffy, V.G. (ed.) Handbook of Digital Human Modeling. CRC Press, Boca Raton (2008)
A Biomechanical Approach for Evaluating Motion Related Discomfort
219
13. Monnier, G., Chateauroux, E., Wang, X., Roybin, C.: Inverse Dynamic Reconstruction of Truck Cabin Ingress/Egress Motions. In: SAE 2009 International Conference and Exposition Conference of Digital Human Modeling for Design and Engineering, Göteborg, Sweden, June 9-11 (2009); SAE Paper 2009–01–2286 14. Monnier, G., Chateauroux, E., Roybin, C., Wang, X., Robert, T.: Upper and lower limb coordinations in truck cabin accessibility: Comparison of kinematic and dynamic coordinations. In: 11th International Symposium on the 3-D Analysis of Human Movement, San Francisco, USA, July 14-16 (2010) 15. Occhipinti, E.: OCRA: a concise index for the assessment of exposure to repetitive movements of the upper limbs. Ergonomics 41(9), 1290–1311 (1998) 16. Pannetier, R., Burra, N.K., Numa, J., Robert, T., Wang, X.: Effects of the free adjustment in pedal position on clutching movement and perceived discomfort. In: 1st International Symposium on Digital Human Modelling, Lyon, France, June 14-16 (2011) 17. Pannetier, R., Robert, T., Holmberg, J., Wang, X.: Optimition-based muscle force scaling doe subject specific maximum isometric estimation. In: XXIIIth Congress of International Society of Biomechanics, Brussel, Belgium, July 3-7 (2011) 18. Robert, T., Ausejo, S., Beurier, G., Celigueta, J.T., Sholukha, V., Van Sint Jan, S., Viossat, P., Wirsching, H.J., Wang, X.: An automated procedure for the personalization of digital human models for human motion analysis. In: Proceedings of the 1st Virtual Physiological Human Conference, Bruxelles, Belgique, September 30-October 1 (2010) 19. Seitz, T., Bubb, H.: Human-model based movement-capturing without markers for ergonomic studies. In: Digital Human Modeling for Design and Engineering Conference and Exhibition, Arlington, VA, USA (2001); SAE paper 2001-01-2113 20. Wang, X., Verriest, J.P., Lebreton-Gadegbeku, B.: Experimental investigation and biomechanical analysis of lower limb movements for clutch pedal operation. Ergonomics 43(9), 1405–1429 (2000) 21. Wang, X., Le Breton-Gadegbeku, L., Bouzon, L.: Biomechanical evaluation of the comfort of automobile clutch pedal operation. International Journal of Industrial Ergonomics 34, 209–221 (2004)
Footbed Influences on Posture and Perceived Feel Thilina W. Weerasinghe and Ravindra S. Goonetilleke Human Performance Laboratory Department of Industrial Engineering and Logistics Management Hong Kong University of Science and Technology Clear Water Bay, Hong Kong
[email protected]
Abstract. Past studies have evaluated body postures when wearing high-heeled shoes. However, the effects of the various parameters that give the footbed its shape have not been investigated. This study determined the perceived feel and the associated postures with the different types of footbed shapes at two heel heights of 50 and 75 mm. Results show that a 10 degree wedge angle at 50 mm and an 18 degree wedge angle at 75 mm have the highest perceived feel during standing. The corresponding postures were significantly different from the others tested, suggesting that a well-designed footbed is a necessary condition to feel comfortable and to maintain good posture. Keywords: Posture, Footbed, Comfort, High-heels, Footwear.
1 Introduction Even with all the negativity, women continue to wear high-heeled shoes and consider them to be a fashion item. The primary reason for the discomfort and pain associated with high-heeled shoes is due to the mismatch between the foot and the surface of the shoe, known as the footbed. The two feet have approximately 25% of the total number of bones in the body, and hence they are quite flexible. If the shoe surface is not designed well, the foot will attempt to deform within its capabilities to a shape that attempts to conform to the footbed. However, the foot cannot deform in an infinite number of ways. The deformation that the foot can undergo is constrained by the length of the bones and the anatomical structure. Thus, a poorly designed footbed can put the foot into a stressful posture leading to a cascading effect on the rest of the body. Maintaining non-neutral postures require elevated levels of muscular effort which invariably lead to muscle fatigue (Joseph and Nightingale, 1956; Lee et al., 2001). The studies of Opila et al. (1988), Franklin et al. (1995), Lee et al. (2001) recorded significant levels of lumbar flattening in high-heeled shoe wearers when compared to the foot-flat stanced lumbar lordosis. Opila et al. (1988) and Franklin et al. (1995) also observed that the legs become straighter and the pelvis rolls backwards in a highheeled shoe stance when compared with the foot-flat stance. Thus, there are many deformations within the whole body. We attempted to investigate whether the footbed shape and material can have any influence on the body posture and the associated perceptions. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 220–227, 2011. © Springer-Verlag Berlin Heidelberg 2011
Footbed Influences on Posture and Perceived Feel
221
Previous studies have shown that a SS60 material with a wedge angle of 100 and 11 at a heel height of 50 mm, and 160 and 180 at a heel height of 75 mm were perceived to be the most comfortable (Witana et al., 2009a, b). Those results were explained with contact area, peak plantar pressure and percentage of force acting on the forefoot region. Thus, the objective of this study was to evaluate whether any postural changes exist when standing on these ideal shapes as opposed to standing flat on the ground and on other surfaces with non-ideal conditions. 0
2 Methodology 2.1 Equipment The Profile Assessment Device (PAD) (Goonetilleke et al., 2010) was used to simulate different footbed shapes. A 3D motion analysis system (Motion Analysis Corp.) with six cameras was used to capture the posture of the subject during the experiment for each trial of 30 seconds at a frequency of 100 Hz. Thirty retroreflective markers were placed on the body to obtain the subjects’ posture. The FScan system was used to obtain the pressure profile of the subjects’ feet. The Motion analysis system and the F-Scan system were calibrated at the beginning. The pressure data are reported elsewhere. This paper focuses on the perceived feel and postural aspects. 2.2 Subjects The subjects were twelve college female students with a mean age of 20 years and a mean height of 158.5 cm. All subjects were without any foot deformities or ailments, postural instability diseases, knee or spinal injuries. Subjects who passed this screening test had to undergo another screening test to check their ability to estimate line length. 2.3 Procedure Two heel heights (50 and 75 mm), three wedge angles (4°, 10°, 14° at 50 mm; 14°, 18°, 22° at 75 mm) and three different mid-foot supports (45-PU, HL-PU, HL-SS) formed the test conditions (Witana et al., 2009a, b). The toe spring was controlled to be 0°. Every subject stood on the PAD when it was set to each of the conditions. The subject was asked to maintain a static posture for 30 seconds with the heel-to-heel distance controlled at 17 cm. The subjects were asked to maintain equal loads on both feet when having the same PAD settings under their feet. The participants rated their perceived feel while their posture and foot pressure were captured using the motion analysis and the F-SCAN systems, respectively. To increase reliability, two replicates were used. Only the second replicate was analyzed. Each subject was asked to rate the perceived feeling of each test condition on a scale of 0-100 assuming that a 100 corresponded to standing on ground.
222
T.W. Weerasinghe and R.S. Goonetilleke
3 Results An ANOVA performed at each heel height on the perceived feel ratings showed wedge angle, mid-foot surface and their interaction to have significant effects (p < 0.05). A simple-effects analysis showed that, at a heel height of 50 mm, the 10°-HLSS condition was significantly different from all the other conditions (p < 0.01) (Figures 1 and 2). At a heel height of 75 mm, the 18° wedge angle, HL-SS mid-foot was different from the others (p < 0.01) (Figures 3 and 4).
Perceived Feel
100
75
50
4 10
25
14
0 45-PU
HL-PU
HL-SS
Mid-foot Condition Fig. 1. Perceived feeling interaction of midfoot condition and wedge angle at a heel height of 50 mm
The shoulder and knee marker data were also analyzed. To determine the shoulder displacement of each condition, the anterior superior iliac spine (ASIS) marker of the subject in each condition was aligned with that when standing on the ground. Thereafter, the displacement of the shoulder was determined relative to the shoulder marker when standing on the ground. The ANOVA on shoulder displacement showed a trend (p = 0.059) at 50 mm; a significant effect at 75 mm (p < 0.01). A paired t-test revealed that, at 50 mm, the condition with a wedge angle of 10 degrees and HL-SS has a significantly lower shoulder displacement over the other conditions (p < 0.05). At 75mm, 18-HL-SS gave the minimum shoulder displacement (p < 0.05) (Figures 5 and 6). Similarly, the knee displacements were evaluated with the ankle as a reference, and displacements calculated relative to the knee marker when standing on the ground. The knee displacements at 75 mm are shown in Figure 7.
Footbed Influences on Posture and Perceived Feel
223
Fig. 2. Perceiv ved feel of the nine experimental conditions at 50 mm
Perceived Feel
100
75
50
14 18
25
22
0 45_PU U
HL-PU
HL-SS
Mid-foot Condition Fig. 3. Interaction of miidfoot condition and wedge angle at a heel height of 75 mm
224
T.W. Weerasinghe annd R.S. Goonetilleke
Perceived feel
100 75 50 25 0
Experimental Condition
Fig. 4. Perceivved feel of the nine experimental conditions at 75 mm
Fig. 5. Shoulder displacement from foot-flat condition for the different experimental conditiions at 50 mm
Shoulder Displacement (mm)
Footbed Influences on Posture and Perceived Feel
225
16 12 8 4 0
Experimental Condition
Knee Displacement (mm)
Fig. 6. Shoulder displacement from foot-flat condition for the different experimental conditions at 75 mm
25 20 15 10 5 0
Experimentalal Condition Fig. 7. Knee displacement relative to foot-flat condition for the different experimental conditions at 75 mm
4 Discussion The subjective ratings are in agreement with those of Witana et al. (2009a, b) where the 10 degree wedge angle HL-SS has the highest rating at 50 mm and the 18 degree HL-SS has the highest rating at 75 mm. Lateur et al. (1991) stated that the highest compensation for the deviation of a high heeled stance takes place at the ankle joint. Franklin et al. (1995) claimed that the
226
T.W. Weerasinghe and R.S. Goonetilleke
external rotation at the ankle puts the joint into a locked state and hence the compensation is limited. They did not find any change in knee flexion angle when the subject stood with their heels on a board 5.1 cm high and when the subjects stood on flat ground. As they pointed out, the lack of any midfoot support may have been the reason for the similarity. Joseph and Nightingale (1956) found that the activity on the soleus muscle increased with increased heel height. They claimed that the increase is possibly to prevent the body falling forwards at the ankle. At 50 mm, with a wedge angle of 10° and for 75 mm with 18°, the knee displacement in the posterior direction was the maximum. Extending the results of Joseph and Nightingale it may be inferred that the soleus muscle activity may be reduced in these preferred configurations. This implies that the ankle joint has more freedom to compensate the high-heeled stance. Mundermann et al. (2001) claimed that there is a relationship between perceived comfort feeling and body alignment. This is seen even in this study where the perceived feel of the ideal footbeds have significantly different shoulder and knee displacements when compared with the other settings. Previous research has shown that wearing high heeled shoes impacts the posture and has a significant deviation from the neutral standing posture. This study is different in the sense that the results show that the ideal footbed shapes can minimize the deviations from the neutral posture, in at least the shoulder and knee joints. More research is needed to analyze the dynamic posture of high-heeled gait.
5 Conclusion The ideal shaped footbed that gave the highest perceived feel was identified using a special device that allowed the footbed shape to be manipulated. At 50 and 75 mm, the highest perceived feel was obtained with a wedge angle of 10° and 18° respectively, with a footbed that has HL-SS. The postural deviations with such footbed shapes were significantly different from other footbed shapes. Acknowledgment. The authors would like to thank the Research Grants Council of Hong Kong for funding this study under grant HKUST GRF 613607.
References 1. Franklin, M.E., Chenier, T.C., Brauninger, L., Cook, H., Harris, S.: Effect of positive heel inclination on posture. Journal Orthopedic Sports Physical Therapy 21(2), 94–99 (1995) 2. Goonetilleke, R.S., Witana, C.P., Weerasinghe, T.W.: Customized shoe and insole, method and apparatus for determining shape of a foot and for making a shoe or insole. US Patent 7854071B2, December 21 (2010) 3. Joseph, J., Nightingale, A.: Electromyography of muscles of posture: leg and thigh muscles in women, including the effects of high heels. Journal of Physiology 132(3), 465–468 (1956) 4. Lateur, B.J., Giaconi, R.M., Questad, K.A., Ko, M., Lehmann, J.F.: Footwear and posture compensatory strategies for heel height. Am. J. Phys. Med. Rehabilitation 70, 246–254 (1991)
Footbed Influences on Posture and Perceived Feel
227
5. Lee, C.M., Jeong, E.H., Freivalds, A.: Biomechanical effects of wearing high-heeled shoes. International Journal of Industrial Ergonomics 28, 321–326 (2001) 6. Mundermann, A., Stefanyshyn, D.J., Nigg, B.M.: Relationship between footwear comfort of shoe inserts and anthropometric and sensory factors. Medicine & Science in Sports & Exercise 33(11), 1939–1945 (2001) 7. Opila, K.A., Wagner, S.S., Schiowitz, S., Chen, J.: Postural alignment in barefoot and highheeled stance. Spine 13(5), 542–547 (1988) 8. Witana, C.P., Goonetilleke, R.S., Xiong, S., Au, E.Y.: Effects of surface characteristics on the plantar shape of feet and subjects’ perceived sensations. Applied Ergonomics 40(2), 267–279 (2009a) 9. Witana, C.P., Goonetilleke, R.S., Au, E.Y., Xiong, S., Lu, X.: Footbed shapes for enhanced footwear comfort. Ergonomics 52(5), 617–628 (2009b)
Postural Observation of Shoulder Flexion during Asymmetric Lifting Tasks Xu Xu1, Chien-Chi Chang1, Gert S. Faber2, Idsart Kingma2, and Jack T. Dennerlein3,4 1
Liberty Mutual Research Institute for Safety, 71 Frankland Road, Hopkinton, MA 01748 U.S.A. 2 Institute for Fundamental and Clinical Human Movement Sciences, Faculty of Human Movement Sciences, VU University, Van der Boechorststraat 9, 1081 BT Amsterdam, The Netherlands 3 Department of Environmental Health, Harvard School of Public Health, 665 Huntington Avenue, Boston, MA 02115 U.S.A. 4 Department of Orthopaedic Surgery, Brigham and Women’s Hospital, Harvard Medical School, 75 Francis Street Boston, MA 02115 U.S.A. {Xu.Xu,chien-chi.chang}@LibertyMutual.com, {gfaber,jax}@hsph.harvard.edu,
[email protected]
Abstract. This study was to evaluate the observation error of the shoulder flexion angle during an asymmetric lifting task. The results indicated the average absolute estimate error was 14.7 degrees and the correlation coefficient between the measured and estimated shoulder flexion was 0.91. The observation error may be due to the arm abduction. Keywords: Shoulder flexion, Posture observation, Side-view, Lifting tasks.
1 Introduction Shoulder musculoskeletal disorders (MSD) have been a major occupational injury. Some studies found that shoulder joint deviation can lead to shoulder joint injury or discomfort (Holmstrom et al., 1992). One way to monitor awkward shoulder postures is by postural observation. Although postural observation is not as accurate as using laboratory equipment, it is still widely adopted by ergonomists to assess mechanical exposure as it does not require specialized equipment and does not involve strong interference with the normal operations of those being surveyed. A previous study indicated that observers could provide good estimates on shoulder flexion when the arms moved in the sagittal plane (Genaidy et al., 1993). On the other hand, a more recent study indicated that the observation error of shoulder posture was large during lifting tasks (Lu et al., In press). The main objective of this study was to evaluate the observation error of the shoulder flexion angle during an asymmetric lifting task. A lifting task was designed V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 228–230, 2011. © Springer-Verlag Berlin Heidelberg 2011
Postural Observation of Shoulder Flexion during Asymmetric Lifting Tasks
229
to provide motion tracking system-based direct measurement of shoulder flexion as well as video clips with which the shoulder flexion could be estimated by the raters with a customized program.
2 Method Twelve healthy male lifters (age: 47.5 (11.3) years, height: 1.74 (0.07) m, weight: 84.5 (12.7) kg) performing various lifting tasks in a laboratory were recorded by a side-view camcorder and a synchronized motion tracking system. The task involved lifting a plastic crate weighing 10 kg with three vertical lifting ranges (floor to knuckle height, floor to shoulder height, and knuckle height to shoulder height) and three end-of-lift asymmetric angles (0 deg, 30 deg to the right, and 60 deg to the right). For each lifting condition, two repetitions were performed. A customized computer program extracted four frames from each video clip of lifting tasks at time T, T+∆t/3, T+2∆t/3, and T+∆t, where T was the starting time of the lift and Δt was the time needed for performing the lifting task. Ten raters (age: 37.0 (16.7) years) estimated shoulder flexion angle for each of the four frames. The sequence of video clips was randomized for each rater. Before the observation, training was provided to the raters with 3 video clips. For each frame, the measured shoulder flexion angle was extracted from the synchronized motion tracking system. The measured shoulder flexion angle subtracted from the estimated shoulder flexion angle was defined as the estimate error. The averages of the absolute estimate error among all frames and the correlation coefficient between the measured and estimated shoulder flexion angle were calculated.
3 Results and Discussion The average absolute estimate error was 14.7 degrees for shoulder flexion. The correlation coefficient between the measured and estimated shoulder flexion was 0.91. Figure 1- left shows that there were a few trials where the rater-estimated shoulder flexion was around 0 degrees (upper arm was horizontal to the ground) while the measured angle was around -90 degree (upper arm was vertical to the ground and at the side of the trunk). Such discrepancy may be due to the arm abduction (Figure 1 - right). During the lifting experiment, lifters tended to have approximate 90-degree shoulder abduction when the box reached the highest point during the lift. In this case, the raters may estimate the shoulder flexion as 0 degrees when they observe such posture from the side view. However, the motion tracking system yielded the shoulder flexion as -90 degrees based on Euler angles decomposition. It is unclear whether estimating the 3-D shoulder angle can improve the observing accuracy because previous research (Lowe, 2004) found that observers had low precision on 3-D shoulder angle estimations.
230
X. Xu et al.
50
0
-50
-100
-100 -50 0 50 Measured Shoulder Flexion Angle (deg)
Error on Shoulder Flexion Estimation (deg)
Estimated Shoulder Flexion Angle (deg)
100 80 60 40 20 0 -20 -40 -60
0
20 40 60 80 Shoulder Abduction (deg)
Fig. 1. Left: Measured shoulder flexion angle vs. estimated shoulder flexion angle. Right: Shoulder abduction angle vs. estimation error of shoulder flexion.
References 1. Genaidy, A.M., Simmons, R.J., Guo, L., Hidalgo, J.A.: CAN visual-perception be used to estimate body part angles. Ergonomics 36, 323–329 (1993) 2. Holmstrom, E.B., Lindell, J., Moritz, U.: Low back and neck/shoulder pain in construction workers: occupational workload and psychosocial risk factors. Part 2: Relationship to neck and shoulder pain. Spine 17, 672–677 (1992) 3. Lowe, B.D.: Accuracy and validity of observational estimates of shoulder and elbow posture. Appl. Ergon. 35, 159–171 (2004) 4. Lu, M.-L., Waters, T., Werren, D., Piacitelli, L.: Human posture simulation to assess cumulative spinal load due to manual lifting. Part II: accuracy and precision. Theoretical Issues in Ergonomics Science (in press)
An Alternative Formulation for Determining Weights of Joint Displacement Objective Function in Seated Posture Prediction Qiuling Zou1, Qinghong Zhang2, Jingzhou (James) Yang1, Robyn Boothby1, Jared Gragg1, and Aimee Cloutier1 1
Human-Centric Design Research Lab Department of Mechanical Engineering Texas Tech University, Lubbock, TX 79409
[email protected] 2 Department of Mathematics and Computer Science Northern Michigan University, Marquette, MI 49855
Abstract. The human posture prediction model is one of the most important and fundamental components in digital human models. The direct optimization-based method has recently gained more attention due to its ability to give greater insights, compared to other approaches, as how and why humans assume a certain pose. However, one longstanding problem of this method is how to determine the cost function weights in the optimization formulation. This paper presents an alternative formulation based on our previous inverse optimization approach. The cost function contains two components. The first is the weighted summation of the difference between experimental joint angles and neutral posture, and the second is the weighted summation of the difference between predicted joint angles and the neutral posture. The final objective function is then the difference of these two components. Constraints include (1) normalized weights within limits; (2) an inner optimization problem to solve for the joint angles, where joint displacement is the objective function; (3) the end-effector reaches the target point; and (4) the joint angles are within their limits. Furthermore, weight limits and linear weight constraints determined through observation are implemented. A 24 degree of freedom (DOF) human upper body model is used to study the formulation. An in-house motion capture system is used to obtain the realistic posture. Four different percentiles of subjects are selected and a total of 18 target points are designed for this experiment. The results show that using the new objective function in this alternative formulation can greatly improve the accuracy of the predicted posture. Keywords: Posture prediction; direct optimization-based posture prediction; digital human.
1 Introduction Posture prediction is the fundamental utility of digital human models and applied in vehicle interior design and other tasks in product design. Different approaches have V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 231–242, 2011. © Springer-Verlag Berlin Heidelberg 2011
232
Q. Zou et al.
been proposed in the past decades (Das and Sengupta, 1995; Das and Behara, 1998; Faraway et al., 1999; Kee et al., 1994; Jung and Choe, 1996; Wang and Verriest, 1998; Wang, 1999; Tolani et al., 2000; Byun, 1991; Bean et al., 1998; Park, 1973; Kerk, 1992; Dysart and Woldstad, 1996). Compared to other methods, the direct optimization-based approach attracts a lot of attention from the digital human community because this method has advantages over traditional methods (Yang et al., 2004; Yang et al., 2006; Yang et al., 2007; Mi et al., 2009; Yang et al., 2010; Howard et al., 2010). However, one longstanding issue for this method, that has not yet been completely addressed and studied, is how to determine the relative importance of the different components in the weighted-sum form of the objective function, e.g. joint displacement (Jung et al., 1994; Yang et al., 2004). Among other methods, such as the consistency ratio method (Saaty, 1977; 1991), the most common way of determining these weights is the trial and error approach, (Yang et al., 2004; Messac and Mattson 2002). Kim and Weck (2004; 2005) and Khan and Ardil (2009) proposed an adaptive weighted-sum method for bi-objective optimization and multi-objective optimization. It focuses on unexplored regions by changing the weights adaptively, rather than by using a priori weight selection. Zhang and Gao (2006) integrated adaptive weightings in a min-max method for optimization. Dong (2008) adopted an orthogonal interactive genetic algorithm to calculate the weights of different factors, which affect posture selection. In our previous study (Zou et al., 2010), a mathematical model was developed to systematically determine the weights of joint displacement function in posture prediction. This paper presents an alternative formulation to improve the accuracy of the predicted posture. This paper is organized as follows: Section 2 briefly introduces a digital human model based on the Denavit-Hartenberg method (DH method) (Denavit and Hartenberg, 1955). Section 3 briefly gives the optimization-based posture prediction. Section 4 gives detailed formulation of a bi-level optimization problem. Section 5 discusses the realistic postures for various tasks obtained through experiments. Section 6 illustrates the procedure through examples to demonstrate the efficiency and effectiveness of the proposed method.
2 Upper Body Model A human’s upper body can be modeled as a kinematic system. This system is a series of links connected by revolute joints that represent musculoskeletal joints, such as spine, shoulder, arm, and wrist. The rotation of each joint in the human body is described as a generalized coordinate, qi , where the hip has 3 DOFs, the spine has 12 DOFs, and the right arm has 9 DOFs. In this study, the left arm is not included. Joint angles are defined as q = [ q1
q24 ] . According to the DH method, the position T
vector of a point of interest (the end-effector) of a human articulated model (e.g., a point on the thumb with respect to the torso coordinate system) can be written in terms of joint variables as
x = x (q ) ,
(1)
An Alternative Formulation
233
Where q ∈ R 24 is a joint angle vector, and x(q) can be obtained from the multiplication of the homogeneous transformation matrices defined by the DH method as ⎡ 0 R (q) x(q) ⎤ 0 (2) Tn = 0 T1 1 T2 ... n −1 Tn = ⎢ n ⎥, 1 ⎦ ⎣ 0 Where i R j is the rotation matrix relating coordinate frames i and j. The vector function x(q) characterizes the set of all points touched by the end-effector.
j −1
Tj is
the transformation matrix. Clavicle
z16 z18
z14 z15
z19
z22
z11 z17
z21
z23
z20
Shoulder
z13 z12 z10
Elbow
z5 z2
Y
Wrist Globa l Reference Frame
Spine
z9 z7 z6 z4 z3 z1
z8
Hip
z0
X Z
Fig. 1. A 24 DOF human upper body model
In Fig. 1, q0 through q2 represent middle of hip joints, q3 through q14 represent the spine, q15 through q19 represent the shoulder and clavicle, q20 through q23 represent the right arm.
3 Optimization-Based Posture Prediction The direct optimization-based posture prediction formulation (Yang et al., 2004) is listed as follows: Find: Joint angles q Minimize: f (q) = ∑ wi (
qi − qiN 2 ) qiU − qiL
(3)
Subject to: d 2 = (x − xTarget )2 = 0 qiL ≤ qi ≤ qiU , i = 1,
,n
N
Where q denotes a relatively comfortable position (neutral posture). qiL and qiU are the lower and upper boundaries of qi , w = [ w1
wn ]
T
.
234
Q. Zou et al.
4 Formulation The alternative formulation to determine the weights in Eq. (3) is a bi-level optimization problem as follows:
Find: q , w n
Minimize:
∑ wi ( i =1
q −q 2 n q − qiN 2 ) − ∑ wi ( Ui ) q −q qi − qiL i =1 * i U i
N i L i
n
Subject to:
∑w
= 1 , wi ≥ 0
i
i =1
(4)
The variable q is found using the following optimization problem: Find: Joint angles q n
Minimize: f (q) = ∑ wi ( i =1
qi − qiN 2 ) qiU − qiL
Subject to: g (q) = (x − xTarget ) 2 = 0
, i = 1, , n .
qiL ≤ qi ≤ qiU
To solve the bi-level optimization problem in Eq. (4), one can transfer it to a one level optimization problem by using the Lagrange and KKT theory as follows: Find: q , w n
Minimize:
qi* − qiN 2 n q − qiN 2 ) − ∑ wi ( Ui ) U L qi − qiL i =1 i − qi
∑ w (q i =1
i
n
Subject to:
∑w i =1
n
∑ w ∇f (q) + λ∇g (q) + ( μ , μ , i =1
i
1
i
2
g (q) = (x − x
i
(5)
= 1 , wi ≥ 0 , μn )T + (ν 1 ,ν 2 ,
,ν n )T = 0
,
) =0
Target 2
; μi ( qi − q ) = 0 ν i ( q − qi ) = 0 ui ≥ 0 , vi ≥ 0 , , , L i
U i
qiL ≤ qi ≤ qiU i = 1, 2,… , n
,
;
q − qiN 2 λi μ νi ) . f i (q ) = ( Ui , i , and are Lagrange multipliers and qi − qiL
wiL ≤ wi ≤ wiU
, w1 = w3 = 2w4 , w2 = 2w5 , w4 = w7 = w10 = w13 = 80w21 w5 = w8 = w11 = w14 w6 = w9 = w12 = w15 = w4 w22 = w23 = w24 = w4 , , w18 = w19 = w20 = w21
.
Where q* is the realistic posture (given). The original formulation (Zou et al., 2010) is defined as follows:
Find: q , w
An Alternative Formulation n
Minimize:
qi − qi* 2 ) U L i − qi Subject to: All constraints same as in Eq. (5)
∑ w (q i =1
i
235
(6)
5 Realistic Postures from Motion Capture Realistic postures are obtained through motion capture experiments. In the experiments, the target points (tasks) are as in Fig. 2. There are 18 target points, which are set in three different heights (high, medium, and low) with six different target points at each height. The six different target points at each height level are set at three different orientations (right, center, and left) with three points closer to the subject (marked by odd numbers) and three points further away from the subject (marked by even numbers). 12 11
6 5
10 9
4 3
8 7
18 17
16 15
2 1
14 13
Fig. 2. Experimental setup
Four subjects were used. The average height was 170.1 (SD 11) cm and the average mass was 71.2 (SD 12) kg. They were between the ages of 21 and 34 years. All participants gave informed consent prior to the study. Subject 1 is the 50% male, Subject 2 is the 95% male, Subject 3 is the 5% female, and Subject 4 is the 50% female. These percentiles are by stature. In the experiment, the optical motion capture system (Motion Analysis®) is used to collect objective data for 3D motion analysis. This system includes eight Eagle-4 cameras. The cameras have a 4 megapixel resolution, a shutter speed from 0-2000 µs, a focal length ranging from 18 mm to 52 mm, and a maximum of 500 frames per second in up to a 10ftx10ft envelope. In our experiments, the frame rate was set to 120 frames per second. The marker protocol we used for the motion analysis system is the plug-in gait marker protocol. In all cases, the joint center location for each joint was estimated by the intersection of the lines connecting the markers on each joint. After obtaining the various joint centers, a transformation analysis was used to calculate the location of each joint center with respect to the human global coordinate system.
236
Q. Zou et al.
The experimental data, recorded from the motion capture system, are the joint center locations (x, y, and z with respect to the global coordinate system). In order to obtain the joint angles from the joint center data, a posture reconstruction model has been used (Gragg, et al., 2011).
6 Illustrative Example In this paper, we use the same subjects and target points (tasks) as those in Zou et al. (2010). By implementing the formulation in Eq. (5), one can determine all weights for each subject for each task. The goal of the direct optimization-based method is to have one set of weights for all tasks and all populations. We call this set of weights the global weights. In this work, we propose that the global weights are defined as the average weights from all subjects and all tasks in Table 1. After the global weights are obtained, we use the direct optimization-based posture prediction in Eq. (3) to predict postures q for all subjects and all tasks. Note that the neutral posture (position) is from Yang et al. (2004): qiN = 0; i = 1, 2,...,15, 22, 23 , N q16N = -0.261792 , q17N = 0.349056 , q18N = 1.74529 , q19N = 0.17453 , q20 = -1.39622 ,
N N q21 = -0.61085 , q24 = 0.2617917 . To validate whether the weights obtained by the
proposed method are plausible, we need to compare the experimental postures q* with those predicted postures q from the obtained weights. Table 1. The global weights for all subjects and all tasks w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13 w14 w15 w16 w17 w18 w19 w20 w21 w22 w23 w24
Subject 1-Average Subject 2-Average 0.102568167 0.098515111 0.067227 0.072844722 0.102645778 0.098598222 0.051245167 0.049234389 0.0336115 0.036424056 0.0512565 0.049245889 0.051341056 0.049325611 0.033621222 0.036446333 0.051278722 0.049293778 0.051339056 0.049318167 0.033660889 0.036467722 0.051313389 0.0492785 0.0513355 0.049331 0.033673111 0.036495111 0.051322778 0.049362944 0.011909889 0.029214056 0.013245278 0.0090645 0.000617833 0.000616278 0.000608667 0.000643389 0.000625556 0.0006545 0.000639722 0.000614833 0.051300722 0.049323278 0.051367611 0.049406111 0.051345389 0.049376444
Subject 3-Average 0.096888333 0.075561611 0.096949611 0.048420111 0.037767278 0.048406 0.048520111 0.037801222 0.048411389 0.048520111 0.037809611 0.048376833 0.048520111 0.037784667 0.048450667 0.0291985 0.0136435 0.000637722 0.0006655 0.000659944 0.000604778 0.048497889 0.048566333 0.048560167
Subject 4 - Average 0.098704889 0.074301333 0.098748944 0.049336056 0.037143167 0.049322833 0.049436 0.037141167 0.049353111 0.049436 0.037147944 0.049324056 0.049435833 0.037164167 0.0494035 0.019600722 0.0136435 0.000609278 0.000618833 0.000592889 0.000616 0.049404389 0.049470389 0.049407111
Global Weights 0.099169125 0.072483667 0.099235639 0.049558931 0.0362365 0.049557806 0.049655695 0.036252486 0.04958425 0.049653334 0.036271542 0.049573195 0.049655611 0.036279264 0.049634972 0.022480792 0.012399195 0.000620278 0.000634097 0.000633222 0.000618833 0.04963157 0.049702611 0.049672278
Standard Deviation 0.002408232 0.003676074 0.002415101 0.001196565 0.001834042 0.001206196 0.001195438 0.001839335 0.001208944 0.001195188 0.001824622 0.001239418 0.001192345 0.001815481 0.001208145 0.008376619 0.002231041 1.22103E-05 2.55092E-05 3.08347E-05 1.4809E-05 0.001185743 0.001183947 0.00118236
Two different criteria are used to compare these postures: linear regression and joint angle difference. The Fig. 3 is the linear regression for Subject 1. The horizontal axis denotes experimental postures (joint angles) and the vertical axis represents predicted postures (joint angles) from the direct optimization-based posture prediction with the global weights obtained by the proposed method in this work. All joint angles are in radians.
An Alternative Formulation 2
237
2
From Fig. 3, R value for Subject 1 is greater than 0.7. R values for all other subjects are also greater than 0.7. This shows the direct optimization-based posture algorithm can predict reasonably accurate postures by using the global weights obtained through the proposed formulation. By comparing (a) and (b) in Fig. 3, it is 2
seen that the alternative formulation is shown to have a consistently higher R than the original formulation. The similar results are also for subject 2 – 4. Therefore, the alternative formulation can generate more accurate postures than the original formulation. In Fig. 4 all joint angles are plotted for all of the subjects on the same 2
graph. The R value in Fig. 4 is greater than 0.7, which shows the alternative formulation is accurate.
-2
2.5
2
2
1.5
1.5
1
Predicted angles
Predicted angles
2.5
0.5 0 -1
-0.5 0
1
2
3
-1 -1.5 -2 -2.5
1 0.5 0 -2
-1
-0.5 0
-1.5 -2
y = 0.8628x - 0.0172
(b) Original formulation in Zou et al. (2010)
Fig. 3. Experimental and predicted postures for Subject 1 3 2
Predicted angles
y = 0.8262x - 0.0003 R² = 0.744
Experimental angles
(a) Alternative formulation
1 0 -2
-1
0
1
2
3
-1 y = 0.8444x - 0.0094 R² = 0.8022
-2 -3
Experiment angles
Fig. 4. Regression of the alternative formulation 3 y = 0.7386x + 0.0014 R² = 0.7486 2
1
Predicted angles
3
-3
Experimental angles
0 -3
2
-2.5
R² = 0.7525
-3
1
-1
-2
-1
0
1
2
3
-1
-2
-3
Experimental angles
Fig. 5. Regression of the original formulation
238
Q. Zou et al.
Fig. 5 shows the regression results for all the subjects with the original formulation used in Zou et al. (2010). 2 By comparing Figs. 4 and 5, the R value is larger for the alternative formulation, which shows that the alternative formulation is better. Fig. 6 show the mean value of the absolute joint angle errors for all tasks of each subject with two different formulations. Note that the mean values are in radians. The largest errors occur at the elbow and shoulder joints. For more joints, the joint angle deviations from the alternative formulation are smaller than those from the original formulation. The wrist joints have larger deviation from the alternative formulation. Mean absolute deviation
0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
With the alternative formulation With the original formulation
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Joint
Mean absolute deviation
(a) Subject 1: 50% Male 0.5 With the alternative formulation
0.4 0.3
With the original formulation
0.2 0.1 0 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Joint
Mean absolute deviation
(b) Subject 2: 95% Male 0.6 0.5
With the alternative formulation
0.4
With the original formulation
0.3 0.2 0.1 0 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Joint
Mean absolute deviation
(c) Subject 3: 5% Female 0.5 With the alternative formulation
0.4 0.3
With the original formulation
0.2 0.1 0 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Joint
(d) Subject 4: 50% Female Fig. 6. Mean values of absolute joint angle errors for all joints
Fig. 7 shows that mean values of the absolute joint angle errors for all joints of each subject. For more tasks, the mean absolute deviations from the alternative formulation are smaller than those from the original formulation.
Mean absolute deviation
An Alternative Formulation
239
0.25 0.2 With the alternative formulation
0.15 0.1
With the original formulation
0.05 0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Task
Mean absolute deviation
(a) Subject 1 0.3 0.25 0.2
With the alternative formulation
0.15 0.1
With the original formulation
0.05 0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Task
Mean absolute deviation
(b) Subject 2 0.3 0.25 With the alternative formulation
0.2 0.15 0.1
With the original formulation
0.05 0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Task
Mean absolute deviation
(c) Subject 3 0.25 0.2 With the alternative formulation
0.15 0.1
With the original formulation
0.05 0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Task
(d) Subject 4 Fig. 7. Mean values of absolute joint angle errors for all joints
7 Conclusion An alternative inverse optimization method was proposed in this paper to determine the weights of the joint displacement cost function for seated posture prediction. A bilevel optimization problem was formulated and then this formulation was transferred to a regular optimization problem by using Lagrange theory and KKT theory. Four subjects from four different percentiles participated in the experiments, where eighteen tasks for each subject were studied. Experimental and predicted postures were used to demonstrate the accuracy of the proposed formulation, and the results were compared in terms of two criteria- joint angle errors (deviations) and linear regression. Two important factors (weight limits and weight linear constraints) play an important role in this formulation. The weights obtained through this formulation are
240
Q. Zou et al.
different for different subjects and tasks. The global weights are obtained by averaging the weights for all subjects and all tasks. This set of global weights can be used for direct optimization-based posture prediction. The R 2 value for the alternative formulation is higher than that of the original formulation in Zou et al. (2010). In addition, the joint angle deviations are smaller for the alternative formulation. Future work includes 1) testing other ways to obtain the global weights such as by using the summation of mean and standard deviation; 2) investigating whether the standing and seated posture share the same set of global weights; 3) having larger number of subjects; 4) extending this formulation for other human performance measures in posture and motion prediction. Acknowledgements. This research was partly supported by National Science Foundation (NSF) awarded to TTU (Award # 09-26549), and the Faculty Research Grant from Northern Michigan University.
References 1. Bottaso, C.L., Prilutsky, B.I., Croce, A., Imberti, E., Sartirana, S.: A numerical procedure for inferring from experimental data the optimization cost functions using a multi-body model of the neuro-musculoskeletal system. Multibody Syst. Dyn. 16, 123–154 (2006) 2. Choi, J., Armstrong, T.J.: 3-D Dimensional kinematic model for predicting hand posture during certain gripping tasks. In: ASB 29th Annual Meeting (2005) 3. Craig, J.J.: Introduction to Robotics: Mechanics and Control, 3rd edn. (2006) 4. Das, B., Behara, D.N.: Three-dimensional workspace for industrial workstations. Human Factors 40(4), 633–646 (1998) 5. Das, I., Dennis, J.: Normal-boundary intersection: an alternate method for generating pareto optimal points in multicriteria optimization problems, NASA Contract No. NASI19480 (November 1996) 6. Dong, Z., Xu, J., Zou, N., Chai, C.: Posture prediction based on orthogonal interactive genetic algorithm. In: Fourth International Conference on Natural Computation, vol. 1, pp. 336–340 (2008) 7. Dysart, M.J., Woldstad, J.C.: Posture prediction for static sagittal-plane lifting. Journal of Biomechanics 29(10), 1393–1397 (1996) 8. Faraway, J.J., Zhang, X.D., Chaffin, D.B.: Rectifying postures reconstructed from joint angles to meet constraints. Journal of Biomechanics 32, 733–736 (1999) 9. Gill, P., Murray, W., Saunders, A.: SNOPT: An SQP algorithm for large-scale constrained optimization. SIAM Journal of Optimization 12(4), 979–1006 (2002) 10. Gragg, J., Yang, J., Boothby, R.: Posture reconstruction method for mapping joint angles from motion capture experiment to simulation models. In: HCII 2011, Hilton Orlando Bonnet Creek, Orlando, Florida, USA, July 9-14 (2011) 11. Halvorsen, K., Lesser, M., Lindberg, A.: A new method for estimating the axis of rotation and the center of rotation. Journal of Biomechanics 32, 1221–1227 (1999) 12. Howard, B., Yang, J., Gragg, J.: Toward a new digital pregnant woman model and posture prediction. In: 1st International Conference on Applied Digital Human Modelling, Miami, Florida, July 17-20 (2010) 13. Jung, E.S., Park, S.: Prediction of human reach posture using a neural network for ergonomic man models. Computers & Industrial Engineering 27(1-4), 369–372 (1994)
An Alternative Formulation
241
14. Jung, E.S., Choe, J.: Human reach posture prediction based on psychophysical discomfort. International Journal of Industrial Ergonomics 18, 173–179 (1996) 15. Khan, S.U., Ardil, C.: A weighted sum technique for the joint optimization of performance and power consumption in data centers. International Journal of Electrical, Computer, and Systems Engineering 3, 1 (2009) 16. Kim, I.Y., Weck, O.L.: Adaptive weighted sum method for multiobjective optimization. In: AIAA, 2004–4322 (2004) 17. Kim, I.Y., Weck, O.L.: Adaptive weighted-sum method for bi-objective optimization: Pareto front generation. Struct. Multidisc. Optim. 29, 149–158 (2005) 18. Ma, L., Wei, Z., Chablat, D., Bennis, F., Guillaume, F.: Multi-objective optimization method for posture prediction and analysis with consideration of fatigue effect and its application case. Computers & Industrial Engineering 57(4), 1235–1246 (2009) 19. Messac, A., Mattson, C.A.: Generating well-distributed sets of pareto points for engineering design using physical programming. Optimization and Engineering 3, 431–450 (2002) 20. Mi, Z., Yang, J., Abdel-Malek, K.: Real-time inverse kinematics for humans. In: Proceedings of 2002 ASME Design Engineering Technical Conferences, Montreal, Canada, September 29-October 2 (2002) 21. Mi, Z., Yang, J., Abdel-Malek, K.: Optimization-based posture prediction for human upper body. Robotica 27(4), 607–620 (2009) 22. Miller, C., Mulavara, A., Bloomberg, J.: A quasi-static method for determining the characteristic of motion capture camera system in a ‘split-volume’ configuration. Gait & Posture 16(3), 283–287 (2002) 23. Robert, J.J., Michele, O., Gordon, L.H.: Validation of the Vicon 460 motion capture systemTM for whole-body vibration acceleration determination. In: ISB XXth CongressASB 29th Annual Meeting, Cleveland, Ohio, July 31-August 5 (2005) 24. Saaty, T.L.: A scaling method for priorities in hierarchical structures. Journal of Mathematical Psychology 15, 57–68 (1977) 25. Saaty, T.L., Vargas, L.G.: The Logic of Priorities: Applications of the Analytic Hierarchy Process in Business, Energy, Health, & Transportation. RWS Publications, Pittsburgh (1991) 26. Tolani, D., Goswami, A., Badler, N.: Real-time inverse kinematics techniques for anthropomorphic limbs. Graphical Models 62, 353–388 (2000) 27. Wang, X.: Behavior-based inverse kinematics algorithm to predict arm prehension postures for computer-aided ergonomic evaluation. Journal of Biomechanics 32(5), 453– 460 (1999) 28. Yang, J., Marler, R.T., Kim, H., Arora, J., Abdel-Malek, K.: Multi-objective optimization for upper body posture prediction. In: 10th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, Albany, New York, USA, August 30-September 1 (2004) 29. Yang, J., Abdel-Malek, K., Marler, T., Kim, J.: Real-time optimal reach posture prediction in a new interactive virtual environment. Journal of Computer Science and Technology 21(2), 189–198 (2006) 30. Yang, J., Kim, J.H., Abdel-Malek, K., Marler, T., Beck, S., Kopp, G.R.: A new digital human environment and assessment of vehicle interior design. Computer-Aided Design 39, 548–558 (2007) 31. Yang, J., Marler, T., Rahmatalla, S.: Multi-objective optimization-based kinematic posture prediction: development and validation. Robotics (2010) (in press) 32. Zeleny, M.: Compromise Programming in Multiple Criteria Decision Making, pp. 262– 301. University of South Carolina Press, Columbia (1973) 33. Zhang, J., Fan, Y., Jia, D., Wu, Y.: Kinematic simulation of a parallel NC machine tool in the manufacturing process. Front. Mech. Eng. China 2, 173–176 (2006)
242
Q. Zou et al.
34. Zhang, K.S., Li, W.J., Song, W.P.: Bi-level adaptive weighted sum method for multidisciplinary multi-objective optimization. In: AIAA 2008-908 (2008) 35. Zhang, W.H., Yang, H.C.: A study of the weighting method for a certain type of multicriteria optimization problem. Computers and Structures 79, 2741–2749 (2001a) 36. Zhang, W.H., Gao, T.: A min–max method with adaptive weightings for uniformly spaced Pareto optimum points. Computers and Structures 84, 1760–1769 (2006) 37. Zhang, W.H., Domaszewski, M., Fleury, C.: An improved weighting method with multibounds formulation and convex programming for multicriteria structural optimization. International Journal for Numerical Methods in Engineering 52, 889–902 (2001b) 38. Zou, Q.L., Zhang, Q.H., Yang, J.: Determining weights of joint displacement function in direct optimization-based posture prediction–A pilot Study. In: The 3rd International Conference on Applied Human Factors and Ergonomics, Miami, FL (July 2010)
Videogames and Elders: A New Path in LCT? Nicola D’Aquaro¹, Dario Maggiorini², Giacomo Mancuso², and Laura A. Ripamonti² 1
Associazione AQUA-Onlus, via Don Giovanni Calabria, 19 - 20132 Milano, Italy 2 Dipartimento di Informatica e Comunicazione, Università degli Studi di Milano via Comelico, 39 – 20135 Milano, Italy
[email protected],
[email protected], {ripamonti,dario}@dico.unimi.it
Abstract. The current demographic ageing in Europe is the result of a relevant economic, social, and medical development. Nevertheless, this is also leading to an increase in the demand for Long Term Care (LTC) by seniors. One viable way to offer qualified cares at home, while at the same time containing costs, is to exploit digital technologies as enablers of a constant interaction with assisting personnel. The main contribution of this paper is to propose a design methodology to put the basis for deploying, step by step, a “virtual hospital at home” based on an Ambient Intelligence (AmI). The envisaged system will integrate consumer-grade devices for videogames consoles which, thanks to their userfriendly interfaces and smooth learning curve, will contribute in minimizing the interference in the elder’s private life. Keywords: Ambient Intelligence (AmI); context awareness; videogames; healthcare; LTC (Long Term Care); usability.
1 Introduction: An Ageing Europe It has been estimated that, by 2025, people over 60 worldwide will be 1.2 billions, and by 2050 they will reach 2 billions (in 2000 they used to be “only” 600 millions). In the same year (2050), in Europe, the number of over 60 will equal the 40% of the total population, and the 60% of the population in working age – that is to say 15-64 years old (see e.g., [1, 8, 10]). In the following decades the so-called “baby-boomers” (i.e., the huge generation born in the 50s-60s) will start to retire, further exacerbating the situation. In particular, Italy is among the oldest Country in Europe with a ratio of 143 elder to 100 people under 65 [7]. The demographic ageing in Europe is the result of a relevant economic, social, and medical development, that provide us with longer and better lives compared to those of our parents and grandparents. Nevertheless, this progressive increase in population age, beside creating new opportunities, impacts deeply on a number of areas, such as: healthcare, pensions, housing, community care, etc. Among these areas, one of the most afflicted is the long-term socio-medical assistance. This picture turns even darker when coped with the lack of support once V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 245–254, 2011. © Springer-Verlag Berlin Heidelberg 2011
246
N. Aquaro et al.
offered by the traditional “enlarged” families. The percentage of elders living alone and at risk of losing their independence and autonomy due to permanent diseases or health deterioration is drastically increasing. The OECD (Organisation for Economic Co-Operation and Development) defines the Long-Term Care (LTC) as any kind of cure supplied for a long period of time, without any pre-defined end date (see e.g., [11]). The LTC encompasses a heterogeneous set of cares, both highly specialized and not: medical and paramedical assistance, domestic help, personal care, and social assistance. The shifting balance of percentages between the younger and the elder population in the European Countries implies an increase in the demand for LTCs that is supported by declining (or, at best, growing at a slower pace) public financial resources. Consequences of this emerging situation are forcing governments to adopt countermeasures. On one hand it is necessary to reorganize the assistance supply through the development of innovative management approaches, the enrichment of socio-medical services, the integration between hospitals and local communities, and the adoption of multidisciplinary perspectives. On the other hand, it is urgent to collect the resources needed to satisfy the growing demand for LTC services. In spite of all these efforts, at the moment, the supply of services addressed to elders is largely insufficient in all the EU Countries. As an example, in 2006 the northern EU countries had the best ratio between need and supply: the 12% of elder received home cares, while in Japan the ratio was 73,4% already in 2003. The situation is even worst in the southern EU Countries, with, e.g., a ratio of 3% in Italy and of 1% in Greece [7].
2 AmI for an Effective Technological Support to LTCs at Home In Italy, as we saw, elders’ percentage is growing at a fast pace, as a consequence the quantity of hospitalizations is growing as well. Often hospitals are often forced to prolong patients staying because of the shortage of services able to support LTC at home. In spite of this, domiciliary cares would be en effective alternative to long-term hospitalization from several points of view. From the patient perspective the psychological and affective benefit would be huge: this will lead to a stronger commitment towards therapy and, as a consequence, a more easy and rapid recovery. From the community perspective, shorter hospitalizations would mean shorter queues to access public health services and a shrinkage in costs. One viable way to offer qualified cares at home to elders, while at the same time containing costs, is to exploit digital technologies as enablers of a constant interaction between the elder and the people in charge of her assistance. Ambient Intelligence (AmI) is a fast developing multi-disciplinary field which promises to have a significant beneficial influence into our society. AmI “aims to enhance the way people interact with their environment to promote safety and to enrich their lives” [3, p.1]. AmI enriches an environment with technology (mainly sensors and devices interconnected through a network) with the goal of providing help and/or support in an “intelligent” way to the user. In this context “intelligent” means that the support provided by the system should ideally be “in tune” with the person mood and needs. To supply
Videogames and Elders: A New Path in LCT?
247
this kind of assistance, AmI enabled environments borrow aspects of many cognate areas of Computer Science: HCI, Artificial Intelligence, Pervasive-Ubiquitous Computing, computer networks, sensors hardware and software [2, 9]. We are developing a medium-term joint project with a non profit Italian organization (namely Assoaqua - www.assoaqua.it) focused on home LTCs for elders. The main goal of the project is to put the basis for deploying, step by step, a “virtual hospital at home”. Assoaqua needs an AmI solution which is able to support their multiple activities (among which physiotherapy) by allowing remote interaction, on a daily basis, with elders (especially those living alone) to assign exercises, to verify progresses in mobility, to monitor health/environmental parameters, etc. Providing all these functions implies the adoption, and the consequent integration, of a number of different devices and technologies, namely: domotics, support to social interaction, remote monitoring, audio-visual monitoring, movement and presence detection, humidity and temperature measurement. In particular, several among these functions could be provided exploiting specific features of several consumer-grade devices recently introduced in the videogames consoles market (namely Sony PS3 Move and Microsoft Xbox Kinect), beside the – already adopted in the medical field – Nintendo WII. Due to their intended usage, these devices already have user-friendly interfaces with a quick and smooth learning curve. Nevertheless, one of the main constraints of the project is to minimize the number of devices to be put in the elder’s environment, in order to reduce to a minimum the interference in her private life. In Fig.1, we have summarised which interactions between the caretakers and the elder – embedded in her environment – could be supported by an integrated digital system. Note that the figure voluntarily does not include explicitly the whole variety of possible interactions that could take place among the caretakers, nor all the interactions that likely could take place through other media (e.g. home visits, phone calls, etc.).
Health monitoring
Doctor
Nurse Environmental monitoring Daily monitoring, Danger alert, etc. Physiotherapist New exercises, Patient progresses, etc.
Fig. 1. Interactions with the elder and her environment that could be supported by ICTs
248
N. Aquaro et al.
3 A Methodological Approach Based on the Dyad Elder-Caretaker In the project framework, we are defining and testing an approach for supporting effectively the design phase of the AmI system. Actually, one among the main critical points is to provide an interface (in the broader sense of the term) to the system that respects several mandatory constraints, such as being: enough friendly with elders, not obtrusive, resilient to unforeseeable behaviours, context aware, etc. In our vision, and according to Dey approaches (see e.g. [4, 5]), these goals can be achieved only if the intended “user” (the dyad elder-caretaker, who can be – in different moments – an elder plus a doctor, a nurse or a physiotherapist), embedded in its environment, is put “at the centre” of the design process, especially during the definition of the problem space [12]. Actually, not only the requirements and constraints provided by the dyad should be taken into appropriate consideration, but also those deriving form the environment into which the LTCs take place. This implies analyzing also which the more diffused issues deriving from elders homes – at large – are. In this category we can count e.g. size of the rooms, presence of pets, habits and technological proficiency of domestic workers living with the elder, etc. According to us, it is hence mandatory to explore requirements and constraints through an immersive approach, which is the only viable way to define clearly the problem space in this specific situation. To address this issue, we have envisaged a methodological approach to the design of the AmI solution that aims at intermingling tightly the different competencies present in the project: those of the computer science with those of the caretakers-senior dyad. In particular, we have organized mixed teams (typically: a physiotherapist or a nurse and a computer scientist, or a physiotherapist, a nurse and a computer scientist) that conducted an on-the-field research to collect qualitative data about needs and constraints encountered by the dyad. This lead to multiple multidisciplinary brainstorming on the emerging results, in order to define the most suitable interaction model and, subsequently, the candidate technological architecture. More in detail, the methodological approach we are following is composed by three iterative and subsequent phases, each of which collects feedbacks from the others and takes as inputs a certain number of constraints and requirements (see Fig.2). Moreover, different actors (the caretakers, the elder and the computer scientists) are involved with different emphasis in each phase (see Tab.1: a major number of “+” stands for a more intense involvement). Table 1. Involvement of different actors in the main phases of the process
1. Multidisciplinary Brainstorming 2. Technical validation 3a. On the filed data collection 3b. On the field candidate solution testing 4. Prototype design 5. Prototype development 6. Prototype testing
COMPUTER SCIENTIST + +++ ++ +++ +++ +++ ++
ELDER
CARETAKER
+ + + ++ ++ + +++
+++ + +++ ++ + + +++
Videogames and Elders: A New Path in LCT?
MACRO-PHASE 1: REQUIREMENTS DEFINITION R(ct) R(e) C(ct)
1. Multidisciplinary Brainstorming
a. requirements list
MACRO-PHASE 2: EMPIRICAL VALIDATION OF REQUIREMENTS LIST 2. Technical validation C(e) C(cs) C(ct)
b. selection of candidate technical solutions
3a. on the filed data collection 3b. on the field candidate solution testing
c. “environmental & technical feasibility
MACRO-PHASE 3: DESIGN & PROTOTYPING
4. Prototype design
d. list of technical features (desired)
C(cs) C(e) 5. Prototype development
6. Prototype testing
e. prototype
f. final prototype
Fig. 2. Methodological approach to the design phase of the AmI system
249
250
N. Aquaro et al.
3.1 Macro-phase 1: Requirements Definition The first macro-phase of the process focuses on the definition of the overall system requirements, and required multiple interactions between the caretakers (physician, nurse, physiotherapist) and the computer scientist. Preliminary inputs to the phase were the requirements defined by the caretakers (R(ct) in Fig.2, e.g. list of environmental parameters to monitor, etc.) and by the elder (R(e) in Fig.2, e.g. system nonobtrusive, physiotherapy effective, etc.), and the constraints posed by the caretakers (C(ct) in Fig.2, e.g. maximum error/delay acceptable when collecting physical data, etc.). After several iterations, a list of desired features for the system was produced. This list, together with several constraints produced by the elder, the caretakers and the computer scientists, became the input for the subsequent phase 2. The involvement of the computer scientist in the first phase was very light, and the inputs she provided must be quite general and uniquely aimed at excluding “a priori” features that would not be achievable through existing technologies. This would guarantee that the brainstorming focuses effectively on what is really relevant from the point of view of the dyad, without worrying in advance for how the technology would possibly support it. 3.2 Macro-phase 2: Empirical Validation of Requirements List Once the first list of requirements has been produced, its actual feasibility had to be proved. This task has been accomplished mainly by the computer scientist, who has been in charge to develop an extensive survey of existing technologies able to support the requirements listed in the first phase. The outcomes of the survey produced a list of candidate technologies, whose features were proved against requirements during an ongoing interaction with the caretakers. In particular a young computer scientist has been relocated in the ONG, and he actively interacted, on a daily basis, with the caretakers for more than three months. This made possible at least: − further refining of the list of the candidate technological choices; − developing a deeper understanding of the envisioned system from the perspective of technological issues (e.g. the choice of connecting the Kinect to a PC equipped with free software instead of a “native/proprietary” Xbox derived from the necessity of collecting data also from environmental sensors, etc.) Nevertheless, in our opinion, several aspects needed a further investigation. In particular, the idea of moulding technologies intended for entertainment purposes to become a support to physiotherapy called for a deep comprehension of the implications of such a choice. To achieve this goal we set up two processes to collect data on the field, jointly with the dyad. On the Field Data Collection. To complete this activity a team composed by a computer scientist and a physiotherapist conducted 30 on-site visits, each of the duration of 45 minutes, during one month. The team carried no technical equipment, except for the usual tools of the physiotherapist. The goal of these visits has been, on one hand, to determine which the major constraints to the deployment of the envisaged solution in the elders’ homes would be, and, on the other end, to observe how a physiotherapy session takes place.
Videogames and Elders: A New Path in LCT?
251
The first goal produced a list of environmental constraints (e.g. in which room the television is placed, how much empty space there is left on the floor, how the room is lighted, etc., see § 4). The second goal produced a list of key-points that should/could be supported by technology (typically: which are the movements that elders are required to do during the session, how they are oriented in the space, which are the specific movements upon which the therapist’s attention focuses, etc. see § 4). The output of this sub-phase is more precise and objective list of the features the envisaged system should – and could – provide. As a consequence, a certain number of possible technical solution were discarded, among which, e.g., Sony’s Playstation Move and Nintendo’s Wii (this latter already adopted in clinical environments, see e.g. [6]), and one candidate solution selected, i.e., Microsoft’s Kinect. On the Field Candidate Solution Testing. Since the candidate technology (Microsoft Kinect) is quite recent and aimed at the entertainment market, its features are not yet explored enough for our purposes. This called for a further phase of on the field data collection. During this phase, the same team (a computer scientist and a physiotherapist) conducted several other on-site visits, equipped with a Kinect connected to a laptop. This allowed verifying objectively the feasibility of employing this solution to support physiotherapists’ work. Several new constraints emerged (see Fig. 5): for example the fact that a room excessively lighted by the sun may cause troubles in movements recognition, and the necessity to verify how much movements tracing is precise, since, e.g. a too wider extension of limb may cause damage to a patient (we are planning to compare the tracing produced by Kinect with one produced by a motion capture system). Parallel to the on the field experiencing, several experiments – more focused on the technical issues – have been conducted by the computer scientist alone. In particular, she has investigated issues related to the availability of effective software applications for PCs (Kinect is intended for the videogame console Microsoft Xbox), to the quantification of the space necessary for a good tracing of an entire/partial human shape, and to the hardware requirements able to guarantee performances good enough, in term, for example, of lag. The output of macro-phase 2 has been a final list of requirements and technological solutions that could satisfy them effectively. This outcome, jointly with several preliminary constraints produced by the computer scientists and the elder embedded into her environment became the inputs for macro-phase 3. Note also that the output of this phase could lead to reconsider one or more decision taken in macro-phase 1 (see Fig.2). 3.3 Macro-phase 3: Design and Prototyping This last phase, currently under development, is aimed at converting the outputs of the previous phases into a prototype. The prototype design (see Fig.2, phase 4 inside macro-phase 3) will define detailed technical specifications. In particular, the design will be informed by the output of the previous phase, and will respect several further constraints set by the computer scientist and by the elder, whose involvement from now on will increase more and more (see Tab.1). Actually, the elder at her home is the one whose interaction with the interface to the system should be studied even
252
N. Aquaro et al.
more deeply, in order to anticipate possible difficulties which could frustrate the benefits (in terms of reduction of home-visits and increase in monitoring) provided by the system. Elder’s home Kinect
Kinect
internet
Physiotherapist Doctor Nurse
Fig. 3. The general structure of the envisioned system
In Fig. 3 we have sketched the general structure of the system. To notice that, for minimizing the intrusion in the elder’s private life/home, we will install only one personal computer, in charge of providing the interface between her home and the caretakers (dashed lines represent wireless connections). Through this channel will be exchanged data (environmental, medical, physiotherapy exercises, alerts, etc.) and supplied the possibility to interact remotely with the caretakers. Fig.3 shows more than one Kinect connected to the PC, since we are planning to adopt this technology also for monitoring elders status (e.g. if the elder falls and cannot get up, does not wake up in the morning, etc. an alert is issued) in a more fine-grained way than with, e.g., usual movement sensors: this would guarantee a more effective monitoring and less false-alarms. Phase 3 will end producing a final prototype and a set of generalized guidelines aimed at supplying help and advices to people interested in creating solutions to enact the interaction between the elders and the people providing them LTCs.
4 What We Have Discovered: Some Data about Physiotherapy The deployment of our methodological approach seems to prove that this is a promising path toward the definition of one possible effective paradigm for the design of
Videogames and Elders: A New Path in LCT?
253
AmI solutions. As an example, in the following figures we summarized several data we have collected that affected proactively the design process: this data could hardly have been figured out without immersing in the dyad environment. The data about where in the elders’ homes and how (Fig. 4) the physiotherapy exercises are performed produced gave us a precise idea of several constraints on the choice/use of the devices (e.g. excluded the adoption of Sony’s Move). The data about the temporal distribution of different activities during the same physiotherapy session (Fig. 5, left) informed us about keylines for the future development of appropriate software applications based on the use of Kinect. In the same vein, we have established the list and importance of problems that could afflict the use of a Kinect (the priority is ranked on the frequency of the problem as observed during phase 3b, see Fig 5 - right).
Fig. 4. Where (left) and how (right) physiotherapy exercises are performed (30 patients)
Fig. 5. Average time required for each phases of physiotherapy rehabilitation (left) and Major constraints to the use of Kinect encountered during 30 site-visits (right)
5 Conclusion The preliminary results we have obtained so far seem to confirm our hypothesis about the necessity to immerse deeply and iteratively the designers of an AmI application in the “world” of its intended users, that is to say the recipient/s of the service in her environment. This approach should lead to a better design, hence to application that will have more chances to be adopted and to effectively enhance the way people interact with their surroundings – in the broadest sense of the term.
254
N. Aquaro et al.
References 1. Afsarmanesh, H., Msanjila, S.S.: ePAL Vision 2020 for Active Ageing of Senior Professionals. In: Camarinha-Matos, L.M., Boucher, X., Afsarmanesh, H. (eds.) PRO-VE 2010. IFIP Advances in Information and Communication Technology, vol. 336, pp. 60–72. Springer, Heidelberg (2010) 2. Augusto, J.C.: Ambient Intelligence: the Confluence of Ubiquitous/Pervasive Computing and Artificial Intelligence. In: Intelligent Computing Every-where, pp. 213–234. Springer, London (2007) 3. Augusto, J.C., McCullagh, P.: Ambient Intelligence: Concepts and Applications. ComSIS 4(1) (June 2007) 4. Dey, A.K.: Understanding and Using Context. Personal and Ubiquitous Computing Journal 5(1), 4–7 (2001) 5. Dey, A.K.: It Really is All About Location! In: Erickson, T., McDonald, D.W. (eds.) HCI Remixed: Essays on Works that Have Influenced the HCI Community, ch. 10, pp. 61–66. MIT Press, Cambridge (2008) 6. Gil-Gomez, J.A., Lozano, J.A., Alcaniz, M., Perez, S.A.: Nintendo Wii Balance board for balance disorders. In: Proc. of Virtual Rehabilitation Conference, Haifa, June 29 -July 2, p. 213. IEEE, Los Alamitos (2009) ISBN 978-1-4244-4188-4 7. Istat: Previsioni della popolazione residente per sesso, età e regione: 2001-2051 (2003) ISBN: 88-458-0756-8, http://www.istat.it/dati/catalogo/20030326_01/ 8. Jeavans, C.: Will we still be working at 70? (2004), http://news.bbc.co.uk/2/hi/uk_news/4016969.stm 9. Nakashima, H., Aghajan, H., Augusto, J.C.: Handbook of Ambient Intelligence and Smart Environments, vol. XVIII. Springer, Heidelberg (2010) ISBN: 978-0-387-93807-3 10. Stranges, M.: Immigration As a Remedy for Population Decline? An Overview of the Euro-pean Countries. In: European papers on the new welfare, Special issue on the counter ageing society, paper no. 8 (2008) 11. OECD Health at a Glance: Europe 2010. European Report (2010), http://ec.europa.eu/health/reports/european/ health_glance_2010_en.htm 12. Sharp, H., Rogers, Y., Preece, J.: Interaction Design: Beyond Human-Computer Interaction. John Wiley and Sons, Chichester (2009)
Research on Digital Human Model Used in Human Factor Simulation and Evaluation of Load Carriage Equipment Dayong Dong1, Lijing Wang2, Xiugan Yuan2, and Shan Fu1 1
School of Aeronautics and Astronautics, Shanghai Jiao Tong University, 800 Dongchuan Rd., Shanghai 200240, P.R. China 2 School of Aeronautical Science and Engineering, Beihang University, 37 Xueyuan Rd., Beijing 199191, P.R. China
[email protected]
Abstract. Data structure of the digital human model has been constructed for particularity demand of human factor evaluation of the load carriage system equipments. Anthropometry data of different percentage was obtained through regression calculation; Range of Motion (ROM) data of the different joints, data of upper limb reach zone and accessibility data of human body surface were also obtained by experiments. Reach zone envelop was constructed based on the experiment data. Different types of view cones have been constructed on the basis of related standard data to meet the use of human factor evaluation. Keywords: Load carriage equipment, Simulation, Human factors, Digital Human model.
1
Introduction
With the development of computer visualization technology, the application of digital human model to carry out initial human factor evaluation of the design in the early digital stage of the system design has been gradually accepted by designers and developers. Presently, digital human model used in the design of both aircraft and automobile has become relatively universal. There is a Human Engineering module in CATIA which can build digital human model with different percentage height of different countries (e.g. America, France, Japan, etc.) for the evaluation on visibility and accessibility of digital model of aircraft system, and it can has a function of posture analysis. Load carriage equipment is usually used by soldiers, police and outdoor travelers. However, it is generally attached to human body surface and varies with human motion, which is of great difference compared with driving environment in the V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 255–262, 2011. © Springer-Verlag Berlin Heidelberg 2011
256
D. Dong et al.
aircraft or car where the pilot or the driver usually work in sitting posture. The equipment used in the cockpit for human-machine interaction does not move with human body, therefore, digital human model enjoys distinctiveness when used in load carriage system design evaluation. Research has been carried out aiming at this distinctiveness. First, human model structure is built with 66 segments, 65 joints and 131 freedom; secondly, different percentage anthropometry data is built by regression calculation on the basis of GJB (China’s military standards) original data; thirdly, the range of joint motion and accessibility data are determined based on experimental data; lastly, visibility data is determined based on GJB.
2
Data Structure
H-Anim standard is an international standard on virtual human model. Its full name is Humanoid Animation Specification. Such standard is established for expression of virtual human in online virtual environment, taking into account of compatibility, adaptability and conciseness. Current version of H-Anim 2001 processes description by using VRML 97[1]. H-Anim standard utilizes Prototype Support in VRML97, defining five types of custom nodes to describe virtual human model—they are Humanoid, Joint, Segment, Site and Displacer, among which, the node of joint is used for composing bone structure of virtual human H-Anim standard divides the whole human body into 1 human body gravity, 77 joints and 47 bone segments—these elements constitutes an integrated virtual human model. Since virtual human bone segments are connected by joints, the motion of human gravity, every bone segment and joint will affect other node status related to it. This definition accords with physiological characteristics of human motion. H-Anim standard hierarchy is realized by nested joint node. The sacroiliac joint at the end of spinal column is regarded as the root of the whole skeleton structure which will be traversed from top to bottom. All the joints will be organized into treeshaped inheritance structure according to the order of various joints encountered. Each joint node is a parent node of those ranking behind, which forms skeleton of human model. This research is based on the above mentioned international standards and simplifies neck segments according to actual needs. It divides human body into 66 segments, 65 joints and 131 freedom, which includes: body (pelvis and 17 spines, totaling 18 segments; 12 thoracic vertebra and 5 lumber vertebra in 17 spines), left shoulder, right shoulder, neck, head, left upper arm, right upper arm, left front arm, right front arm, left hand (palm and 15 pieces of finger bones, totaling 16 segment), right hand (palm and 15 finger bones, totaling 16 segments), left thigh, right thigh, left shin, right shin, left foot (2 segments) and right foot (2 segments). Topological structure is shown as follows in fig 1.
Research on Digital Human Model Used in Human Factor Simulation and Evaluation
257
Head Cervical
Thoraces
Lumbar
Pelvis
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 l1 l2 l3 l4 l5
Clavicle
Upperarm
Forearm Hand Fingers (15)
Thigh
Calf
Hindfoot Forefoot
Fig. 1. Human Model Topological Structure
2
Human Body Measurement Size
When carrying out human body modeling, it is necessary to define the sizes of various segments according to human body measurement size. GJB4856-2003 provides basic size of part of percentages. In order to realize human body modeling of any percentages, Haiqiang Li from Beijing University of Aeronautics and Astronautics conducted research on relatedness between main human body measurement size and height and weight and figured out relevant regression parameters [2]. In this paper such kind of research findings were adopted to control human body modeling dimension so as to realize human model with different dimension as is shown in fig 2.
258
D. Dong et al.
Fig. 2. Human Model of Different Percentages
3
Range of Motion of the Joint
The restraints which load carriage equipments exert on human joints mainly center on shoulders and body trunk. The study conducts experiment to carry on one’s shoulder 20kg backpack to simulate restrictions that load carriage system exerts on the joints. The shoulder straps of backpack are 80mm wide and are used to measure the range of motion of the joints. The data base for the range of motion of the joints is established to define ROM data sheet. The data structure is shown in table 1 below.
Table 1. Data Structure of Joint ROM Field name
description
Joint Name X1 X2 Y1 Y2 Z1 Z2
Name of joint Minimum value of X axis Maximum value of X axis Minimum value of Y axis Maximum value of Y axis Minimum value of Z axis Maximum value of Z axis
In user interface, topological structure of human model joints are shown by means of human-computer interaction and joint angle can be adjusted manually, as is shown in fig 3.
Research on Digital Human Model Used in Human Factor Simulation and Evaluation
259
Fig. 3. Human Model Joint Adjustment
5 5.1
Accessibility Data Accessible Region Enveloping Surface
First, let the subject stand at attention at original point; then, divide horizontal plane according to 15° interval and draw 15° interval line on the ground; and then, divide a plane every 10cm in vertical direction and measure from middle fingertip that is naturally hanging down; after that, use a vertical scale ruler to guide the subject to perform touch actions and utilize VICON motion track system to let the subject perform actions in capture space; finally, record coordinate points of fingertips and use three-dimensional software to generate enveloping curve surface. 12 male subjects were selected to measure accessible region of their upper limbs. In order to measure maximum accessible region under free conditions as well as maximum accessible region and comfortable accessible region under the condition of restricted shoulder joints, the experiment carries out measurement on these three kinds of status: 1. Under free condition, that is, the subjects will not carry any equipment and shoulder joints will not be subjected to any restriction. The subjects will stretch their upper limbs as far as they can and perform touch actions under guidance; 2. Under the condition of restricted shoulder joints, that is, the subjects will carry load system with 80mm wide shoulder straps and thus, the shoulder joints will somewhat be restricted by shoulder straps. The subject will stretch their upper limbs as far as they can and perform touch actions under guidance; 3. Stretch the upper limbs to a comfortable and reachable state after their shoulder joints are restricted, that is, the subjects will carry load system with 80mm wide shoulder straps and thus, the shoulder joints will somewhat be restricted by shoulder straps. The subject will stretch their upper limbs comfortably as far as they can through their subjective feeling and perform touch actions under guidance;
260
D. Dong et al.
In order to avoid measurement error brought about by the fatigued subjects, the subjects will be arranged to take 15 minutes break after each collection of experimental data till all the data are collected. The generated enveloping surface is demonstrated in figure 4.
Fig. 4. Accessible Region Enveloping Surface
5.2
Accessibility of Human Body
Divide human body into 32 parts (see fig. 5). Select the subjects to present standing position, crouched position and creeping motion, to perform touch actions to different body parts and make subjective judgment to accessibility levels which can be divided into three: comfortably reachable, extremely accessible and inaccessible. Comfortably reachable means that the subjects can easily touch their body parts and will not generate uncomfortable feeling; extremely accessible means that the subjects can get in touch with their body parts, but will generate unconformable feeling; Inaccessible means that the subjects cannot reach any of their body parts. Experimental data is available in reference [3]. Experimental data can be mapped onto digital human model surface.
Fig. 5. Regional Division of Human Body
Research on Digital Human Model Used in Human Factor Simulation and Evaluation
6
261
Visibility
According to related regulations in GJB2873-97[4] and MIL-HDBK-759A[5], this paper constructs visual field in both plane and vertical direction to draw vision cone under three conditions: eye rotation, head rotation, head and eye rotation (Table 2). The different type of view cone of digital human model is shown in Fig 6. Table 2. Visual Field View cone Preferred Eye rotation only Head rotation Head and eye rotation
Up 15 40 65 90
down 15 20 35 75
Left/right 15 35 60 95
Fig. 6. View cone of digital human model
7
Conclusion
In this paper, the author introduced a digital human model used in human factor evaluation of load carriage equipments. Some experiments were carried on, and the data from the experiments were used in the human body modeling.
Acknowledgement This work was supported by a grant from the National Basic Research Program of China (973 Program, No. 2010CB734103).
262
D. Dong et al.
References 1. International Organization for Standardization (ISO).The Humanoid Animation Specification. International Standard ISO/IEC FCD 19774:200X (2001) 2. Li, H.: The Correlation Analysis of Body Measurements for Human Model. BUAA (2007) (in Chinese) 3. Dong, D., Wang, L., Yuan, X.: Experimental Research on the Confirmation of Human Body Surface Reachable Grade. AHFE (2010) 4. Human engineering design criteria for military equipment and facilities. GJB2873-1997 (1997) (in Chinese) 5. MIL MIL-HDBK-759A. Human Factors Engineering Design for Army Material Military Handbook, U.S. (1982)
Multimodal, Touchless Interaction in Spatial Augmented Reality Environments Monika Elepfandt and Marcelina Sünderhauf Technische Universität Berlin, Zentrum Mensch-Maschine-Systeme, GRK prometei, Franklinstr. 28/29, 10587 Berlin, Germany {monika.elepfandt,marcelina.suenderhauf}@mms.tu-berlin.de
Abstract. Spatial augmented reality environments are increasing in many domains. However, from a user-centred perspective we still do not know how to interact with them properly. Thus, there is a need for new interaction concepts. In this paper we discuss certain aspects which are special for the interaction in spatial augmented reality environments. We conclude that the most promising form of interaction will be multimodal and touchless. Additionally it should be taken into account that the user is interacting in a 3D space. Therefore different ways of interaction might be more appropriate for different spaces around the user instead of only one. We give a short review of advantages and disadvantages of touchless and multimodal input devices and then outline spatial aspects of perception and interaction in 3D space. Keywords: spatial augmented reality, touchless interaction, multimodal interaction, multimodality, interaction in 3D space.
1 Introduction With technological advances, our everyday life is getting more and more augmented with additional information. Large displays are becoming ubiquitious and it will soon be possible to use every physical surface as an interactive display. A commonly accepted definition of such augmented reality (AR) systems is provided by Azuma [1]: These systems combine the real and virtual world, are interactive in real time, and are registered in 3D. That means that the real world is overlaid with computer-generated virtual information to support the user. The real and the virtual world are combined such that the user interprets the new created three-dimensional environment as a unit and can interact with it in real time. The most commonly known example for AR is the use of head-up displays in aircrafts to visualise airspeed, altitude, the horizon line etc. Despite the fact that head-mounted displays are still mainly used for AR applications if the user is moving around, spatial AR is a fast-growing field. In these systems the displays are placed and integrated in the environment instead of being attached to the user as with head-mounted or hand-held displays. Normally this is done by using projectors and screens, and the information is directly projected on real objects or even on daily surfaces. Olwal et al. [2], for example, used spatial AR for industrial CNC-machines. The operator sees the real machinery in the workspace V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 263–271, 2011. © Springer-Verlag Berlin Heidelberg 2011
264
M. Elepfandt and M. Sünderhauf
through a display. Process data is presented on the display in such a way that tools and workpieces are overlaid with the visualisations to augment the user’s understanding and to simplify machine operation. The advantages for users are obvious: with spatial displays the users do not have to wear heavy equipment. Hence, these systems are less cumbersome and more ergonomic. On top of that, the field-of-view is less limited compared to head-mounted displays and spatial AR is more convenient for multi-user settings [3], [4]. Application domains for spatial AR are wide-ranging and include industrial settings like control rooms, product design, manufacturing, medical settings like surgery or scientific simulations. Spatial AR applications can also be found in everyday life settings such as office environments, museums or smart homes. For a more detailed description see also Bimber and Raskar [3] or Streitz, Kameas and Myrommati [5]. However, as the development of such systems is mainly technology-driven, we still do not know how to interact with them properly from a user-centred perspective. Though there is a lot of research on interaction techniques, it mostly concentrates on a specific implementation which is often not tested with users or in terms of usability criteria. And even if interaction techniques are tested with usability criteria they are seldom compared with each other. Thus many challenges remain for interaction in spatial AR environments and there is a considerable need for new interaction concepts and design guidelines. As a first step this paper discusses key aspects for interacting in spatial AR, focusing on input devices. The first aspect is the question how the input should be realised. In most humanmachine systems the keyboard, mouse, remote-control unit, and touch display are still the main ways of interaction. But in spatial AR these input devices are no longer suitable. The idea of spatial AR is that the users are moving around a room while interacting with various screens and projections. Hence, fixed controls like a keyboard and mouse would not be suitable as they would limit the users’ freedom of mobility. Neither would the use of remote controls be a good choice. On the one hand the advantage of detaching the user from the technology would get lost and on the other hand the need for frequent gaze shifts between the remote control and the screen can be really annoying and time consuming. Direct touch is also not applicable as the user often stands at some distance from the screen. On top of that there are environments where touch can not be used even if the screen is within reach because they have to remain sterile (e.g. in surgery or laboratory settings). Thus we think that touchless interaction is the most promising way of interaction in spatial AR and we will therefore focus on these input devices here. The second important aspect is that the user has to interact with screens and projections at various distances. As mentioned above, spatial AR implies that the user is not sitting in front of a display but is moving around. From neuropsychology it is known that the brain constructs multiple spatial representations in order to interact with the world around us and that different spaces around a person are characterized by different functions, e.g. manipulation of objects, visual search, spatial orientation, etc. (see chapter 3). The question is if and how various distances influence the type of interaction in spatial AR.
Multimodal, Touchless Interaction in Spatial Augmented Reality Environments
265
2 Touchless Interaction We first offer an overview of the input modalities that are mainly used for touchless interaction – gaze, speech and gesture. Their advantages and disadvantages are outlined from a user-centred perspective. Additionally, the question of unimodal vs multimodal use is discussed. 2.1 Gaze Gaze is the only reliable predictor of the area of visual attention. During humanhuman communication it is used by the listener to identify about what (e.g. which person or which object) a person speaks [6], [7]. In addition, gaze is sometimes also directly used to point at objects, people, etc that are around [7]. Since now the research focus of gaze as input modality is the interaction with close displays. Apart from research it is mainly used for people with motor disabilities. Amongst the advantages are that selection of objects via gaze on a screen seems to be faster than hand-eye coordination (e.g. using a mouse) [8] and that users do not have to learn a difficult gaze control compared to hand-eye coordinated devices [6]. This is due to the fact that users have to look at the target anyway before they act mechanically [9], [10]. Accordingly, Ware and Mikaelian [9] could show that gaze interaction is faster than regular input devices. However, if using gaze as a mode of input it should be considered that the eye is primarily a sensory organ for perception and should not be overloaded with other tasks [11], [12]. In most cases, an object is selected by gaze direction and activated by dwell time, blink or an additional keystroke. The dwell-time method uses the fixation on an object to activate or to select it. The critical point for this common method is the setting for the duration of the fixation. If the dwell-time is too short, the so called “Midas touch” problem occurs [10], i.e. every glance unintentionally activates a command. The user can not look anywhere without selecting or issuing a command. If the dwell-time is too long, the time advantages from fast eye movement disappear [12]. And even with a longer dwell-time, the problem of activating an object accidentally still remains as the user always looks somewhere. 2.2 Speech To communicate via voice is natural for humans and compared to gaze or gesture it is the modality which is used mostly for direct communication. In human-machine interaction, a main advantage is that speech can be used where the hands are occupied and eyes are busy. Additionally, speech is not only independent from static input devices like keyboard and mouse, but also works during motion and at a distance, which gives the user more freedom and offers more possibilities to solve several tasks at once. In a lot of empirical studies speech input is compared to standard manual input devices like keyboard and mouse. Benefits are a faster task completion, particularly for tasks that are command intensive [13], and lower error rates than by typing. Martin [14] gives an overview of various studies and also points out that using speech
266
M. Elepfandt and M. Sünderhauf
increases user productivity by providing an additional response channel. Users spend less time looking at keyboards when speech is available which also cuts performance time. Draper et al. [15] tested these advantages with pilots performing a continuous flight/navigation control task while completing various data entry tasks. Speech input was significantly better than manual input in terms of task completion time, task accuracy, flight/navigation measures, and pilot ratings. Also the data entry time was reduced considerably with speech input. In addition, users prefer to speak rather than type if natural language interaction is chosen as a modality of human-computer interaction [16]. However, there are also disadvantages for using speech as input device. There are many situations where users do not want to be overheard by others - whether for privacy reasons or to avoid disturbances, e.g. in an office setting. Depending on the interface, another disadvantage is that voice commands often have to be learned and remembered, which places additional cognitive demands on the user. Voice commands are not always short and intuitive. In some applications the commands are quite long, detailed and circuitous, e.g. if spatial specifications are required [17]. Additionally speech is not always under full conscious control. During the production of speech, lexical disfluencies (e.g. content self corrections, false starts, verbatim repetitions, spoken pauses) may appear [18] which a user can not completely avoid. Current research on speech as input modality focuses more on integrating speech in a multimodal setting than using speech on its own (e.g. [19], [20]). 2.3 Gesture Gestures are an important part of the human-human communication. They are used in everyday life and are self explanatory. Following Fikkert [21] this leads to a joyful and natural use of gestures in human-machine interaction. However, in human-human communication, they normally highlight what is being said and are seldom used on their own. Furthermore, the only gesture which is natural is pointing [7]. All others are learned through cultural influences, and during communication are rarely used on a conscious level. Hence, for utilizing gestures in human-machine interaction the problem arises that they have to be especially designed and are not necessarily intuitive. Nielsen et al. [22] could show that different user groups produce different gestures for the same task. On top of that the participants used two to three different gestures for the same instruction, although their choice was at a conscious level. If there is not even a real consistency in one person, it seems to be impossible to find gestures that are really intuitive. Thus, gestures in human-machine interaction always have to be learned and remembered by the user and are therefore cognitively demanding. Jetter, Gerken and Reiterer [23] even argue that “symbolic gestures are indirect and resemble learning an artificial sign language to converse with a system about triggering actions”. Therefore, the research focus should not be on symbolic gestures but rather on manipulations as these are more natural. Apart from that, it should also be taken into account that gesticulating may be physically stressing and tiring if used frequently or for a long time. Users prefer gestures which require a minimum of effort and rate fast and easy switching between resting and interacting as very important [21].
Multimodal, Touchless Interaction in Spatial Augmented Reality Environments
267
2.4 Multimodality Instead of using gaze, speech and gesture as unimodal input devices – with their advantages and disadvantages – they can also be combined and used to provide a multimodal input. In human-human communication, speech and gesture are normally used multimodaly for complementary information. Speech typically specifies subject, verb, and object information, while manual gestures convey location, number, manner of movement and related information [24]. As described earlier, gaze is also sometimes used to point at objects or people. In human-machine interaction, users have also a strong preference for multimodal rather than unimodal interaction across a wide range of various application domains [25]. Here again modalities are used to complement one another and not redundantly. Bolt [26] already used speech and gesture complementary in his “Put-That-There” interface. The user selected an object or location on the interface by pointing at it and gave a spoken command (e.g. "Create a blue square there."). However, the preference is very task dependent. Oviatt, DeAngeli and Kuhn [17] conducted a task analysis of speech and pen input for a dynamic map system. They found that the use of multimodal commands depend on the type of action which has to be performed. Users were most likely to use speech and pen in spatial location commands (adding, moving, modifying and calculating distances). A moderate use of multimodal interaction occurred during selection commands (querying, deleting, labeling and zooming), while during general action commands (e.g. scrolling, printing and locating) a marginal level of multimodality was found (i.e. less than 1% of the time). Tan et al. [27] tested gaze and speech as a data entry method and showed that users prefer the multimodal use of gaze and speech compared to other methods like handwriting, mouse and keyboard, or speech only. The multimodal input was neither the fastest nor the most accurate, but obtained the highest hedonic quality and naturalness according to the users. Besides, Oviatt [28] reports that errors - in multimodal interaction - lead to less subjective frustration than errors in unimodal interfaces. Comparing speech-only input with speech-gesture input while performing mapbased tasks, the multimodal input results in more efficiency of speech (fewer spoken words), fewer task performance errors, fewer disfluencies (e.g. content self corrections, false starts, spoken pauses), and faster task performance [29]. Also Oviatt et al. [17] showed that speech commands get shorter and more explicit if used to complement another modality. Thus there are a lot of advantages adding complementary multimodality to an interface. It is quite important to consider which modalities to combine and for which subtask each modality should be used in combination with other ones. For instance, studies show performance benefits when users indicate spatial objects and locations (e.g. points, paths, areas, groupings and containment) through gestures instead of speech [28], [29], [30]. On the other hand, speech is more useful than gesturing for specifying abstract actions [20]. Summarizing, we have shown that gaze, speech and gesture have different advantages and disadvantages for use as input modalities. Their benefits and weaknesses should be taken into consideration when designing new interaction
268
M. Elepfandt and M. Sünderhauf
techniques for spatial AR environments. Thus using only one modality does not seem to be appropriate. Furthermore, as the interaction takes place in 3D space most of the tasks will have a spatial component. Therefore we think that a more multimodal approach is the most promising for interaction in spatial AR.
3 Interaction in 3D Space As already mentioned, an important challenge for interaction in spatial AR is that the user is no longer sitting in one place but is moving around in a 3D space, interacting with various screens and projections at different distances. Thus, spatial aspects of perception and interaction should always be considered in spatial AR environments. Neuropsychological studies have discovered that the brain builds multiple spatial representations in order to interact with the world around us [31] and that particular representations are created to guide certain actions like eye movements, head movements or arm movements [32]. According to Previc [33] “our behavioral interactions in each region of the 3-D space are associated with a characteristic 3-D locus of operation, a select set of sensory-perceptual and motor operations, and a predominant coordinate system”. In his work on 3D spatial interaction he integrated neuropsychological findings into a model of four major behavioral realms: (1) peripersonal, (2) focal extrapersonal, (3) action extrapersonal, (4) ambient extrapersonal (see Fig. 1).
Fig. 1. A theoretical model of 3D spatial interaction, adapted from Previc [33]
The peripersonal realm is within hand reach, the three extrapersonal realms are out of reach. Therefore the realms also depend on an individual’s arm-length [34]. The peripersonal behavioral system is characterized by grasping, reaching and manipulating objects, while the extrapersonal behavioral systems are characterized by the visual search for and recognition of objects (focal extrapersonal), navigation and target orientation (action extrapersonal), and spatial orientation, postural control and locomotion (ambient extrapersonal).
Multimodal, Touchless Interaction in Spatial Augmented Reality Environments
269
For interaction at varying distances, the differentiation between the peripersonal and the extrapersonal space is most important. As the peripersonal realm is in reach, it is highly optimized for direct interaction because this is the only space in which we are naturally able to manipulate objects. Other authors refer to the manipulative realm [35] or grasping space [36]. It can also be extended by the use of instruments [36], [37]. Objects beyond this space can normally not be reached or manipulated without moving to them. Naturally the main functions of the extrapersonal realms are more perceptive and visual and are not constructed for direct interaction. But what happens if we have to select or manipulate an object on a distant screen, i.e. in extrapersonal space? Or if we have to interact with objects on a screen that is in the peripersonal space but which we are not able to touch? Is there a difference for interaction in different realms? Or is the only difference the ability to touch? These questions should be considered by developing new interaction concepts for spatial AR environments.
4 Conclusion In many application domains such as industry, research, and even home environments, spatial AR systems are increasing. Interaction with these systems is one of the most basic but also critical components. Unfortunately, input devices designed for traditional computers are not always suitable for new technological approaches – particularly not for interaction in spatial AR. Although a lot of research has been carried out on interaction techniques we still do not know how to interact with these environments in a suitable way because the research is often more technologically driven than user-centred. As a first step towards a more user-centred design, this paper pointed out and discussed certain aspects that are special for interaction in spatial AR. The following aspects should be considered for input devices: First, as the users are moving around in a room while interacting with different screens and projections at various distances the most promising way of interaction will be touchless interaction. Second, as the user is interacting in 3D space most or all of the tasks will have a spatial component. Thus the design of input devices should focus more on multimodal interaction as users prefer multimodality for spatial tasks. Third, as the brain builds multiple spatial representations in order to interact with the world around us, and as different spaces around a person are characterized by different functions, it should be taken into account that different modes of interaction might be more appropriate for various distances. Many challenges remain for the interaction in spatial AR environments and there is an urgent need for new interaction concepts and systematic research from a usercentred perspective. Acknowledgments. We want to thank the Deutsche Forschungsgemeinschaft, the Chair of Human-Machine Systems Berlin, and the research training group prometei for their support.
270
M. Elepfandt and M. Sünderhauf
References 1. Azuma, R.T.: A survey of augmented reality. Presence-Teleoperators and Virtual Environments 6, 355–385 (1997) 2. Olwal, A., Gustafsson, J., Lindfors, C.: Spatial augmented reality on industrial CNCmachines. In: SPIE Electronic Imaging, vol. 6804 (2008) 3. Bimber, O., Raskar, R.: Spatial Augmented Reality: Merging Real and Virtual Worlds. A.K. Peters, Wellesley, Massachusetts (2005) 4. Zhou, F., Duh, H.B.-L., Billinghurst, M.: Trends in augmented reality tracking, interaction and display: A review of ten years of ISMAR. In: 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 193–202. IEEE Press, Washington (2008) 5. Streitz, N., Kameas, A., Mavrommati, I.: The Disappearing Computer. Springer, Heidelberg (2007) 6. Kaur, C.S., Tremaine, M., Huang, M., Wilder, N., Gacovski, J., Flippo, Z., Mantravadi, F.: Where is it? Event synchronization in gaze-speech input systems. In: ICMI, pp. 151–158. ACM, Vancouver (2003) 7. Poggi, I.: Mind, hands, face and body. In: A goal and Belief View of Multimodal Communication. Weidler, Berlin (2007) 8. Sibert, L.E., Jacob, R.J.K.: Evaluation of eye gaze interaction. In: SIGCHI Conference on Human Factors in Computing Systems CHI 2000, pp. 281–288. ACM Press, New York (2000) 9. Ware, C., Mikaelian, H.H.: An evaluation of an eye tracker as a device for computer input. ACM SIGCHI Bulletin 17, 183–188 (1987) 10. Jacob, J.K.: The Use of Eye Movements in Interaction Techniques: What You Look At is What You Get. ACM Transactions on Information Systems 9, 152–169 (1991) 11. Zhai, S., Morimoto, C., Ihde, S.: Manual and gaze input cascaded (MAGIC) pointing. In: SIGCHI Conference on Human Factors in Computing Systems CHI 1999, pp. 246–253. ACM Press, New York (1999) 12. Drewes, H., Hußmann, H., Schmidt, A.: Blickgesten als Fernbedienung. In: Mensch & Computer, pp. 79–88. Oldenburg-Verlag, Weimar (2007) 13. Karl, L., Pettey, M., Shneiderman, B.: Speech-Activated versus Mouse-Activated Commands for Word Processing Applications: An Empirical Evaluation. International Journal of Man-Machine Studies 39, 667–687 (1993) 14. Martin, G.L.: The utility of speech input in user-computer interfaces. International Journal of Man-Machine Studies 30, 355–375 (1989) 15. Draper, M., Calhoun, G., Ruff, H., Williamson, D., Barry, T.: Manual versus speech input for unmanned aerial vehicle control station operations. In: Human Factors & Ergonomics Society’s 47th Annual Meeting, pp. 109–113 (2003) 16. Cohen, P.R., Oviatt, S.L.: The role of voice input for human-machine communication. National Academy of Sciences of the United States of America 92, 9921–9927 (1995) 17. Oviatt, S., DeAngeli, A., Kuhn, K.: Integration and synchronization of input modes during multimodal human-computer interaction. In: SIGCHI Conference on Human Factors in Computing Systems CHI 1997, pp. 415–422. ACM Press, New York (1997) 18. Oviatt, S.: Multimodal interfaces for dynamic interactive maps. In: SIGCHI Conference on Human Factors in Computing Systems Common Ground - CHI 1996, pp. 95–102. ACM Press, New York (1996) 19. Kammerer, Y., Scheiter, K., Beinhauer, W.: Looking my way through the menu: the impact of menu design and multimodal input on gaze-based menu selection. In: Symposium on Eye Tracking Research & Applications, pp. 213–220 (2008)
Multimodal, Touchless Interaction in Spatial Augmented Reality Environments
271
20. Tse, E., Shen, C., Greenberg, S., Forlines, C.: Enabling interaction with single user applications through speech and gestures on a multi-user tabletop. In: Working Conference on Advanced Visual Interfaces - AVI 2006, pp. 336–343 (2006) 21. Fikkert, F.W.: Gesture Interaction at a Distance. University of Twente, Centre for Telematics and Information Technology (2010) 22. Nielsen, M., Störring, M., Moeslund, T.B., Granum, E.: A procedure for developing intuitive and ergonomic gesture interfaces for HCI. In: Camurri, A., Volpe, G. (eds.) GW 2003. LNCS (LNAI), vol. 2915, pp. 409–420. Springer, Heidelberg (2004) 23. Jetter, H.-C., Gerken, J., Reiterer, H.: Natural User Interfaces: Why We Need Better Model-Worlds, Not Better Gestures. In: CHI 2010 Workshop - Natural User Interfaces: The Prospect and Challenge of Touch and Gestural Computing, pp. 1–4 (2010) 24. McNeill, D.: Hand and Mind. In: What Gestures Reveal about Thought, The University of Chicago Press, Chicago (1992) 25. Oviatt, S.: Multimodal Interfaces. In: Jacko, J.A., Sears, A. (eds.) A Handbook of HumanComputer Interaction, pp. 414–428. Lawrence Erlbaum Associates, New Jersey (2002) 26. Bolt, R.A.: ‘Put-that-there’: Voice and gesture at the graphics interface. In: 7th Annual Conference on Computer Graphics and Interactive Techniques, pp. 262–270. ACM Press, New York (1980) 27. Tan, D.S., Gergle, D., Scupelli, P., Pausch, R.: With similar visual angles, larger displays improve spatial performance. In: SIGCHI Conference on Human Factors in Computing Systems, pp. 217–224. ACM Press, New York (2003) 28. Oviatt, S.: Ten myths of multimodal interaction. Communications of the ACM 42, 74–81 (1999) 29. Oviatt, S.: Multimodal interactive maps: Designing for human performance. Human– Computer Interaction 12, 93–129 (1997) 30. Cohen, P.R., Johnston, M., Mcgee, D., Oviatt, S., Box, P.O.: QuickSet: Multimodal Interaction for Distributed Applications. In: 5th ACM International Conference on Multimedia, pp. 31–40. ACM Press, New York (1997) 31. Colby, C.L.: Action-oriented spatial reference frames in cortex. Neuron 20, 15–24 (1998) 32. Colby, C.L., Duhamel, J.R.: Spatial representations for action in parietal cortex. Cognitive Brain Research 5, 105–115 (1996) 33. Previc, F.H.: The Neuropsychology of 3-D Space. Psychological Bulletin 124, 123–164 (1998) 34. Longo, S.F., Lourenco, M.R.: Space perception and body morphology: extent of near space scales with arm length. Experimental Brain Research 177, 285–290 (2007) 35. May, M.: Raum. In: Funke, P.A., Frensch, J. (eds.) Handbuch der Allgemeinen Psychologie – Kognition, pp. 66–74. Hogrefe, Göttingen (2006) 36. Grüsser, O.-J.: Multimodal structure of extrapersonal space. In: Hein, A., Jeannerod, M. (eds.) Spatially Oriented Behavior, pp. 327–352. Springer, New York (1983) 37. Holmes, N.P., Spence, C.: The body schema and the multisensory representation(s) of peripersonal space. Cognitive Processing 5(2), 94–105 (2004)
Introducing ema (Editor for Manual Work Activities) – A New Tool for Enhancing Accuracy and Efficiency of Human Simulations in Digital Production Planning Lars Fritzsche1, Ricardo Jendrusch2, Wolfgang Leidholdt2, Sebastian Bauer2, Thomas Jäckel2, and Attila Pirger1 1 2
imk automotive, Inc., 5 Research Drive, 29607 Greenville SC, U.S.A. imk automotive, GmbH, Annaberger Str. 73, 09111 Chemnitz, Germany
[email protected]
Abstract. The aging workforce is a risk factor for manufacturing industries that contain many jobs with high physical workloads. Thus, ergonomic risk factors have to be avoided in early phases of production planning. This paper introduces a new tool for simulating manual work activities with 3D human models, the so-called HPD. For the most part, the HPD software is based on a unique modular approach including a number of complex operations that were theoretically developed and empirically validated by means of motion capturing technologies. Using these modules for defining the digital work process enables the production planner to compile human simulations more accurately and much quicker compared to any of the existing modeling tools. Features of the HPD software implementation, such as ergonomic evaluation and MTM-time analyses, and the workflow for practical application are presented. Keywords: Human Modeling, Production Planning, Ergonomics, Efficiency.
1 Introduction Manufacturing industries are facing major challenges due to the continued aging of their workforce, which is caused by an increasing life expectation and decreasing fertility rates. The demographic development is particularly considered as risk factor for work tasks that are associated with low autonomy and high physical task demands, for example automotive assembly [1], [2]. In order to keep the work-ability of older employees and avoid work-related musculoskeletal disorders, such as severe low back pain and carpal tunnel syndrome, manufacturers need to regard ergonomic principles of workplace design in early phases of production planning [3], [4]. Computer-aided simulation tools, such as digital human models (DHM), are considered to be very promising in the facilitation of pre-production planning and proactive ergonomic assessment [5], [6]. However, current DHM tools are often very complicated to handle and thus, it is mostly very time-consuming and inefficient to prepare human simulations for specific areas of application. Furthermore, it is important to assure that simulations of cycle times and ergonomic workloads are very precise and reflect reality quite well because analyses results may lead to substantial V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 272–281, 2011. © Springer-Verlag Berlin Heidelberg 2011
Introducing HPD (Editor for Manual Work Activities)
273
investments in workplace (re)design [7]. Facing these practical requirements this paper will introduce the “Editor for Manual Work Activities” (ema) – a new software tool that reduces the effort for preparing simulations of human work and, at the same time, improves the accuracy of simulations. HPD can be applied in various manufacturing environments with clock-cycled assembly lines and manual work stations, particularly in the field of automotive production planning.
2 Theoretical and Empirical Background The newly developed software tool “Editor for Manual Work Activities” (ema) utilizes a modular approach for describing and generating human work activities. HPD is based on so called “complex operations” representing an aggregation of single elementary movements that are directed at carrying out a certain work task. Thus, using highly automated algorithms, HPD strongly reduces the effort for simulating human work enabling the production planner to generate a simulation of the entire work process by describing operations on a rough abstraction level. In order to achieve correct simulation results, all complex operations that are implemented in HPD were first deducted on a theoretical basis and then empirically validated. 2.1 Theoretical Considerations Current tools for human work simulation, such as DELMIA Human Builder (previously known as Safeworks) and Siemens NX Human Modeling (previously known as Jack), often result in non-realistic dynamic movements. In order to improve the accuracy and validity of simulation results as well as the user acceptance human behavior and motions need to be generated and presented more realistically. Therefore, all movements of the human model have to be based on the principles of biomechanics and ergonomics (i.e., anthropometrics). Anyhow, following these principles is not quite sufficient because both scientific disciplines do not specifically refer to the generation of realistic motions. Biomechanics investigates human abilities to carry out specific activities based on the functions of the musculoskeletal system and the law of physics [8]. Ergonomics examines the level of physical strain, including factors like body postures, action forces, and manual weight handling, when people are conducting a specific task in a specific work environment [9]. In reality, however, human movements in a work setting are not only determined by biomechanical abilities and ergonomic strain but also by a number of other influences. People tend to carrying out work tasks in the most convenient and most comfortable way. Eklund, for instance, found that people put less effort in correctly performing their work tasks when they felt fatigue or pain in body parts in order to avoid discomfort [10]. Nevertheless, the most convenient way is not necessarily the most ergonomic way of working and thus, people often need to be trained to move correctly, for example when lifting heavy parts. This is the reason why many human movements in a production environment are strongly influenced by the training level of operators according to the standard work descriptions that are pre-defined by supervisors. Thus,
274
L. Fritzsche et al.
for example, people might use their left hand instead of the right hand not because it is easier – but because it avoids injuries and it is requested by their standard work description. Based on these theoretical assumptions and practical observations the artificial generation and simulation of correct and realistic human movements at work has to consider at least three factors: (1) biomechanical principles, (2) ergonomic strain, and (3) the training level of real operators. For the development of the new HPDsoftware all three factors were investigated, as will be described in the following sections. 2.2 Defining Complex Operations
HPD utilizes a modular approach for describing and generating human work activities that is based on so called “complex operations”. Complex operations represent an aggregation of single elementary movements that are all directed at carrying out the same work task in a logical sequence. For instance the operation “get and place part” may consist of the following single movements: steps forward – bend – hand to object – pick object – object to body – step backward – turn – steps forward – bend – place object – let loose – hand back. Of course, in a multifaceted work environment like automotive assembly many complex operations are needed and most of them are more complicated than the example above. One major challenge in the development of HPD was to define and implement all the complex operations and each single step in the correct sequence. At first, operations were logically defined based on theoretical analyses and practical observations. The most complicated part, however, was the software-technical implementation of each operational step. Particularly, a number of parameters were defined for each complex operation enabling the software user to quickly adjust the boundary conditions of the work task, for instance the weight of the part that has to be handled. Thereby, large efforts of research, analyses, and testing were invested to optimize the algorithms for each operational step in accordance with biomechanical and ergonomic principles in order to improve the correctness of movements under the given physical restrictions. To this end, the software developers also used motioncapturing data from real operators with a certain training level. 2.3 Empirical Validation with Real Operators Currently there are no sufficient theoretical methods that are able to fully describe the complexity of the generation of human motions. Therefore, an empirical approach that uses the movements of real humans is needed in order to develop more realistic simulation methods. Modern technologies of motion capturing are able to record movements in space and describe it as three-dimensional vectors. In this way, it is possible to collect reliable motion data that can be used for enhancing the quality of the software-technical generation and simulation of human movements. In this project, an A.R.T. (Advanced Realtime Tracking) motion-capturing-system was used for the development of HPD to specifically record human movements when performing pre-defined sequences of manual work activities. In order to collect motion data that can be used for the purpose of realistically generating and simulating human work it was necessary to prepare an artificial environment in which
Introducing HPD (Editor for Manual Work Activities)
275
experienced operators from real production lines are recorded and real standard tools, such as battery-driven screwdrivers, are used. Based on these premises a large number of trials were set up to motion-capture simple sequences, such as “get and place part”, as well as more complex movements, such as “car ingress/ car egress”. Thereby, a number of external parameters that may influence the recorded movements were systematically varied and tested; for instance the working height, the force direction, and the weight of the handled parts. Furthermore, the characteristics of study participants were also varied, particularly the body height and the skill level. Finally, the following conditions were found to be most useful for recording motion data that can be used for human simulations in pre-production planning: • • • •
Capture approximately 50th percentile body height Use skilled worker with much experience and training in the simulated task Record operations in very small steps and systematically varying parameters Identify continuous behavior, erratic movements, and impossibilities/refusal
Figure 1 illustrates one example of the experimental setting that was used for the empirical development and validation of HPD. These studies were conducted with experienced skilled workers at the Volkswagen Training Center in Germany. The complex operation that is shown on the picture is “screwing with battery tool”. Before recording motion data all possible screwing directions and angles were deducted from theory and expertise considering influencing factors, such as the working height. Based on this, a test specimen was prepared that contained 181 drilled holes in a range of more than 180 degrees. Following that, some test trails were conducted to determine the exact separation and sequence of recordable movements. Finally, after an elaborate phase of preparation, the actual motion capturing trails took place.
Fig. 1. Development and empirical validation of HPD: experimental setting for motioncapturing the complex operation “screwing with battery tool (pistol grip)”
After recording motion-capturing data have to be reprocessed in order to eliminate measurement inaccuracies, numerical errors, missing markers, etc. Following that, all
276
L. Fritzsche et al.
data needs to be translated into 3D vectors and related to a specific model of the test participant, in which all movements are described by joints. In a second step, individual motion data has to be enriched with meta-data for categorization and stored in a common database for typed movements. Finally, this database is the source for the motion generator that is used by HPD to simulate complex operations following the principle of similarity: individual movements for specific conditions are derived from the database by searching the best match (i.e., the most appropriate and most similar motion in storage). This mechanism is a very complex procedure that is based on many algorithms and represents the actual “heart and brain” of HPD. 2.4 Ergonomic Risk Assessment with ema Biomechanical correctness and a high accuracy of movements were both important criteria for finally defining and implementing the complex operation modules in HPD. Thus, using the pre-defined operations by specifying external parameters for compiling digital human simulations not only reduces the time-effort and enhances the simulation efficiency, but also enables the production planner to conduct more precise ergonomic evaluations [6]. HPD has included a standard tool for ergonomic risk assessment, the AAWS (“Automotive Assembly Worksheet“) [11], respectively its updated format EAWS (“European Assembly Worksheet”). This checklist has been shown to produce reliable results in real-world settings as well as in the evaluation of digital human simulations [7]. It was specifically developed for ergonomic assessments in repetitive assembly tasks. In contrast to other methods like RULA [12] or OWAS [13], which are both mainly focused on postures and already available for DELMIA Human and Siemens Jack, the EAWS includes several physical risk factors, such as static postures, action forces, manual weight handling, and specific “extra work strains”. Moreover, EAWS systematically takes the intensity, duration, and frequency of all four risk factors into account and thus, allows a more comprehensive ergonomic evaluation of the entire assembly process. Using the EAWS methodology as foundation HPD enables a semiautomatic ergonomic assessment, which improves not only the efficiency but also the objectivity of 3D human work analyses. To this end, the joint angles and positions of the body segments are recorded throughout the entire cycle (i.e., simulation time). Based on this data each posture is categorized into one of the standard posture classes as defined by EAWS (e.g., standing upright, bend forward, overhead, etc.). In some situations the categorization based on the joint data is not quite definite; for instance, when sitting on an assembly chair the joint positions of the legs may be very similar to a squatting posture. However, the modeling methodology of HPD facilitates correct posture assessment for such cases because it assigns an initial body movement that is typical for each posture; for example, there is always a movement to "sit down" prior to the posture “seated”. This approach ensures a correct and definite automatic categorization of all postures. Moreover, certain complex tasks, such as “car ingress / car egress”, may contain further workloads leading to additional scores for “extra work strains”. To complete ergonomic evaluation, information regarding action forces and objects weights need to be manually included by the user. Finally, HPDcombines all ergonomic data and
Introducing HPD (Editor for Manual Work Activities)
277
calculates a total risk score that is rated according to the traffic-light system (green – yellow – red) defined by EAWS. This way, HPDallows a comprehensive semiautomatic ergonomic assessment by using automatically retrieved data on postures and movements that are slightly enriched with additional information on forces and weights provided by the user. However, in future releases HPDwill also enable to automatically retrieve this kind of data from CAD part specifications, when available.
3 EMA Software Implementation In general, HPDis conceptualized as an open software system. The HPDsimulation kernel can be incorporated in any available PLM (Product Lifecycle Management) system and any 3D-Engine due to its modular architecture and various interfaces. In the current version, HPDis available as Plug-in for “DELMIA V5” by Dassault Systemes, one of the leading PLM-systems for automotive production planning. As mentioned above, DELMIA already provides a 3D man-model that, however, can only generate dynamic human simulations by defining single body movements in a very time-consuming step-by-step process. The so called “MoveToPosture” activities contain information about the current values for joint angles (DOFs – degrees of freedom) and the time duration for reaching a certain target posture. Incorporating HPD into DELMIA V5 aims at reducing the time effort handling difficulties and inaccuracies of the current modeling process by enabling interactive digital planning. Figure 2 shows the graphical user interface of HPDV5.
Fig. 2. Screenshot of HPDV5 user interface for task configuration (in German language)
Before the planner can start working with HPDV5 he first needs to prepare the simulation environment with the given DELMIA tools. To this end, all relevant data of products (e.g., parts) and resources (e.g., machinery) have to be loaded and saved
278
L. Fritzsche et al.
in its “initial state”. Furthermore, at least one human model needs to be added to the scene; all properties of the human model (e.g., height, gender, etc.) may be defined in the DELMIA “Human Builder”. Following that, the planner may start the HPDV5 plug-in that is added to the DELMIA workbench “Human Task Simulation”. After initializing HPDV5 the pre-defined complex operations (categorized and presented on the left side of the screen) can be used for describing the work process with a simple drag-and-drop mechanism that is adding operations to the human model task bar. HPDV5 automatically presents a list of parameters to specify the operation after it is added to the task bar – in some cases specifications have to be completed with drop-down menus (e.g., left vs. right hand, object weights, etc.), whereas other parameters are defined by directly interacting with the 3D simulation environment via mouse click (e.g., walking ways, points of assembly, etc.). Based on the parameter settings all pre-defined movements that are included in the complex operation are adapted and calculated to fit the specific situation. This approach enables the planner to describe the entire work process only by choosing the correct complex operations and entering a small number of defining parameters. Following the interactive description of the work process HPDV5 automatically generates the 3D human simulation including all single steps (i.e., MoveToPostures) that are stored accordingly into the DELMIA process structure. Thereby, HPDV5 uses its improved algorithms for the generation of human motions as described above. Furthermore, by running the simulation HPDV5 automatically calculates the MTMtime for each operation and the entire work process (see Figure 3). The MTM analyses can be stored and, potentially, directly transferred to alphanumerical planning software, such as DELMIA Process Engineer (DPE).
Fig. 3. Screenshot of HPD V5 user interface for MTM-time analyses (in German language)
Furthermore, one of the most helpful features of HPDV5 is the semiautomatic ergonomic assessment based on the premises of EAWS (see above). While running the 3D human simulation all postures are being recorded and stored in a database. The
Introducing HPD (Editor for Manual Work Activities)
279
duration, frequency and intensity of all postures are then summarized in one list. Additional information regarding action forces and object weights are either retrieved from standard product specifications or added manually in order to derive a total score for the ergonomic risk evaluation. Moreover, HPDV5 enables detailed analyses of the joint angles for certain body segments across the time course of the entire work process, which allows to specifically identifying risky movements and operations of high workload amplitude (see Figure 4).
Fig. 4. Screenshot of HPDV5 user interface for posture analysis (in German language)
In summary, the current release of HPDV5 1.0 includes the following features: − − − −
Intuitive graphical user interface (GUI) Full integration into the existing PLM infrastructure of DELMIA V5 Library of elementary complex operations Generation of MoveToPosture activities including automatic calculation of walking ways, avoidance of collision, and determination of grip points − Automatic calculation of MTM-times or manual input of planning times − Posture recognition, semiautomatic ergonomic assessment based on EAWS For future releases it is planned to enlarge the motion library by adding more complex operations and integrate more condensed methods of pre-determined times (e.g., additional MTM methods). Moreover, HPD will be made available as standalone tool with integrated 3D-engine.
280
L. Fritzsche et al.
4 Practical Application HPD was already tested in a number of pilot applications in car manufacturing. First results show that HPD can reduce the effort for preparing human simulations up to 90% compared to manual step-by-step simulations. Most importantly, the first pilot applications showed that HPD can be integrated in corporate software architecture to support the standard product development process that is used by several car manufacturers as well as other production industries. Firstly, in the early concept phase HPD may be used to validate the product buildability, which includes the verification that the vehicle can be manufactured with the given planning premises, equipment restrictions, and abilities of manual operators. Ergonomic analyses in that phase can, for instance, check well-known issues of the predecessor car (i.e., the reference vehicle) and other previous models. HPD supports this phase by using available CAD-data to quickly set up human simulations for comparing concept alternatives that influence the future assembly process. Thus, part design might be revised to improve the ease of manual assembly, which may not only reduce ergonomic workload but also save costly production time. By enabling accurate 3D analyses of the future assembly process HPD may also significantly reduce costs for late corrective design changes and part optimizations after start of production (SOP). Furthermore, HPD provides the opportunity to visually document good design solutions in a database for best practices that could be used as guideline for the development of future models. Secondly, in the phase of pre-production planning HPD may be used for the compilation and validation of the future work process. Utilizing the features that were described above HPD supports the production planner to quickly set up a standard work sequence and generate 3D simulations for visual inspection and optimization. HPD provides a set of tools that facilitate the design of efficient work processes by avoiding “waste” (with reference to the Toyota Production System), such as ergonomic strains (far reach, bending, etc.), long walking ways, and double-handling of parts and tools. Thus, HPD enables to compare process alternatives by means of objective quantitative analyses on ergonomics and MTM-time in early phases of preproduction planning merely based on digital product data that is readily available in the existing PLM environment, such as DEMLIA V5. Finally, in the phase of series production HPD may be used for investigating product, equipment, and process optimizations before implementing the actual changes to the production line or setting up costly production trials. In order to support the continuous improvement process after SOP HPD allows the series planner to quickly simulate and verify the integration of new concepts in an existing production environment. Again, the incorporated tools for quantitative analyses on ergonomics and MTM-time provide an objective evaluation of the proposed changes and thus, the costs for extensive production trials decrease. Furthermore, the HPD-generated simulation can be used to communicate evaluation results to the involved parties, such as the workers union, safety experts, plant management, etc., and reach a common sense on the final solution. At last, the same simulations can also be used to introduce the new equipment to the workers and provide a first training on the correct usage and the new work process.
Introducing HPD (Editor for Manual Work Activities)
281
5 Conclusions HPD is a new tool for simulating and editing manual work activities in digital production planning. Based on a large body of research HPD improves simulation accuracy of existing man-models, such as DELMIA Human, and significantly reduces the effort for compiling human modeling studies using unique modules of complex operations. HPD enables the human model to quickly transfer standard work descriptions into sequences of natural movement – just like a real operator would do. In that sense, HPD makes the human model a little bit smarter by utilizing the skills and the knowledge of a qualified worker. Finally, HPD supports production planners in analyzing future ergonomic conditions and avoid physical overload proactively in order to keep the work ability of the aging workforce in manufacturing industries.
References 1. Streb, C., Voelpel, S., Leibold, M.: Managing the aging workforce: Status quo and implications for the advancement of theory and practice. European Management Journal 26, 1–10 (2009) 2. Weichel, J., Buch, M., Urban, D., Frieling, E.: Sustainability and the ageing workforce: Considerations with regard to the German car manufacturing industry. In: Docherty, P., Kira, M., Shani, A.B. (eds.) Creating Sustainable Work Systems, pp. 70–83. Routledge, New York (2009) 3. Fritzsche, L.: Work Group Diversity and Digital Ergonomic Assessment as New Approaches for Compensating the Aging Workforce in Automotive Production. Doctoral Dissertation, Technical University of Dresden, Germany (2010) 4. Ilmarinen, J.E.: The ageing workforce – Challenges for occupational health. Occupational Medicine 56, 362–364 (2006) 5. Demirel, H.O., Duffy, V.G.: Applications of digital human modeling in industry. In: Duffy, V.G. (ed.) HCII 2007 and DHM 2007. LNCS, vol. 4561, pp. 824–832. Springer, Heidelberg (2007) 6. Chaffin, D.B.: On simulating human reach motions for ergonomics analyses. Human Factors and Ergonomics in Manufacturing 12, 235–247 (2002) 7. Fritzsche, L.: Ergonomics risk assessment with digital human models in car assembly: Simulation versus real-life. Human Factors and Ergonomics in Manufacturing & Service Industries 20, 287–299 (2010) 8. Chaffin, D.B., Andersson, G.B.J., Martin, B.J.: Occupational Biomechanics. Wiley & Sons, New York (2006) 9. Karwowski, W., Salvendy, G.: Ergonomics in manufacturing: raising productivity through workplace improvement. Society of Manufacturing Engineers, Dearborn (1998) 10. Eklund, J.: Relationships between ergonomics and quality in assembly work. Applied Ergonomics 26, 15–20 (1995) 11. Winter, G., Schaub, K., Landau, K.: Stress screening procedure for the automotive industry: Development and application of screening procedures in assembly and quality control. Occupational Ergonomics 6, 107–120 (2006) 12. McAtamney, L., Corlett, E.N.: RULA: A survey method for the investigation of workrelated upper limb disorders. Applied Ergonomics 24, 91–99 (1993) 13. Karhu, O., Kansi, P., Kuorinka, I.: Correcting working postures in industry: A practical method for analysis. Applied Ergonomics 8, 199–201 (1977)
Accelerated Real-Time Reconstruction of 3D Deformable Objects from Multi-view Video Channels Holger Graf1, Leon Hazke1, Svenja Kahn1, and Cornelius Malerczyk2 1
Fraunhofer Institute for Computer Graphics Research IGD, Fraunhoferstr. 5, 64283 Darmstadt, Germany {Holger.Graf,Leon.Hazke,Svenja.Kahn}@igd.fraunhofer.de 2 University of Applied Science Giessen-Friedberg, Faculty of Mathematics, Natural Sciences and Computer Sciences Wilhelm-Leuschner-Strasse 13, 61169 Friedberg, Germany
[email protected]
Abstract. In this paper we present a new framework for an accelerated 3D reconstruction of deformable objects within a multi-view setup. It is based on a new memory management and an enhanced algorithm pipeline of the well known Image-Based Visual Hull (IBVH) algorithm that enables efficient and fast reconstruction results and opens up new perspectives for the scalability of time consuming computations within larger camera environments. As a result, a significant increase of frame rates for the volumetric reconstruction of deformable objects can be achieved using an optimized CUDA-based implementation on NVIDIA's Fermi-GPUs. Keywords: Image based 3D reconstruction, GPU, CUDA, Real-Time reconstruction, Image-Based Visual Hull.
1 Introduction In this paper we present an efficient high resolution Image Based Visual Hull (IBVH) algorithm that entirely runs in real-time on a single consumer graphics card. The target application is a real-time motion capture system with a full body 3D reconstruction. The topic of 3D scene reconstruction based on multiple images has been investigated during the last twenty years and produced numerous results in the area of computer graphics and computer vision. Especially, real-time 3D reconstruction of target objects within a GPU environment has become one of the hot issues nowadays. Image based visual hull reconstruction (IBVH) is a 3D reconstruction technique from multiple images [1], which is based on the visual hull concept defined by Laurentini [2]. The visual hull of a 3D object is a conservative 3D shell, which contains the 3D object. The IBVH algorithm calculates the visual hull of a 3D object by intersecting the silhouette cones of the 3D object in several images. Whereas other 3D reconstruction techniques are often computationally intensive or limited to the reconstruction of static 3D scenes [3,4,5], the IBVH algorithm can reconstruct the 3D V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 282–291, 2011. © Springer-Verlag Berlin Heidelberg 2011
Accelerated Real-Time Reconstruction of 3D Deformable Objects
283
visual hull of rigid and deformable objects. A common application scenario of 3D visual hull reconstruction from multi-view video channels is the acquisition, analysis and visualization of human movements. Corazza et al. estimate human body poses by fitting an articulated 3D model of a human into the reconstructed visual hull of a person [6]. Graf, Yoon and Malerczyk reconstruct the visual hull to give the user the possibility to inspect the deformable object from an arbitrary viewpoint [7]. While the user moves a virtual camera around the deformable object, new views of the object are synthesized in real-time from the point of view of the virtual camera. Whereas the visual hull based pose estimation proposed in [6] is an offline process, the 3D reconstruction for the synthesis of new views needs to be calculated in real-time [7]. The processing time of the visual hull reconstruction increases with the number of cameras and their resolution. On current state-of-the-art processors, the frame rate of the CPU-based IBVH algorithm on multi-view video channels [1] is not high enough for a smooth real-time visualization. However, the immense speed-up of the processing power of graphical processing units in the last few years allows for the development of very fast GPU based 3D reconstruction algorithms. The mapping of computer vision algorithms to the GPU can significantly accelerate their processing time. Here, Owens et al. present a broad survey about general-purpose computations on graphics processors [8], Fung and Mann focus on the mapping of computer vision and image processing tasks to the GPU [9]. Rossi et. al [10] as well as Weinlich et al. [11] show that 3D reconstruction algorithms can be considerably accelerated by adapting reconstruction algorithms such that they can be executed on the GPU. Waizenegger et al. [12] as well as Ladikos, Benhimane and Navab [13] and Kim et al. [14] proposed GPU versions of visual hull 3D reconstruction algorithms. Whereas Ladikos, Benhimane and Navab [13] and Kim et al. [14] accelerate visual hull algorithms which use a voxel representation to explicitly reconstruct the 3D geometry, Waizenegger et al. [12] present a GPU version of image based visual hull reconstruction reducing the number of line segment intersection tests by a caching strategy. In this paper we also propose a GPU accelerated image based visual hull reconstruction algorithm. However, in contrast to [12], we focus on adapting the algorithm pipeline and its memory management such that they are optimized for the parallelized GPU architecture. Castano-Diez et al. evaluate the performance of image processing algorithms on the GPU [15]. They note that the speed-up which can be attained by mapping an algorithm to the GPU strongly depends on the characteristics of the algorithm itself: A high speed-up can be attained for algorithms which exhibit a high level of data parallelism (operations which are independent from one another) and which are characterized by a high number of sequentially and independently applied instructions. Whereas algorithms, which fulfill these requirements can benefit from a high speed-up if they are mapped to the GPU, memory access can significantly slow down the calculations. This is due to the fact that the GPU processing pipeline is not optimized for fast access of large amounts of memory. In this work we present a 3D reconstruction framework, which accounts for these requirements to attain a maximal speed-up of the reconstruction algorithm on the
284
H. Graf et al.
GPU with CUDA. The algorithm pipeline and the memory management of our 3D reconstruction framework have been carefully adapted to the inherent parallelism of GPU architectures.
2 Concept We follow a photo-realistic 3D reconstruction methodology from multiple images that exploits camera calibration data and foreground extraction information. We extract the silhouette of a foreground object from an array of multiple static cameras using kernel density estimation based background subtraction. The IBVH algorithm computes a 3D volumetric reconstruction of a target object from its 2D projections captured from a sequence of images in different viewpoints (viewing rays). We consider a scene observed by n calibrated static cameras and we focus on the state of one voxel at position X chosen among the positions of the 3D lattice used to discretize the scene (figure 1).
Fig. 1. Basic principle of the IVBH mechanism: on the left, different viewing rays and line intersections reconstructing the 3D lattice, on the right, the resulting target object (untextured)
For the available silhouettes we assume, that each silhouette is represented by a binary array and the positions can be derived by using extrinsic and intrinsic calibration information of each camera. As a result of the IVBH we obtain a sequence of 3D intervals along each viewing ray. Those intervals contain the volumetric area of the target object. The proposed framework exploits the hardware features of recent consumer graphics cards and is deployed on a single graphic board. For our implementation we used an open source CPU implementation of the IBVH (Image Based Visual Hull) algorithm as proposed in [1]. In order to speed up the overall process chain, the algorithm pipeline and its memory management have been adapted to the inherent parallelized GPU architecture. Our concept is based on the subsequent strategies.
Accelerated Real-Time Reconstruction of 3D Deformable Objects
285
2.1 Optimizing Memory Management for the 2D and 3D Intervals In order to fully exploit the parallelism of the CUDA architecture the computation of several viewing rays is mapped to individual CUDA threads. As a consequence, we are using as many CUDA threads as available viewing rays. If the maximum number of intersection points for each viewing ray is defined to nvr, the allocated memory block M for the computation of the intersection points is M = nvr ⋅ I w ⋅ I h ⋅ nIp ⋅ PIp ( x, y ), ∀( x, y ) ∈ I p ,nIp = I p
with Ip is the set of 3D intersection points, PIp the size of an intersection element at pixel value (x,y), Iw , Ih image width, resp. image height. In order to efficiently implement the overall memory consumption, memory access is realized in local areas rather than within global alignment procedures. Here, several threads within a CUDA WARP (smallest unit of threads (#32), capable of being physically executed in parallel by the multiprocessor) could access sequentially ordered data blocks, i.e. neighbored threads access sequentially ordered data sets. Thus, the pattern of memory storage resembles a volumetric lattice with multiple “image layers”. Within each “image layer” we store the intersection points of each corresponding viewing ray. The number of image layers complies with the maximum number of intersection points. The challenge we have been facing is, that the course of the sampling of the image along its epipolar lines (see figure 2) within each viewpoint is not optimal for a CUDA implementation.
Fig. 2. Projection of viewing ray onto reference image (containing silhouette), left; traversal (sampling) of reference image along epipolar line, right
Coalesced memory access is not possible as the threads within a WARP typically cannot access the resulting, neighbored (and sequentially aligned) data sets. To reduce the negative impact on the performance, we store the silhouette images as textures. The use of textures is optimized for 2D memory access as the multiprocessor is capable of transferring 2D areas of an image into its texture cache. This strategy allows us to make use of the overall CUDA capability for fast texture cache access
286
H. Graf et al.
that is as fast and efficient as classical register access1. We achieve a further optimization by reducing the number of cache mismatches. This can be achieved by a “packaging” strategy for the silhouette images. Here, we condense the images to an image element (one byte for 8 images or two bytes for 16 images) that contains the information on multiple images (1 bit for 1 image2). 2.2 GPU Adapted Silhouette-Line Intersection Algorithm For further memory optimization we adapted the Wedge-Cache algorithm described by Matusik [1] by adjusting the memory storage patterns for parallel processing on the GPU. The idea of the Wedge Cache algorithm is that the projections of the viewing rays onto the image plane lead to a 2D-cone. As a consequence projections of different rays may coincide and therefore induce the same epipolar lines, which have to be calculated once only and lead to a reduction of memory accesses. The algorithm uses indexed border pixels and therefore, it is possible to assign the epipolar lines, which are used for traversal to a specific wedge cache index, which is calculated as the furthest intersection point of epipolar lien and image border (see fig. 3). Each traversal of an epipolar line is implemented as a single GPU thread and therefore, the number of threads is reduced to 2*image width + 2*image height. Furthermore, all intersection points of epipolar lines with the object (silhouette) border are stored in the corresponding position (wedge cache index).
Fig. 3. Projection of viewing rays onto the image plane and building the wedge cache indexes
2.3 Efficient CUDA Alignment Within this step, we bundle the optimized calculation of silhouette line intersections (bundle of viewing rays) into a dedicated CUDA block. This again leads to an increase of locally aligned memory access reducing the number of cache mismatches and avoiding longer latencies. The projections of bundles of rays lead to several 1
2
Note: this is only possible if several threads within a CUDA block actually access the uploaded 2D area, otherwise it results in a cache mismatch. The silhouette images are stored in a binary array.
Accelerated Real-Time Reconstruction of 3D Deformable Objects
287
epipolar lines on the image plane. Due to the fact that they distant against each other towards the image borders, their cache indexes may not be consecutive. Therefore, access to the wedge cache is due to the CUDA coalesced memory access only sequential possible. To avoid this we arrange the CUDA blocks in such a way that viewing rays in x- and y-direction are calculated within one block of the size 16*16 or 32*32 (depending on the used GPU). Further on, we configure the execution of the CUDA kernels in such a way, that a CUDA block is capable of sampling multiple images along the epipolar line in parallel. Thus we reduce the cache mismatches by increasing the probability of successful memory accesses to the same 2D areas by several threads within one CUDA block. An alignment of the CUDA blocks according to the position of the epipole of the image plane leads to a further optimization in handling the CUDA memory. As to this position, not all areas of the wedge cache alignment have to be computed. We therefore achieve a higher degree of parallelism, which we deploy in the subsequent implementation.
3 Implementation The GPU architecture is designed for massive parallel computations. It is necessary to use as much streams processors as possible at the same time to achieve a maximal computational power. Therefore, the original IBVH algorithm described by Matusik [1] has been adapted in such a way that all separately computable portions of the algorithm exploit the GPU's parallel capabilities. By executing multiple CUDA kernels in parallel, we fully deploy the GPU capacities and its utilization reducing the overall execution time for the 3D reconstruction by avoiding several time consuming memory access tasks within a local area rather than accessing global memory reducing the overall latency of data transfer. Especially the calculation of the 3D cones (as describes in the previous sections) allows a high degree of parallelism and therefore a vast amount of saving computational time due to the facts that is it possible to calculate all viewing rays independently. Nevertheless, the IBVH algorithm is obviously not parallelizable as a whole and parallelization has to be implemented in several sections that are executed sequentially: ComputeMatrixAndEpipols (); initializeViewingRayBuffer (); initializeWedgeCacheBuffer (); compressSilhouettes (); compute2DIntervals (); for each referenceImage r in R computeViewingRayAndProjekt2DIntervalsBackIn3D (); merge3DIntervals (); All methods of the pseudo code above are realized as separated CUDA kernels.
288
H. Graf et al.
3.1 Hardware Setup As usual for computer vision based algorithm that have to deal with real-time requirements, it is important to have an appropriate hardware setup at hand. For our system we use TheImagingSource3 DBK 21BF04 Firewire cameras with a VGA resolution of 640*480 pixels and 4mm lenses. Using a Bayer filter the cameras deliver up to 60 frames per second in 8-bit mode, which are converted to RGB images by de-mosaicing them under CUDA on the GPU. To ensure that all image tuples used for both calibration and reconstruction purposes are acquired exactly at the same time, the cameras are triggered by a 5 volts TTL signal. We use an USB-PIO device by BMC Messsysteme GmbH4, which features three 8-bit bidirectional ports and therefore allows to trigger up to 127 cameras simultaneously. All cameras are connected with BNC cables to a server PC, which is used to generate the trigger signals. Nevertheless, using multiple calibrated cameras for the reconstruction process leads to the typical bottleneck of vision-based systems dealing with a huge amount of data that have to be transmitted via several different busses. Both Firewire and PCI Express bus allow up to four cameras to be connected to a single PC. Therefore, we propose two different hardware setup approaches depending on the count of used cameras, one for up to four cameras with a single PC and one for five or more cameras using a network-based client-server architecture. The first system is able to build the visual hull out of up to four synchronous cameras, which are connected to a single PC, which is used for as well triggering the camera images as al image processing and computer vision tasks (see fig 4, left).
Fig. 4. Hardware system setup: Up to four cameras connected to one single PC (left) and scalable setup for n cameras using a client/server architecture (right)
Obviously, using only four cameras the reconstruction is limited to a restricted volume due to the fact that the target object has to be fully visible in all camera images at all time. To overcome this limitation more cameras surveying the interaction volume are used, which are connected to several PCs in a network. While two to four cameras each are connected to one client, one PC is used a the server triggering the cameras, performing the visual hull reconstruction on the GPU and rendering the output images. All other PCs are used as clients preprocessing the camera images with de-mosaicing the incoming images and performing the foreground extraction. Due to the fact that after silhouette extraction not all full 3 4
http://www.theimagingsource.com/ http://www.bmc-messsysteme.de/us/
Accelerated Real-Time Reconstruction of 3D Deformable Objects
289
images but run length encoded images of the silhouettes only are transmitted to the server PC, the system is able to perform with more than 40 frames per second using e.g. eight synchronous cameras.
4 Evaluation We tested the CUDA-based implementation of the algorithm described in the previous sections on different setups. Due to the fact that only of-the-shelf hardware is used for the system setup, GPU tests were done with consumer graphics cards NVIDIA GeForce only. Parameters for the evaluation tests are the overall processing times in milliseconds running through the algorithm from the first to the last step of the algorithm (including visualization and data transfer from and to GPU) and the necessary memory used on the graphics board. The following table shows the results using five different graphic cards using setups with four, eight and sixteen cameras at three different levels of image resolutions. Table 1. Framework benchmarks on different GPUs at different resolutions
GPU
16 Cameras
8 Cameras
4 Cameras
Resolution
Memory Usage (16 cameras) 429 Mb
GeForce GTX 470 GeForce GTX 460 GeForce GTX 260 GeForce GTX 9800+ GeForce GTX 8800
4.1 ms 6.9 ms 6.4 ms 10.4 ms 12.1 ms
2.7 ms 3.8 ms 3.7 ms 6.0 ms 7.0 ms
2.0 ms 2.4 ms 2.4 ms 3.8 ms 4.6 ms
320 x 240 320 x 240 320 x 240 320 x 240 320 x 240
GeForce GTX 470 GeForce GTX 460 GeForce GTX 260 GeForce GTX 9800+ GeForce GTX 8800
12.7 ms 16.9 ms 21.0 ms 38.7 ms 45.8 ms
7.5 ms 9.1 ms 12.0 ms 21.6 ms 26.6 ms
5.0 ms 6.4 ms 7.4 ms 13.6 ms 17.3 ms
640 x 480 640 x 480 640 x 480 640 x 480 640 x 480
651 Mb
GeForce GTX 470 GeForce GTX 460 GeForce GTX 260 GeForce GTX 9800+ GeForce GTX 8800
48.3 ms 64.0 ms 76.0 ms 128 ms 170 ms
28.5 ms 36.7 ms 43.7 ms 78.3 ms 98.0 ms
18.5 ms 23.4 ms 26.4 ms 48.9 ms 64.0 ms
1280 x 960 1280 x 960 1280 x 960 1280 x 960 1280 x 960
980 Mb
5 Conclusion In this paper we presented a new framework for an accelerated 3D reconstruction of deformable objects within a multi-view setup. It is based on a new memory management and an enhanced algorithm pipeline that enables efficient and fast reconstruction results and opens up new perspectives for the scalability of time consuming computations within larger camera environments. As a result, a significant increase of frame rates for the volumetric reconstruction of deformable objects can be
290
H. Graf et al.
achieved. We tried to restrict several time consuming memory access tasks within a local area rather than accessing global memory reducing the overall latency of data transfer. Therefore, we adapted the proposed Wedge-Cache algorithm to the parallel nature of a GPU, leading to a reduction of inefficient memory access. As the projection of several viewing rays into a 2D plane is computational expensive resulting in a 2D cone, the use of the Wedge Cache algorithm avoids multiple calculations, as correspondent projections of multiple viewing rays into the image plane do not have to be recalculated again. Further on, we bundle the optimized calculation of silhouette-line intersections into a dedicated CUDA block. This again leads to an increase of locally aligned memory access reducing the number of cache mismatches and avoids longer latencies. Finally, the proposed strategies lead to a higher degree of parallelism, which we can exploit, as the framework has been designed to execute several kernels in parallel (which is only possible on the new generation of graphic boards, e.g. NVIDIA Fermi-GPUs.
References 1. Matusik, W., Buehler, C., Raskar, R., Gortler, S.J., McMillan, L.: Image-Based Visual Hulls. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2000, pp. 369–374. ACM Press/Addison-Wesley Publishing Co., New York (2000) 2. Laurentini, A.: The Visual Hull Concept for Silhouette-Based Image Understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence 16(2), 150–162 (1994) 3. Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, pp. 519–526 (2006) 4. Hornung, A., Kobbelt, L.: Robust and Efficient Photo-Consistency Estimation for Volumetric 3D Reconstruction. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 179–190. Springer, Heidelberg (2006) 5. Furukawa, Y., Ponce, J.: Accurate, Dense, and Robust Multiview Stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1362–1376 (2010) 6. Corazza, S., Mündermann, L., Gambaretto, E., Ferrigno, G., Andriacchi, T.: Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation. International Journal of Computer Vision 87, 156–169 (2010) 7. Graf, H., Yoon, S.M., Malerczyk, C.: Real-time 3D Reconstruction and Pose Estimation for Human Motion Analysis. In: Proceedings of the 2010 17th IEEE International Conference on Image Processing, ICIP 2010, pp. 3981–3984 (2010) 8. Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A., Purcell, T.J.: A Survey of General-Purpose Computation on Graphics Hardware. Computer Graphics Forum 26(1), 80–113 (2007) 9. Fung, J., Mann, S.: Using Graphics Devices in Reverse: GPU based Image Processing and Computer Vision. In: Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, ICME 2008, pp. 9–12 (2008) 10. Rossi, R., Savatier, X., Ertaud, J.-Y., Mazari, B.: Real-time 3D Reconstruction for Mobile Robot Using Catadioptric Cameras. In: Proceedings of the 2009 IEEE International Workshop on Robotic and Sensors Environments, ROSE 2009, pp. 104–109 (2009)
Accelerated Real-Time Reconstruction of 3D Deformable Objects
291
11. Weinlich, A., Keck, B., Scherl, H., Kowarschik, M., Hornegger, J.: Comparison of HighSpeed Ray Casting on GPU using CUDA and OpenGL. In: Proceedings of the First International Workshop on New Frontiers in High-performance and Hardware-aware Computing, pp. 25–30 (2008) 12. Waizenegger, W., Feldmann, I., Eisert, P., Kauff, P.: Parallel High Resolution Real-time Visual Hull on GPU. In: Proceedings of the 2009 16th IEEE International Conference on Image Processing, ICIP 2009, pp. 4301–4304 (2009) 13. Ladikos, A., Benhimane, S., Navab, N.: Efficient Visual Hull Computation for Real-time 3D Reconstruction using CUDA. In: Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2008, pp. 1–8 (2008) 14. Kim, H., Sakamoto, R., Kitahara, I., Orman, N., Toriyama, T., Kogure, K.: Compensated Visual Hull for Defective Segmentation and Occlusion. In: Proceedings of the 17th International Conference on Artificial Reality and Telexistence, ICAT 2007, pp. 210–217 (2007) 15. Castano-Diez, D., Moser, D., Schoenegger, A., Pruggnaller, S., Frangakis, A.S.: Performance Evaluation of Image Processing Algorithms on the GPU. Journal of Structural Biology 164, 153–160 (2008)
Second Life as a Platform for Creating Intelligent Virtual Agents Larry F. Hodges1, Amy Ulinski2, Toni Bloodworth1, Austen Hayes1, John Mark Smotherman1, and Brandon Kerr3 1 Clemson University, School of Computing, Clemson, SC 29634-0974 University of Wyoming, Department of Computer Science, Laramie, WY, 82071 3 University of North Carolina at Charlotte, Department of Computer Science, Charlotte, NC 28223 {lfh,tbloodw,ahayes,jsomthe}@clemson.edu,
[email protected],
[email protected] 2
Abstract. Intelligent virtual agents (IVAs) are animated characters that interact with humans and with each other using natural modalities such as speech and gestures. Creation of successful IVA applications is challenging since development requires integration of a broad range of specialized tools and skills. We present a discussion of the advantages and technical challenges of using the Second Life programming environment as an accessible platform for undergraduates to develop an Intelligent Virtual Agent and the environment that the IVA inhibits. Keywords: Intelligent Virtual Agents, Virtual Worlds, Animation, Avatars, Virtual Characters.
1 Introduction and Background Intelligent virtual agents (IVAs) are animated characters that interact with humans and with each other using natural modalities such as speech and gestures. Recent applications of IVAs include teaching communications skills [1], training nursing students in interview techniques [2], training healthcare workers [3], cultural protocol training [4], children’s museum docents [5] and conducting police photo lineups [6]. Creation of successful IVA applications is challenging since development requires integration of a broad range of specialized tools and skills. Typical tasks include modeling (both characters and environment), animation, voice recognition, conversational modeling, text-to-speech with lip syncing, perception of events in both the real and virtual world, and awareness of social conventions. Creating the tool infrastructure to teach just one of these skills is non-trivial and costly. As a result, involving university students in IVA projects, courses and research is significantly hampered by the lack of accessible and usable tools. Commercial products that allow creation and animation of animated characters such as Autodesk Maya [7] or DI-Guy Human Simulation Software [8] are costly, and have steep learning curves— particularly for students with no previous experience. Free tools such as Unity3D [9] V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 292–301, 2011. © Springer-Verlag Berlin Heidelberg 2011
Second Life as a Platform for Creating Intelligent Virtual Agents
293
or Blender [10] provide accessible tools for environmental building but not for custom character creation and animation. An alternative approach to creating an IVA is to modify components in an existing virtual world. The term virtual world can be used to describe any electronic environment that visually depicts or mimics complex physical spaces, and in this environment people can interact with each other and virtual objects while being represented by an animated character (an avatar) [11]. Currently, there are two main types of online virtual worlds. There are the creative-oriented environments such as Linden Labs Second Life (SL) [12] and Active Worlds [13], and there are the large multiplayer online role-playing games such as World of Warcraft (WoW) [14]. Active Worlds and SL are environments where users are visually represented by avatars and have the ability to interact with each other via chat or voice input [15]. Users in SL have the ability to build objects and environments, edit the appearance of his/her avatar, and interact socially with other users. SL is being used as a programming environment by professors and students to create distance learning classrooms, where students have the ability to meet with professors and other students in a virtual classroom. In Second Life there are virtual replicas of entire cities, universities, and companies [16]. WoW is an environment where players create characters with distinctive looks and characteristics ranging from clothes, jewelry, strength, agility and healing powers. In this environment, players work together to complete tasks to advance through levels of play in an environment that is full of different landscapes, objects, animals and fantasy elements [17]. WoW and SL both use avatars to represent users, have large 3-D environments, complex internal economies, voice communication, and simple artificial intelligence. One of the main differences between these two environments is the way the environments are used. WoW focuses on the gaming and fantasy aspect of a virtual world, while the users in SL focus on the social interactions and building aspect of a virtual world. SL affords itself well to the rapid development and prototyping of virtual worlds. In this paper we present an experiment in which a group of undergraduate students with no previous experience in creating IVAs were challenged to design, model, and implement a virtual world complete with an interactive character that could communicate with a user via voice recognition, gestures, and text-to-speech—all in just eight weeks. Besides the ambitious schedule, the unique aspect to this project was to create the application using tools from the SL programming environment. The underlying assumption for how users will use the SL environment is that the virtual character is an avatar that represents and is under the control of a particular user in the virtual world. Our challenge was to invert this concept and convert a SL avatar from a representation of a user in a virtual world to an intelligent virtual agent that could directly interact with a user. We describe the resulting product and summarize the advantages and challenges of this approach.
2 Project Description and Goals The Project Team consisted of eight undergraduate students from four different universities who were on the Clemson campus to participate in a summer research program in Human-Centered Computing. All were majoring in computer science or a
294
L.F. Hodges et al.
related discipline. None of the students had previous experience with second life, voice recognition, or IVAs. Two faculty members and one Ph.D. student served as mentors to advise the team. The assigned goal for the group was to design a virtual human that could interact naturally with a user via voice recognition, speech, and animated gestures. The context for the interaction was an interactive presentation of the Clemson School of Computing. Projected users of the system would be visiting high school seniors who were potential computer science majors and their parents. The project had to be completed in eight weeks. During the first week of the project, the students used online tutorials and books to learn basic SL skills: how to control and customize their avatars, how to model objects, and how to program objects using the Linden Scripting Language (LSL). Students were also introduced to the basics of voice recognition and text-to-speech software. In the second week we added daily meetings to discuss content and design of our IVA application. Since we were all (including the mentors) new users of SL we decided to first build an initial prototype to explore and learn more about the available tools before doing the design and implementation of the final product.
Fig. 1. CARA sitting at her desk
2.1 Initial Prototype The initial prototype was named CARA (Clemson Autonomous Recruiting Agent). CARA was presented on a large screen display and used voice recognition and recorded audio to communicate with a user. The goal was to implement an initial agent that allowed us to test all the capabilities needed for the final project: environmental modeling, avatar creation, animation, voice recognition, agent speech with lip syncing, capabilities of the SL scripting language, embedding multimedia content, and methods for overall program control. The result, which took two weeks to build, was a teenage girl sitting at a desk in an office. CARA greeted the user and asked the user’s name. She then stood up and asked the user if they would like to see a short video about one of the current research projects in the School of Computing.
Second Life as a Platform for Creating Intelligent Virtual Agents
295
The video played on a virtual large TV screen in the office. At the end of the video, CARA tells the user goodbye and the demo concludes. A simple state machine controlled the flow of the system. Figure 1 shows a screenshot of CARA at her desk. 2.2 Final Design Once we had the prototype working we began the design of the final product. In addition to identifying several technical problems that required workarounds while creating CARA, we also realized that we had rich tools for fast modeling and a large content space in SL that we had not taken advantage of in the prototype. Instead of an office space we designed an entire island world that included a grizzled older guide (CLEM) who takes the user on a tour that provides an overview of one of the three divisions in the Clemson School of Computing: Computer Science, Human-Centered Computing, or Visual Computing. CLEM carries on a short conversation with the user to determine the preferred tour, and then they set off in his vehicle through a jungle environment complete with rivers, foliage, and creatures. Along the way CLEM relates tales of Clemson history and traditions. Eventually you come to a clearing where CLEM uses posters and videos to explain current research and education projects for the chosen division. CLEM and his world were presented to the user on a large screen display and communication with CLEM was done using voice recognition and text-to-speech.
Fig. 2. CLEM in his vehicle at the Human-Centered Computing exhibit
3 Technical Discussion 3.1 Avatar Creation and Animation Second Life accounts provide users with a free, customizable avatar that can be easily modified to many different shapes and sizes. Clothing, skin textures, hair styles and
296
L.F. Hodges et al.
other avatar accessories can be found either free or at low cost. The pre-made, customizable avatar bodies and wide range of available textures meant that our virtual agents could be made quickly without a large time or monetary investment. Avatar animation can be controlled by the user via a keyboard and mouse or by prerecorded animations that can be triggered via keyboard commands. Animations such as walking and basic gestures come already built-in with the avatar. Controlling a walking movement directly with the keyboard movement keys, however, is nondeterministic. The exact same sequence of keyboard commands may produce a different result in the final location and orientation of the avatar. The consequence is that we could not reliably animate an avatar’s movements via capturing and resending a sequence of keyboard commands. Animation files created in a tool such as Blender or Qavimator can be uploaded to SL to allow the user to customize the walking animation or perform other predetermined animations the avatar skeleton is capable of. There is a severe limitation, however, in the duration of prerecorded animations which can be no longer than 60 seconds and the associated file can be no larger than 60 kb. Once the animation is finished, the avatar snaps back to its original position and configuration it was in before the animation was triggered. The end result is that both CARA and CLEM had to be restricted to a fixed position or very short animated movements that always returned to an initial position. Another issue was that both head and eye animations cannot be controlled directly through a script. Instead, both the head and eyes follow the mouse pointer of the user. The result is that the avatar is always looking toward whatever spot on the display screen where you last left the mouse pointer. 3.2 Camera Control and Object Animation Although scripted avatar animations are not supported in SL, scripted object (anything in the world that is not an avatar) movement is. One motivation for designing CLEM around the theme of a jungle tour guide was the idea that the guide could lead the tour while sitting in a vehicle and we could write a script to animate the vehicle. To make this work we needed camera control functions so that the view of the scene followed the movement of the vehicle, and we had to create an animation script for the vehicle. Camera Control. The standard camera view of a scene in SL is either a third-person camera view from directly behind and above a user’s avatar or a first-person camera view from the eye-point of the avatar. For both CARA and CLEM we wanted to control the camera so as to present the world from the viewpoint of the user instead of an avatar-controlled view. To help create this effect we needed fluid camera movements that allowed the user to virtually look around or move around a space. While there are scripted functions in the built-in scripting language (LSL) for instantaneous camera position changes, there are no built-in interpolation functions to provide fluid camera movement from one point of interest to the next. We needed to create our own interpolation functions from the basic commands provided by LSL. For our camera algorithm to work we needed to change both the position and rotation of the camera. The incomplete implementation of the camera API in LSL made scripting difficult. For example, while there is a get camera rotation function,
Second Life as a Platform for Creating Intelligent Virtual Agents
297
there is no set camera rotation function. We used the built-in functions for the positional movement of the camera and found a “camera focus offset” to work around the lack of a camera “set” rotation function. Since no interpolation functions existed we wrote an algorithm that took the starting point and target point of the camera and divided the movement up into a set number of “frames”, each frame being an incremental step toward the target position and rotation. Because the movement function calls snap the camera into its new position, the result was camera movement that could not be smoothed enough so the eye could not distinguish the individual frames, resulting in usable but slightly jerky camera movements. Vehicle Animation. To overcome the limitations on scripted avatar movements for CLEM we built our story line so that the tour guide rode in a vehicle. SL allows modeled objects to be scripted for movement; however, like the camera movement functions, scripted object movement in SL is limited, with no interpolation functions for smooth, continuous object movement. This meant that the vehicle’s movement appeared jerky. Further, positional and rotational functions were separate function calls, meaning there was no way to fluidly turn corners as a real vehicle would. Instead, the movement script would incrementally change the position and rotation of the vehicle separately, both calls being distinguishable to the observer. 3.3 Program Control, Voice Recognition and Lip-Synching Program control for both CARA and CLEM was handled by simple finite state machines. CARA carried on a limited conversation with the user and then showed them a video about a current Clemson School of Computing research project. The state machine used three main utilities behind the scenes to accomplish its function: a timer, a listener, and a current state. The states controlled context-specific actions for each state, normally followed by a prompt to the user and an initiation of the listener. From there, the listener would take over, listening for specific recognized text (again, depending on the current state). While the listener waits for a given input, a timer runs concurrently in order to control what happens when Cara does not receive input (when it is assumed that the user did not say anything). CLEM was developed using a similar logic structure as CARA, but was more complex, since Clem would give the user choices and the number of interactions increased. LSL scripts may be used in order to start animations, play sounds, and do various other things inside SL. For our CARA prototype we used LSL for our entire logic script, maintaining a state machine for all control logic inside the SL script. To solve some of the architectural issues we encountered in the implementation of CARA we moved the logic script outside of SL and into a C# script for CLEM. LSL scripts are the only method to enact changes inside the virtual world. The SL viewer program (used to connect a computer to the SL environment) limited us to using a keyboard emulator to send commands to LSL scripts through hidden text chat channels. SL has a main chat channel, and then any number of other channels, which you can specify to chat on. Every scripted object we built received commands on a particular channel; for instance, the doors to a room would swing open based on hearing the word “open” on the channel they were listening on.
298
L.F. Hodges et al.
Demo begins.
Listen for input. Begin tours.
Exhibit 1
Exhibit 2
Exhibit 3
Choose another tour or end demo.
Fig. 3. Basic state machine for CLEM
For CARA, timing the sequence of events was straight forward except for animations and audio. Animations and sounds trigger, but don’t block and have no way of communicating that they are finished. We had to time these and build wait times into our scripts. Audio files and animations also had to load from the SL server into the SL viewer, which took time and changed the wait time needed by the script, meaning that the timing was always uncertain, depending on the speed and traffic on the network connection. There is a flag that can be set to trigger a preload of an audio file before they are needed, but either the flag just does not work, or the amount of audio we wanted to preload was too much for the SL viewer’s cache. For CLEM, we not only had to build wait times into the C# script for the sounds and animations, but there was an additional lag between the input of the commands into the chat channel and the listeners on our scripted objects being fired. Because our state machine was outside SL, this lag was a major difficulty in timing our scripts for animation and movies. CLEM’s voice, however, was being generated by the C# script, so there was no lag on his voice. SL does not provide a voice recognition capability. For CARA we used Dragon Naturally Speaking and for CLEM we switched to Microsoft’s SAPI. In both cases we implemented a keyword-based structure that listened for certain words only, and sent commands to SL based on what keywords were recognized. The CARA system sent only what it heard, maintaining no state, while the CLEM implementation could send more detailed information because both the state machine and voice recognition were outside of SL. At the time of our project, SL used a proprietary tool, VIVOX, for both voice chat and lip-synching. The system can be set up to lip-synch audio coming through a SL voice chat cannel. For CARA, we prerecorded voice audio files and loaded them into SL. Our script would trigger the audio, which was then fed through a microphone on that computer through voice chat to trigger the avatar to lip-synch. CLEM used text-tospeech instead of prerecorded audio files, using a C# SAPI script. This was also fed through the voice chat for lip-synch. We did this in both cases by cabling our audio
Second Life as a Platform for Creating Intelligent Virtual Agents
299
speakers directly into a microphone. Using this setup, we had to turn the lip-synching off with our script every time the avatar finished speaking so that the avatar wouldn’t lip-synch all the audio sources in the virtual world. 3.4 Modeling and Streaming Media Working in the SL environment does provide free and straight forward 3D modeling tools for building content for the virtual world that an agent inhabits. In addition, many SL users build content that they give away to other users for free or sale to other users for a modest price. As a result we used free assets to build a very rich environment for our CLEM world, including trees, plants, birds, animals, and textures. Screen shots are shown in figure 4.
Fig. 4. Screen shots of the CLEM environment
Since we had video content to illustrate many of the projects that we wanted to feature on our tour we planned to imbed this content into the virtual world that we had created. Second Life provides many useful controls when working with streaming media, including the capability to play a video, provided it is in the correct format, from any web address onto any object of any size or shape in the virtual world. However, making this work in the context of a demo that has numerous collaborators, timing, dynamic content, and multiple media locations was very complicated. Collaboration was an issue because of the way SL handles permissions for who can modify the script controlling the media player. At present SL requires that the owner of the script is also an owner of that particular parcel of virtual real estate. The rest of the group could not have permissions to edit the script controlling the media player, and SL would prevent even the owner from making modifications once the object had been placed in the environment. If everyone became an owner it would also present a security issue as there are many things an owner can do that would jeopardize the integrity of the entire virtual world (especially if that person is a novice SL programmer). Timing was an issue because each media playing object (in this case, a model of a rectangular projector screen) took a variable amount of time to load the content to be played. If the screens were not triggered early enough, the user would see the screens loading and possibly be subjected to a long wait before the content was actually viewable. Having the screens play dynamic content (for the CARA demo, the screen script selected a random media URL to play) was also a timing issue. Since the objects displaying media in SL are not aware of when their media is done playing,
300
L.F. Hodges et al.
event timings had to be recorded and maintained in their respective order. This allowed for the script to keep track of when the media was finished and, subsequently, when to proceed with the rest of the demo. Having multiple locations playing media, such as in the CLEM demo, was an issue because of the global way SL controls media playing for the virtual island. The commands to play media given by one screen to itself also play media on any screen in the same virtual space. Parceling the virtual real estate into separate areas—each with a separate screen—proved ineffective. The user therefore had to be kept out of sight from the remaining screens while viewing the content of a particular exhibit.
4 Summary The primary advantages of using SL as a development platform were the tool sets for modeling the environment and modifying the appearance of avatars. In addition, many objects such as trees, birds and water simulations, were available for free and did not have to be modeled in-house. Second Life also provided automatic lip syncing, simple animation tools and a scripting language that allowed us to program object behaviors such as the opening of the gates to enter the tour and the movement of the vehicle. Lack of support for databases, text-to-speech, and voice recognition in the SL programming environment meant that we had to move these tasks outside of SL to a C# script and use SL chat channels to transfer data between SL and our C# script. The most surprising problem was the limited SL capacity for scripted animations. The scripting language can be used to program object movements other than avatars but the tools provided are inadequate for creating smooth animations. Despite limitations the final product was completed on time and has been shown to potential students at regular Friday demo tours in the School of Computing for a year. Although the animations were coarse, the visual models were rich and inviting. To quote one visitor to the lab: “We have been on tours and seen project demos at some of the best universities on the East coast. Nowhere have we seen a system build by undergraduates that was this impressive!” Acknowledgments. This research was supported, in part, by NSF Research Experiences for Undergraduates (REU) Site Grant CNS-0850695. Janelle Arita, Freddie Dunn III, Hugh Kinsey III, and Nichole Wilson helped with the programming and modeling of this project.
References 1. Stevens, A., Hernandez, J., Johnsen, K., Dickerson, R., Raij, A., Harrison, C., DiPietro, M., Allen, B., Ferdig, R., Foti, S., Jackson, J., Shin, M., Cendan, J., Watson, R., Duerson, M., Lok, B., Cohen, M., Wagner, P., Lind, D.: The Use of Virtual Patients to Teach Medical Students Communication Skills. The American Journal of Surgery 191(6), 806– 811 (2005) 2. Ziemkiewicz, C., Ulinski, A., Zanbaka, C., Hardin, S., Hodges, L.F.: Interactive Digital Patient for Triage Nurse Training. In: Proceedings of HCI International: Virtual Reality (2005)
Second Life as a Platform for Creating Intelligent Virtual Agents
301
3. Bertrand, J., Babu, S., Polgreen, P., Segre, A.: Virtual agents based simulation for training healthcare workers in hand hygiene procedures. In: Safonova, A. (ed.) IVA 2010. LNCS, vol. 6356, pp. 125–131. Springer, Heidelberg (2010) 4. Babu, S., Suma, E., Barnes, T., Hodges, L.F.: Can Immersive Virtual Humans teach Social Conversational Protocols? In: Proceedings of the IEEE International Conference on Virtual Reality, pp. 215–218 (2007) 5. Swartout, W., Traum, D., Artstein, R., Noren, D., Debevec, P., Bronnenkant, K., Williams, J., Leuski, A., Narayanan, S., Piepol, D., Lane, C., Morie, J., Aggarwal, P., Liewer, M., Chiang, J.-Y., Gerten, J., Chu, S., White, K.: Ada and Grace: Toward Realistic and Engaging Virtual Museum Guides. In: Allbeck, J., Badler, N., Bickmore, T., Pelachaud, C., Safonova, A. (eds.) IVA 2010. LNCS (LNAI), vol. 6356, pp. 286–300. Springer, Heidelberg (2010) 6. Cutler, B.F., Daugherty, B., Babu, S., Hodges, L.F., Van Wallendael, L.: Creating Blind Photoarrays Using Virtual Human Technology: A Feasibility Test. Police Quarterly 12, 289–300 (2009) 7. Maya, http://usa.autodesk.com/adsk/servlet/pc/ index?id=13577897&siteID=123112 8. DI-Guy, http://www.diguy.com/diguy/ 9. Unity3D, 12, http://unity3d.com/ 10. Blender, 13, http://www.blender.org/ 11. Bainbrigde, W.: The Scientific Research Potential of Virtual Worlds. Science 317(5837), 472–476 (2007) 12. Second Life, http://www.secondlife.com/ 13. Active Worlds, http://activeworlds.com/ 14. World of Warcraft, http://us.battle.net/wow 15. Dickey, M.D.: Three-Dimensional Virtual Worlds and Distance Learning: Two Case Studies of Active Worlds as a Medium for Distance Education. British Journal of Education Technology 36(3), 439–451 (2005) 16. Boulon, M.N., Hetherington, L., Wheeler, S.: Second life: An overview of the potential of 3-D virtual worlds in medical and health education. Health Information Library J 24, 233– 245 (2007) 17. Nardi, B., Harris, J.: Strangers and Friends: Collaborative Play in World of Warcraft. In: Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work (2006)
A Framework for Automatic Simulated Accessibility Assessment in Virtual Environments Nikolaos Kaklanis1,2, Panagiotis Moschonas1, Konstantinos Moustakas1, and Dimitrios Tzovaras1 1
Informatics and Telematics Institute, Centre for Research and Technology Hellas Thessaloniki, Greece 2 Department of Computing, University of Surrey Guildford, United Kingdom {nkak,moschona,moustak,Dimitrios.Tzovaras}@iti.gr
Abstract. The present paper introduces a framework that enforces the accessibility of products and services by enabling automatic simulated accessibility assessment at all the stages of the development. The proposed framework is based on a new virtual user modelling technique describing in detail all the physical parameters of a user with disability(ies). The proposed user modelling methodology generates a dynamic and parameterizable virtual user model that is used by a simulation framework to assess the accessibility of virtual prototypes. Experimental results illustrate the use of the proposed framework in a realistic application scenario. Keywords: accessibility evaluation; user modelling; task modelling; simulation; UsiXML; virtual user.
1 Introduction The population is ageing: according to the World Health Organisation, the worldwide population of people aged 60 years and over will have increased from 580 million in 1998 to 1000 million in 2020. As global populations are growing older, together with an increase in people with disabilities, the moral and financial need for “designed for all” products and services becomes obvious. The lack of non accessible products can cause large productivity losses, with many people being unable to fully participate at work, in education, or in a wide range of economic and social activities. People's choice of leisure activities may be narrower than it otherwise could be. User studies involving users with disabilities often incur greater financial cost and complexity than those involving general populations. Consequently, accessibility issues may not be identified during the earlier phases of product/services design, when designs are still malleable. Additionally, it can be difficult to create controlled studies with multiple groups of very similar subjects due to the extremely heterogeneous nature of the impact of many motor and visual disabilities. Accessible systems, that is, systems which can be used by users who are disabled as well as by the average user, are challenging to build. Like usability, accessibility is V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 302–311, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Framework for Automatic Simulated Accessibility Assessment in Virtual Environments
303
a goal that requires iteration, customer-centered design, and significant expertise to be done right. Unfortunately, user testing with special populations often requires greater effort, time, and monetary commitments on the part of designers, developers, and their companies than user testing with the general population. Additionally, it may be difficult to find homogeneous groups of participants. Effective evaluation of design concepts at an early design stage can help to achieve shorter lead time and reduced costs. With the advent of inexpensive high-speed computing, it has become feasible to verify a design based on a virtual prototype, using modelling and simulation technology. Even if there have been some limited and isolated attempts [10] to support accessibility testing of novel products and applications, there is a clear lack of a holistic framework that supports comprehensively virtual user modeling, simulation and testing at all development stages. The present paper presents a framework that performs automatic simulated accessibility evaluation based on a new user modeling technique. The great importance of such a framework lies to the fact that it enables the automatic accessibility evaluation of any environment for any user by testing its equivalent virtual environment for the corresponding virtual user. Moreover, the main innovation of the proposed framework lies in the fact that, the whole framework is based on a new virtual user modelling technique including physical, cognitive and behavioral/ psychological aspects and parameters of the user with disability. The proposed technique is not constrained in static modelling of the virtual user it but generates a dynamic and parameterizable user model including interactions that can be used directly in simulation frameworks, adaptive interfaces and optimal design evaluation frameworks.
2 Related Work Virtual users with disabilities fulfill the role of standardized patients by simulating a particular clinical presentation with a high degree of consistency and realism and offer a promising alternative [14]. Different user model representations have been proposed in the literature using different syntaxes and implementations, varying from flat file structures and relational databases to full-fledged RDF with bindings in XML. OntobUM [21], the first ontology-based user modelling architecture, was introduced by Razmerita et al. in 2003. A similar, but way more extensive approach for ontology-based representation of the user models was presented in [10]. GUMO, probably the most comprehensive publicly available user modelling ontology to date, is proposed in [9]. XML-based languages for user modelling have also been proposed. UserML has been introduced in [8] as user model exchange language. UserML is based on an ontology that defines the semantics of the XML vocabulary (UserOL). The use of “personas”, which are empirically based abstract descriptions of people has gained popularity and has been developed into a well documented design practice. Pruitt et al. [20] claim that currently there are no explanations of why many prefer to develop personas, instead of focusing directly on the scenarios that describe the actual work processes that the design is intended to support. However, a potential candidate to a theory of using ‘personas’ in design is, according to Pruitt et al. [20] the ‘theory
304
N. Kaklanis et al.
of mind’ that says that we, as humans, always use our knowledge of other people’s mental states to predict their behavior [2]. In the context of cognitive modelling, SOAR [22] offers a general cognitive architecture for developing systems that exhibit intelligent behavior. It defines a single framework for all tasks, a mechanism for generating goals and a learning mechanism. Another popular cognitive architecture is ACT-R [1], which is based on the distinction between declarative and procedural knowledge, and the view that learning consists of two main phases. In the first phase the declarative knowledge is encoded and in the second it is turned into more efficient procedural knowledge [18]. The idea of simulating disability has been applied to Web accessibility research in the past. For example, IBM’s aDesigner [6] renders a Web page so that text or images that cannot be viewed by a person who is blind are impossible to see (covered in black). WebAIM (Web Accessibility In Mind) [27] provides scripted simulations with a number of capabilities, including those that illustrate screen reader use, those that demonstrate the use of screen-enlarging software for users with low vision, and those that simulate distractibility and the accompanying frustrations that a user with cognitive disabilities might experience. Virtual User Models are also used in application areas such as automotive and aviation design. The RAMSIS human modelling tool [17] is currently used in the design of automobiles in about seventy percent of auto-manufacturers. SAMMIE is a computer aided human modelling system that represents a widely used tool to accommodate the needs of a broad range of differently sized and shaped people into the design of products [23]. HADRIAN [16] is another computer-based inclusive design tool that has been developed to support designers in their efforts to develop products that meet the needs of a broader range of users.
3 Proposed User Modelling Methodology In order to support the automatic accessibility testing of ICT and non-ICT services and products, it is essential to create machine-readable descriptions of the disabled user and the tasks that the user is able to perform, as well as formal descriptions of the products and services to be tested. In [11] the concepts of Virtual User Model, Task Model and Simulation model (all expressed in UsiXML format) have been introduced. The aforementioned models are used by the core component of the proposed framework, the Simulation Module. More specifically, the Simulation Module gets as input a Virtual User Model (describing a virtual user with disabilities), a Simulation Model that describes the functionality of the product/service to be tested, the Task Models that describe in detail the complex tasks of the user and simulates the interaction of the virtual user (as it is defined in the Simulation Model) within a virtual environment. 3.1 Virtual User Model A Virtual User Model [11] describes an instance of a virtual user, including user’s preferences, needs and capabilities and is modelled through an extension of UsiXML. It includes all the possible disabilities of the user, the affected by the disabilities tasks,
A Framework for Automatic Simulated Accessibility Assessment in Virtual Environments
305
motor, visual, hearing, speech, cognitive and behavioral user characteristics as well as some general preferences (e.g. preferred input/output modality, unsuitable input/output modality, etc.) of the user. 3.2 Task Model Task models [11] describe the interaction between the virtual user and the virtual environment. User tasks are divided into primitive (e.g. grasp, pull, walk, etc.) and complex (e.g. driving, telephone use, computer use, etc.). For each complex task a Task Model is developed, in order to specify how the complex task can be analyzed into primitive tasks (as they have been defined by the designers/developers according to the functionality of the prototypes to be tested in terms of accessibility). Fig. 1 presents a schematic description of a complex task and the primitive tasks in which it could be analyzed.
Fig. 1. Task Model example – The root node represents a complex task while the leaf nodes represent primitive tasks. The arrows between the primitive tasks define that the primitive tasks should be executed sequentially.
3.3 Simulation Model A Simulation Model [11] refers to a specific product or service and describes all the functionalities of the product/service as well as the involved interaction with the user, including all the different interaction modalities supported.
Fig. 2. A Simulation Model example representing some common tasks that can be performed in a car interior. The tasks of the same level are connected with choice relationship, which means that only one of the two can be executed at a time.
Fig. 2 presents a simulation model referred to a car interior. According to this simulation model, the interaction between user and the environment includes the use of the handbrake and the storage compartment.
306
N. Kaklanis et al.
4 Simulation Framework 4.1 Virtual Environment The proposed framework performs automatic simulated accessibility evaluation in virtual environments, thus virtual environments have to be developed according to the specifications of the real ones to be tested. The virtual environment is used by the Simulation Module, which is the core module of the proposed framework and performs the evaluation process. Fig. 3 depicts a virtual environment representing a common car interior. It is obvious that the credibility and the accuracy of the simulation results lie on the detailed representation of the virtual environment, which will be given as input to the Simulation Module.
Fig. 3. Virtual environment example representing a common car interior
4.2 Simulation Module The Simulation Module [11] gets as input a Virtual User Model (describing a virtual user with disabilities), a Simulation Model (describing the functionality of the product/service to be tested), one or more Task Models (describing in detail the complex tasks of the user) and a 3D virtual prototype representing the product/service to be tested. It then simulates the interaction of the virtual user (as it is defined in the Simulation Model) with the virtual environment. The disabled virtual user is the main “actor” of the physically-based simulation that aims to assess if the virtual user is able to accomplish all necessary actions described in the Simulation Model, taking into account the constraints posed by the disabilities. Even if the proposed framework is designed to include cognitive, behavioral and physical disabilities, in the present work, first results on the modeling and simulation of motor arm disabilities are reported. Simulation planning is performed using inverse kinematics, while dynamic properties of the human limbs (e.g. torques and forces) related to the corresponding actions (e.g. grasping) are obtained using inverse dynamics. The simulation module is parted of three sub-elements: a) the Scene module, b) the Humanoid module and c) the Task manager module. The scene module is responsible for the representation of the scene and the management of all the objects in it. The humanoid module represents the human model and controls the humanoid’s movements. The task manager module defines the interactions between the humanoid and the scene objects in a specific manner defined by the task and simulation models. These sub-modules are further explained in the following sections.
A Framework for Automatic Simulated Accessibility Assessment in Virtual Environments
307
4.3 Scene module The scene module [11] is responsible for creating the scene, adding the objects in it and defining their special attributes and behavior. The scene is modeled by two sets of objects: static objects and moveable objects. Both types of objects have geometry (volume) and visual representation (textures). Static objects do not have mass and cannot be moved by the humanoid. On the other hand, moveable objects are modeled as rigid bodies, having properties like uniformly distributed mass over their volume (constant density), linear and angular velocity.
Fig. 4. The three states showing the car's storage compartment functionality. The arrows represent the rotational degrees of freedom. The red box shows the POI, used for interacting with the object. Three objects are presented in the screenshots: the handle (moveable), the storage compartments' door (moveable) and the car's dashboard (static).
For example, the car's storage compartment (Fig. 4) functionality described by two DoF-chains: one that connects the handle with the storage compartment and another that connects the storage compartment with the dashboard. In the previous example, a scene rule is used to check at every timestep the state of the handle. If the angle to its parent (i.e. compartment's door) exceeds the predefined limit, the storage compartment opens by a spring force. 4.4 Humanoid Module The humanoid [11] is modeled by a skeletal model that is represented by a set of bones and a set of joints. The skeletal model is following a hierarchical approach, meaning that each joint connects a child bone with its parent one. Basically, the bones are modeled by rigid bodies, having properties like mass, volume, position, velocity etc. The mass is distributed uniformly in the bone’s volume. A primitive geometry shape (box, sphere or capsule) is attached to each bone, which is representing the human part flesh and is used for collision testing purposes. The joints induce or restrict the movement of their attached bones and have properties such as: a) rotational degrees of freedom: define the rotational axis in which the attached child bone is able to rotate relatively to its parent. Every joint can have one to three degrees of freedom, b) minimum and maximum angle per degree of freedom: these two angles constrain the range of motion of the attached bones (and their body parts), c) minimum and maximum joint torque: an abstract representation of the muscles attached to the joint.
308
N. Kaklanis et al.
Currently, the model supports two basic modes: kinematic and dynamic. In kinematic mode, the humanoid moves by directly changing the position and orientation of each bone part per time step. In dynamic mode, the humanoid changes its state, by applying torques at each joint. As above, forward and inverse dynamic techniques [5] are supported. Various high level motion planning and collision avoidance techniques are supported by the humanoid module. Configuration space [15] and structures that define and explore it, such as rapidly exploring random trees [13] and multi-query planners [24] are extensively used, in order to compute a non colliding path for the humanoid. 4.5 Task Manager Module The task manager module [11] is responsible for managing the actions of the humanoid in order to provide a solution to a given task. After splitting the complex task to a series of primitive tasks, the task manager instructs the humanoid model to perform a series of specific actions in order to accomplish them. At every step, the task manager supervises and checks for task completion and reports to the system if something went wrong. The cycle pattern that is followed at each simulation step in the dynamic mode is presented in [11]. In order to decrease the complexity of each primitive task and its success/failure condition, the task manager uses a system of task rules. Each primitive task can have a set of rules that are checked at each timestep. Following the same rule model as in the scene module, each rule has two main parts: condition part and result part. When a condition is met, the rule's result part is applied. Conditions can check various simulation's elements and states, such as current distance of a specific bone from a POI, measure time since task started, count how many times a specific humanoid action was performed etc.
5 Experiments In order to show how the proposed framework could be used in practice, an evaluation scenario is presented in this section. According to the scenario, a car designer performs accessibility evaluation of a prototype for different virtual users with disabilities. The designer initially develops the virtual workspace prototype presented in Fig. 3. Then, using the Virtual User Model Editor described in [11], the designer generates two Virtual User Models, whose problematic physical parameters are presented in Table 1. Two Task Models describing the handbrake use and the opening of the storage compartment, respectively, were developed in the context of the experimental scenario (in a similar way such as the Task Model presented in Fig. 1). Finally, a simple Simulation Model describing the interaction of the Virtual User in the specific virtual environment was developed (in a similar way such as the Simulation Model presented in Fig. 2).
A Framework for Automatic Simulated Accessibility Assessment in Virtual Environments
309
Table 1. Virtual User Models - Details Physical characteristics
Normal Values [4][12] 335 0 – 27.5 0 - 35 0 - 85 0 - 85 0 - 10
Hand maximum pull force (N) Wrist radial deviation (o) Wrist ulnar deviation (o) Forearm supination (o) Forearm pronation (o) Elbow hyper-extension (o) Shoulder flexion (o)
Elderly (60-84) [12][25][3][26] 76.8 0 – 19 0 – 26 0 - 74 0 - 71 0-4
0 - 160 o
Shoulder abduction ( ) Shoulder internal rotation (o) Shoulder external rotation (o) Spinal column flexion (o) Spinal column extension (o) Spinal column left lateral flexion (o) Spinal column right lateral flexion (o)
0 - 85 0 - 80 0 - 45 0 - 90 0 - 30 0 - 25 0 - 25
Spinal Cord Injury [7]
0 - 86 0 - 67 0 - 63
0 - 21 0 - 12
0 – 23.6 0 - 17 0 - 19 0 - 20
Table 2. Simulation results Task Pull handbrake
Open storage compartm ent
Virtual User Elderly (6084)
Spinal Cord Injury
Scenario Handbrake resistance (torque) : 17Nm
Simulation result Failure – Reduced hand pull force
Handbrake resistance (torque) : 6Nm
Success
Storage compartment that opens by pulling a handle
Failure – Reduced range of motion
Storage compartment without handle that opens by pushing it
Success
The handbrake use was simulated for an elderly user (60-84 years old) in two different scenarios. In the first scenario, the handbrake (mass 200gr) had a resistance
310
N. Kaklanis et al.
(torque) of 17Nm, which loosely speaking it can be translated into lifting a weight of about 4.8kgr, while in the second scenario the resistance was 6Nm (~1.7kgr). Furthermore, the storage compartment’s opening was simulated for a virtual user with spinal cord injury in two different scenarios. In the first scenario, the opening of the storage compartment could be performed by pulling a handle, while in the second scenario it could be performed by pushing the whole storage compartment. The results of the simulation process clearly revealed accessibility issues of the specific car interior. More specifically, as depicted in Table 2, the elderly virtual user failed in handbrake use when resistance was 17Nm and the virtual user with spinal cord injury failed in opening the storage compartment with the handle, which means that the specific environment is inaccessible for them.
6 Conclusions The present paper proposes a framework that performs automatic simulated accessibility testing of designs in virtual environments using a new user modelling technique. The great importance of such a framework lies to the fact that it enables the automatic accessibility evaluation of any environment for any user by testing its equivalent virtual environment for the corresponding virtual user. Acknowledgment. This work is supported by the EU funded project VERITAS (FP7 – 247765).
References [1] Anderson, J.R.: Cognitive Psychology and its Implications, 5th edn., p. 241. Worth Publishers, N.Y (2000) [2] Astington, J.W., Jenkins, J.M.: Theory of mind development and social understanding. Cognition and Emotion 9, 151–165 (1995) [3] Berryman-Reese, N., Bandy, W.: Joint Range of Motion and Muscle Length Testing. Saunders, Philadelphia (2002) [4] Debrunner, H.: Orthopaedic Diagnosis. E & S Livingstone, Edinburgh (1970) [5] de Jalon, J.G., Bayo, E.: Kinematic and Dynamic Simulation of Multibody Systems. In: The Real-Time Challenge, p. 440. Springer, New-York (1994); ISBN 0-387-94096-0 [6] Dvorak, P.: Looking for a User-Friendly Internet. The Wall Street Journal (September 16, 2004) [7] Eriks-Hoogland, I.E., de Groot, S., Post, M.W.M., van der Woude, L.H.V.: Passive shoulder range of motion impairment in spinal cord injury during and one year after rehabilitation. Journal of Rehabilitation Medicine 41(6), 438–444 (2009) [8] Heckmann, D.: Introducing situational statements as an integrating data structure for user modeling, context-awareness and resource-adaptive computing. In: ABIS 2003, Karlsruhe, Germany, pp. 283–286 (2003) [9] Heckmann, D.: Ubiquitous User Modelling. Akademische Verlagsgesellschaft Aka GmbH, Berlin (2006); ISBN 3-89838-297-4, ISBN 1-58603-608-4 [10] Heckmann, D., Schwartz, T., Brandherm, B., Schmitz, M., von Wilamowitz-Moellendorff, M.: GUMO – The General User Model Ontology. In: Ardissono, L., Brna, P., Mitrović, A. (eds.) UM 2005. LNCS (LNAI), vol. 3538, pp. 428–432. Springer, Heidelberg (2005)
A Framework for Automatic Simulated Accessibility Assessment in Virtual Environments
311
[11] Kaklanis, N., Moschonas, P., Moustakas, K., Tzovaras, D.: Enforcing accessible design of products and services by enabling automatic accessibility evaluation. In: CONFIDENCE 2010 (2010) [12] Kumar, S.: Muscle Strength. CRC Press, Boca Raton (2004) [13] Lavalle, S.M., Kuffner, J.J.: Rapidly-Exploring Random Trees: Progress and Prospects. Algorithmic and Computational Robotics: New Directions (2000) [14] Lok, B., Rick, F., Andrew, R., Kyle, J., Robert, D., Jade, C., Stevens, A., Lind, D.S.: Applying Virtual Reality in Medical Communication Education: Current Findings and Potential Teaching and Learning Benefits of Immersive Virtual Patients. Virtual Reality 10, 185–195 (2006) [15] Lozano-Pérez, T.: Spatial planning: A configuration space approach. IEEE Transactions on Computing C-32(2), 108–120 (1983) [16] Marshall, R., Porter, J.M., Sims, R.E., Summerskill, S., Gyi, D.E., Case, K.: The HADRIAN approach to accessible transport. Work, A Journal of Prevention, Assessment & Rehabilitation 33(3) (2009) ISSN 10519815, doi:10.3233/WOR20090881 [17] van der Meulen, P., Seidl, A.: Ramsis – The Leading Cad Tool for Ergonomic Analysis of Vehicles. In: Duffy, V.G. (ed.) HCII 2007 and DHM 2007. LNCS, vol. 4561, pp. 1008–1017. Springer, Heidelberg (2007) [18] Mitrovic, A., Koedinger, K.R., Martin, B.: A Comparative Analysis of Cognitive Tutoring and Constraint-Based Modeling. In: Proceedings of 9th International Conference on User Modelling 2003, USA, pp. 313–322 (2003) [19] Palmon, O., Oxman, R., Shahar, M., Weiss, P.L.: Virtual environments as an aid to the design and evaluation of home and work settings for people with psysical disabilities. In: Proc. 5th Intl Conf. Disability, Virtual Reality & Assoc. Tech., Oxford, UK (2004) [20] Pruitt, J., Grudin, J. (2003). Personas:Practice and Theory. In: Proc. DUX (2003), http://research.microsoft.com/en-us/um/people/jgrudin/ [21] Razmerita, L., Angehrn, A., Maedche, A.: Ontology-based user modeling for Knowledge Management Systems. In: Proceedings of UM 2003 User Modeling: Proceedings of the Ninth International Conference, USA (2003) [22] Rosenbloom, P.S., Laird, J.E., Newell, A.: The Soar Papers: Research on Integrated Intelligence. MIT Press, Cambridge (1993) [23] SAMMIE, SAMMIE CAD Ltd. (2007), http://www.sammiecad.com/ [24] Siciliano, B., Khatib, O.: Springer Handbook of Robotics. Springer-Verlag New York, Inc., Secaucus (2007) [25] Steenbekkers, L.P.A., van Beijsterveldt, C.E.M. (eds.): Design-relevant characteristics of ageing users Series Ageing and Ergonomics, vol. 1. Delft University Press, Delft (1998) [26] Trudelle-Jackson, E., Fleisher, L.A., Borman, N., Morrow Jr., J.R., Frierson, G.M.: Lumbar Spine Flexion and Extension Extremes of Motion in Women of Different Age and Racial Groups. Spine (Phila Pa 1976) 35(16), 1539–1544 (2010), doi:10.1097/BRS.0b013e3181b0c3d1. [27] WebAIM Simulations, WebAIM, Center for Persons with Disabilities, Utah State University, http://www.webaim.org/simulations/
Cloth Modeling and Simulation: A Literature Survey James Long, Katherine Burns, and Jingzhou (James) Yang* Department of Mechanical Engineering Texas Tech University, Lubbock, TX 79409 Tel.: 806-742-3563, Fax: 806-742-3540
[email protected]
Abstract. Cloth modeling and simulation has gained significant momentum in recent years due to advances in computational hardware and software. Cloth plays an important role not only in daily life, but also in special scenarios such as firefighter’s cloth and space suits. There are special requirements such as protection capability of the human body and mobility after the firefighters or astronauts wear the special cloth. Traditional assessment of cloth is to have prototypes first and have experiments by subjective rating. This is time consuming and expensive. Virtual cloth modeling and simulation provides a means to demonstrate and assess its performance before cloth is made. This paper attempts to give a literature review to summarize the state-of-the-art of cloth modeling and simulation. Keywords: Cloth modeling and simulation; finite element method; comfort.
1 Introduction The concept of developing a mathematical model to define the behavior of cloth and its properties started back in the 1930s, when the engineering community developed textile mechanics for supporting the textile industry [13]. Since that time technology has evolved exponentially. Advancements in computers and software made virtual simulation a reality. With the development and advancement of Finite Element Analysis software and Computational Fluid Dynamic software, engineers are able to solve problems at speeds never imagined. There is no reason clothing cannot go under a similar process. The benefits to cloth simulation would include a decrease in cost for prototyping. With less prototyping companies could also save time. It is estimated that time and cost could be reduced as much as 80-90%. Also much more complicated problems can be solved through virtual simulation. The development of this software led to the concept of virtual cloth simulation which started in the late 1980’s and began to see implementation in the 1990’s [13]. The idea of cloth simulation involves two main focuses that are interrelated. The first focus is the simulation of cloth for engineering applications. The main objectives for simulating cloth for engineering applications includes comfort, reliability, performance and ensuring safety in cloth designed for specific operations. Cloth comfort includes how the cloth contacts the skin and how much heat is allowed to transfer between the body and cloth. The *
Corresponding author.
V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 312–320, 2011. © Springer-Verlag Berlin Heidelberg 2011
Cloth Modeling and Simulation: A Literature Survey
313
modeling of pressure maps or contact pressure between the cloth and human skin is needed to determine how cloth contacts the human body. While the study of heat transfer determines the analysis of how warm cloth might make a human in specific environments. The other main focus for virtual cloth simulation is cloth modeling and animation. The research into cloth modeling has been developed by two main groups. The first group, the textile industry, is the main community that tries to understand cloth by applying the perspective of mechanical engineering. Before the era of computer aided design researchers in the textile industry struggled to develop an accurate model for predicting cloth. The first model developed was the Pierce model and was developed in the 1930’s [13]. The next models were the strain energy models; however, they failed to accurately model cloth due to the highly anisotropic material properties of cloth. The latest models, which have been implemented to CAD, Computer Aided Design, systems are based off the laws of elasticity. The laws of elasticity are combined with continuum mechanics and finite element methods to create much more accurate models than their predecessor at predicting cloth. The second group that has advanced cloth modeling is the computer graphics industry. Their main objective is to obtain realistic visualization of cloth at minimal computational cost. This group has conducted a majority of the research on the virtual simulation of cloth. The finished product for this group is in the form of computer-generated images and animations. These visualizations are used by cloth designers, game designers and the movie industry. However, the majority of the developers for computer graphics do not care if their models are physically correct. Their interest is in developing the simplest method to obtain the most realistic appearance of cloth. However several physically based simulations exist, the other simulations are based on geometric approaches. Physically based models that the computer graphics industry uses include mass and spring systems; particle based systems, and elasticity based models. There are several goals researchers aim to produce for simulating the visual effects of cloth simulation. Several visual effects include, but are not limited to, the illumination of the fabric chosen, the deformation or way a particular fabric folds, and the visual texture of the fabric. Visualizing cloth is also used for animation. Being able to animate cloth with a realistic model is one of the largest drives for cloth simulation. Their approach is often limited to their specific computational constraints. The organization of this literature survey is as follows. To understand clothing modeling it is important to understand the terminology of textile mechanics. Since much of the concepts used for modeling virtual cloth are based heavily on engineering core concepts. For this reason it is key to understand how fiber mechanics affect, yarn mechanics which than effects the fabric than finally cloth mechanics. Included in this survey will be the fundamentals of cloth mechanics. For this reason the first section will be a brief summary of terminology and the various engineering concepts of textile engineering. The final section of the paper will go over how clothing simulation can be used in solving engineering problems. There are a few examples to go over, and the theory behind solving these problems. The literature survey will wrap up with a conclusion.
314
J. Long, K. Burns, and J. (James) Yang
2 Mechanics of Clothing To virtually simulate clothing it is important to understand the properties of clothing. The first part to understanding properties of cloth is fiber mechanics (Fig. 1). Dai and Li divide the study of fiber mechanics into four parts: macrostructure, microstructure, submicroscopic structure and fine structure of fibers. The macrostructure includes features that are visible to the human eye, these features being fiber width, fiber length, and fiber crimp. The most important property of fiber is its size. Fiber size is commonly referred as its fineness. The variation of fiber size affects the stiffness of the fiber. Fiber stiffness affects the stiffness of the fabric which affects the feel of the fabric and how the fabric drapes. The next feature, fiber length, affects the fiber strength. A longer fiber has stronger yarn strength. The last macrostructure fiber crimp refers to waves, bends, twists or curls along a strand of fiber. Fibers can be linear, have crimp in only two dimensions, and there are fibers that have threedimensional crimp. Crimped fibers have higher elongation than linear fibers. The microstructure of fiber includes surface contour and cross sectional shape. A fibers surface contour may be smooth, serrated, lobed, striated, pitted, scaly, or convoluted. The surface contour affects comfort with the skin and affects fiber frictional properties. Cross sectional shape changes with the type of fiber used and affects the bending stiffness and torsional stiffness of the fiber. An electronic microscope is needed for the study of the submicroscopic structure. The submicroscopic structure gives more information on the surface of the fiber. The last sub-division, fine structure, examines the length, width, shape and chemical composition of the polymers. Polymers are thousands of elements bonded together by covalent bonds. The longer the polymer chain usually means a stronger fiber. The mechanical properties of fiber are essential to understanding yarn than fabric mechanics. The properties are as follows. When it comes to the elastic recovery of fiber it is initially very similar to that of an elastic spring, the stress strain curve is linear. The fibers will also partially recover after the yield point has been reached. Bending in fiber follows the traditional material/mechanical models. Fiber differs when in compression compared to traditional mechanics. Fiber buckles very easily under compressive forces. Fiber friction is the force that holds together the fibers in yarn. Higher friction can lead to a stronger yarn, but is not always desired. Having a lower fiber friction helps minimize wear of fiber and helps provide a better fabric drape. Molecular properties
Fiber properties
Fiber structure
Yarn structure
Yarn properties
Fabric structure
Fabric properties
Garment performance
Garment construction
Fig. 1. Hierarchical relationships of fiber, yarn, fabric and garment to the biomechanical function performance of clothing and textile devices
Next it is important to understand yarn mechanics and how they influence fabric mechanics. To understand yarn mechanics it is important to understand yarn structure
Cloth Modeling and Simulation: A Literature Survey
315
and distribution of the fibers. The way fibers settle and are spun into yarn causes the structure of fiber in yarn to be a helix with a variable helix radius. The fibers migrate within this helix and their location can only be hypnotized by varying theories. The effect of fiber migration can have a very significant impact on mechanical properties of yarn. The fiber distribution in the cross section depends on fiber type and twist level. The material properties that effect yarn the most are the stress-strain relationship of the fibers, fiber friction and compressionial properties of fiber. There are two types of yarn staple yarn and continuous filament yarn. Staple yarn, the most widely used have discontinuities throughout their fiber lengths. The fiber lengths are prevented from slippage by their friction coefficient. During fiber extension the fibers move in different directions and the make solving for the resultant friction force quite complex. A few equations exist to predict the yarn mechanical properties using the fiber mechanics. These equations are based on experimental data and need the use of unknown constants derived from experiments. Now it is possible to understand fabric mechanics. Fabrics are the main materials used in constructing cloth garments and textiles. The mechanical properties of fabric are discontinuous, inhomogeneous, and anisotropic. The structure of fabrics consists of yarn and/or fibers. Fabrics are made of overlapping yarn woven at 90 degree angles. This leads to the complex mechanical behavior of clothing. As mentioned before the yarn is assembled from thousands of fibers. Fabric is than assembled from thousands of strands of yarn. There are several parameters used to describe fabric. Fabric count is to describe the yarn per inch of a particular fabric. Warp is the yarn extended lengthwise in a loom. Weft is the width of the fabric through the length section of yarn. Balance is the ratio of warp yarn to weft yarn. Fabric weight is the mass per unit area, and indicates how thick the fabric is. In everyday use fabrics go through a wide range of motion. Dai, Choi and Li explain that it is necessary to group all the deformations of fabric into four groups: tensile, shearing, bending and twisting. Tensile deformation is caused when fabric is pulled in the warp or weft direction. The friction created between the fibers hold the fabric together. Because of the anisotropy of the fabric tension in the warp and weft directions will cause dramatically different effects. Shear deformation in fabric is not linear because fabric is not elastic. The shear deformation also greatly differs from continuous materials, because the slippage that occurs between fibers. Fabric can have large deformations caused by a relatively small force when bending. Fabric is able to buckle like no other sheet material. The fabric has an initial resistance to bending caused by a frictional force which is cause by the pressure of the yarn coming into contact at all of the cross sections of the weave. The final deformation comes from twist. When fabric is bent the fabric also has a twist deformation. All of the relationships used to describe the different deformations of fabric are non-linear. The complex mechanical properties of cloth have lead researchers to try and develop many different models to approximate the behavior of clothing.
3 Engineering Applications of Simulating Clothing There are many different methods used to simulate clothing and cloth materials in virtual environments. The method is usually determined by the level of accuracy
316
J. Long, K. Burns, and J. (James) Yang
needed and the mechanics involved in the particular experiment. When using cloth in Finite Element Analysis it might be necessary to model each fiber of cloth, or use a thin shell material with the proper material properties. FEA solvers have been used to simulate the effectiveness of various clothing protective systems. Several researchers have successfully simulated projectiles coming into contact with clothing protective systems. The methods and results vary as follows. Cheeseman et al. [5] built a digital model of Kevlar that includes each piece of fiber digitally modeled. Cheeseman et al. [5] simulation was only a 5 x5 square piece of Kevlar fabric (Fig. 2). To model a whole garment with each strand of fiber in the fabric would not only be complicated to build, but also it would require a large amount of computational time. Similarly, Gu and Ding used FEA to simulate a project impacting a cloth material called Twaron, which is similar to Kevlar [10]. However, Gu and Ding used a solid model with many elements to represent there fabric model [10]. Gu and Ding treated their fabric layer like a composite composed of several different layers of lamina to obtain material properties to represent Twaron [10] (Fig. 3). A solid model takes significantly less time to model than a model containing every strand of fiber.
Fig. 2. (Left) (a) Deformed mesh of 5x5 Kevlar fabric layer impacted by a cylindrical projectile and (b) close-up view of impacted area Fig. 3. (Right) Mesh of inclined lamina and projectile. The projectile and the geometrical model of the 3-D braided composite were meshed by eight-node hexahedron solid elements
The use of material properties being used to successfully simulate cloth does not just apply to projectile impacting fabric simulations. Wu et al. used material properties to realistically model cloth in draping simulations. The process Wu et al. used for developing material properties is as follows. For cloth simulation, geometric deformation is related to the energy function by the material properties of the cloth. Solving for the material properties of cloth can be extremely difficult. A commonly used piece of equipment, the Kawabata Evaluation system, is used to solve for the mechanical properties of cloth. The Kawabata Evaluation system is able to plot test data from tensile, shear and bending test of fabric. With the contributions by Breen et al. (1994), Eischen and Bigliani (2000) it is possible to solve for the elastic constants of fabric, using the plots from the Kawabata Evaluation system. The equations for the material properties are derived from the strain energy density equation. [25] The
Cloth Modeling and Simulation: A Literature Survey
317
tensors can be simplified, substituted and derived to solve for the material property equations that are used to describe clothing. The equations are simplified to allow the inputs of the different moduli. The Kawabata systems solves for the tensile, shear and bending moduli. Poisson’s ratio and the twisting modulus are derived using the tensile, shear and bending modulus. With these material properties of cloth known it is possible to virtually simulate cloth by using the mechanical properties of cloth in a finite element solver. Wu et al. [25] are able to perform draping simulation on cloth with the measured and calculated material properties of cloth (Fig. 4).
Fig. 4. Results of using material properties in simulation. Images show a comparison of real image of cotton versus a simulated image.
While solid elements have less computational requirements over models built with each strand of fiber, the accuracy of the simulation is a sacrifice for a more efficient model. Nilakantan et al. have proposed a ‘Hybrid Element Analysis HEA’ method that proves to be accurate and while not having a high demand for computational resources [17] (Fig. 5). Their hybrid method uses sections made up of both solid elements and sections that contain each strand of fiber. A small section of the fabric that is to be impacted is modeled with each strand of fiber, while the surrounding area of the material is modeled with a shell element that uses material properties to represent the textile being tested [17].
Fig. 5. HEA containing both solid and shell elements
Not only are textile products simulated virtually to determine their drape or ability to protect humans, but also they are simulated to predict the comfort of almost an
318
J. Long, K. Burns, and J. (James) Yang
unlimited amount of products that users may wear each and every day. The most common way to test for a comfort in clothing is to test for the discomfort a user might feel. A users discomfort can be visualized by calculating the pressure distribution of cloth as it comes into contact with users. Wang et al. used several different tools to develop pressure maps that a human might feel while wearing a shirt. A pressure map is a color coded graph used to show the higher concentrations of pressure on an object. Wang et al. determined clothing deformation and human body deformation using M-smart. From there Wang et al. were able to plug the human body deformation into ANSYS to map the pressure distribution. This pressure map determines where the particular human model might experience discomfort from the garment being tested [23] (Fig. 6). Wong et al. completed similar simulations to pressure map tight-fit sportswear on the human body [24]. Wong et al. simulated the clothing pressure of three tight-fit garments and validated their simulations with experimental results [24]. The work of Seo et al. differs from other researchers because, Seo et al. developed a pressure map for tight-fit clothing (Fig. 7). Seo et al. used a mass-spring system that models the strain-stress relationship of clothing accurately. The clothing’s strain is measured by measuring the deformation of a circular stamp that is placed along the garment [19]. Seo et al. were able to validate their virtual simulations with experimental results. There are many other types of simulations developed by researchers to virtually simulate cloth. The inputs and outputs are determined by the researcher’s goals and expected outcomes.
Fig. 6. The pressure distribution on the skin
Fig. 7. Simulated measurement of tight-fit cloth pressure on differently sized 3D mannequins
4 Conclusion The complex structure of fabric creates an anisotropic material with non-linear properties. This creates a highly complex problem when virtually simulating cloth. There are several methods and tools currently available for researchers to predict the various behaviors of clothing. The movie and gaming industries prefer to have quick results that only worry about the appearance of the clothing. Engineers and
Cloth Modeling and Simulation: A Literature Survey
319
researchers however, are often concerned about more than the appearance of clothing in their simulations. They can choose tools that are more accurate at the expense of higher computational requirements. Advances in technology have made it possible to conduct virtual simulations. Virtual simulations have helped improve the process of design. In the case of clothing simulation it is now possible to cut down the time and expense of large amounts of prototypes and any necessary testing that may accompany this process.
References 1. Breen, D., House, D., Wozny, M.: Predicting the drape of woven cloth using interacting particles. In: Proceedings of the Siggraph 1994 Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques (1994) 2. Bridson, R., Fedkiw, R., Anderson, J.(n.d.): Robust treatment of collisions, contact and friction for cloth animation. Informally published manuscript. Stanford University, Palo Alto, California 3. Bridson, R., Marino, S., Fedkiw, R.: Simulation of clothing with folds and wrinkles. In: Proceedings of the Eurographics/Siggraph Symposium on Computer Animation, pp. 28–36 (2003) 4. Carignan, M., Yang, Y., Magnenat Thalmann, N., Thalmann, D.: Dressing animated synthetic actors with complex deformable clothes. Computer Graphics 26(2), 99–104 (1992) 5. Cheeseman, B., Yen, C., Scott, B., Powers, B., Bogetti, T.: From filaments to fabric packs simulating the performance of textile protection system. U.S. Army Research Laboratory, Aberdeen Proving Ground, MD (2006) 6. Chittaro, L., Corvaglia, D.: 3d virtual clothing: from garment design. Association for Computing Machinery (2003) 7. Choi, K.J., Ko, H.S.: Research problems in clothing simulation. Computer-Aided Design 37, 585–592 (2005) 8. Eischen, J., Deng, S., Clapp, T.: Finite-element modeling control of flexible fabric parts. Computer Graphics Textiles and Apparel, 71–80 (1996) 9. Fan, J., Wang, Q., Chen, S., Yuen, M., Chan, C.: A spring–mass model-based approach for. The Journal of Visualization and Computer Animation 9, 215–217 (1998) 10. Gu, B., Ding, X.: A refined quasi-microstructure model for finite element analysis of threedimensional braided composites under ballistic penetration. Journal of Composite Materials 39, 685–710 (2004) 11. Goldenthal, R., Harmon, D., Fattal, R., Bercovier, M., Grinspun, E.: Effi cient Simulation of Inextensible Cloth. ACM Trans. Graph 26(3), 7, Article 49 (2007), http://doi.acm.org/10.1145/1239451.1239500, doi:10.1145/1239451.1239500 12. Hayler, G.: Cloth modelling - a literature survey (2004) (unpublished manuscript) 13. House, D., Breen, D.: Cloth modeling and animation. A.K. Peters, Ltd., Natick (2000) 14. Lee, S., Kamijo, M., Honywood, M., Nishimatsu, T., Shimizu, Y.: Analysis of finger motion in evaluating the hand of a cloth using a glove-type measurement system. Textile Research Journal 77(1) (2007), Retrieved from http://trj.sagepub.com/content/77/1/13 15. Li, Y., Dai, X.-Q.: Biomechanical engineering of textiles and clothing. CRC Press LLC, Boca Raton (2006)
320
J. Long, K. Burns, and J. (James) Yang
16. Meng, Y., Mok, P.Y., Jin, X.: Interactive virtual try-on clothing design systems. Computer-Aided Design 42, 310–321 (2010) 17. Nilakantan, G., Keefe, M., Gillespie, J., Bogetti, T.: Simulating the impact of multi-layer fabric targets using a. In: Proceedings of the Texcomp 9 – International Conference on Textile Composites Newark (2008) 18. Saeki, T., Furukawa, T., Shimizu, Y.: Dynamic clothing simulation based on skeletal motion of the human body. International Journal of Clothing Science and Technology 9(3), 256 (1997) 19. Seo, H., Kim, S.J., Cordier, F., Hong, K.: Validating a cloth simulator for measuring tightfit clothing pressure. In: Proceedings of the Association for Computing Mechinery, pp. 431–438 (2007) 20. Tan, S.T., Wong, T.N., Zhao, Y.F., Chen, W.J.: A constrained finite element method for modeling cloth deformation. The Visual Computer 15, 90–99 (1999) 21. Tutt, B., Taylor, A.: Applications of ls-dyna to structural problems related to recovery systems and other fabric structures. In: Proceedings of the 7th International ls-dyna Users Conference 22. Volino, P., Magnenat-Thalmann, N.: Accurate garment prototyping and simulation. Computer-Aided Design & Applications 2(5), 645–654 (2005) 23. Wang, R., Li, Y., Luo, X.-N., Zhang, X.: Visualization application in clothing biomechanical design. In: Pan, Z., Cheok, D.A.D., Haller, M., Lau, R., Saito, H., Liang, R. (eds.) ICAT 2006. LNCS, vol. 4282, pp. 522–529. Springer, Heidelberg (2006) 24. Wong, A., Li, Y., Zhang, X.: Influence of fabric mechanical property on clothing dynamic pressure distribution and pressure comfort on tight-fit sportswear. Sen’I Gakkaishi 60(10), 293–299 (2004) 25. Wu, Z., Au, C.K., Yuen, M.: Mechanical properties of fabric materials for draping simulation. International Journal of Clothing 15(1) (2002), Retrieved from http://www.emeraldinsight.com/0955-6222.htm 26. Yamada, T., Matsuo, M.: Clothing pressure of knitted fabrics estimated in relation to. Textile Research Journal 79 (2009), Retrieved from http://trj.sagepub.com/content/79/11/1021
Preliminary Study on Dynamic Foot Model Ameersing Luximon1 and Yan Luximon2 1
Institute of Textiles and Clothing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong 2 School of Design, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
[email protected]
Abstract. It is generally accepted that improper footwear design causes injuries and illnesses. Existing literature indicated that mostly static footwear fit has been studied to some extent, even though we are well aware that dynamic footwear fit is different. In addition to static fit, illnesses and injuries will occur due to dynamic pressure, friction, and foot movement beyond the normal range of motion. This study proposes a method for dynamic foot shape computation. The dynamic foot is generated by using multi-dimensional transformation of the 3D static foot based on the angular values of the foot movement. This provides a simple algorithm to deform the foot to create dynamic 3D foot shape. Result of this study is essential to build a model for dynamic fit computation.
1 Introduction The human foot, an unsymmetrical object, consists of complex arrangements of 26 bones and other tissues namely cartilage, muscles, tendons, ligaments, nerves and blood vessels [1, 2]. Even though footwear has become more specialized these days, research on fit, especially during dynamic situation, have often been neglected [3], leading to foot-related problems such as bunions, calluses, corns, plantar forefoot pain, hallux valgus, metatarsalgia, and ankle sprain [4, 5]. The foot has size, shape and proportion variations within a person and among persons [6]. For specific individual there is considerable difference between right and left foot, weight-bearing, dynamic and thermal conditions [7]. Apart from the variation of the feet within a person, there are variations depending on gender [3-5, 8-9], age [4, 8] and race [5, 9]. In order to better understand the foot shape, feet have been classified [10] according to racial differences [5-7], arch type [11-12], flare angle [13-14], rear foot position [10, 15-20], big toe length [1] and foot print indices [20-24]. The variation in foot clearly indicates that development of better fitting footwear is a challenge. It is very important to understand the 3D foot shape. Currently footwear is produced in different sizes based on foot length and girth (or width). Since foot length and width are somewhat correlated and therefore current sizing methods are not optimal. Some studies have shown the importance of foot curvature (or flare angle) [25-27] and therefore an accurate and reliable method has been developed to quantify foot curvature (or foot flare) using principle component method [26]. Moreover analysis of footwear outline and footwear fit perception showed that differences in V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 321–327, 2011. © Springer-Verlag Berlin Heidelberg 2011
322
A. Luximon and Y. Luximon
footwear curvature can have differences in perception [25]. There should be a match between foot curvature and footwear curvature [27-29] Since foot outline and foot pressure distribution are important shapes that can influence fitting and foot outline was represented accurately using eight specially chosen landmarks [30] The accuracy was calculated based on dimensional differences [30] In addition to 2D shapes, 3D foot shape was predicted using three methods [25, 31-32]. The first method uses foot length, foot width, foot height, foot curvature and a standard foot shape to predict the complete food shape [31]. The second method uses foot outline, foot height, and a standard foot shape and the third method uses foot outline, foot profile and a standard foot shape [32]. The third method can predict foot shape to an accuracy of around 1mm. These techniques show that better foot prediction can be used for future developments of sizing based on 3D foot shape. Many of these studies are done using static foot shape data, dynamic foot data is still missing for better fitting footwear development. Several studies have been done for better quantification of foot shape and use of computational tools to capture foot shape. The foot height [33] and mid-foot shape [34] have been modeled to develop footwear without laces [35], boots [36] and ladies shoes [37]. The plantar foot shape changes under different load-bearing conditions have also been quantified [38-39]. The foot types (normal, pes cavus, and pes planus) were evaluated based on dynamic foot pressure distributions [40]. The foot contact properties were analyzed based on footprints and dynamic foot pressure distributions [41]. Recent studies have been aimed to develop comfortable plantar support for high heel shoes [42-43]. In dynamic situation such as normal gait, the foot is undergoes different stages including pronation during heel strike and supination during push-off. Supination is a combination of inward rotation at the ankle, adduction of the hindfoot, inversion of the forefoot, and plantarflexion, while pronation is a combination of abduction of forefoot, eversion of hindfoot and dorsiflexion [44]. The foot has 28±6.9° abduction and 28±4.8° adduction [44]; 37±4.5° inversion and 21±5.0° eversion; and 13±4.4° dorsiflexion and 56±6.1° plantarflexion [45]. The dorsiflexion/plantarflexion is different for gender and age group [46]. The foot range of motion is used for foot evaluation [47], and it is widely accepted that foot movement beyond the normal range causes foot injury, mainly ankle sprains [48]. Van Rijn [49] provides a detailed review of ankle sprain injury. Therefore knowledge of dynamic and static foot shape is essential, in medical as well as footwear design. During the last decade, emergence of 3D laser scanners and digital imaging technologies [50-51] are enabling the quantification of 3D foot shape. Even though 3D laser scanning technology is widely used [52], capture of dynamic foot shape is still at its infancy. So far, few studies used projection pattern and digital camera to capture dynamic foot shape while walking [53-54]. This study uses simple multi-dimensional deformation to model dynamic foot shape. Once the dynamic foot has been generated, it can be used for fit evaluation and for complex computational models.
2 Dynamic Fit Model Let P be the data point set representing 3D foot shape. The points pi is the ith data point denoted as {pi = ( xi , yi , zi ) ∈ P, where i = 1,...n }. In order to have
Preliminary Study on Dynamic Foot Model
323
non-linear ‘shear’ or deformation, an interface plane is required. The interface plane separates two regions. One region will be deformed while the other is not deformed. Let assume the interface plane is plane A. The equation of the plane A is given by Equation (1). If we want to consider the toe side, then S is the point set of the selected part is denoted as si = ( xs , i , ys ,i , zs ,i ) ∈ P | exi + fyi + gzi < h, i = 1,...ns .
{
}
For a given point si, the point ai is computed, where
d s , i is the shortest distance from
t s indicates the shear direction. The and is influenced by t s and d s , i . If a cubic hermite spline
point si to the plane A (Equation (2)). The vector offset of the point si is
os , i
function is used, the relationship between Osmax is the maximum offset for position
d s , i and os , i is given in Equation (3).
os , i and Dmax is the maximum of d s , i . The new
si* of the point si after a non-linear shear transformation along vector t s is
given in Equation (4). The shear transformation will be used to calculate plantarflexion/dorsiflexion, and with some modifications foot shape for different heel height and shank curve. Similar to non-linear shear, non-linear rotation can be used to calculate forefoot or rear foot everted/inverted foot shape. Non-linear scaling will be used to generate changes to foot shape due to different weight bearing. Preliminary results are shown in Figure 1. Currently the data has not been validated. The prediction error between predicted and measured foot can be calculated using average and standard deviation of dimensional differences [30-32].
(a) Original digital 3D foot
(c) Plantar-flexion (20°)/ Dorsi-flexion (20°)
(b) Eversion (20°)/ Inversion (20°)
(d) Abduction (5°)/ Adduction (20°)
Fig. 1. Preliminary results for 3D dynamic shape prediction using non-linear multidimensional transformation
ex + fy + gz = h
(1)
324
A. Luximon and Y. Luximon
d s ,i = si − ai d d ⎤ ⎡ os ,i = ⎢− 2( s , i )3 + 3( s , i ) 2 ⎥ × Os max Dmax Dmax ⎦ ⎣ si∗ = si + os ,i × tˆs
(2)
0 ≤d s ,i ≤ Dmax
(3)
(4)
3 Discussion and Conclusion This study attempts to generate dynamic foot shape using static foot shape and simple multi-dimensional scaling. Further research is needed to create a comprehensive model, but the results seem promising. Although dynamic foot scanners are being developed, there will be a need for developing a model for dynamic foot shape that can be generated from static foot shape. Currently foot scanner can capture the 3D foot shape, usually in static state. Similarly, it is fairly easy to capture location of landmarks on the foot in dynamic situation (for example in normal gait or running) using motion capture systems. Using motion capture system landmarks on the foot can be located in dynamic situation. The relative position of the landmarks can be used to find angular values of the foot movement in terms of plantar-flexion/dorsiflexion; eversion/inversion; and adduction/abduction. The dynamic foot is generated by using multi-dimensional transformation of the 3D static foot based on the calculated angular values of the foot movement. Preliminary results indicate that the static foot shape can be deformed. Further studies are still required to validate our results. Acknowledgment. This work was supported by The Hong Kong polytechnic grant (A-PH98).
References 1. Morton, D.J.: The Human Foot. Columbia University Press, New York (1935) 2. Sarrafian, S.K.: Anatomy of the Foot and Ankle: Descriptive, Topographic, Functional. JB Lippincott, Philadelphia (1993) 3. Rossi, W.A.: The Futile Search for the Perfect Shoe Fit. Journal of Testing and Evaluation 16, 393–403 (1988) 4. Pfeiffer, M., Kotz, R., Ledl, T., Hauser, G., Sluga, M.: Prevalence of Flat Foot in Preschool-Aged Children. Pediatrics 118, 634–639 (2006) 5. Hawes, M.R., Sovak, D., Miyashita, M., Kang, S.-J., Yoshihuku, Y., Tanaka, S.: Ethnics Differences in Forefoot Shape and the Determination of Shoe Comfort. Ergonomics 37(1), 187–196 (1994) 6. Cavanagh, P.R.: The Running Shoe Book. Anderson World, Mountain View (1980) 7. Cheskin, M.P.: The Complete Handbook of Athletic Footwear. Fairchild Publications, New York (1987)
Preliminary Study on Dynamic Foot Model
325
8. Adrian, K.C.: American Last Making: Procedures, Scale Comparisons, Sizing and Grading Information, Basic Shell Layout, Bottom Equipment Standards. Shoe Trades Publishing, Arlington (1991) 9. Fessler, D.M.T., Haley, K.J., Lal, R.D.: Sexual Dimorphism in Foot Length Proportionate to Stature. Annals of Human Biology 32(1), 44–59 (2005) 10. Razeghi, M., Batt, M.E.: Foot Type Classification: a Critical Review of Current Methods. Gait and Posture 15, 282–291 (2002) 11. Nachbauer, W., Nigg, B.M.: Effects of Arch Height of the Foot on Ground Reaction Forces in Running. Med. Sci. Sports Exerc. 24(11), 1264–1269 (1992) 12. Williams, D.S., Mcclay, I.S.: Measurements Used to Characterize the Foot and the Medial Longitudinal Arch: Reliability and Validity. Phys. Ther. 80(9), 864–871 (2000) 13. Freedman, A., Huntington, E.C., Davis, G.C., Magee, R.B., Milstead, V.M., Kirkpatrick, C.M.: Foot Dimensions of Soldiers (Third Partial Report Project No. T-13). Armored Medical Research Laboratory, Fort Knox, Kentucky (1946) 14. Yavatkar, A.S.: Computer Aided System Approach to Determine the Shoe-Last Size and Shape Based on Statistical Approximated Model of a Human Foot. Unpublished Masters Thesis. Tufts University, Medford MA (1993) 15. Kouchi, M., Tsutsumi, E.: Relation Between the Medial Axis of the Foot Outline and 3-D Foot Shape. Ergonomics 39, 853–861 (1996) 16. Sell, K., Verity, T.M., Worrel, T.W., Pease, B.J., Wigglesworth, J.: Two Measurement Techniques for Assessing Subtalar Joint Position: A Reliability Study. J. Orthop. Sports Phys. Ther. 19(3), 162–167 (1994) 17. Smith, L.S., Clarke, T.E., Hamill, C.L., Santopietro, F.: The Effects of Soft and SemiRigid Orthoses upon Rearfoot Movement In Running. J. Am. Pediatr. Med. Assoc. 76(4), 227–233 (1986) 18. Kernozek, T.W., Greer, N.L.: Quadriceps Angle and Rearfoot Motion: Relationships in Walking. Arch Phys. Med. Rehab. 74(4), 407–410 (1993) 19. Kernozek, T.W., Ricard, M.D.: Foot Placement Angle and Arch Type: Effect on Rearfoot Motion. Arch Phys. Med. Rehab. 71(12), 988–991 (1990) 20. Mcpoil, T.G., Cornwall, M.W.: Relationship between Three Static Angles of the Rearfoot and the Pattern of Rearfoot Motion During Walking. J. Orthop. Sports Phys. Ther. 23(6), 370–375 (1996) 21. Cavanagh, P.R., Rodgers, M.M.: The Arch Index: a Useful Measure from Footprints. J. Biomech. 20(5), 547–551 (1987) 22. Staheli, L.T., Chew, D.E., Corbett, M.: The Longitudinal Arch. A Survey of Eight Hundred and Eighty-Two Feet in Normal Children and Adults. J. Bone Joint Surg. Am. 69(3), 426–428 (1987) 23. Chu, W.C., Lee, S.H., Chu, W., Wang, T.J., Lee, M.C.: The Use of Arch Index to Characterize Arch Height: a Digital Image Processing Approach. IEEE Trans. Biomed. Eng. 42(11), 1088–1093 (1995) 24. Hamill, J., Bates, B.T., Knutzen, K.M., Kirkpatric, G.M.: Relationship between Selected Static and Dynamic Lower Extreminy Measures. Clin. Biomech. 4, 217–225 (1989) 25. Luximon, A.: Foot Shape Evaluation for Footwear Fitting. Phd Thesis, Hong Kong University of Science and Technology, Hong Kong (2001) 26. Goonetilleke, R.S., Luximon, A.: Foot Flare and Foot Axis. Human Factors 41, 596–607 (1999) 27. Luximon, A., Goonetilleke, R.S.: Critical Dimensions for Footwear Fitting. In: IEA 2003 Conference, Seoul, Korea (2003)
326
A. Luximon and Y. Luximon
28. Witana, C.P., Goonetilleke, R.S., Feng, J.J.: Dimensional Differences for Evaluating the Quality of Footwear Fit. Ergonomics 47(12), 1301–1317 (2004) 29. Luximon, A., Goonetilleke, R.S.: A 3-D Methodology to Quantify Footwear Fit (Chapter 28, pp. 491–499). The Customer Centric Enterprise- Advances in Customization and Personalization by MM Tseng and F Piller (2003) 30. Luximon, A., Goonetilleke, R.S., Tsui, K.L.: Foot Landmarking for Footwear Customization. Ergonomics 46(4), 364–383 (2003) 31. Luximon, A., Goonetilleke, R.S.: Foot Shape Prediction. Human Factors 46(2), 304–315 (2004) 32. Luximon, A., Goonetilleke, R.S., Zhang, M.: 3D Foot Shape Generation from 2D Information. Ergonomics 48(6), 625–641 (2005) 33. Xiong, S., Goonetilleke, R.S., Witana, C.P., Au, E.Y.L.: Modelling Foot Height and Foot Shape-Related Dimensions. Ergonomics 51(8), 1272–1289 (2008) 34. Witana, C.P., Goonetilleke, R.S.: Midfoot Shape when Standing on Soft and Hard Footbeds. In: HFES 50th Annual Meeting, San Francisco, California (2006) 35. Xiong, S., Goonetilleke, R.S.: Can We Do away with those Cumbersome Shoe Laces? In: Eighth Pan-Pacific Conference on Occupational Ergonomics (PPCOE 2007) Bangkok, Thailand (2007) 36. Xiong, S., Witana, C.P., Goonetilleke, R.S.: The Use of a Foot Dorsal Height Model for the Design of Wellington Boots. In: Agriculture Ergonomics Development Conference (Aedec), Kuala Lumpur, Malaysia (2007) 37. Xiong, S., Goonetilleke, R.S.: Midfoot Shape for the Design of Ladies’ Shoes. In: BME2006 Biomedical Engineering Conference, Hong Kong (2006) 38. Tsung, Y.S., Zhang, M., Fan, Y.B., Boone, D.A.: Quantitative Comparison of Plantar Foot Shapes Under Different Weight Bearing Conditions. J. Rehab. Res. Dev. 40(6), 517–526 (2003) 39. Xiong, S., Goonetilleke, R.S., Zhao, J., Li, W., Witana, C.P.: Foot Deformations under Different Load Bearing Conditions and their Relationships to Stature and Body Weight. Anthropological Science 117(2), 77–88 (2009) 40. Leung, A.K.L., Cheung, J.C.Y., Zhang, M., Fan, Y., Dong, X.: Contact Force Ratio: a New Parameter to Assess Foot Arch Function. Prosthet Orthot Int. 28, 167–174 (2004) 41. Tsung, B.Y.S., Zhang, M., Mak, A.F.T., Wong, M.W.N.: Effectiveness of Insoles on Plantar Pressure Redistribution. J. Rehab Res. Dev. 41(6A), 767–774 (2004) 42. Luximon, Y., Luximon, A., Yu, J., Zhang, M.: Foot plantar pressure evaluation for different heel heights and with different weight-bearing. Acta Mechanica Sinica (in press, 2011) 43. Witana, C.P., Goonetilleke, R.S., Au, E.Y.L., Xiong, S., Lu, X.: Footbed Shapes for Enhanced Footwear Comfort. Ergonomics 52(5), 617–628 (2009) 44. Roaas, A., Andersson, G.B.: Normal Range of Motion of the Hip, Knee and Ankle Joints in Male Subjects, 30-40 Years of Age. Acta Orthopaedica Scandinavica 53(2), 205–208 (1982) 45. Boone, D.C., Azen, S.P., Lin, C.M., Spence, C., Baron, C., Lee, L.: Reliability of Goniometric Measurements. Physical Therapy 58(11), 1355–1390 (1978) 46. Nigg, B.M., Fisher, V., Allinger, T.L., Ronsky, J.R., Engsberg, J.R.: Range of Motion of the Foot as a Function of Age. Foot Ankle 13, 336 (1992) 47. Rome, R.: Ankle Joint Dorsiflexion Measurement Studies. A Review of the Literature. Journal of the American Podiatric Medical Association 86(5), 205–211 (1996) 48. Kaufman, K.R., Brodine, S.K., Shaffer, R.A., Johnson, C.W., Cullison, T.R.: The Effect of Foot Structure and Range of Motion on Musculoskeletal Overuse Injuries. The American Journal of Sports Medicine 27(5), 585–593 (1999)
Preliminary Study on Dynamic Foot Model
327
49. Van Rijn, R.M., Van Os, A.G., Bernsen, R.M., Luijsterburg, P.A., Koes, B.W., BiermaZeinstra, S.M.: What is the Clinical Course of Acute Ankle Sprains? A Systematic Literature Review. The American Journal of Medicine 121(4), 324–331 (2008) 50. Wang, J., Saito, H., Kimura, M., Mochimaru, M., Kanade, T.: Shape Reconstruction of Human Foot from Multi-Camera Images Based on PCA. In: Proceedings of the Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM 2005) (2005) 51. Amstutz, E., Teshima, T., Kimura, M., Mochimaru, M., Saito, H.: PCA Based 3D Shape Reconstruction of Human Foot using Multiple Viewpoint Cameras. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 161–170. Springer, Heidelberg (2008) 52. Nácher, B., Alcántara, E., Alemany, S., García-Hernández, J., Juan, A.: 3D Foot Digitizing and its Application to Footwear Fitting. In: Proc. of 3D Modelling 2004, Paris (2004) 53. Kimura, M., Mochimaru, M., Kanade, T.: Measurement of 3D Foot Shape Deformation in Motion. In: PROCAMS 2008, Marina Del Rey, California (2008) 54. Kimura, M., Mochimaru, M., Kouchi, M., Kanade, T.: 3D Cross-Sectional Shape Measurement of the Foot while Walking. In: Proceedings of the 7th Symposium on Footwear Biomechanics, International Society of Biomechanics, pp. 34–35 (2005)
Three-Dimensional Grading of Virtual Garment with Design Signature Curves Roger Ng Institute of Textiles and Clothing, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, China
[email protected]
Abstract. The key difference between tailoring and mass production is the use of a single size versus a size chart that contains difference sizes. In practice, the garment pattern of a reference size is made for the confirmation of style and sizing during the product development process. Then, garment patterns of other sizes are derived by this reference size using a technique of grading. In the flatpatterning techniques, there are three types of grading: rectangular (Cartesian coordinate), radio (polar coordinate) and line (localized Cartesian coordinate). All these methods suffer from the limitation of the increase of deformation as the sizes increase. In this article, I shall present a three-dimensional method of grading, which can maintain the styling and comfort characteristics of the wearers at different sizes. This is achieved by the concept of Design Signature Curve. After I present the concept, I shall also present an example as the verification. Keywords: 3-D Grading, Design Signature Curve, Design, Fitting, Apparel.
1 Introduction In the current computer-aided design (CAD) software with virtual fitting, the pattern designer can virtually sew the pattern and put it on the virtual mannequin to simulate the fitting. The virtual mannequin typically comes from the direct digitization of human subject or from parametric generation based on the data from a national size survey. The final virtual garment defines a second skin of the virtual mannequin. Once, the fitting of the reference size (i.e., the currently working mannequin) is confirmed, the grading process is typically executed separately on the flat-pattern to other sizes. It is known that the grading process is only effective within a small range of extrapolation. The fitting will be deformed as the sizes increase. The current study aims to develop the Design Signature Curve (DSC) method to handle the grading process with the three-dimensional garment pattern, rather than the flat pattern. The concept of the DSC is to establish the fitting relationship between the garment and the mannequin or a model. Then, such fitting relationship is applied to the mannequin of other sizes. Hence, with this novel method, the deformation due to the size increase can be reduced. The fitting can be improved. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 328–336, 2011. © Springer-Verlag Berlin Heidelberg 2011
Three-Dimensional Grading of Virtual Garment with Design Signature Curves
329
In this article, I shall present the concept of Design Signature Curve and then an example as the verification. Finally, I shall discuss the potential extension to other garment parts.
2 Literature Review Theoretically, if a garment can be unfolded directly from the virtual 3-D body, the grading process can be considered as an unfolding process on a virtual 3-D mannequin of different sizes. Hence, I shall focus the mathematical modeling of garment patterns, which can be traced back to as early as 1956. Mark and Taylor [1] presented a paper on the subject of the 3-D fitting of woven fabrics to the surfaces of revolution. The idea was subsequently taken up by Heisey. This study was completed by Samelson [2] in 1989. Since woven fabric is constructed by means of interlocking warp and weft yarns, Mark and Taylor considered the interlocking square unit of the woven fabric as four inelastic beams hanging more than four movable joints. Hence, when the fabric is laid on the 3-D surface, the beams move freely about the joint, and the angles between the beams change accordingly. In other words, the fitting process is based on the shearing property of the woven fabric. Heisey [3, 4, 5, 6] determined that the constraints on the shear angle was less than the shearability of the fabric, which can be measured using fabric objective measurement methods. Moreover, Heisey defined the rotational ridge to be the curve that connects the rotational maximum on each vertical section, above which the fabric will conform to the surface with minor interpolation around concave areas and below which the fabric will drape vertically. Samelson proved the theoretical condition that guarantee the existence of the garment pattern. Efrat [7] utilized the concept of folding a pattern into a 3-D cone to approximate the fitting process. The work was subsequently taken up by Shen and Huck [8] in 1992. Efrat selected 17 crucial points on the front right panel, and 16 points for the back right panel for the upper body. Using the bust point as the origin of the geodesic polar coordinate system for the front panel, he measured the distance from the crucial points to the bust point along the surface, including the angles. Shen and Huck used Somatographic technique to input the physical data, and reported the potential of adapting such technology for the apparel industry. Hinds and McCartney [9, 10] took a totally different approach to the fitting problem. They represented the digitized image of the human body as B-spline patches and further represented the 2-D pattern panel as a series of named sides, straight lines or curves, with an internal reference point on the panel, under a polar coordinate system. Bassett and Postle [11] designed a fitting scheme which was based on the shearing and bending properties of the woven fabric, and used a hemispherical shell as an approximation of the surface of the human body. This method was quite similar to that of Mark and Taylor, except it depended on the numerical approximation, whereas Mark and Taylor derived the analytic solution. Wang [12, 13] reported her study on the relation between ease allowance and the garment shape of the X-style jacket. The proposed model divided the body crosssection into three sections by inserting a small segment of almost rectangular shape. By the introduction of such rectangular shape, one can avoid the negative value of radial ease.
330
R. Ng
Hu, Ding, Yu, Zhang and Yan [14] proposed using the hybrid intelligent method, based on genetic algorithm and artificial neural network to learn how to relate the fitting of a garment to the set of garment patterns. The artificial neural network was used to learn the relationship among body measurements to the fit values that were assigned by a human expert. Then, the genetic algorithm was adopted to calculate the optimal size. Cho, Komatsu, Inui, Takatera, Shimizu and Park [15] presented a computerized draping method to develop the 2-D garment pattern. They represented the garment surface with a collection of triangles. Each triangle was unfolded, or mapped, to flat surface according to the fabric properties. Finally, these triangles were combined, and darts were created.
3 Design Signature Curve Method The key concept of the Design Signature Curve method is to define the fitting relationship between the virtual garment and the virtual mannequin. Therefore, it is important to first define what is fitting, and how to measure fitting. Next, by integrating the regional fitting index one can define the overall fitting via a Design Signature Curve that contains the fitting information with respect to the height. Once this curve is defined, it represents the design requirement of the designer (Fig. 1). When another size is considered, one can apply this Design Signature Curve directly to the virtual mannequin and calculate the virtual garment satisfying the same fitting requirements. The technical detail will now be presented. Firstly, the definition of fitting, that is adopted in this study, is based on the ease allowance, which is defined by the difference between the body and the garment (Fig. 2). There are two methods of measuring ease allowance. In a size chart, all the measurements are gross measurement, in the sense that a waist measurement is a girth measurement, rather than four separate regional arc length measurements of the leftfront waist arc, right-front waist arc, left-back waist arc, and right-back waist arc. This is the traditional method. Furthermore, there is another method of measuring the ease allowance by measuring the radial difference between two surfaces at any specific angle. Naturally, since only a size chart is available, it is easier to work with the traditional definition of measuring the gross measurements.
Fig. 1. Design Signature Curve
Three-Dimensional Grading of Virtual Garment with Design Signature Curves
331
Fig. 2. Cross-sectional view of ease, body and garment
Secondly, a three-dimensional mannequin can be defined as an atlas of surface patches. The virtual garment, which is considered as the second skin of the virtual mannequin can be defined in the same way. The surface patch model that is adopted in this study is the bicubic Bezier tensor-product surface patch, defined as follow: S(u, v) = P B3(u) B3(v)T B3(t) = (B3,0(t) = (1 - t)3 , B3,1(t) = 3 (1 - t)2 t , B3,2(t) = 3 (1 - t) t2, B3,3(t) = t3)
(1) (2)
In equation (1), S is the surface patch defined by the u- and v-parameters; B3(t) are the cubic Bernstein Basis functions in t; and P is the control matrix. Thus, the atlas of a surface is thus defined as a collection of S(u, v). Under this formulation, both the virtual mannequin and the virtual garment are atlases of bicubic tensor-product surface patches. As a simplification of the calculation, u-parameter corresponds to the cross-section while v-parameter corresponds to the vertical section (i.e., height). Typically, the body trunk is divided into layers of four connected patches, for example, left-front waist, right-front waist, left-back waist, and right-back waist. The number of surface patches should be strategically chosen, because if there are too many control points, the Design Signature Curve may not provide enough information to determine a unique solution [16]. Thirdly, the difference between the second skin and the virtual mannequin can be described as an ease allowance curve that varies with the height level (v-parameter). Such difference must be nonnegative. Since this second skin is a final design of the garment, it means that this difference is the ease allowance, which is also a styling ease, rather than the comfort ease or movement ease. For this reason, the functional relationship between the second skin and the virtual mannequin is called the Design Signature Curve and is defined by Equation (3).
332
R. Ng
DSC(v) = (∑i (arclength(SG(u, v)) – ∑ i (arclength(SB(u, v))) /
(3)
∑i(arclength(SB(u, v) Arclength(S(u, v)) = ∫ ds
(4)
In equation (3), SG and SB are the surface patches of garment and body respectively. Therefore, the DSC(v) sums up the total arc length of the garment, subtracts the total arc length of the body and then expresses as a percentage. In equation (4), the Arclength function measures the arc length of the boundary of the surface patch.
Longest Arc 0 B
r1
1 C
1
r2 Shortest Arc A
0 D
Fig. 3. Reconstruction of arc that satisfy the system of equations
4 Virtual Grading In fact, the construction of the DSC is very straightforward. However, the Virtual Grading, i.e., the reconstruction of the second skin from the DSC, requires special strategy. Firstly, since a SDC contains the information of extra girth measurement at different height level, and it is the sum of the differences of each curve segments along the circumference, the value must be subdivided and appropriated to each curve segment so that they can be modified individually. Secondly, on a v- cross-section of the bicubic Bezier patch, it is a cubic Bezier curve, which contains two end points and two control vertices. The positions of both control points must be calculated to match the expanded arc length. Additional conditions must be used. For example, it is canonical to require the body to be at least C1 continuous at the connecting endpoints of neighboring patches. Such condition forces the trajectory of a control point to collide with the line of (BC) or (CD), Pi,2−Pi,3(=Pi+1,0)−Pi+1,1, which is the geometric representation of the C1 continuity (Fig. 3). Thirdly, the position of a control point has three coordinates. Although the height has been fixed by the v-parameter, there are two values, x- and y- coordinates to be determined. Luckily, under the C1 continuity requirement, the ratio between x and y is fixed by the slope of the line segment Pi,2−Pi,3(=Pi+1,0) −Pi+1,1, a unique position of control point can be determined. Finally, it is possible to have two or more configurations of control vertices yielding the same
Three-Dimensional Grading of Virtual Garment with Design Signature Curves
333
arc length. If the second skin should maintain the similar shape structure (r1 and r2) of the underlying body, the ratio between the distances of the control points to the neighboring endpoints should be maintained. Then, the solution is unique, smooth and matches with the underlying body shape. Table 1. The algorithm of the Virtual Grading Step 1. 2. 3. 4.
Action For each key cross-section do Distribute the DSC(v) proportionally according to the arc length ratio of each crosssectional curve segment For each curve segment do Move the position of control vertices to shoot for the new arc length; the trajectory of the movement must collide with the original tangent lines at the end points respectively. i.e. The set of defining equations: New_garment_girth(v) = body_girth(v) * (1 + DSC(v)); C1 continuity at both end points; Shoot while maintaining similar shape proportion. i.e. If multiple solution exist for the set of equations, select the solution that matches with the body shape. End each curve segment; End each key cross-section
Fig. 4. Mannequin
Fig. 5. Digitized image of a mannequin
5 Worked Example The Virtual Grading of a man’s T-shirt is now presented as a verification of the proposed method. The mannequin being used is a male top body without hands (Fig. 4). The white tapes are the reference lines for the pattern unfolding process. The shoulder reference line is for the dart reference. The digitized image of the mannequin is shown in Fig. 5. A T-shirt is drafted with the standard ease amount of according to the body measurements of the mannequin on a Lectra Modaris system (Fig. 6 and Fig. 7). The
334
R. Ng
set of patterns was made into a sample and was digitized into a virtual garment. This process is indispensable because Lectra system supports only the virtual fitting visualization, without the output of the required ease data at each height level.
Fig. 6. Unfolded T-shirt pattern (back)
Fig. 7. Unfolded T-shirt pattern (front)
In Fig. 8, a cross-sectional view of the ease at the chest level is displayed. The outer red line is the garment while the inner red line is the body. The Design Signature Curve is then calculated according to Table 2. For the next size, if the ease percentage is maintained, the new body measurements of the T-shirt can be calculated as in Table 3. The cross-section view of the chest is displayed in Fig. 9. The outer broken blue line is the garment while the inner broken blue line is the body. It can be observed that the fitting remains similarly in both sizes. The corresponding Design Signature Curve is presented in Fig. 10. In this curve, the origin is set at the hip level, measuring upward. Hence, the x-axis is the height. Moreover, the y-axis is the XI, the “Ease %.” Table 2. Table of ease at different levels Girth Measurement Neck Chest Waist Hip
Body Measurement (BM) (cm) 45 106 98 106
T-shirt Measurement (GM) (cm) 46 116 116 116
Ease (%) = (GM-BM)/BM 2.2% 9.4% 18.4% 9.4%
Table 3. Table of ease at different levels after grading Girth Measurement Neck Chest Waist Hip
Body Measurement (BM) (cm) 4 111 103 111
T-shirt Measurement (GM) (cm) 48 121.5 112.7 121.5
Ease (%) = (GM-BM)/BM 2.2% 9.4% 18.4% 9.4%
Three-Dimensional Grading of Virtual Garment with Design Signature Curves
Fig. 8. Cross-section view of ease
335
Fig. 9. Cross-section of view of graded ease
Fig. 10. Design Signature Curve
6 Result and Discussion The purpose of virtual grading is to generate new patterns for other sizes by maintaining the similar fitting ease. In this article, the ease percentage is preserved. The computation of the ease percentage involves the calculation of the difference in the girth measurements of the virtual mannequin and the second skin, and then expresses in the form of percentage. Next, the cross-sectional styling ease can be measured at different height levels. When the full height level of the second skin is processed, the Design Signature Curve is obtained. It should be noted that there are two types of Design Signature Curve, the Angular Design Signature Curve (i.e., fixed angle) and the Holistic Design Signature Curve (i.e. total styling ease on a cross-section). Certainly, the Angular Design Signature Curve can produce a more accurate second skin, but the Holistic Design Signature Curve matches with the common practice of the styling ease in the apparel industry. When the SDC is applied to generate new second skin for other sizes, the variational method must be used because there are conditions to be expressed as equations and must be satisfied. In this article, both methods will be presented and discussed. In the present example, only a bodice is used. Other garment parts, such as the sleeves and the collar, can be treated similarly as long as the second skin is considered as an offset surface of the body.
336
R. Ng
Acknowledgement. This project is financially supported by the Internal Research Grant of Hong Kong Polytechnic University, under the account number of ZZ65. The author would like to express his gratitude.
References 1. Mark, C., Taylor, H.: The Fitting Of Woven Cloth To Surfaces. J. Text. Ins. 47, T477– T488 (1956) 2. Samelson, S.L.: Tchebychev Nets on Two Dimensional Riemannian Manifolds, Ph.D. Thesis, Carnegie-Mellon University (1989) 3. Heisey, F., Brown, P., Johnson, R.: Three-Dimensional Pattern Drafting: A Theoretical Framework. Text. Res. J. 58(3), 1–9 (1988) 4. Heisey, F., Brown, P., Johnson, R.: Three-Dimensional Pattern Drafting: Part I: Projection. Text. Res. J. 60(11), 690–696 (1990) 5. Heisey, F., Brown, P., Johnson, R.: Three-Dimensional Pattern Drafting: Part II: Garment Modelling. Text. Res. J. 60(12), 731–773 (1990) 6. Heisey, F., Haller, K.: Fitting Woven Fabric to Surfaces in Three Dimensions. J. Text. Ins. 2, 250–263 (1988) 7. Efrat, S.: The Development of a Method for Generating Patterns for Garment That Conform to the Shape of the Human Body, Ph.D. Thesis, Leicester Polytechnic (1982) 8. Shen, L., Huck, J.: Bodice Pattern Development Using Somatographic and Physical Data. I.J.C.S.T. 5(1), 6–16 (1993) 9. Hinds, B.K., McCartney, J., Hadden, C., Diamond, J.: 3D CAD for Garment Design. I.J.C.S.T. 4(4), 6–14 (1992) 10. Hinds, B.K., McCartney, J., Woods, G.: Pattern Development for 3D Surface. CAD 23(8), 583–592 (1991) 11. Bassett, R.J., Postle, R.: Fabric Mechanical and Physical Properties, Part 4: The Fitting of Woven Fabrics to a Three-Dimensional Surface. I.J.C.S.T. 2(1), 26–31 (1990) 12. Wang, Z., Newton, E., Ng, R., Zhang, W.Y.: Ease Distribution in Relation to the X-line Style Jacket. Part 1: Development of a Mathematical Model. J. Text. Ins. 97(3), 247–256 (2006) 13. Wang, Z., Newton, E., Ng, R., Zhang, W.Y.: Ease Distribution in Relation to the X-line Style Jacket. Part 2: Application to Pattern Alteration. J. Text. Ins. 97(3), 257–264 (2006) 14. Hu, Z., Ding, Y., Yu, X., Zhang, W., Yan, Q.: A Hybrid Neural Network and Immune Algorithm Approach for Fit Garment Design. Text. Res. J. 79, 1319–1330 (2009) 15. Cho, Y., Komatsu, T., Inui, S., Takatera, M., Shimizu, Y., Park, H.: Individual Pattern Making Using Computerized Draping Method for Clothing. Text. Res. J. 76, 646–654 (2006) 16. Ng, R.: Computer Modelling for Garment Pattern Design, Ph. D. Thesis, Hong Kong Polytechnic University (1998)
A Model of Shortcut Usage in Multimodal HumanComputer Interaction Stefan Schaffer1, Robert Schleicher2, and Sebastian Möller2 1
Research training group prometei, Franklinstr. 28-29, 10587 Berlin, Germany
[email protected] 2 Deutsche Telekom Laboratories, TU Berlin, Ernst-Reuter-Platz 7, 10587 Berlin, Germany {Sebastian.Moeller,Robert.Schleicher}@telekom.de
Abstract. Users of multimodal systems have to choose between different interaction strategies. Thereby the number of interaction steps to solve a task can vary across the available modalities. In this work we introduce such a task and present empirical data that shows that strategy selection of users is affected by modality specific shortcuts. The system under investigation offered touch screen and speech as input modalities. We introduce a first version of an ACTR model that uses the architectures-inherent mechanisms production compilation and utility learning to identify modality-specific shortcuts. A simple task analysis is implemented in declarative memory. The model reasonably accurate matches the human data. In our further work we will try to get a better fit by extending the model with further influence factors of modality selection like speech recognition errors. Further the model will be refined concerning the cognitive processes of speech production and touch screen interaction. Keywords: Multimodal HCI, User Modeling, Automated Usability Evaluation.
1 Introduction The research interest in automated usability evaluation has been rapidly grown in the last years [1,2]. At the same time systems with novel interaction techniques like multimodal input become more and more popular. Thus the simulation of multimodal interaction is of explicit interest for the field of automated usability evaluation. Simulation frameworks which can directly be used by interaction designers are still rare. Möller et al. [3] introduced the MeMo workbench for semi-automatic usability testing. MeMo incorporates a general user simulation framework, applicable to different kinds and classes of systems, such as spoken dialog systems and graphical user interfaces (GUIs). The workbench supports the design of system models with multiple input modalities, but a mechanism to select a specific modality comparable to the behavior of a real user is still missing. In CogTool [4] the designer is also able to build multimodal system models. But the modeler further has to select manually which modality will be used to solve a task. Studies on user behavior in multimodal human-computer interaction revealed that multiple factors are involved when a user selects the input modality for the next V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 337–346, 2011. © Springer-Verlag Berlin Heidelberg 2011
338
S. Schaffer, R. Schleicher, and S. Möller
interaction step. Each user of a system has individual attributes. Deploying the MeMo workbench, Jameson et al. [5] distinguish the following user attributes: perceptual and motor capabilities (e.g. visual acuity); relevant prior knowledge and experience (e.g. amount of experience with systems similar to the one under consideration); and dynamic factors like the amount of attention that the user is devoting to the performance of the task. Cognitive theories assume that the cognitive workload perceived during the interaction also affects modality selection [6]. McCrasken and Aldrich [7] developed a theory of workload which implies that for a single interaction step, speech input can be more demanding than touch input: preparing and uttering a speech phrase is assumed to be more straining than the processes of initiating and performing touch screen interaction [8]. Further situational constraints like a noisy environment (e.g. traffic), so called “eyes busy, hands busy” settings (e.g. driving) and the system itself may affect modality selection. Regarding the system, efficiency in terms of the number of interaction steps can vary for different modalities. In our previous work we observed that users selected speech instead of touch screen input, if speech offered a shortcut in terms of interaction steps [9]. In addition, error-proneness of a system and the accuracy of input devices affect the effectiveness of interaction [10,11]. In several studies it was shown that users tend to prefer more accurate modalities [12,13]. Further Naumann et al. [14] revealed that the likelihood of task success influenced modality selection. All these factors taken together play a part in each individual modality selection what makes it hardly possible to model single effects. The scope of this paper is to build a model for tasks offering shortcuts in terms of interaction steps. We present the Attentive Display study, where human data about modality usage was recorded. More specifically, the Employee Search Task offered a shortcut for speech input compared to touch screen input. In contrast, the Floor Change Task involved the same number of interaction steps in either modality. Both tasks have been modeled with the cognitive architecture ACT-R [15]. In the following, we will briefly summarize the collection of the empirical data which were then used for modeling.
2 Data Collection: Attentive Display Study The Attentive Display is a wall-mounted information system for employee and room search tasks in a smart office environment. Different floors are displayed on an interactive map of an office building. The system is controlled via touch screen or a speech interface. All tasks can be done in either modality. The output is always given via GUI. 2.1 Task In the Attentive Display study participants performed six different tasks [16]. Our modeling efforts so far focused on two specific tasks: the Employee Search Task and the Floor Change Task. The Floor Change Task can be accomplished with one interaction step in both modalities. Using speech, a floor change can be performed by saying a command like
A Model of Shortcut Usage in Multimodal Human-Computer Interaction
339
“Go to floor eighteen”. After processing speech input the map of floor eighteen is displayed. Using touch screen a specific floor button has to be pressed to accomplish the task (cf. Fig.1).
floor buttons
Fig. 1. The GUI of the Attentive Display. After pressing a floor button the map of a specific floor is displayed. Goal
Start
Speech
“Search Patrick Pattern“
open search mask
type
confirm
type
Fig. 2. The Employee Search Task. Speech input: a direct transition from the start state to the goal state is possible by saying a keyword in combination with an employee’s name. Touch screen input: press the button to open the employee search mask; type the name; press the button with the desired name (the information with the employee’s bookings pops up).
Fig. 2 shows an example for the Employee Search Task. To search an employee by means of the touch screen, a user has to open the search mask, type a name and confirm the entry. While typing a name, a list with all names matching the letters
340
S. Schaffer, R. Schleicher, and S. Möller
entered so far is shown. The more letters are correctly entered, the shorter the list gets. The list consists of buttons with employee names as textual labels. If a button is pressed, information about the current workplace and room bookings of this employee is displayed. Via speech input direct search is possible by saying a keyword in combination with an employee’s name: e.g. “Search Patrick Pattern”. The system straightly displays the information of the employee. 2.2 Participants Thirty-six German-speaking participants (17 male, 19 female) between the age of 21 and 39 (M=31.24, SD=3.45) took part in the study. A single experiment took approximately one hour. Participants received a remuneration of € € 10. 2.3 Procedure At first demographic data was gathered using a questionnaire. After that the system was explained and the usage of touch and speech was demonstrated. Next the participants obtained detailed information about the system in written form. Open questions were answered, if the information was not sufficient. The real test comprised 3 blocks: (1) speech, (2) touch, and (3) multimodal. To consider learning effects the first two blocks were permuted after each participant. In the multimodal run the preferred modality could be selected in each interaction step. As the multimodal block was always the last, the participants had prior experience with both modalities in this block. Within each block the same six tasks were presented. Table 1 shows the interaction steps (IS) the participants could derive from the instructions for the Employee Search Task and the Floor Change Task. Table 1. Interaction steps of the Employee Search Task and the Floor Change Task Task Floor Change Task Employee Search Task
Modality Touch Speech Touch
Speech
Interaction Step 1. Press the button of the desired floor 1. Say “Go to floor
” 1. Press the button “Employees” 2. Type first letter of employee name 3. Type second letter of employee name1 4. Press the button labeled with the desired name 1. Say “Search ”
2.4 Results As the Floor Change Task (FCT) can be conducted with one interaction step in either modality, its profit PFCT of speech usage amounts to zero: 1
Here the minimum number of necessary interaction steps is derived from the instructions. It was also possible to type the whole name before pressing the button labelled with the desired name.
A Model of Shortcut Usage in Multimodal Human-Computer Interaction
PFCT = ISTouch - ISSpeech = 1 – 1 = 0 .
341
(1)
Speech input was selected by 22.2% of the participants. A χ²-test revealed significant differences between both modalities (χ² (1, N=36) = 11.11, p=.001). The profit PEST of speech usage for the Employee Search Task (EST) amounts to three: PEST = ISTouch - ISSpeech = 4 – 1 = 3 .
(2)
Here, speech usage in 75% of the cases could be observed. Again a χ²-test revealed significant differences between both modalities (χ² (1, N=36) = 9, p=.004). 2.5 Discussion The study indicates that the profit of speech interaction affects modality selection. Speech input was strongly preferred in the Employee Search Task. The shortcut via speech was used. Users appeared to seek efficient interaction strategies. Different results were observed for the Floor Change Task. An equal amount of IS for both modalities implied that usage frequency should be equal for both modalities. However, the data showed that users rather tended to use touch input in this condition. A possible explanation for this pattern will be given in the general discussion in section 5.
3 ACT-R Model A simple task analysis was implemented in the declarative memory of the model. An example for the Employee Search Task is shown in Table 2. Each interaction step is represented by a chunk with a unique name (e.g. “VOICE_1”). In the step slot the current interaction step is encoded. The action slot specifies which modality will be used to perform system input. The next slot defines which chunk will subsequently be retrieved to go on within a task. Table 2. Representation of interaction steps of the Employee Search Task in the declarative memory of the model. Each line in the table represents a chunk encoding an interaction step. Declarative Memory VOICE_1 step ONE action SPEECH next FINISH TOUCH_1 step ONE action TOUCH next TWO TOUCH_2 step TWO action TOUCH next THREE TOUCH_3 step THREE action TOUCH next FOUR TOUCH_4 step FOUR action TOUCH next FINISH FINISH step FINISH action FINALIZE next ONE
In the current version the models procedural memory consists of production rules for retrieving instructions (“retrieve”), encoding instructions (“encode”) and finishing the task (“finish”). Fig. 3 shows how the production rules are linked.
342
S. Schaffer, R. Schleicher, and S. Möller
At the beginning of a new task the model always retrieves the chunk with the value “ONE” in the step slot. For touch screen usage this is the chunk “TOUCH_1”. The value “TWO” is derived from the next slot. That means that the chunk to be subsequently retrieved should have the value “TWO” in its step slot. The information in the action slot is currently not affecting the model behavior. It will be used in upcoming versions to consider cognitive processes of touch and speech interaction and visual search. The models retrieval of interaction steps goes on, until the finish chunk is decoded. When the production rule “finish” fires, the model realizes that the task is processed to the end. For speech usage the chunk “VOICE_1” is retrieved at first. The next slot of this chunk directly guides to the “FINISH” chunk. When this chunk is decoded, the “finish” production fires again and task processing ends. To model the Floor Change Task only slight changes have to be made. The chunks “TOUCH_2”, “TOUCH_3” and “Touch_4” can be deleted and the next slot of the chunk “TOUCH_1” has to be changed to the value “FINISH”. The speech interaction steps do not have to be changed in this version of the model.
encode
retrieve
finish
Fig. 2. The procedural memory of the model
The declarative instruction chunks are associated through the step and next slots. This representation features a practical flexibility, which can be used for simulating multimodal interaction with ACT-R. If two different chunks with the same step slot are added to declarative memory, different interaction strategies (chunks) can be retrieved (e.g. “TOUCH_1” and “VOICE_1”). Further different subsequent chunks can be defined in the next slots. Thereby it has to be taken into account that chunks with the same value in the step slot are usually chosen randomly. Hence the model does not reproduce human behavior. We solved this problem by making use of the ACT-R inherent mechanisms production compilation and utility learning. The production compilation mechanism combines two productions into one new rule (production) and substitutes retrievals from declarative memory directly into the new rule. Thus specialized productions for speech and touch interaction are created.
IS DM 1
VOICE_1
PM retrieve
Production Compilation retrieve V1 & encode
encode
2
FINISH
retrieve
retrieve FIN & finish
finish
Fig. 3. The effect of production compilation for speech interaction. For each interaction step the “retrieve” production, the ”encode” (or “finish”) production and a chunk are compiled into one new production.
A Model of Shortcut Usage in Multimodal Human-Computer Interaction
343
Utility learning rewards all productions, which are involved in reaching the goal. The total reward is a stated value and spreads over the involved productions. Consequently the reward per production is lower if more rules were involved. By means of these mechanisms we let ACT-R learn the utilities of new production rules. After several model runs the strategy involving less production rules has a higher utility. Hence a modality is used with a higher probability if it offers a shortcut.
4 Results Fig. 4 and 5 show the usage of speech input of respectively the Employee Search Task and the Floor Change Task, with the data in dashed lines and the model results in solid lines. The presented model data represents the average speech usage after 1000 independent model runs. Within each model run the task was conducted 100 times. Within the first approximately (~) 30 trials of the Employee Search Task random modality selection can be observed. From ~30 to ~40 trials the utility values of the newly learned speech productions (cf. Fig 3) increase. The increase ends at 61.84%. In the Attentive Display study speech usage of 75% emerged in the empirical Data. The model underestimates modality usage for the Employee Search Task. Misjudgment accounts for over 10%. Within the first ~40 trials of the Floor Change Task random modality selection can be observed. After ~40 trials the utility learning ends up and speech usage sets at a level of 51.97%. In the Attentive Display study speech usage of 22.2% emerged in the empirical Data. The model highly overestimates modality usage for the Floor Change Task. Misjudgment accounts for nearly 30%. Employee Search Task 100
Percentage of speech usage
90 80 70 60 50 40 30 20 Data Model
10 0 1
8
15
22
29
36
43
50
57
64
71
78
85
# of trials
Fig. 4. Modality usage in the Employee Search Task
92
99
344
S. Schaffer, R. Schleicher, and S. Möller
Floor Change Task 100
Percentage of speech usage
90 80 70 60 50 40 30 20 Data
10
Model
0 1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
# of trials
Fig. 5. Modality usage in the Floor Change Task
5 Conclusion The results indicate that that the models’ trend to approximate human data is reasonably accurate for the high profit task. The ACT-R inherent mechanisms production compilation and utility learning cause speech usage to increase up to a level of 61.84%. With a better parameter tuning the models’ fit to human data should even be improvable for the Employee Search Task. For the Floor Change Task the model fails. Respective to our assumptions the level of speech usage adjusted to ~50%. This is the level where modality learning should theoretically end up due to the used ACT-R learning mechanisms. However with a speech usage of only 22.2% human data showed a considerably different value. The results indicate that other factors of modality selection have a higher influence, if the number of interaction steps is equal for both modalities. Personal preferences or situational factors might be of high impact. Other efficiency- related factors might also be relevant. An easy to measure factor is the time that one interaction step takes. In many studies touch was more efficient in terms of time, if one interaction step was needed in both modalities to fulfill the task [17,18]. Up to now the model does not consider the effect of automatic speech recognition (ASR) errors. This type of system errors highly affects the efficiency of interaction. If an ASR error occurs the profit of speech usage might vanish, because the user needs interaction steps to recover from the error. Users might stick at touch input if they cannot benefit from speech usage. In the Attentive Display study 66.8% ASR errors occurred. This high error rate might explain sparse speech usage in the non-profit condition.
A Model of Shortcut Usage in Multimodal Human-Computer Interaction
345
It also must be mentioned, that the cognitive processes of language production and touch screen input are difficult to compare. Cognitive theories imply that preparing and uttering a speech phrase might be more loading then performing touch input [7]. But up to now these differences are difficult to measure and no standardized method has established. In order to make the interaction more realistic, we will further develop the ACT-R model and the Attentive Display simulation. A module for realistic reproduction of errors will be integrated to consider the influence of ASR errors on modality selection. Additionally the model will be refined concerning the cognitive processes of speech production and touch screen interaction Acknowledgments. The research is funded by the German Research Foundation (DFG - 1013 'Prospective Design of Human-Technology Interaction') and supported by Deutsche Telekom Laboratories.
References 1. Möller, S.: Assessment and Evaluation of Speech-Based Interactive Systems: From Manual Annotation to Automatic Usability Evaluation. Speech Technology, 301–322 (2010) 2. Baker, S., Au, F., Dobbie, G., Warren, I.: Automated Usability Testing Using HUI Analyzer. In: 19th Australian Conference on Software Engineering, ASWEC 2008, pp. 579–588 (2008) 3. Möller, S., Englert, R., Engelbrecht, K., Hafner, V., Jameson, A.: MeMo: Towards Automatic Usability Evaluation of Spoken Dialogue Services by User Error Simulations. System, 1–4 (2006) 4. John, B.E., Salvucci, D.: Multipurpose prototypes for assessing user interfaces in pervasive computing systems. Pervasive Computing 4, 27–34 (2005) 5. Jameson, A., Mahr, A., Kruppa, M., Rieger, A., Schleicher, R.: Looking for Unexpected Consequences of Interface Design Decisions: The MeMo Workbench. In: 19th Australian Conference on Software Engineering, ASWEC 2008, pp. 579–588 (2008) 6. Wickens, C.D.: Multiple resources and performance prediction. Theoretical Issues in Ergonomics Science 3, 159–177 (2002) 7. McCrasken, J.H., Aldrich, T.B.: Analysis of selected LHX mission functions: Implications for operator workload and system automation goals (Technical Note ASI479-024-84). Fort Rucker, AL (1984) 8. Bierbaum, C.R., Szabo, S.M., Aldrich, T.B.: A comprehensive task analysis of the UH-60 mission with crew workload estimates and preliminary decision rules for developing a UH60 workload prediction model (Technical Report ASI690-302-87[B], Vol I, II, III, IV). Fort Rucker, AL (1987) 9. Wechsung, I., Engelbrecht, K.-P., Naumann, A., Möller, S., Schaffer, S., Schleicher, R.: Investigating Modality Selection Strategies. In: Workshop on Spoken Language Technology (SLT). IEEE, Los Alamitos (2010) 10. Card, S.K., Mackinlay, J.D., Robertson, G.G.: The design space of input devices. In: Proc. SIGCHI 1990, pp. 117–124. ACM Press, New York (1990) 11. Chen, X., Tremaine, M.: Patterns of Multimodal Input Usage in Non-Visual Information Navigation. In: Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS 2006). IEEE, Los Alamitos (2006)
346
S. Schaffer, R. Schleicher, and S. Möller
12. Bilici, V., Krahmer, E., Riele, S., Veldhuis, R.: Preferred Modalities in Dialogue Systems. In: Proc. ICSLP 2000, pp. 727–730 (2000) 13. Suhm, B., Myers, B., Waibel, A.: Model-based and empirical evaluation of multimodal interactive error correction. In: Proc. CHI 1999, pp. 584–591. ACM Press, New York (1999) 14. Naumann, A.B., Wechsung, I., Möller, S.: Factors Influencing Modality Choice in Multimodal Applications. In: Perception in Multimodal Dialogue, pp. 37–43 (2008) 15. Anderson, J.R., Bothell, D., Byrne, M.D., Douglass, S., Lebiere, C., Qin, Y.: An integrated theory of the mind. Psychological Review 111, 1036–1060 (2004) 16. Schaffer, S., Seebode, J., Wechsung, I., Metze, F., Möller, S.: Benutzerstudien zur Bewertung multimodaler, interaktiver Anzeigetafeln in unterschiedlichen Entwicklungsstufen. In: Kain, S., Struve, D., Wandke, H. (eds.) Workshop-Proceedings der Tagung Mensch und Computer 2009, pp. 22–27. Logos Berlin, Berlin (2009) 17. Perakakis, M., Potamianos, A.: Multimodal system evaluation using modality efficiency and synergy metrics. In: Proc. ICMI 2008, pp. 9–16. ACM Press, New York (2008) 18. Wechsung, I., Naumann, A.B., Möller, S.: Multimodale Anwendungen: Einflüsse auf die Wahl der Modalität. Mensch & Computer, 437–440 (2008)
Multimodal User Interfaces in IPS² Ulrike Schmuntzsch and Matthias Rötting Technische Universität Berlin, Department of Psychology and Ergonomics, Chair of Human-Machine Systems, Franklinstr. 28-29, 10587 Berlin, Germany {ulrike.schmuntzsch,matthias.roetting}@mms.tu-berlin.de
Abstract. The field of Industrial Product-Service Systems (IPS²) faces various challenges. Goal of the recently started project is the multimodal design of interaction specific warnings and instructions in IPS² for human operators with relation to their interactions. This approach should help to prevent mistakes of human operators whilst interaction and it should account for the support and optimization of work process in the heterogeneous area of IPS². Here we discuss how such a project can be realized step-by-step. Keywords: Multimodal user interfaces, interaction specific warnings, user generated instructions, IPS².
1 Introduction Regarding industrial production a global trend is the growing specialization and complexity within supply chains. The automation level increases over the time and the pressure to improve productivity grows in many areas. Procedures become increasingly multifaceted and require more expert knowledge. Especially the globalised field of Industrial Product-Service Systems (IPS²) is faced with these challenges. IPS² are characterized by integrated and reciprocally determined design, development, provision and utilization of products and services. This also includes the possibility to substitute partial aspects of services and products [1] during the design and provision phases of IPS². These highly automated production systems can be seen as socio-technical systems involving both human factors and mechanical components. Human factors can be described as a set of human-specific physical, cognitive, or social skills and abilities such as attention, information processing, or dealing with stress [2]. Thus, within socio-technical systems such as IPS² the interplay between human operators as well as between technical components and the human-machine interaction is important. A variety of different requirements put high demands on human operators within their working environment. The dynamic and complex structure requires flexible operations and adaptability. Furthermore there is a need for broad understanding as well as special system knowledge. For these purposes, it is important to provide concepts that enable human operators to handle the complexity and dynamic as well as to optimize human reliability and performance. At least the developed systems should on the one hand reduce human error and adapt to it. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 347–356, 2011. © Springer-Verlag Berlin Heidelberg 2011
348
U. Schmuntzsch and M. Rötting
Most current research and development work focuses mainly on the improvement of technical components, neglecting the human factor within the socio-technical system. As a result 50-80% of the reported incidents are classified as Cause Human [3, 4]. For this reason, approaches are needed which allow an automatic and predictive detection of human errors and malfunctions. The major goal behind this approach is an adaptive and user-centered support of human operators. It is particularly important to pay best care and attention to error situations because they are critical for usability for two reasons. Firstly, they stand for situations where the user is in trouble or even in danger and potentially will be unable to use the system to achieve the desired goals. Secondly, they present options for supporting the user to bring the tasks to a favorable conclusion and in the end to understand the system better. That is why warnings are considered as an essential part of a general interface design strategy of guidance for users. It is important to make sure that the error messages are integrated, coordinated and consistent across one or more applications [5]. In this article an approach how to design a multimodal user interface for IPS² that outputs interaction specific warnings and user generated instructions will be presented. It begins with a brief introduction to the theoretical background of the topic. Chapter three deals with the approach for a multimodal user interface in IPS². Based on the theoretical background and the described approach the future considerations will be briefly outlined in chapter four. The article concludes with a discussion in chapter five and a short summary in chapter six.
2 Theoretical Background Beside the special context of IPS², two aspects seem to be most important concerning the project definition – the idea of a multimodal user interface and the interaction specific warnings. 2.1 Multimodal User Interfaces Multimodal user interfaces are characterized by two or more combined user input and system output modes which are featured in a coordinate manner. This kind of interfaces intends to identify naturally occurring human language and behavior, which include at least one recognition-based technology. In order to support intelligent adaptation to the user, task and context, information on the user’s interface and the surrounding environment will be tracked and included from multiple sensors. Furthermore information can be transmitted to the user in different modes. Wellknown examples are the presentation of speech or sound and their registration on the acoustical side. Regarding the visual side, pictures, text or videos are often used as system output. Likewise gestures, facial expressions and eye movements can be analyzed. Haptic feedback is often transmitted by pressure or vibrations and registered by force or manual operations [6]. Figure 1 provides a brief overview of the different input and output modes in human-machine systems.
Multimodal User Interfaces in IPS²
349
Fig. 1. The three perception modes on which interface designers mainly focus
Several advantages are offered by the approach using multimodal interfaces. For example, it allows different users to interact with technical system in a way which best suits their individual skills and abilities. Also, human factors such as the native language status, the skill or the age of the human operator can be taken into consideration. Furthermore a continuous accommodation to changing working conditions and contexts is possible. For example, if there is a high noise pollution, it is feasible to switch from the acoustic mode to the visual or the haptic one, or to receive these feedback modalities as back up. Moreover, in case of temporary disability, such as a human operator is not able to use an input mode for a period of time, it is also possible to use another one in order to perform the task. Thus, temporary disabilities must not be a factor anymore why assemblers who are in other respects healthy cannot execute their work [7]. Additionally, these interfaces fulfill users’ needs to interact multimodally because they are already familiar to communicate in such a way during their natural interpersonal relationships. Regarding the error handling, various authors acknowledge that the multimodal design is superior in error avoidance and successful recovery from errors [8]. 2.2 Warnings in Human-Machine Systems Warnings are an essential part of a general interface design strategy of user guidance. Difficulties in understanding the warnings and instructions sometimes encountered when they do not correspond with the way a human operator usually perform their tasks. This indicates an obvious information gap in transitioning from error message to supporting the user in performing the corrective action. Moreover it is a sign of a gap between the system model designers have and the users´ mental model. The last one can be described as a representation of the surrounding world, the relations between its individual components as well as the intuitive perception of humans about their own activities and their consequences [9]. Overall, warnings and instructions are a sensitive issue. Since errors occur because of incorrect understanding, inadvertent slips, or a lack of knowledge, users are likely to feel uncertain or confused as well as inadequate. Furthermore, users are scared by an imperious tone, so that it makes more difficult to correct the problem. Thus, warnings have to be specific and constructive as well as formulated positively and in an appropriate physical format [10]. Moreover the dialogues should be simple and
350
U. Schmuntzsch and M. Rötting
natural in order to speak the user’s language. Information objects and operation must be presented in a sequence which corresponds with the way a user will accomplish the task mostly. Therefore it is necessary to discover mappings between machine information and the user’s mental model of that information. This requires carrying out a so called Usability Engineering Process. That involves a structured approach with empirical proven methods. In the following a common way for realization will be introduced. 2.3 Usability Engineering Lifecycle The Usability Engineering Lifecycle by Deborah Mayhew consists of the following three phases: 1. 2. 3.
Requirements Analysis Design/ Testing/ Development and Installation [11].
Each phase consists of a series of tasks. It begins with the requirements analysis where, for example, user profiles and task analyses have to be carried out. Thus, it is intended to describe specific characteristics of the potential users, for example the level of job experience, expected frequency use. The developed user profiles will influence the design decisions according to the user interface and also help to identify the major user categories. The tasks analyses are used to study work-flows and current user tasks in order to get an understanding of the underlying user goals. The gathered information will be used to set the usability goals and to guide the user interface design process. Additionally, the platform capabilities and constraints as well as the general design principles have to be considered within the requirement analysis. As a result usability goals and design style guides are available for the second phase. Therefore it is necessary to survey user profiles and task analyses carefully and with best attention. Based on the established usability goals and the developed style guides, the design, testing and development of the user interface is carried out in the second phase. This is regarded as an iterative process in which the development runs from conceptual models and mockups to increasingly sophisticated prototypes. At the end, a detailed user interface will be tested. Here it is important to make sure that the established usability goals are met and that all system functionalities are addressed. Otherwise it does not make any sense to install the half-finished user interface in human operators´ work environment because it would only be counterproductive. Phase three includes the installation and the spot survey of human operators in the daily working routine. At this point, various problems may emerge so that a reengineering process is necessary. Concluding, it is really important to pay best care and attention to the first phase where user profiles and task analyses are performed. The more time and effort is invested here, the more it is then saved later on. Figure 2 shows an overview of the complete Usability Engineering Lifecycle.
Multimodal User Interfaces in IPS²
351
Fig. 2. The Usability Engineering Lifecycle by Deborah Mayhew. It consists of three phases – requirements analysis, development / design / testing and installation. Each phase consists of a series of tasks [12].
352
U. Schmuntzsch and M. Rötting
3 Approach for a Multimodal User Interface In accordance with the heterogeneous and dynamic context of IPS² and the given particular importance of error situations, the remit of the project is to design a multimodal user interface that outputs interaction specific warnings and user generated instructions. The mission of this project is to take the human factors more into consideration within such a socio-technical system like IPS². Consequently, it is intended to make an optimal use of the human abilities and skills. Weaknesses and limitations should be compensated by the technical system. The major goals behind this mission are to prevent mistakes by human operators whilst interacting and to support as well as to optimize work processes. Because of the heterogeneity of users and contexts both individual and contextual assistance is regarded as particularly useful. Based on the automatic detection and evaluation of human interaction, multimodal instructions and warnings will be generated and then presented contextually to the human operator. For theses goals, we are striving to develop a concept which includes the following three aspects: 1. 2. 3.
contextual and multimodal assistance instructions expert platform
First of all there is a need to provide contextual and multimodal assistance for human operators who are going to make mistakes while operating. Instructions about how to do better also have to be featured. And third an expert platform is required, e.g. in order to collect all previous successful solutions to fall back on it in case of problems. The integrated combination of these three aspects is necessary to support the human operator, so that human abilities and skills can be used in an optimal way. The realization means in detail that a decision has to be made whether the user or the system state or both should be influenced. If the current situation is classified as less critical and solvable then it is intended to support the human operator in task performance. But if there is a safety-relevant situation, the machine state should be changed, e.g. to stop the engine and to provide information to the human operator [13]. After deciding who is getting regulated or changed, it has to be determined how complex and in which granularity the future action should be implemented. Regarding automation systems it is quite common to distinguish between different types of support functions, like to inform or to warn the user as well as to suggest one or more action alternatives or even to implement a suitable action by the machine itself [14]. Thereafter an action will be initiated which can mean a regulation on human side and/or a change of the technical system state. These operations lead to a new state of the socio-technical system which affects again both human operator and machine. To realize this concept, detailed information about products and services as well as their reciprocal effects are necessary. Furthermore an adequate knowledge of the system state has to be provided in order to initiate specific instructions, e.g. to open / close a valve or to speed up the flow rate. Especially, information about the detected and classified deviation of human operators from the predicted action sequences will be required. For this purpose, cognitive user models are developed in order to
Multimodal User Interfaces in IPS²
353
simulate the optimum user behavior and to predict the future behavior [15]. All these information are provided by other subprojects. The interdependencies between the different subprojects and their provided information are shown in figure 3.
Fig. 3. A demonstrator at a micro production scenario in IPS² is shown. The aim is to develop a multimodal user interface that provides interaction specific warnings and user-generated instructions. Detailed sensor data and cognitive user models are required as well as information about products and services.
Regarding the project goals it is intended to accomplish a Usability Engineering Process according to the proposed concept by Deborah Mayhew.
4 Future Considerations Considering both Usability Engineering Lifecycle and the described project goals, preparatory steps are taken in order to develop a system which provides contextual and multimodal assistance to human operators, who are going to make mistakes during interaction. First of all the user interface requirements have to be clearly defined, e.g. by applying ethnographical observations of potential users and their work environment. Furthermore major problems in the execution of different work
354
U. Schmuntzsch and M. Rötting
steps have to be identified through task analysis. Therefore action oriented methods such as the Hierarchical Task Analysis (HTA) or cognitive task analysis like the Critical Action and Decision Evaluation Technique (CADET) can be used [16]. After getting a feeling for the daily working routine, the everyday speech and the problems human operators can face, a concept for contextual and multimodal assistance can be worked out in exemplary fault scenarios. At the beginning the developed interaction-specific warnings can be tested in paper-pencil versions or by using the card sorting method or a paired similarity rating. It is well-known from usability research experiments that the vote-winning terms enable users to get a better understanding of the system. For example, they can generalize their knowledge more often to correctly use it in new ways [17]. After making sure that user’s mental model of the system matches with the ideas of the warning concept, it should be implemented and tested in scenarios with a prototype. The warnings can then be optimized and tested again.
5 Discussion To design a multimodal user interface that outputs interaction specific warnings and user-generated instructions as well as to implement it within the heterogeneous area of IPS² it takes time. The heterogeneity of users and contexts dominates the work environment. That is why the aspect of multimodality and sensitivity for contexts as well as for individual user preferences are of such importance. But this heterogeneity and individuality is also a great challenge for developing a multimodal user interface for IPS². Therefore it is necessary to focus only on a few exemplary user contexts or fault scenarios and to work these out well. Moreover, it has to be pointed out that the current behavior of the human operator as well as the estimation of the situation criticality and the presentation of the interaction specific warnings have to be transmitted in real-time. Another open question will be what kind of processes and operations are supportable at all. Firstly, it is potentially feasible to present interaction specific warnings for each sequence of actions which is well-structured and previously defined. But every procedure that is executed spontaneously by the human operator will be difficult to deal with, because no cognitive user models or machine learning algorithms patterns could be previously defined. Thus, it is not possible to find a corresponding action in the database and so no appropriate warning or instruction can be displayed. Regarding the acceptance of the proposed concept in real work environments it has be clarified how the human operator will interact with the developed system and how the digital data storage will be accepted. Finally, it has to be questioned critically to what extent the developed technology will be used for. Major goals are to provide technical support to human operators in order to protect man and machine against risks as well as to optimize circuitous and unnecessary work processes. But there is also the risk that the continuous data storage a feeling of being monitored as well as time pressure will raise by the human operator
Multimodal User Interfaces in IPS²
355
due to. This might be a great challenge concerning to the acceptance of the technology. Another issue raised by the data storage will be the unresolved question of data security. These and other challenges have to be considered.
6 Summary The written article describes an approach how to develop a multimodal user interface that provides interaction specific warnings and instructions in the heterogeneous field of IPS². This special industrial sector is determined by a dynamic and complex structure in which the design, provision and utilization of products and services are integrated and reciprocally. The theoretical background of this project is the design of multimodal user interfaces as well as its current inventions. Therefore the particular conditions and challenges in the field of IPS², e.g. regarding the micro production, should be carefully analyzed and considered in the planning phase. It seems most important that best care and attention should be paid to the implementation of a Usability Engineering Process in order to develop an appropriate and usable multimodal interface. Accordingly, in the near future it is intended to get an idea of the daily working routine and the special problem the human operator can be faced with. Based on these consolidated findings the project work will be put forward step-by-step. Finally, the designed multimodal user interface leads to a more user-centered approach in IPS² which contributes to prevent mistakes of human operators and to support as well as to optimize work processes. The TRR 29 B4 project aims to provide a method to integrate human factors into IPS² as a core context in its interaction techniques. Acknowledgments. We thank the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft) for funding this research within the Transregional Collaborative Research Project TRR 29 on Industrial Product-Service Systems – dynamic interdependencies of products and services in production area.
References 1. Meier, H., Uhlmann, E., Kortmann, D.: Hybride Leistungsbündel – Nutzenorientiertes Produktverständnis durch interferierende Sach- und Dienstleistungen. In: Wt Werkstattstechnik Online, 95, Jahrgang, pp. 528–532. Springer-VDI-Verlag, Heidelberg (2005) 2. Wickens, C.D., Gordon-Becker, S.E., Liu, Y., Lee, J.D.: Introduction to Human Factors Engineering – International Edition, 2nd edn. Prentice Hall, New Jersey (2003) 3. Major Accident Hazards Bureau (MAHB): Major Accident Reporting System (MARS), http://mahbsrv.jrc.it/mars/Default.html (October 01, 2007) 4. Working Group on Chemical Accidents: Preparation of a work-shop on human factors in chemical accidents and incidents. 5. In: Proceeding at the 16th Meeting of the Working Group on Chemical Accidents, Varese (2006) 5. Reason, J.: Human Error, 2nd edn. Cambridge University Press, Cambridge (2003)
356
U. Schmuntzsch and M. Rötting
6. Hedicke, V.: Multimodalität in Mensch-Maschine-Schnittstellen. In: Timpe, K.-P., Jürgensohn, T., Kolrep, H. (eds.) Mensch-Maschine-Systemtechnik – Konzepte, Modellierung, Gestaltung, Evaluation, 5th edn., pp. 203–232. Symposion Publishing GmbH, Düsseldorf (2002) 7. Oviatt, S.: Multimodal Interfaces. In: Jacko, J., Sears, A. (eds.) Handbook of HumanComputer Interaction. Lawrence Erlbaum, New Yersey (2002) 8. Oviatt, S., Bernard, J., Levow, G.: Linguistic adaptation during error resolution with spoken and multimodal system. Language and Speech, Special Issue on Prosody and Conversation 41(3-4), 415–438 (1999) 9. Norman, D.A., Draper, S.: User centered system design: New perspectives on humancomputer interaction. Lawrence Erlbaum Associates, Mahwah (1986) 10. Shneiderman, B., Plaisant, C.: Designing the User Interface – Strategies for Effective Human-Computer Interaction, 5th edn. Addison-Wesley, Boston (2010) 11. Mayhew, D.: The Usability Engineering Lifecycle – A Practitioner’s Handbook for User Interface Design. Academic Press, San Francisco (1999) 12. Mayhew, D.: The Usability Engineering Lifecycle, http://www.isrc.umbc.edu/HCIHandbook/figures/47-01.jpg (January 31, 2011) 13. Dzaack, J., Höge, B., Rötting, M.: User-Centric and Contextual support in IPS2. In: CIRP IPS2 Conference, pp. 13–19 (2010) 14. Parasuraman, R., Sheridan, T.B., Wickens, C.D.: A model for types and levels of human interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics 30, 286–297 (2000) 15. Beckmann, M., Dzaack, J.: Incorporating emotion data and cognitive models in IPS2 (in press, 2011) 16. Embrey, D.: Task Analysis Techniques. Human Reliability Associates Ltd. (2000) 17. Nielsen, J.: Usability Engineering. Academic Press, San Francisco (1993)
The Application of the Human Model in the Thermal Comfort Assessment of Fighter Plane’s Cockpit Haifeng Shen and Xiugan Yuan School of Aeronautic Science and Engineering, Beihang University, 100191, Beijing, P.R. China [email protected]
Abstract. Thermal comfort is an important content concerned by aircraft designing and using department. Computational Fluid Dynamics (CFD) is a main numerical simulation method used in this field. A human model of a pilot is built according to its geometrical and thermal features in order to simulate the air flow and heat transfer in a fighter plane’s cockpit. The velocity and temperature fields are obtained by using a numerical simulation flat involving the human model above. Meanwhile, an experiment was carried out to prove the effectiveness of numerical simulation. The comparison between calculation results and experiment results shows that the calculation error of velocity and temperature is 15% and 6% respectively, which proves that the human model can be used for numerical simulation and satisfy the requirement of simulation precision. Furthermore, some thermal comfort assessment criteria such as the mean skin temperature and the core temperature of the body are obtained by putting calculation result above into the Human Thermal Regulation System (HTRS) model. Keywords: cockpit; CFD; human model; thermal comfort; assessment.
1 Introduction The thermal comfort for a pilot in cockpit is an issue that the plane design and using departments attach great importance to. Velocity and temperature fields formed by the air flow and heat transfer inside a cockpit are the working conditions for a pilot, directly influencing the cockpit’s thermal comfort, which then will cause impacts on the pilot’s working efficiency [1-2]. For this reason, first we should study the velocity and temperature distribution of airflow inside the cockpit. Currently, numerical simulation based on CFD is a fairly popular method of solving the air flow and heat transfer in confined space such as rooms, cabins, and so on. Meanwhile, the human model is a key factor in impacting the precision and rationality of a simulation especially when the issue of comfort is involved [3-10]. A human model for a numerical simulation involves both geometrical and thermal features. As the development of 3D modeling technology, the geometrical features of a human model can be generated by using 3D modeling software. However, the thermal features of a human model are relatively difficult to be described for the V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 357–366, 2011. © Springer-Verlag Berlin Heidelberg 2011
358
H. Shen and X. Yuan
complexity of the heat transfer between human body and external environment. Another difficult point is how to evaluate the thermal comfort after the environment around body is known according to the numerical simulation. To solve the above-mentioned problems, a human model of a pilot is built for an air velocity and temperature numerical simulation. An experiment was carried out to prove the effectiveness of numerical simulation. Afterwards, an assessment is made by combining the simulation above and the HTRS simulation.
2 Calculation Models and Methods 2.1 Governing Equations The basic conservation equations were solved by employing the finite volume and SIMPLEC algorithm. The realizable k − ε turbulence model was used. The simulation results presented in the paper are steady-state results. The air was assumed to be incompressible and its physical properties were assumed constant. The residuals of continuity, momentum, turbulence kinetic energy and its dissipation rate had to reach 5×10-5 magnitude for convergence, while the residual of energy had to reach 1×10-8 in magnitude. The conservation equations were formulated as follows: ∂ρ ∂ (ρu j ) = 0 + ∂t ∂x j
(1)
⎤ ∂ u k ⎞⎟ ∂ ⎡ ⎛⎜ ∂ u i ∂ u j 2 ∂p ∂ ∂ ⎢μ + − δ ij + − ρ u ' i u ' j ⎥ (2) (ρ ui ) + (ρui ⋅ u j ) = − ∂ x k ⎟⎠ ∂x i ∂ x i ∂ x j ⎢ ⎜⎝ ∂ x j ∂x j ∂t 3 ⎥⎦ ⎣
⎤ ∂ ∂ ∂ ⎡ μ ∂T (ρT ) + (ρ u j ⋅T ) = − ρ u ' j T '⎥ + S T ⎢ ∂t ∂x j ∂ x j ⎣⎢ Pr ∂ x j ⎥⎦ ∂(ρu jk ) ∂x
∂ ( ρ u jε ) ∂x
j
=
∂ ∂x
j
j
=
∂ ∂x
⎡⎛ μt ⎢ ⎜⎜ μ + σ ε ⎣⎢ ⎝
j
⎡⎛ μt ⎢ ⎜⎜ μ + σ k ⎢⎣ ⎝
⎞ ∂k ⎟⎟ ⎠ ∂x j
⎤ ⎥ + G k + G b − ρε − Y M ⎥⎦
(3)
(4)
⎞ ∂ε ⎤ ε2 ε ⎟⎟ + c 1ε c 3 ε G b (5) ⎥ + c1 ρ S ε − c 2 ρ x k ∂ k + νε j ⎥ ⎠ ⎦
2.2 Geometrical Features of the Human Model When calculating the velocity and temperature fields by CFD method, grid generation is the first step. To generate high quality grid, a good geometrical model is an
The Application of the Human Model in the Thermal Comfort Assessment
359
important precondition. From the angle of geometrical modeling, the human model is the most difficult part of the whole cockpit geometry. First, the geometry of human body is so complicated that it is hard to be modeled accurately. In fact, the more complicated the geometrical model is, the more difficult it is to generate the grid. Even if the grid can be generated, some deformed cells of grid will probably be produced. Consequently, the convergence and accuracy of numerical simulation will be affected remarkably. On the other hand, if the geometry of human body is oversimplified and some key features are lost, the simulation will not describe the real picture of velocity and temperature fields. Therefore, a reasonable geometry of human model was built according to the standard human body data in Military Criterion [11-12] in this research. Relatively, the sizes and surface features of several main body parts were described precisely, while other unnecessary details were ignored. Moreover, the actual working gestures were taken into consideration in designing. As a result, the human model’s geometrical size, shape and gesture were comparatively closer to the real situation. A fighter plane cockpit’s configuration with human body is indicated in Fig.1.
Fig. 1. Geometry of cockpit with human body
Based on the geometrical model above, unstructured multi-block grid was used to solve the problem. The grid contains 1123265 cells. Supply air travels from a main pipe lying on the left floor, and then is divided into five branches on its way including back hatch pipe, windshield pipe, feet pipe, left nozzle and right nozzle. The outlet is on the right side of the back bulkhead.
360
H. Shen and X. Yuan
2.3 Thermal Features of the Human Model Heat transfer boundary conditions are critical foundations in CFD. A reasonable heat transfer boundary condition should reflect the true thermal features of the boundary and the true heat transfer process between the boundary and the environment. For its physiological characteristics, the thermal features of a human body are too complicated to be described by a simple boundary condition. In past researches, a human body is generally dealt with as an isothermal wall or a uniform heat flux wall. However, each part of the body differs much in their respective physiological characteristic and heat transfer process, so the simplified way above is unreasonable. In this research, a human body was divided into four parts: head, trunk, arms and legs. The generated heat of each part is showed in Table 1. And then each part can be regarded as a uniform heat flux wall. Table 1. Generated heat of each part of the body Body Part Ratio(%) Total Power
Head 18.4
Trunk 66
Arms 7.3
Legs 8.3
150W
3 Experiment 3.1 Measurement Planes and Points Measurement planes chosen for the experiment on measuring velocity and temperature fields are shown as Fig.2 in which plane1 is the symmetry plane of the cockpit (x=0) and plane2 is in front of human body and the vertical distance between plane2 and body is about 50~80mm, in addition, a region near cloth surface was settled for temperature field measurement. Measurement points on each plane or region are indicated in Fig.3. 3.2 Heat Load Simulation and Measurement Equipment Aerodynamic heat load and generated heat load of human body were simulated in the experiment. Aerodynamic heat load was simulated by using radiation heating chamber to heat outside surface of cockpit and the temperature of outside surface was controlled according to the simulation flying condition. Generated heat load was simulated by using a dummy wrapped in heating tape and heat power of each part is shown in Table 1. ATM2400 36-channel airflow and temperature instrument produced by DegreeC company in U.S.A was used to measure velocity field and temperature field in the cockpit.
The Application of the Human Model in the Thermal Comfort Assessment
361
4 Comparison between Numerical and Experimental Data Under the condition of air demand G0=300 kg/h and both nozzles closed. The temperature contours of measurement plane1 and plane2 are presented as Fig.4, Fig.5 respectively. The comparisons of temperature fields between numerical and experimental results are presented as Fig.6. Under this condition, the calculation error of velocity field of plane1 and plane2 compared with experimental results is 13.48% and 18.46% respectively, and the calculation error of temperature field of plane1, plane2 and the region near cloth surface is 5.24%, 5.70% and 8.63%. This shows that human model built above can satisfy the requirement of numerical simulation, which means that the geometrical and thermal features settled above are reasonable.
Fig. 2. Measurement planes
(a) Plane1
362
H. Shen and X. Yuan
(b) Plane2
(c) Near cloth surface
Fig. 3. Measurement points
5 Thermal Comfort Assessment 5.1 Thermal Comfort Assessment Criteria After the velocity and temperature fields in the cockpit are calculated by the above method, the calculation results can be used to assess thermal comfort of the cockpit by referring to the relative standards. There are two assessment methods: one is the mean temperature near the body ( Tdb ) and the other is a physiological method including the mean skin temperature ( Tskin ) and the core temperature (Tcore). The thermal comfort standards of these three assessment criteria are indicated in Table 2 [13]. Tdb can be calculated from numerical simulation results of the temperature field directly, but Tskin and Tcore must be obtained by putting calculation results above into the HTRS model. Table 2. Thermal comfort assessment standards of the cockpit Assessment Criteria
℃ ℃ ,℃
Feeling comfort
Temperature Range Ensuring efficiency
Slight stress
Tdb ,
16.0-26.0
27.0-32.0
33.0-35.0
Tcore,
37.0±0.2
37.0±0.4
>37.6 or <36.4
Tskin
33.0±1.5
33.0±2.6
33.0±4.0
The Application of the Human Model in the Thermal Comfort Assessment
363
5.2 Human Thermal Regulation System Model [14] The actual HTRS is very complicated and can be regarded as a feedback system. The whole system has two subsystems, which are control system and controlled system. Control system means nervous system participating in temperature regulation including regulation center and thermorecceptors. Controlled system is made up of effectors. In HTRS, the controlled variables are Tskin and Tcore. Regulation center collects and integrates all signals received by thermorecceptors, and then drives relevant effectors to generate physiologic reaction of temperature regulation. Therefore, if environment temperature figures can be obtained, Tskin and Tcore will be calculated by using HTRS model.
(a) Calculation result
(b) Experiment result Fig. 4. Temperature contour of measurement plane1, G0=300 kg/h and both nozzles closed
364
H. Shen and X. Yuan
(a) Calculation result
(b) Experiment result
Fig. 5. Temperature contour of measurement plane2, G0=300 kg/h and both nozzles closed
(a) Plane1
(b) Plane2
The Application of the Human Model in the Thermal Comfort Assessment
365
(c) Near cloth surface Fig. 6. Comparison between calculation and experiment results of temperature field, G0=300kg/h and both nozzles closed
5.3 Thermal Comfort Assessment Analysis According to the velocity and temperature fields obtained above and HTRS model, Tdb , Tskin and Tcore can be calculated. The results are indicated in Table 3. Compared with Table 3, we know that under the condition of air demand G0=300kg/h and both nozzles closed, the three assessment criteria above are all in the range of feeling comfort, which means that the temperature environment in the cockpit can satisfy the thermal comfort requirement. Table 3. Calculation results of thermal comfort assessment criteria
Tdb ,
℃
23.3
Tcore,
℃
37.03
Tskin ,
℃
33.68
6 Conclusions The following conclusions can be obtained by this investigation: (1) The human model is a key factor of the numerical simulation about air flow and heat transfer in confined space with human body inside, and a human model with reasonable geometrical and thermal features can improve the simulation accuracy; (2) The thermal comfort assessment criteria can be calculated by combining the numerical simulation of air flow and heat transfer and HTRS model. This means if physiologic features are taken into consideration, an integrated human model can be built and a complete method of thermal comfort assessment can be established accordingly.
366
H. Shen and X. Yuan
References 1. Chen, X., Yuan, X.: General human-machine-environment system. Beihang University Press, Beijing (2000) (in Chinese) 2. Shou, R., He, H.: Aircraft’s environment control. Beihang University Press, Beijing (2004) (in Chinese) 3. DeJager, A.W., Lytle, D.B.: Commercial airplane airdistribution system development through the use of computational fluid dynamics. In: AIAA-1992-0987, pp. 1–9 (1992) 4. Singh, A., Hosni, M.H., Hortsman, R.H.: Numerical simulation of airflow in an aircraft cabin section. ASHRAE Transactions 108(1), 1005–1013 (2002) 5. Gunther, G., Bosbach, J., Pennecot, J., et al.: Experimental and numerical simulations of idealized aircraft cabin flows. Aerospace Science and Technology (10), 563–573 (2006) 6. Zhang, T.F., Chen, Q.Y.: Novel air distribution systems for commercial aircraft cabins. Building and Environment (42), 1675–1684 (2007) 7. Yuan, X.: Study on Mathematic Model of Human-Cockpit’s Environment Temperature. Selected Papers of The Key Research Topic of Human-Machine-Environment Engineering, pp. 17–23 (1987) (in Chinese) 8. Yang, C.: The numerical simulation on the thermal behavior of aircraft cabin. Acta Aeronauticaet Astronautica Sinica 16(1), 64–68 (1995) (in Chinese) 9. Lin, G.: Numerical fluid flow and heat transferwithin an air-conditioning cockpit. School of Aeronautic Science and Engineering. Beihang University, Beijing (1998) (in Chinese) 10. Xiong, X., Liu, W., Ang, H., et al.: Flow fields and thermal comfort in training-plane’s airconditioned cockpit. J. Applied Sciences 25(6), 639–644 (2007) 11. Technology and Industry for National Defense. GJB20-84, Standard series on individual protecting and lifesaving equipment of pilot. Military Criterion Press of Technology and Industry for National Defense, Beijing 12. Technology and Industry for National Defense. GJB36-85, The design and use requests of pilot’s body model. Military Criterion Press of Technology and Industry for National Defense, Beijing 13. Technology and Industry for National Defense. GJB1129-91, Methodology and physiological requirements for temperature assessments of cockpit in military aircraft. Military Criterion Press of Technology and Industry for National Defense, Beijing 14. Qiu, Y.: Numerical and experimental study on liquid cooling garment used for extravehicular activity - human thermal regulation system. School of Aeronautic Science and Engineering, Beihang University, Beijing (2002) (in Chinese)
Mass Customization Methodology for Footwear Design Yifan Zhang1, Ameersing Luximon1, Xiao Ma1, Xiaoling Guo2, and Ming Zhang3 1
Institute of Textiles and Clothing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong 2 Zhongshan Boai Hospital, Guangdong, China 3 Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong [email protected]
Abstract. Nowadys consumers are sophisticated and want better quality personalized products at lower prices. In order to satisfy consumers, footwear companies have to predict consumers’ trends and produce huge varieties of designs using mass production technologies. This results in wastage and high inventory cost. Recently mass customization methodologies have been proposed. Mass customization enables the design and production of an almost custom made product at almost mass production costs. This can be achieved by reducing the wastage and inventory cost. The products are designed specifically for the consumer, eventually improving consumer satisfaction and the companies’ market share. Many researchers have discussed the framework of mass customization; however, there is limited study on the system details, especially for footwear design. The main purpose of this research is to propose a methodology for the mass customization of footwear – both style and fit customization. Keywords: Footwear design, Mass customization, Custom footwear, Style customization, Fit customization, Computer-Aided design (CAD) technology.
1 Introduction The traditional methods of production, used by manufacturing companies, focus on mass production. Mass production technique tends to produce large quantities of products with minor variations. Nowadays, the combination of globalization and digitalization has tremendous economic impact and has radically transforms the environments in which companies operate. The internationalization and digitalization of the markets offer companies the possibility to diversify their supply, production and distribution networks. Facing a plethoric supply, consumers have become more demanding and selective. Product with variation and customization is a trend in current market-oriented manufacturing environment. Companies try to respond as precisely as possible to consumers’ needs with highly focused products, each one aiming at a specific and highly differentiated consumer class, or each being personalized for a single consumer [1]. Mass customization relates to the ability to provide customized products or services through flexible V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 367–375, 2011. © Springer-Verlag Berlin Heidelberg 2011
368
Y. Zhang et al.
processes in high volumes and at reasonably low costs [2]. Selladurai [3] studied the methods to achieve mass customization. He explained why the term is not an oxymoron but describe a true situation. He looked at the advantages and disadvantages, and discussed how it may be effectively used in production and operations management. Similarly, Aigbedo [4] proposed a framework for evaluating the levels of mass customization that enables the manufacturer to meet consumers’ requirements in a cost effective manner. The shoe industry is no exception. It is a very competitive industry with many companies and educated consumers. Consumers’ preferences and perception of shoes strongly influence their acceptance. Today’s consumers, especially the young, are more sophisticated. They demand shoes to satisfy their needs at reasonable prices. At the same time, they wish to personalize the style and fit of their shoes. In order to capture the footwear market, companies need to manufacture customized shoes to satisfy consumer’s needs on style, fit and comfort. In order to enable footwear customization, the most important aspect is to design a customized, or better fitting shoe-last [5-7]. A shoe-last, 3D mold used for making footwear, has a relatively complex shape without any straight lines and is normally made of high density polyethylene [8]. To realize footwear fit customization, it is necessary to consider the wide variations in dimension, weight, shape, surface area, volume and composition. For mass production, the footwear fit is influenced by sizing and grading methods [916]. For footwear, there is a choice to customize the fit or the style. The aim of this study is to create a methodology to customize shoes for fit as well as style depending on consumer’s requirements. Previous studies [17-19] have discussed the framework of footwear mass customization, however very few of them have discussed the details in terms of Computer-Aided Design (CAD) technologies. In this study, the processes of generating designs and design modification of shoe-lasts are also discussed. Since 3D foot scanned data are used, the system enables design customization, as well as fit customization.
2 Method The framework of the proposed study is shown in Fig. 1. The frameworks indicate that mass produced shoes; fit customized shoes; style customized shoes; and style and fit customized shoes were produced. For mass produced shoes, the consumer’s feet anthropometric measures are required to extract the size information. In case the consumer has digital foot data acquired from a foot scanner, the size information were extracted from the digital foot. Based on the sizing information, the shoe-last can be sized and graded, and shoes are produced according to mass production techniques. 2.1 Customization of Mass Produced Shoes via Sizing and Grading Each country has its own guidelines for shoe-last sizing and grading; however, methods for shoe-last design are limited. Two widely used shoe-last design guidelines
Mass Customization Methodology for Footwear Design
Fig. 1. Framework of mass customized shoe design system
(a) Side view
(b) Front view
Fig. 2. Sizing and grading system
369
370
Y. Zhang et al.
include the AKA64-WMS American system [20, 21] and the Chinese system National Footwear Research Institute, Chinese Shoe Size and Last Design, China Light Industry Press, Bei Jing (1984) (in Chinese).[22].The AKA64 system, developed in 1964, was based on anthropometric study of children and is now used for sizing children as well as adults [21]. Fig. 2 shows the shoe-last sizing and grading according to the American system. Currently sizing and grading is based on foot length and foot girth. If a better sizing and grading method were developed, even using mass production technologies footwear fit and comfort can be improved. In addition, by using half sizes, the tolerance was improved. 2.2 Style Customization In addition to mass customized footwear, consumers can decide on style customization. The style customization can accomplish by modification of the upper or the modification of the shoe-last (Fig. 3). When modifying the shoe-upper, the consumers can decide to change the style lines to create new designs (Fig. 4a and Fig. 4b). This was done by modifying the style curves that were represented in terms of splines or NURBS curves. The style was modified by removing or adding accessories, such as removal of buckles (Fig. 4c). In addition, the style and fashion were modified by changing material (Fig. 4d and Fig. 4e) and color (fig. 4f). For the shoe-last style modification, the toe design (Fig. 5 and Fig. 6a), toe spring and heel height (Fig. 6b) were modified. It should be noted that modification of the toe design, toe spring and heel height will also influence the fit to some extent. Shoe-last style modification is included in shoe style modification, because normal sizing and grading is used after the shoe style modifications (Fig 1).
Fig. 3. Framework of shoe style customization system
Mass Customization Methodology for Footwear Design
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 4. Shoe style modification
Fig. 5. Shoe-lasts
(a) Modification of toe shape
(b) Modification of toe spring Fig. 6. Shoe-last style modification
371
372
Y. Zhang et al.
2.3 Fit Customization To some extent, the fit were customized by modification of the shoe-last style. However, fit customization will be done mostly by modification of the shoe-last according to consumer’s feet data (Fig. 7). From the digital foot data, foot parameters are extracted. The number of parameters will influence the degree of fit customization. For example, the parameters can be foot length, foot flare, foot width, and foot height. Similar to foot parameters, the matching shoe-last parameters are extracted. The shoe-last is then modified based on the shoe parameters and given tolerances to create customized shoe-last (Fig. 8). The customized shoe-lasts were used for making customized shoes.
Fig. 7. Framework of shoe fit customization system
Master shoe-last Customized shoe-last Foot
Fig. 8. Custom made shoe-last
Mass Customization Methodology for Footwear Design
373
Fig. 9. Shoe-last selection using fit rating
Also, the consumers can have a choice to decide on shoe-last based on the different quality of fit. This was achieved by selecting different shoes based on the consumer’s preference. Based on the shoes the matching shoe-last were extracted. The selected shoe-lasts were graded and sized based on the size information of the consumer. The modified shoe-lasts were compared with the feet to create fit rating [6] and ranking of the shoe-lasts (Fig. 9). Based on the rating and ranking, consumers can select the shoes. 2.4 Fit and Style Customization Depending upon the costs, most customers will prefer to have both fit and style customization (Fig. 1). With style and fit customization, the customer has the freedom to modify the style of the shoe-last and the shoe. In addition, to modification of the shoe-last and the shoe style, the shoe-last were modified to account for variation in the foot.
3 Discussion and Conclusion Mass customization is a solution to address the new market realities while still enabling manufacturers to capture the efficiency advantages of mass production. In this study, footwear customization methodology has been discussed; both style and fit are customized. The proposed methodology considers the entire different scenario for footwear customization. Previous studies on different level of fit customization [5, 6] can be incorporated in the proposed system. This system is advantageous to the consumers, footwear industries and the society. Consumers are able to design their shoes according to their style and fit requirements, while footwear industries are able
374
Y. Zhang et al.
to increase market share, reduce waste and inventory cost. In addition, mass customization benefits to the society as it creates less pollution. Acknowledgment. This work was supported by The Hong Kong Polytechnic University PhD studentship.
References 1. Labarthe, O., Espinasse, B., Ferrarini, A., Montreuil, B.: Toward a Methodological Framework for Agent-Based Modelling and Simulation of Supply Chains in a Mass Customization Context. Simulation Modelling Practice and Theory 15(2), 113–136 (2007) 2. Silveira, G.D., Borenstein, D., Fogliatto, F.S.: Mass Customization: Literature Review and Research Directions. International Journal of Production Economics 72, 1–13 (2001) 3. Selladurai, R.S.: Mass Customization in Operations Management: Oxymoron or Reality? Omega 32(4), 295–300 (2004) 4. Aigbedo, H.: An Assessment of the Effect of Mass Customization on Suppliers’ Inventory Levels in a JIT Supply Chain. European Journal of Operational Research 181(2), 704–715 (2007) 5. Luximon, A., Goonetilleke, R.S.: A 3-D Methodology to Quantify Footwear Fit. In: Tseng, M.M., Piller, F. (eds.) The Customer Centric Enterprise- Advances in Customization and Personalization, ch. 28, pp. 491–499. Springer, Heidelberg (2003) 6. Luximon, A., Goonetilleke, R.S.: A Fit Metric for Footwear Customization. In: Proceedings of World Congress on Mass Customization and Personalization, Hong Kong (2001) 7. Luximon, A., Luximon, Y.: Shoe-last Design Innovation for Better Shoe Fitting. Computers in Industry 60(8), 621–628 (2009) 8. Cheskin, M.P.: The Complete Handbook of Athletic Footwear. Fairchild Publications, New York (1987) 9. Ball, R., Shu, C., Xi, P.C., Rioux, M., Luximon, Y., Molenbroek, J.: A Comparison Between Chinese and Caucasian Head Shapes. Applied Ergonomics 41(6), 832–839 (2010) 10. Chen, W., Zhuang, Z., Benson, S., Du, L., Yu, D., Landsittel, D., Wang, L., Viscusi, D., Shaffer, R.E.: New respirator fit test panels representing the current Chinese civilian workers. Annals of Occupational Hygiene 53(3), 297–305 (2009) 11. Kouchi, M., Mochimaru, M.: Analysis of 3D Face Forms for Proper Sizing and CAD of Spectacle Frames. Ergonomics 47(14), 1499–1516 (2004) 12. Jussi, R.: Evaluation of an Augmented Reality Audio Headset and Mixer. Master’s Thesis, Helsinki University of Technology (2009) 13. Robinette, K.M., Annis, J.F.: A Nine-Size System for Chemical Defense Gloves. Technical rept. Anthropology Research Project Inc. Yellow Springs Oh (1986) 14. Luximon, A., Goonetilleke, R.S.: Critical Dimensions for Footwear Fitting. In: Proceedings of IEA 2003 Conference, CD-ROM (2003) 15. Zheng, R., Yu, W., Fan, J.T.: Development of a New Chinese Bra Sizing System Based on Breast Anthropometric Measurements. International Journal of Industrial Ergonomics 37, 697–705 (2007) 16. Maria, L.M., Philip, N.A., Nickolas, S.S.: A New Methodology for the Development of Sizing Systems for the Mass Customization of Garments. International Journal of Clothing Science and Technology 22(1), 49–68 (2010)
Mass Customization Methodology for Footwear Design
375
17. Sheffer, A., de Sturler, E.: Parameterization of Faceted Surfaces for Meshing Using Angle Based Flattening. Engineering with Computers 17(3), 326–337 (2001) 18. Lee, Y., Kim, H.-S., Lee, S.: Mesh Parameterization with a Virtual Boundary. Computers and Graphics 26(5), 677–686 (2002) 19. Luximon, A., Luximon, Y.: Creation of Design Variations Using CAD. In: 86th Textile Institute World Conference Proceedings, Hong Kong, pp. 1465–1471 (2008) 20. Adrian, K.C.: American Last Making: Procedures, Scale Comparisons, Sizing and Grading Information. In: Basic Shell Layout Bottom Equipment Standards. Shoe trades publishing Co., Arlington (1991) 21. Pivecka, J., Laure, S.: The Shoe Last: Practical Handbook for shoe Designers. Pivecka Foundation, Slavicin (1995) 22. National Footwear Research Institute, Chinese Shoe Size and Last Design. China Light Industry Press, Beijing (1984) (in Chinese)
Incorporating Motion Data and Cognitive Models in IPS² Michael Beckmann and Jeronimo Dzaack Chair of Human-Machine Systems, Department of Psychology and Ergonomics, Technische Universität Berlin, Franklinstr. 28-29, 10587 Berlin, Germany {michael.beckmann,jeronimo.dzaack}@mms.tu-berlin.de
Abstract. In the SFB/TR29 a focus lies on human factors and their integration into Industrial Product-Service Systems (IPS²). These innovative systems are complex and dynamic. Human operators need to be able to perform a multitude of complex tasks in such socio-technical systems, providing a challenge to the operators because of the high complexity. Therefore automatic assistance systems are necessary for the overall reliability and effectiveness of such a system. This article describes a theoretical approach for simulating human behavior with cognitive models. The performed actions are recognized with motion capturing in combination with machine learning. By evaluating the perceived action and reality a description for the situation can be automatically generated in real time. This can be used for e.g. providing the human operator with real time contextual feedback. Keywords: Cognitive User Models, Error Prevention, Human-Machine Interaction, Machine Learning, Gesture Recognition.
1 Introduction Industrial Product-Service Systems (IPS²) are innovative production systems incorporating human operators and technical systems [1]. They are characterized by an integrated and reciprocal determined development, provision and utilization of products. In modern production systems the strict separation of both services and products is no longer possible. IPS² provides a possibility to substitute partial aspects of services and products during the design and provision phase. IPS², like other socio-technical systems, consist of human actors and technical systems. IPS² are highly dynamic systems, because they need to be flexible concerning their operation and adapt to different settings. These dynamics lead to a high complexity of the technical system, requiring a vast amount of information about the systems for a multitude of tasks from the human operators. The information needed by a human operator to perform tasks on the technical systems consists of structural and status information, which is provided by e.g. internal heat sensors. This information needs to be evaluated by the human operator for the optimal course of action. In these systems the human operators cannot specialize in only a small subset of procedures, but need to handle diverse tasks and working conditions. The V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 379–386, 2011. © Springer-Verlag Berlin Heidelberg 2011
380
M. Beckmann and J. Dzaack
optimization of such technical systems in the design phase only considers technical aspects without regard to the special needs of human operators. These are a crucial part for the optimal and error free operation of IPS². The research in the field of human factors considers the human operator with limited resources [2] and optimizes the human well-being and system performance by applying theories, principles, data and other methods to the design of human-machine systems [3]. Factors which can influence the human in the system are e.g. age, state of mind, emotions and propensity for common mistakes, errors and cognitive biases. Allowing the human operator to control complexity and dynamics of the technical system increases human reliability and performance. Learning from errors of human operators can be used to reduce future errors. Designing IPS² with consideration to human factors can increase the resilience of the overall system and the performance of the human operators, increasing the overall productivity. The problem of insufficient regard of human needs, in case of knowledge and courses of action, can be addressed by providing human operators with additional contextual information and assistance during the execution of different tasks. In previous work [4] this was addressed by utilizing an expert to support a novice. The novice was the operator on site and the expert could see the gaze and field of view of the novice at a remote location. In this paper an automatic approach is described for combining motion capturing and machine learning to recognize human behavior or actions. Valid behavior is simulated with cognitive user models. By assessing the system state and comparing the observed human behavior with the predicted courses of action dangerous situations, for the human operator as well as the technical systems, can be recognized and system messages with an evaluation of the situation issued. These messages can be used to generate multimodal feedback for the human operator to prevent malpractice (e.g. errors of emission or confusion). Research about the optimal way of providing multimodal feedback to avert the situation is not part of this subproject, but researched in the subproject B4 “Multimodal user interfaces – Interaction specific warnings and user generated instructions for IPS²” [5]. An example task is the partial disassembly of a system where a part is mounted by several screws. After the removal of the last screw the part falls down if not held properly, damaging the system. In this setting the cognitive user models predict the holding of the part by one hand. The evaluation of the human movement provides the information on the use of a screwdriver, but does not recognize a holding action in the appropriate place for one of the hands. The prediction of the model and the recognized action from the recorded movement data is compared and evaluated. The invalid course of action is detected in real time and an error message generated. An appropriate reaction to the message might be haptic feedback to the hand holding the screwdriver, getting the attention of the human operator, thus stopping the screwing action and preventing the damage to the part. In case of an automatic screwdriver the unscrew action can be directly intercepted by the technical system. The previous elaboration of IPS² and the mentioned exemplifications make it evident, that contextual real time assistance for human operators in socio-technical systems will increase the overall robustness and effectiveness of these systems. In this article a theoretical approach for recognizing mismatches between human behavior and valid behavior is presented. This approach combines the fields of cognitive modeling and machine learning in addition to motion capturing. By evaluating the
Incorporating Motion Data and Cognitive Models in IPS²
381
simulated actions and real operator behavior the situation and the state of the humanmachine system can be assessed. This evaluation provides an objective and modelbased base for contextual feedback for human operators.
2 Approach The following chapters describe a theoretical approach to automatically recognize mistakes of a human operator in IPS² in real time. Information about the mistakes can be used to provide the human operator with contextual information and prevent harm to the human as well as the technical system. Cognitive user models (i.e. optimal user models) can be used to model characteristics of the human operator and to predict expected actions. The combination of motion capturing with machine learning provides the possibility to recognize different actions of the human operator. A mistake is detected, if there is a mismatch between the simulated action and the performed action of the human operator. Such a mismatch might be (1) a deviation from the optimal course of action, (2) a dangerous situation for the machine or the human operator, (3) an omission of previous steps or (4) an invalid course of action for the situation. Figure 1 visualizes the general idea. The following paragraphs describe the different related fields.
Fig. 1. General visualization of the workflow
2.1 Cognitive User Models Research in the field of cognitive psychology provides several theories to understand human behavior [6]. Cognitive user models incorporate these theories and make the modeling of selected cognitive processes possible. There are different cognitive architectures for developing models to simulate human behavior. These are basically programming frameworks and can be assigned to the two categories of high-level and low-level cognitive architectures. High-level architectures use elementary behavior as building blocks and describe cognitive processes as a fixed sequence of actions. They are most suitable for
382
M. Beckmann and J. Dzaack
investigating errors and difficulties in using interfaces in the field of usability and human performance. An example for a high-level architecture is Goals, Operators, Methods and Selection Rules (GOMS) and its derivatives [7], which can be used to predict the time to finish routine tasks in a specific setting. High-level architectures are limited in the description of signal-detection or decision-making processes. Low-level architectures, like Atomic Components of Thought – Rational (ACT-R) [8] describe human behavior on an atomic level. Most of these architectures use production systems to simulate human processing and cognition. This equals a bottom-up process when processing data and makes the reaction to external stimuli possible, as well as the interruption and resumption of cognitive processes. Complex paradigms, like dual-tasking and decision making, and their underlying processes can be simulated with cognitive models designed with low-level architectures [9]. Because of the bottom-up processing of the information the predictions of cognitive models based on low-level architectures are more natural, taking into account human complexity and dynamics. Most cognitive user models are applied to predict user behavior offline and the performance of simulations is compared with experiments to validate the model. A valid model explains the observed behavior, but the actual decision process can be completely different from the one simulated. 2.2 Motion Capturing Motion capturing consists of several techniques to record the movements of real things, like a human being. As an example walking, grasping motion or movements for a serve in tennis can be recorded for later evaluation of the ergonomics of the motion, effectiveness of the motion or realistic animation in a virtual environment. The desired object can be captured using different technologies. An RGB and a depth camera can be used to recognize shapes and match these shapes to some model of the tracked object, recording the information by mapping the shape to the model and mapping the movement of the shape to the movement of the model. Another technique uses markers placed on e.g. the human body in defined positions and tracking of the markers by e.g. infrared cameras and moving the markers in the simulated model under the constraints of a human body. Usually the user needs to assume a specific posture for the initialization of the model. The movement of the model is used for later analysis or online processing. 2.3 Machine Learning The field of machine learning provides a rich repertoire of methods [10] to learn from observed data. These methods are used to adjust the parameters of a model to a training dataset. Using this trained model new data points from the same distribution as the training data points are transformed, resulting in optimal output. There are several models with different objectives. The objective can be to extract relevant features of a data set for later processing, which is the field of feature extraction. Another application is the training of a model which learns classes or categories from data points and can assign data points to their respective class. If the labels of the data points in the training set are known, the process is called supervised
Incorporating Motion Data and Cognitive Models in IPS²
383
learning, otherwise unsupervised learning. In the latter case the model tries to infer classes or categories from the data. Assigning an unknown data point to a class is called classification. An application of supervised learning is handwriting recognition. Images of letters are presented with their label to the model which later discriminates new handwritten digits and assigns the assumed label. In the case of this approach human movement has to be recognized and classified into actions. In case of human movement the start and end of an action is not known in advance, to solve this problem gesture spotting [11] can be used to get sequences for further classification. This results in the same action not having the same duration or data points. This problem was researched in the field of speech recognition, where timing and pronunciations vary. One approach is to use dynamic time warping [12] which basically stretches or compresses the time axis to generate some matching to a template under several conditions. Hidden Markov models (HMM) represent another prominent approach in the field of gesture recognition [13]. HMMs represent a probabilistic model of latent and observed variables with transition probabilities for the states in the model. For movement the trajectory of the tracking device can be interpreted as the observable variables and the latent variables represent the different actions under consideration.
3 Concept The aim of this theoretical approach is to provide the means to process human behavior and the technical system state in real time to automatically recognize a mismatch or error in the action of the human operator, so automatic assistance can be provided depending on the type of mismatch and the knowledge of the involved human. The data flow of this theory can be summarized into four consecutive steps: (1) data gathering, (2) classification and simulation of the human action (3) comparison of the classification with the simulated action and (4) evaluation of the actual system state in respect to the comparison. This process is interrupted when the last step generates a system message indicating an error which needs to be handled by providing the human operator with feedback. The approach is illustrated in Figure 2. In the first step the data for further processing is gathered. The movement data of the human operator can be collected using motion or gesture tracking systems. The technical systems in industrial systems usually integrate several sensors to observe the system state. Further sensors can be added for additional data. Initial data about the state space of a task can be collected using a setup step. The second step requires the evaluation of the captured movement using machine learning. Actions like holding some part, moving a part from one location to another or unscrewing a screw are recognized from the recorded movement in real time. Concurrent actions need to be recognized, because the use of different tools at the same time is common in industrial settings. A scenario for concurrent actions is the removal of a part which involves an unscrew action and a holding action at the same time, because the part would fall down otherwise and get damaged. To recognize these concurrent actions, the body movement is segmented into different areas which are involved in different actions. Action recognition is done on each part
384
M. Beckmann and J. Dzaack
separately and the performed actions are deduced from the results of these segments, making recognition of concurrent actions possible. The optimal behavior of the operator is simulated with cognitive user models. These models provide quantitative and theoretical data about ideal operators for specific tasks, like execution times. Action sequences, times and further characteristics are extracted from the classification and simulation for further comparison to determine the validity of the performed action. In the case of the described scenario the model would issue the action of holding the part and unscrewing the screw. These commands also contain timing information for the execution of the task. In this third step real-time comparison of the characteristics is performed. The differences between the simulation and the recognized action can be computed and further processed by the evaluation step. For the scenario described above, if the human operator wants to unscrew the screw, but does not hold the part, there is a mismatch between the required holding action of the cognitive model and the absence of this action in reality. A message about this mismatch will be sent to the following evaluation step.
Fig. 2. Conceptual framework to automatically support human-computer interaction in IPS²
The last evaluation step takes into account the differences computed in the previous step and the sensors of the technical system. Dangerous actions can be recognized in this step, e.g. moving a hand near a dangerous area in which the heat did not dissipate yet. On recognizing a valid action the state of the model can be adjusted to the taken action and the process continued. Otherwise appropriate system messages can be generated, which serve as a basis for providing the human operator with contextual feedback, which is part of the subproject B4. The message consists of the system state, the action performed by the human operator, a set of valid actions and the evaluation of the severity of the invalid action. In the described case the mismatch of not holding the part is critical, because unscrewing the screw leads to damage of the technical system, the missing required action is holding the part, which is the required action as well.
Incorporating Motion Data and Cognitive Models in IPS²
385
4 Discussion The human operator needs assistance for increasing the robustness and effectiveness of IPS². This article described a theoretical approach to support human operators in such socio-technical systems. The automatic assessment of a working task can be used to generate contextual feedback for the human operator to prevent the execution of improper courses of action. Several challenges need to be tackled to realize the described approach. The first challenge is the real time processing of the data. This can be solved by predicting courses of action in advance by cognitive user models and adjusting the models to a recognized action, in case it was valid. The models must contain mechanism for interruption, modification and resumption. Differences in action must be recognized as early as possible, because of the real time processing. Thus low-level architectures should be used, because human behavior is simulated on an atomic level. This makes an earlier comparison of actions possible. Tools exist to map highlevel models to low-level models [14], making the design of such models possible at a more abstract level and transforming them to an easier level for evaluation. Another challenge is the action recognition using machine learning. The model can be trained offline, considering the past data. The transformation of such a learned model is fast for new data and can be done in real time. An open problem is the type as well as the amount of movement information, e.g. trajectories, which is necessary to reach a reliable recognition of different actions. The previously mentioned work in gesture recognition and other fields suggest that this problem can be solved by studying different practical settings. A first step in the realization of this approach is the simulation and recognition of a basic normalized industrial process involving human operators and technical systems. For this data from the technical systems, movement data and additional sensory data is used for the evaluation of the situation. In a second step the data provided by additional sensors is reduced. In a further step individual cognitive user models for the human operators will be employed, providing the human operators with individualized contextual support. Additional value is added by this approach in addition to the contextual information by evaluation of the rich set of sensory data collected for this approach. Collected data from different operators, tasks and systems can be evaluated in a further step to plan training workshops for skill enhancement for those tasks. Training videos can be created semi-automatically from the recorded movement data and optimal process patterns derived from recorded work of an expert with additional information added by a human about critical situations. The additional information is necessary, because dependencies and reasoning cannot be reconstructed on movement data alone. Detected common problems in operation can be perceived and the process for a task or the technical system adjusted. The method of providing the human operator with appropriate feedback to avoid malpractice or error is not covered in this approach. This is a well studied field in human-computer interaction. The subproject B4 of the SFB/TR29 is researching the optimal methods for providing this feedback.
386
M. Beckmann and J. Dzaack
5 Summary IPS², like other socio-technical systems, are complex systems. Human factors are not considered appropriately when these systems are designed. The multitude of tasks a human operator needs to perform and the know-how necessary for these tasks increases rapidly, reducing the practice of the human operator in each task and thus the familiarity. This, in turn, has a negative impact on the performance of the involved humans. The described approach for observing human operators and recognizing their behavior from collected data with the help of machine learning and simulating valid behavior with cognitive models, identifying mismatches, can be used to provide the human operator with contextual feedback. This reduces the complexity of the technical system and helps the human operator to cope with the multitude of facets in these socio-technical systems. Acknowledgments. We express our sincere thanks to the Deutsche Forschungsgemeinschaft (DFG) for funding this research within the Collaborative Research Project SFB/TR29 on “Industrial Product-Service Systems – dynamic interdependency of products and services in the production area”. We thank Bo Höge for his support concerning this work.
References 1. Meier, H., Völker, O.: Industrial Product-Service-Systems – Typology of Service Supply Chain for IPS2 Providing. In: Mitsuishi, M., Ueda, K., Kimura, F. (eds.) Manufacturing Systems and Technologies for the New Frontier, pp. 485–488. Springer, London (2008) 2. Jones, B.D.: Bounded Rationality. Annu. Rev. Polit. Sci. 2, 297–321 (1999) 3. Human Factors and Ergonomics Society, http://www.hfes.org/web/abouthfes/about.html 4. Höge, B., Schlatow, S., Rötting, M.: A Shared-Vision System for User Support in the Field of Micromanufacturing. In: Proceedings of HCI International 2009 – Posters, pp. 855–859. Springer, Heidelberg (2009) 5. Schmuntzsch, U., Rötting, M.: Multimodal User Interfaces in IPS2. In: Proceedings of HCI International 2011. Springer, Heidelberg (in press, 2011) 6. Solso, R.L.: Cognitive Psychology, 6th edn. Allyn and Bacon, Nedham Heights (2000) 7. John, B.E., Kieras, D.E.: The GOMS family of user interface nalysis Techniques: Comparison and contrast. ACM Transactions on Computer-Human Interaction 3(4), 320– 351 (1996) 8. ACT-R Theory and Architecture of Cognition, http://act-r.psy.cmu.edu/ 9. Dzaack, J., Kiefer, J., Urbas, L.: An approach towards multitasking in ACT-R/PM. In: Proceedings of the 12th Annual ACT-R Workshop, Italy (2005) 10. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006) 11. Junker, H., Amft, O., Lukowicz, P., Tröster, G.: Gesture spotting with body-worn inertial sensors to detect user activities. Pattern Recognition 41(6), 2010–2024 (2008) 12. Salvador, S., Chan, P.: Toward accurate dynamic time warping in linear time and space. Intelligent Data Analysis 11, 561–580 (2007) 13. Ying, Y., Randall, D.: Toward natural interaction in the real world: real-time gesture recognition. In: ICMI-MLMI 2010, pp. 15:1–15:8. ACM, New York (2010) 14. Amant, R.S., Freed, A.R., Ritter, F.E.: Specifying ACT-R models of user interaction with a GOMS language. Cognitive Systems Research 6(1), 71–88 (2005)
Study on Synthetic Evaluation of Human Performance in Manually Controlled Spacecraft Rendezvous and Docking Tasks Ting Jiang, Chunhui Wang*, Zhiqiang Tian, Yongzhong Xu, and Zheng Wang National Laboratory of Human Factors Engineering, Astronaut Research and Training Center of China, Beijing 100094, China [email protected]
Abstract. Manual-Control Rendezvous and Docking (MCRVD) is a complex and demanding task which requires astronauts to control the direction of six-freedom of the spacecraft accurately by hand. Its performance has a close relationship among the design of the spacecraft HCI, the control ability of the astronaut, and the matching effect of the two factors above. In this paper, human performance for these tasks was measured and a mathematical model to evaluate MCRVD performance quantitatively was proposed. First, 3500 experiments were designed on a ground simulated RVD system to examine characteristics and regulations of MCRVD performance indexes, such as control deviation, fuel consumption etc. Twenty-five male volunteers aged 25-35 participated in the experiment. Analysis predicts that the performance indexes of MCRVD show the characteristics and laws of stages. The process of MCRVD can be divided into three stages: tracking control (about more than 20m distance), accurate control (less than 20m distance) and docking stage (0m). The performance indexes of tracking and accurate control show the relevance of characteristics, and the precision index of docking reflects the difficulty of control the direction of the spacecraft. For that reason, several statistics techniques, such as the factor analysis method, the entropy analysis method etc, are utilized to analyze the weight coefficient of each performance index. After that we presented a novel multi-hierarchy integrated evaluation method, which includes four hierarchies: the performance of the tracking control, the performance of the accurate control, the precision of docking and the result of docking. Finally we utilized this method to analyze the human performance in MCRVD, which verifies the validity of our method. Keywords: Manual-control Rendezvous and Docking, Human Performance, evaluation method.
1 Introduction The rendezvous and docking (RVD) process consists of orbital maneuvers and controlled trajectories, which successively bring the active vehicle (chaser) into the *
Corresponding author.
V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 387–393, 2011. © Springer-Verlag Berlin Heidelberg 2011
388
T. Jiang et al.
vicinity of, and eventually into contact with, the passive vehicle (target). RVD technology and techniques are key elements in missions such as assembly in orbit of large units, re-supply of orbital platforms and stations, exchange of crew, and lunar/planetary mission.[1-6] The Manual control technique is important in spacecraft rendezvous and docking. The human Performance has a close relationship among the design of the spacecraft HCI, the control ability of the astronaut, and the matching effect of the two factors above. This paper aimed at the human performance model to evaluate the degree of matching.
2 Manual-Control Rendezvous and Docking As shown in Fig. 1, Manual-control Rendezvous and Docking (MCRVD) is a manned system consisted of human, display and control. MCRVD is started from the final phases of rendezvous, about 150 meters. In the phases, astronaut estimate the relative distance and attitude between the spacecrafts depend on the video of target in display and parameters data from measure system. At the same time, astronaut also operates six-axis control of spacecraft. Video display
Translation controller Astronaut Orientation controller
Graph display
Sensor
Spacecraft
Camera
Fig. 1. The human-display-control system in Manual-control Rendezvous and Docking
In Human-display-control system, the performance of task depends on the matching between the parts of system. It is difficult that estimating of the matching of system by a few simple performance indexes. It is important that the system performance model used to evaluate the composite performance by multi-indexes.
3 Evaluate Index System and Screening Methods Performance index system should be constructed to meet the following two principles: On the one hand, it’s principle of comprehensiveness, that is, evaluation of the performance index system to reflect all aspects of the object or different properties; On the other hand, it’s principle of typicality, that is, a single index must contain the actual physical meaning, to reflect some aspect of evaluation objects or features. For the
Study on Synthetic Evaluation of Human Performance
389
characteristics of MCRVD, this paper through primary and screening to meet the principles of both. 3.1 Evaluate Index System The purpose of primary screen is the selection of indexes as much as possible to describe all aspects of evaluation objects and features, fully meet the principle of comprehensiveness. In this study, Analytic Hierarchy Process (AHP) was used to establish the structure of index system. Through the analysis of indexes properties and physical meaning, the index system was divided into two levels and three categories: Docking results, Docking accuracy and the indexes of Control process, as shown in Table. 1. Table 1. Evaluate index system
Level 1 Docking results Docking accuracy
Indexes of Control process
Level 2 Success rate Docking deviation Docking Velocity Control time Fuel consuming content Control deviation Alignment hold rate Velocity
Docking deviation contain that: Translation distance deviation, roll angular deviation, pitch angular deviation and yaw angular deviation. Docking Velocity contain that: X velocity, Translation velocity, roll angular velocity, pitch angular velocity and yaw angular velocity. Control time is the time from the beginning of manual control to the contact between the two spacecrafts. Fuel consuming content is the sum of fuel in the whole manual control Rendezvous and Docking. The indexes of Control deviation contain the total deviation, average deviation and standard deviation of Y distance deviation, Z distance deviation, Translation distance deviation, roll angular deviation, pitch angular deviation and yaw angular deviation. The indexes of Velocity contain the maximum velocity and average velocity of X velocity, Y velocity, Z velocity, roll angular velocity, pitch angular velocity and yaw angular velocity. 3.2 Screening Method Each index of performance indexes system should have a specific physical meaning, can reflect the differences in operating characteristics. In this study, some coefficients of variation were used to measure the degree of difference, e.g. mean and standard deviation. The greater the value of variability, then the higher the discrimination of index, on the contrary, that index is not strong discriminability. This method was used to screen the indexes of docking accuracy.
390
T. Jiang et al.
In the index system, especially within all levels, indexes have large overlap between the information. So that, it is have a need for indexes screening to keep the representative, as far as possible. Correlation coefficient was used to screen the indexes of rendezvous process. According to the analysis of data, the Control process was divided into two parts, tracking control stage and accurate control stage. Tracking control stage is about from the start distance to 20m, Owing to the far distance of target, detailed features can t be observed, astronaut mainly track the target by observing the shape of features. Accurate control stage is form 20m to 0m, astronaut mainly precise control the attitude and translation of vehicle to meet the docking access. The screen indexes of both stages are shown in Table.2.
’
Table 2. The indexes of control process
Tracking control stage
Indexes
√ √ √ √ √ √ √ √ √
Control time Fuel consuming content Alignment hold rate Total roll angular deviation Total translation distance deviation Total gesture angular deviation Gesture alignment hold rate Maximum X velocity Average X velocity Average translation velocity
Accurate control stage
√ √ √ √ √ √ √ √
4 Synthetic Evaluate Model 4.1 The First Step of Performance Model
:
The first step of performance model was
S = F1 * ( w2 ∗ F2 + w3 ∗ F3 )
(1)
Where S is the value of performance, F1 , F2 and F3 are the values of Docking results, Docking accuracy and the indexes of Control process,
w2 and w3 are the weight
coefficient of Docking accuracy and the indexes of Control process. The weight coefficient of Docking accuracy and the indexes of Control process were analyzed by expert assign method.
Study on Synthetic Evaluation of Human Performance
391
4.2 The Second Step of Performance Model The second step of performance model has two parts: the model for evaluates docking precision and the model for evaluates Rendezvous process. Docking precision contain offset precision and velocity precision. The model was: 9
F2 = ∑ w2i ∗ x2i
(2)
i =1
Where F2 is the value of docking precision, w2i is the weight coefficient of i index,
x2i is the value of i index. The weight coefficients were analyzed by the entropy analysis method. The Rendezvous process was divided into two parts, tracking control stage and accurate control stage. The model was:
F3 = wA ∗ FA + wB ∗ FB
(3)
Where F3 is the value of whole Rendezvous process, FA and FB are the values of tracking control stage and accurate control stage, wA and wB : the weight coefficient of tracking control stage and accurate control stage. The weight coefficient of tracking control stage and accurate control stage were analyzed by expert assign method. The values of tracking control stage and accurate control stage were obtained by the factor analysis method
。
5 Validation Experiment 5.1 MCRVD Simulator For examine characteristics and regulations of MCRVD performance indexes, a ground RVD simulator was developed. This simulator can simulate the process of MCRVD in the last 150m. The simulator have human-machine interface that contain spacecraft cabin, seats, controller, Video display interface, Control dynamics module.[7] For the experiment requirement, the design parameters of simulator can be changed, e.g. the installation position of seats, the contents, layout, target sign and referential ruler of display, etc. 5.2 Experiment Design The experiment design three scheme distinguish from different task difficulty. The experiment aimed at the performance of three schemes. The parameters contain numerical value display type and assistant task in table.3.
392
T. Jiang et al. Table 3. Experiment scheme parameter
Scheme A Scheme B
Numerical value display type numerical value Non-numerical value
Scheme C
Non-numerical value
Scheme name
Assistant task distinguish from sound stimulation
The operators in the experiment were technicians from the Astronaut Research and Training Center of China. They included 25 males ranging from 22 to 40 years old. The operators all had at least a bachelor’s degree and work experience related to manned spaceflight. Everyone has operated 3 times. After experiment, everyone was required to fill in the subjective post experiment questionnaire. 5.3 Data Analysis Multiple comparisons of value between different scheme divisions were conducted by using One-way ANOVA. The result of the experiment showed in table 4. Table 4. Multiple comparisons in One-way ANOVA
Scheme I Scheme A Scheme A Scheme B
Scheme J Scheme B Scheme C Scheme C
Sig.
0.211 0.001 0.001
Data Analysis and Results showed that: the value did significantly discrepancy between scheme A and scheme C (p<0.05), and the same result between scheme B and scheme C (p<0.05). it is the same as the result of the subjective experiment questionnaire. It is ensure that the model was validated.
6 Conclusions The performance indexes of MCRVD show the characteristics and laws of stages. The process of MCRVD can be divided into three stages: tracking control (about more than 20m distance), accurate control (less than 20m distance) and docking stage (0m). The performance indexes of tracking and accurate control show the relevance of characteristics, and the precision index of docking reflects the difficulty of control the direction of the spacecraft. For that reason, several statistics techniques, such as the factor analysis method, the entropy analysis method etc, are utilized to analyze the weight coefficient of each performance index. After that we presented a novel multi-hierarchy integrated evaluation method, which includes four hierarchies: the performance of the tracking
Study on Synthetic Evaluation of Human Performance
393
control, the performance of the accurate control, the precision of docking and the result of docking. Finally we utilized this method to analyze the human performance in MCRVD, which verifies the validity of our method. Acknowledgements. This study is supported by The National Basic Research Program of China (973). (No.2011CB711000).
References 1. Zhang, Y.J., Xu, Y.Z., Li, Z.Z., Li, J., Wu, S.: Influence of Monitoring Method and Control Complexity on Operator Performance in Manually Controlled Spacecraft Rendezvous and Docking, vol. 5(12), pp. 619–624 (2008) 2. Carolina, L.L., Edward, V.B.: A Survey of Rendezvous and Docking Issues and Development. AAS-89-158 (1989) 3. Collins, D.J.: Fight Operations for Shuttle Rendezvous Navigation and Targeting. Proceeding IEEE (1984) 4. Fehse, W.: Rendezvous and Docking Technology Development for Future European Missions, vol. 9(1). ESA (1985) 5. Clandinon, B.: Detailed System Studies on RVD. N84-33428 (1984) 6. Eggleston, J.M.: A Study of the Optimum Velocity Change to Intercept and Rendezvous. NASA-TND-1029 (1962) 7. Nunamaker, R.R., Rowell, L.F.: Langley Research Center Resources and Need for Manned Space Operations Simulation. NASA Langley Research Center, Paper (1987)
Dynamic Power Tool Operation Model: Experienced Users vs. Novice Users Jia-Hua Lin, Raymond W. McGorry, and Chien-Chi Chang Liberty Mutual Research Institute for Safety Hopkinton, MA, USA [email protected]
Abstract. Previous study demonstrated that a single-degree-of-freedom mechanical model can represent a human power tool operator subjective to impulsive torque reactions. Using only novice tool users, it was shown that the mechanical capabilities to respond to tool torque reaction depended on workstation location and orientation, and varied among users (Lin, Radwin, & Richard, 2000). It was hypothesized that the mechanical model elements among experienced tool operators may be different from novice users. A laboratory study was carried out to measure the equivalent mechanical parameters among the novice and experienced tools users. The results demonstrate the difference between the two groups. Those may represent the strategy developed by the experienced users to minimize the impacts from the impulsive torque reactions.
1 Introduction Human limbs have been modeled as a single-degree-of-freedom mechanical system to understand their dynamic characteristics (Aruin & Zatsiorsky, 1984; Hunter & Kearney, 1982; Lakie, Walsh, & Wright, 1980; Reynolds & Soedel, 1972; Zawadzki & Kornecki, 1988). Tool operators have also been modeled as equivalent mechanical systems (Hansson & Kihlberg, 1983; Lin, Radwin, & Richard, 2001; Lindqvist, 1993) to understand the effect of powered hand tool torque reaction of the hand-arm system. The operator can be represented using a single-degree-of-freedom system, which consists of a spring, an inertial mass, and a damping element. The mechanical elements can be identified by observing the response when the hand-arm is subject to a viscoelastic loading (Lacquaniti, Licata, & Soechting, 1982; Lin et al., 2001). The mechanical elements were found to be affected by work locations (Lin et al., 2001) and postures (Hogan, 1990). Experience was identified as a factor influencing hand tool performance because of different techniques used between experienced and novice users (Rockwell & Marras, 1986). Kearney and Hunter (1990) pointed out that the substantial inter-subject variability in joint dynamics suggests the data measured for one population may limit its application for another. Therefore, this project aimed to identify if operator V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 394–398, 2011. © Springer-Verlag Berlin Heidelberg 2011
Dynamic Power Tool Operation Model: Experienced Users vs. Novice Users
395
experience affected the hand-arm equivalent mechanical system responding to the impulsive torque reactions encountered in common power tool operations.
2 Methods To identify the hand-arm equivalent mechanical parameters, an oscillating elastic loading test battery was constructed. The principle of the battery was that when the hand gripped the attached handle, the mechanical system of the operator (consisting of equivalent mass, spring and damping elements) was added to that of the battery. It resulted in a new mechanical system of different impulse response characteristics. By measuring the response characteristics, the operator system could be identified. The battery consisted of a mass coupled with a spring. Through the center of the mass was a spindle, whose end was connected to a handle for the operator to grasp. The handle, which simulates pistol grip or right angle handle shapes, could measure hand force and spindle torque. A solenoid switch was mounted on the battery, controlled by the trigger installed at the handle, to release the mass and start the oscillation for a trial. Thirty right-handed male participants, free of musculoskeletal disorders, were recruited and gave informed consent to participate in this study approved by our institutional review board. Participants were considered experienced tool users if they had at least one full year in any previous or current job where using power hand tools such as screwdrivers, nutrunners, or wrenches was necessary and part of the job. There were 27 trials (3 oscillating frequencies × 3 horizontal distances × 3 vertical distances) for each participant. For each trial, the participant was asked to grip the handle on the test battery in the same manner as holding a corresponding tool. When ready, the trigger was pressed by the participant, the handle should be stabilized around its neutral position. Each oscillation lasted less than 3 seconds. The mechanical elements measured on the pistol grip handle on the vertical surface are reported in this paper.
3 Results and Discussion The results demonstrate that operator hand-arm stiffness was significantly affected by vertical (p< 0.001) and horizontal (p = 0.001) locations. The trends in Figures 1 and 2 show that as the horizontal distance between the handle and the operator increased, and as the handle height was lowered, hand-arm stiffness decreased. This was consistent with the previous findings (Lin et al. 2003). Experienced tool users exercised less, although not statistically significant, stiffness to react against the tool handle than novice users (Figure 1).
396
J.-H. Lin, R.W. McGorry, and C.-C. Chang
Stiffness (Nm/rad)
Novice
Experienced
25
50 Horizontal location (% reach)
75
Fig. 1. Average operator hand-arm stiffness by experience and horizontal location
The equivalent inertial mass the operator posed to the tool handle was significantly affected by experience (p= 0.008), horizontal (p< 0.001) and vertical (p= 0.001) locations. The means of the operator inertial mass by each study level are listed in Table 1. The damping element was not affected by any study factors, although experienced operators had greater coefficients (1.9 N·sec/m) than the novice operators (1.2 N·sec/m). Stiffness and inertial mass are the two elements having the greatest influences in the model that describes and predicts operator hand-arm dynamics (Lin et al. 2003). Based on the single-degree-of-freedom model, the greater the stiffness and the inertial mass, the less the hand displacement. From the current findings, experienced operators produced a greater inertial mass but less stiffness while counteracting the impulsive disturbance similar to those encountered in power tool use. Lin et al. (2007) observed that while exposing to different tasks, which resulted in different level of torque exposure, experienced tool users allowed consistent handle displacement while novice users were highly affected by the torque exposure level. Novice users allowed a significantly greater handle motion to a soft joint than a hard joint. Further, in both exposures, experienced users allowed less displacement. The current findings support that experienced users may alter their equivalent mechanical properties to accommodate various torque exposure to minimize discomfort or physical disturbances.
Dynamic Power Tool Operation Model: Experienced Users vs. Novice Users
397
Fig. 2. Average operator hand-arm stiffness by vertical location Table 1. Average operator hand-arm inertial mass by experience, vertical, and horizontal location
Factor Experience
Level Novice Experienced Vertical location Shoulder - 30 (cm) Shoulder Shoulder + 30 Horizontal location 25 (% reach) 50 75
Inertial mass (Nm s^2) .019 .031 .021 .020 .034 .019 .019 .036
References Aruin, A.S., Zatsiorsky, V.M.: Biomechanical characteristics of the human ankle joint muscles. European Journal of Applied Physiology and Occupational Physiology 52, 400–406 (1984) Hansson, J.-E., Kihlberg, S.: A test rig for the measurement of vibration in hand-held power tools. Applied Ergonomics 14(1), 11–18 (1983) Hogan, N.: Mechanical impedance of single- and multi-articular systems. In: Winters, J.M., Woo, S.L.-Y. (eds.) Multiple Muscle Systems, pp. 149–164. Springer, New York (1990) Hunter, I.W., Kearney, R.E.: Dynamics of human ankle stiffness: variation with mean ankle torque. Journal of Biomechanics 15(10), 747–752 (1982)
398
J.-H. Lin, R.W. McGorry, and C.-C. Chang
Kearney, R.E., Hunter, I.W.: System identification of human joint dynamics. Critical Reviews In Biomedical Engineering 18, 55–87 (1990) Lacquaniti, F., Licata, F., Soechting, J.F.: The mechanical behaviour of the human forearm in response to transient perturbations. Biological Cybernetics 44, 35–46 (1982) Lin, J.-H., McGorry, R.W., Chang, C.-C., Dempsey, P.G.: Effects of user experience, working posture and joint hardness on powered nutrunner torque reactions. Ergonomics 50, 859–876 (2007) Lin, J.-H., Radwin, R.G., Richard, T.G.: Development and validation of a dynamic biomechanical model for power hand tool torque build-up reaction force. In: Proceedings of the Human Factors and Ergonomics Society 44th Annual Meeting, pp. 295–312. HFES, Santa Monica (2000) Lin, J.-H., Radwin, R.G., Richard, T.G.: Dynamic biomechanical model of the hand and arm in pistol grip power handtool usage. Ergonomics 44(3), 295–312 (2001) Lin, J.-H., Radwin, R.G., Richard, T.G.: A single degree-of-freedom dynamic model predicts the range of human responses to impulsive forces produced by power hand tools. Journal of Biomechanics 36, 1845–1852 (2003) Lindqvist, B.: Torque reaction in angled nutrunners. Applied Ergonomics 24(3), 174–180 (1993) Reynolds, D.D., Soedel, W.: Dynamic response of the hand-arm system to a sinusoidal input. Journal of Sound and Vibration 21(3), 339–353 (1972) Rockwell, T.H., Marras, W.S.: An evaluation of tool design and method of use of railroad leverage tools on back stress and tool performance. Human Factors 28(3), 303–315 (1986) Zawadzki, J., Kornecki, S.: Influence of joint angle and static tension of muscles on dynamic parameters of the elbow joint. In: de Groot, G., Hollander, A.P., Huijing, P.A., van Ingen Schenau, G.J. (eds.) International Series on Biomechanics: Biomechanics XI-A, pp. 94–100. Free University Press, Amsterdam (1988)
An Empirical Study of Disassembling Using an Augmented Vision System Barbara Odenthal, Marcel Ph. Mayer, Wolfgang Kabuß, Bernhard Kausch, and Christopher M. Schlick Institute of Industrial Engineering and Ergonomics at RWTH Aachen University, Bergdriesch 27, 52062 Aachen, Germany {b.odenthal,m.mayer,w.kabuss, b.kausch,c.schlick}@iaw.rwth-aachen.de
Abstract. Within the Cluster of Excellence “Integrative Production Technology for High-Wage Countries” of RWTH Aachen University a numerical control unit and its ergonomic human-machine interface are developed for a robotized production unit. In order to cope with novel systems, the human operator will have to meet new challenges regarding the work requirements. Therefore, based on a first prototype of an augmented vision system (AVS) assisting the human operator in error detection the system is enhanced in order to deal with the task of error correction. Laboratory tests have been performed to find a preferable solution to support the human operator in disassembling of the erroneous object in order to correct the detected error in cooperation with the robot. Significant effects of the supporting mode regarding human assembly time will be presented. Keywords: Augmented Reality, Disassembly, Cognitive Automation.
1 Introduction The ability of self-organization respective self-optimization within a production system, meaning the system being able to decide which part of a given task can be resolved best by a certain resource can be regarded as intelligent behavior. To include those abilities into production systems, new models capable of describing this behavior are required. A feasible approach is the integration of cognitive abilities based on so-called cognitive architectures into technical systems. These cognitive architectures are based on an information-technological approach to describe and simulate human cognition. As a result, the cognitive technical system should be able to plan and (if necessary) to optimize e.g. a production process. Furthermore, it shall implement it to the machine autonomously and under variable conditions based on insecure and/or incomplete information. This leads to the research question: How must the human-machine-interface (HMI) be designed to provide the human operator with all the necessary information to continually monitor the cognitive system using the inherited “intelligence” of the system? V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 399–408, 2011. © Springer-Verlag Berlin Heidelberg 2011
400
B. Odenthal et al.
Therefore, within the Cluster of Excellence “Integrative Production Technology for High-Wage Countries” of RWTH Aachen University, a Cognitive Control Unit (CCU) and its ergonomic user-centered HMI is developed for a robotized production unit ([1],[2],[3]). By means of the CCU non-value-adding planning and implementation tasks of the skilled worker should be transferred partially to the technical cognitive system (CCU), so that the production unit is able to adapt to new conditions, e.g. assembly modifications. Concerning conformity of system behavior with operator expectations and therefore to avoid the risk of an operator-out-of-theloop-scenario the importance of an ergonomic user-centered HMI increases. The HMI shall support the human operator in case of interaction due to system failure as well as increase the safety of work and process. A possible working procedure of a CCU controlled production unit is described in the following: First a representation of a target (e.g. 3D-assembly object) is handed over to the CCU. This initial step is initiated and performed by the human operator. As a second step the CCU plans and performs the assembly autonomously. During the performance, an error occurs that the CCU is only qualitatively able to detect, meaning it is not possible for the CCU to identify the error. It is only possible to pass information to the skilled worker that an error has occurred. In the case of an error (e.g. error in the plan of the assembly process, error in the implementation of the control program, unforeseen failure of a machine of the production plant), and based on the aforementioned assumption that the CCU cannot identify it, the error must be identified and corrected by the skilled worker. Under this assumption, a first prototype of a supporting system was developed dealing with the task of error identification in an assembly object (incorrect construction of the assembly object). More precisely in this case, a prototype of an augmented vision system (AVS) was developed and implemented with the focus on the presentation of the assembly information. The aim of using this system was to place the human operator respective skilled worker in a position to detect the construction errors in a fast and adequate way [4]. Based upon this first prototype, the synthetic vision system was extended to assist the human operator in the disassembly of the erroneous object in order to correct the detected error in cooperation with the robot. Due to this reason, lab tests were being performed to find a - from an ergonomic point of view - preferable solution to display information in order to correct the mistake by disassembling the assembly object. The augmented reality technology in the form of a head mounted display (HMD) is being used for the lab tests, so additional synthetic assembly information or unit information can be displayed in the natural field of view of the human operator.
2 Method Experimental Design. The experimental design distinguishes two factors: 1) the socalled supporting mode with four levels: no support, Human, Robot, and Robot & Human (see Fig. 1); 2) the group with two levels: engineers, and technicians. Supporting Mode (SM), see Fig. 1:
An Empirical Study of Disassembling Using an Augmented Vision System
401
• No Support: No LEGO objects are highlighted within this mode. • Human: Within this mode, the LEGO objects are highlighted that are removable by a human operator in this particular moment. That means that the whole top surface and at a minimum one of the lateral surfaces of the LEGO object are completely exposed. • Robot: Within this mode, the LEGO objects are highlighted that are removable by a robot in this particular moment. That means that the whole top surface and two opposing lateral surfaces of the LEGO object are completely exposed. Therefore, all objects removable by the robot can also be disassembled by the human operator but not vice versa. • Robot & Human: Within this mode, the LEGO objects are highlighted that are removable by the robot. Additionally the objects that are removable solely by the human operator in this particular moment are highlighted in a different color.
Fig. 1. Factor levels of the factor “Supporting Mode” in the laboratory study
The two subject groups differ in professional education. The first group “Engineers” consists of male (future) engineers aged 20 to 40 years who are either at university or have finished their studies. The second group “Technicians” consists of male subjects aged 20 to 40 years who did a technical apprenticeship (e.g. in the field of assembly or as an (industrial) mechanic). Task. During the experimental phase the human operator has to disassemble virtually eight models in cooperation with a virtual presented robot by means of the AVS (four supporting modes with each two models). On the one hand, caused by the trend of an increasing level of automation in production industry and the associated allocation of human operators, the human interaction in the running process must be minimized and focused on situations in which the human intervention is crucial and essential. On the other hand, changes between the robot and the human operator are time consuming operations. Hence, the following restrictions are set during the laboratory test: 1) The human operator has to disassemble as few bricks as possible; 2) The assembly object should be disassembled with a minimum of Human-Robot changes. Regarding the cooperative disassembly, there are two possibilities to remove bricks. 1) The human operator marks a brick by means of an input device (see apparatus) and removes it by a voice command, or 2) the human operator decides that the robot has to remove bricks (so called CCU function). The CCU function is called
402
B. Odenthal et al.
by a voice command. The CCU controlled robot removes as much bricks as possible until no more bricks are removable by the robot (see the definition of removable bricks). Eight distinct assembly objects are developed according to the following criteria: All models consist of cubic bricks. The total number of the bricks is identical (27 pieces). A minimum of three bricks must be removed by the human operator. There are two categories of models (four models each): in the ideal case, the robot has to remove the first brick (1), or the human operator has to remove the first brick (2). Apparatus. The position and orientation of the HMD are tracked by the infrared real time tracking system smARTtrack of A.R.T. GmbH. A binocular optical see-through HMD (nVisorST by NVIS Inc.) is used to display the synthetic assembly information. A self-developed marker pen was used by the subjects to virtually mark the bricks in order to disassemble the assembly object. The pen is located at predefined position on the table. The manipulation of the virtual LEGO models is performed by voice control (voice recognition by VoCon2.1 by Philips). The robot cell is virtually presented on an off-the-shelf TFT flat screen (SyncMaster 2243BW by Samsung). The apparatus is shown in Fig. 2. „Robot Cell“
Input Device
Subject
Virtual Assembly Object
Fig. 2. Main components of the AVS (right); view through the HMD (left)
Procedure. The procedure was divided into two phases: • Pretests and training under experimental conditions: First, the personal data were collected, e.g. age, profession, experiences with computers, experiences with virtual and augmented reality systems as well as assembly skills concerning LEGO models. Furthermore, a visual acuity test, a stereo vision test and a color test according to Ishiara are processed by means of a vision test device (Rodatest 302 by Vistec Vision Technologies). Third, the “dice test” of IST 2000-R (module: figural intelligence) [5] was carried out in order to quantify the imagination of spatial visualization. Because of the fact, that the system is controlled
An Empirical Study of Disassembling Using an Augmented Vision System
403
by voice, a calibration for the voice recognition system had to be processed. After completing the pretests, the subjects have some minutes to get familiar with the system. • Data acquisition: The subject started with filling in a questionnaire about visual fatigue. Each subject conducted eight trials because each supporting mode was tested by two different assembly objects. Each subject started with the mode “no support” – as baseline in order to assess the other supporting modes. Each subject conducted the following trial four times: The disassembly time started with the beginning of the system variant. The task was the completely disassemble a virtual assembly object in cooperation with the robot (see task). The presentation of the assembly object on the HMD appeared to be at the middle of the work table. With the infrared optical tracking system the position and the orientation of the HMD was measured in real time and the user’s viewing direction was estimated. In case the CCU function within the disassembly was called by the human operator, the presentation of the virtual model disappeared on the HMD and the simulation and the animation of the robot cell started on the table mounted display. After removing all possible bricks of the model by the robot, the presentation of the (partly) disassembled object appeared again on the HMD. When the CCUfunction was activated, the disassembly by the robot could not be interrupted by the human operator. After the complete disassembling of the model, the next model followed. Finally, in the post test (after the fourth trial) the participant had to take off the HMD and had to fill in the questionnaire about visual fatigue. The total time-on-task was approximately two hours for each participant. The subjects were balanced in the experimental conditions. Subjects. A total of 32 subjects participated in the laboratory study (16 each group). All of them satisfied the requirements concerning stereo vision, color vision and emmetropia. The subjects were divided into two groups: Group 1 “male (future) Engineers”: The age of the subjects was between 21 and 35 years (mean 26.4 years, SD 4.7). 37.5% of the subjects had no experience with HMDs. 50% of the subjects had prior experience with 3D computer games. The average weekly playing time was approx. 2.5 hours. The average experience in assembling LEGO models was 3.6 and in assembling 2.5 on a scale ranging from 0 (low) to 5 (high). One person stated problems in the past after working with electronic information displays (back pain). On average, the subjects allocated 17 of 20 dices correctly. Group 2 “male technicians”: The age of the 16 subjects was between 23 and 37 years (mean 27.9 years, SD 3.7). 81.3% of the subjects had no experience with HMDs. 62.5% of the subjects had prior experience with 3D computer games. The average weekly playing time was approx. 4.1 hours. The average experience in assembling LEGO models was 4.1 and in assembling 3.7 on a scale ranging from 0 (low) to 5 (high). One person stated problems in the past after working with electronic information displays (burning eyes). On average, the subjects allocated 14 of 20 dices correctly.
404
B. Odenthal et al.
Dependant variables. The following dependent variables were considered: • Normalized Human Disassembly Time (HDTnorm): The human disassembly time (HDT) represents the time elapsed from the trial start until the virtual object is disassembled completely reduced by the removal time of the robot (RDT). The normalized human disassembly time HDTnorm is estimated as the human disassembly time divided by the minimal number of human cycles HHCpot.
HDTnorm =
HDT N HCopt
• Distance to optimal Performance (DoP): The Distance to optimal Performance (DoP) is a measure for the deviation of optimal performance and a function of the ratio rBH which is defined as the number of bricks disassembled by the human operator NBH divided by the minimal number of bricks NBHmin and the normalized human cycle (HCnorm) which is defined as the number of human cycle NHC during the experiment divided by the minimal number of cycles NHCopt. Because of their linear independence, DoP is defined as
DoP = (rBH − 1)2 + ( HCnorm − 1)2 with rStH =
NB N BH min
=
N HC NB and HCnorm = . N HCopt 3
With the optimal value of the ratio rBH (= 1) and the optimal value of HCnorm (= 1) the optimum value of DoP is 0. Hypotheses. The following null hypotheses were formulated: • The supporting mode (H01), and the group (H02) do not significantly influence the normalized human disassembly time (HDTnorm). • The supporting mode (H03), and the group (H04) do not significantly influence the difference of Distance to optimal Performance (DoP). Two analyses were carried out on the basis of the data acquired. First, the samples from both groups (group 1 – male (future) engineers; 2 – male technicians) were compared with regard to the dependent variables. To do so, a two-way analysis of variance (ANOVA) with repeated measures was calculated to test the hypotheses H01 - H04 (one within-subjects factor (SM) and one between-subjects factor (group)). Second, the groups were analyzed separately in order to identify differences between the groups with regard to the different supporting modes. To do so, a one-way analysis of variance (ANOVA) with repeated measures was calculated to test the hypotheses H01 and H03.
An Empirical Study of Disassembling Using an Augmented Vision System
405
3 Results Normalized Human Disassembly Time (HDTnorm). The means and the 95% confidence intervals of the normalized human disassembly time under the different experimental conditions are shown in Fig. 3.
Fig. 3. Normalized Human Disassembly Time [s] under the experimental conditions
All Subjects: The supporting mode (SM) significantly influenced the normalized human disassembly time (F(3,90)=2.728, p=0.048), so that null hypothesis H01 was rejected. The between-subject factor “Group” showed no significant influence of the normalized human disassembly time (F(1,30)=3.084, p=0.089), so that null hypothesis H02 was not rejected. Statistically significant interactions were not found (SM*Group: F(3,90)=01.076; p=0.363). When using the supporting mode “Human”, the average HDTnorm was only 2.48% shorter compared with the HDTnorm of the condition “no support”. When comparing the supporting mode “Robot” and “no support”, the average HDTnorm of the robot condition was 7.95% shorter. In the case of the comparison between the condition “Robot & Human” and “no support”, the “Robot & Human” condition led to an average HDTnorm that was 16.39% longer than under the “no support” condition. Group 1 – male (future) engineers: The Supporting Mode (SM) did not significantly influence the normalized human disassembly time (F(1,45)=0.408, p=0.748), so that null hypothesis H011 was not rejected. When using the supporting mode “Human”, the average HDTnorm was 7.6% shorter compared with the HDTnorm of the condition “no support”. When comparing the supporting mode “Robot” and “no support”, the average HDTnorm of the robot condition was 9.7% shorter. In the case of the comparison between the condition “Robot & Human” and “no support”, the “Robot & Human” condition led to an average HDTnorm that was only 2.0% longer than under the “no support” condition. Group 2 – male technicians: The Supporting Mode (SM) significantly influenced the normalized human disassembly time (F(1,45)=3.094, p=0.036), so that null hypothesis H012 was rejected. When using the supporting mode “Human”, the average HDTnorm was 2.1% longer compared with the HDTnorm of the condition “no support”. When comparing the supporting mode “Robot” and “no support”, the average
406
B. Odenthal et al.
HDTnorm of the robot condition was 6.4% shorter. In the case of the comparison between the condition “Robot & Human” and “no support”, the “Robot & Human” condition led to an average HDTnorm that was only 29.3% longer than under the “no support” condition. Distance to optimal Performance (DoP). The means and the 95% confidence intervals of the distance to optimal performance under the different experimental conditions are shown Fig. 4.
Fig. 4. Distance of optimal Performance under the experimental conditions
All Subjects: The Supporting Mode (SM) did not significantly influence the difference of Distance to optimal Performance (F(3,90)=1.049, p=0.375), so that null hypothesis H03 was not rejected. The between-subject factor „Group“ showed no significant influence of the DoP (F(1,30)=1.056, p=0.312), so that null hypothesis H04 could not be rejected. Statistically significant interactions were not found (SM*Group: F(3,90)=0.470; p=0.704). When using the supporting mode “Human”, the average DoP was 12.66% higher compared with the DoP of the condition “no support”. When comparing the supporting mode “Robot” and “no support”, the average DoP of the robot condition was 31.67% lower. In the case of the a comparison between the condition “Robot & Human” and “no support”, the “Robot & Human” condition led to an average DOP that was 12.57% higher than under the “no support” condition. Group 1 – male (future) engineers: The Supporting Mode (SM) did not significantly influence the DoP (F(3,45)=0.900, p=0.449), so that null hypothesis H031 was not rejected. The average DoP of the “human” condition was 16.73% lower compared with the DoP of the condition “no support”. When comparing the supporting mode “Robot” and “no support”, the average DoP of the robot condition was 49.36% lower. In the case of the comparison between the condition “Robot & Human” and “no support”, the “Robot & Human” condition led to an average DoP that was 16.06% lower than under the “no support” condition. Group 2 – male technicians: The Supporting Mode (SM) did not significantly influenced the DoP (F(3,45)=0.667, p=0.576), so that null hypothesis H032 was not rejected. When using the supporting mode “Human”, the average DoP was 46.32% higher compared with the DoP of the condition “no support”. When comparing the supporting mode “Robot” and “no support”, the average DoP of the robot condition
An Empirical Study of Disassembling Using an Augmented Vision System
407
was 11.41% lower. In the case of the comparison between the condition “Robot & Human” and “no support”, the “Robot & Human” condition led to an average DoP that was only 45.38% higher than under the “no support” condition. Results of the Visual Fatigue Questionnaire. As pointed out, the data of the visual fatigue questionnaire were collected before and after the last (fourth) trial. A difference of +1 represents an increase in visual fatigue of 10%. In Group 1 (Engineers) the largest increases in visual fatigue are 1.59 for the item “neck pain”, 1.47 for the item “Headache”, and 1.31 for the item “Mental fatigue. In Group 2 (technicians) the largest increase in visual fatigue is 0.81 for “neck pain”. The acquired fatigue values are much smaller than expected. Nevertheless, it is important to point out that the consecutive time-on-task with the HMD worn was only a short period of time in the experimental study. It is to be expected that longer load cycles would significantly increase visual fatigue [6].
4 Summary and Conclusion In this paper, the results of a laboratory study were presented that deal with the cooperative disassembly of a LEGO object by a human and a robot. Therefore, different supporting modes (SM) and two subject groups who differ in their professional education are evaluated regarding the normalized human disassembly time (HDTnorm) and the distance to the optimal performance (DoP). The supporting mode resulted in a significant influence of the normalized human disassembling time (HDTnorm) for both groups. But there is one difference: The supporting mode had less influence on the HDTnorm for group 1 than for group 2. The difference between the four modes is insignificant. Fig. 3 shows that in group 2 (Technicians) only the “Robot” mode compared with the “no support” condition led to an improvement. In contrast to that, the “Robot & Human” mode led to a considerable increase of the HDTnorm (29.3%) Regarding the distance to the optimal performance (DoP), the influence of the supporting modes are insignificant but the experiment had shown that the mode “Robot” achieved the best results regarding the average DoP compared with the “no support” mode (improvement of 50% resp. 11%). In both groups, the conditions “Human” and “Robot & Human” scored much lower than the “Robot” mode whereas regarding the technicians group the two conditions led to a considerable deterioration compared with the “no support” condition. The deterioration of the “Robot & Human” condition can be attributed to the higher mental workload caused by the higher level of presented information. Whereas the “Robot” condition is directly linked to the fulfillment of the task (The human operator has to disassemble fewest bricks possible. Hence, the robot has to disassemble most bricks possible.). Depending on the condition of the supporting mode, the users have to perceive, analyze and process different amounts of presented information which is a strenuous and time-consuming operation. In the case of a cooperative disassembly (Human-Robot) under the before mentioned conditions and on the basis of the empirical findings, the following design recommendations can be given. In a group of engineers it doesn’t matter which
408
B. Odenthal et al.
supporting mode is presented, so we can recommend that all modes can be used alternatively by the operator. In a group of technicians, the “Robot” mode seems to be most promising. Acknowledgement. The authors would like to thank the German Research Foundation DFG for the kind support within the Cluster of Excellence "Integrative Production Technology for High-Wage Countries".
References 1. Ewert, D., Mayer, M., Kuz, S., Schilberg, D., Jeschke, S.: A hybrid approach to cognitive production systems. In: Garetti, M. (ed.) Proceedings of the International Conference Advances in Production Management Systems, Cernobbio, pp. 1–8 (2010) 2. Kempf, T., Herfs, W., Brecher, C.: Cognitive Control Technology for a Self-Optimizing Robot Based Assembly Cell. In: Proceedings of the ASME 2008 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference (2008) 3. Mayer, M., Odenthal, B., Faber, M., Kabuß, W., Jochems, N., Schlick, C.: Cognitive Engineering for Self-Optimizing Assembly Systems. In: Karwowski, W., Salvendy, G. (eds.) Advances in Human Factors, Ergonomics, and Safety in Manufacturing and Service Industries, pp. 1–11. Taylor & Francis, Boca Raton (2010) 4. Odenthal, B., Mayer, M., Kabuß, W., Kausch, B., Schlick, C.: Investigation of Error Detection in Assembled Workpieces Using an Augmented Vision System. In: Proceedings of the IEA 2009 - 17th World Congress on Ergonomics (CD-ROM) Beijing, China, pp. 1–9 (2009) 5. Liepmann, D., Beauducel, A., Brocke, B., Amthauer, R.: IST 2000 R- Intelligenz Struktur Test 2000 R Manual (2007) 6. Pfendler, C., Schlick, C.: A comparative study of mobile map displays in a geographic orientation task. Behaviour & Information Technology 26(6), 455–463 (2007) 7. Schlick, C., Bruder, R., Luczak, H.: Arbeitswissenschaft. Springer, Berlin (2010), ISBN 978-3-540-78332-9; e-ISBN 978-3-540-78333-6
Polymorphic Cumulative Learning in Integrated Cognitive Architectures for Analysis of Pilot-Aircraft Dynamic Environment Yin Tangwen and Shan Fu School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240, P. R. China [email protected]
Abstract. A Polymorphic Cumulative Learning (PCL) Conception was proposed in order to make it feasible for Digital Pilots (DP) in trying to have the capability to incorporate various forms of learning mechanisms as abundant as human do. An integrated cognitive architecture for analysis of pilot-aircraft dynamic environment (ICA-APADE) which will facilitate aircraft design and evaluation is devised as a framework to make the analysis and implementation of the PCL conception more concrete and practical.
1 Introduction Digital pilots are of great significance in aircraft design and evaluation. Yet few digital pilots have the capability to incorporate various forms of learning mechanisms as abundant as human do, which leads to manifest infidelity and confines their utilization. The purpose of this paper is to figure out more plausible learning mechanisms that could be adopted by digital pilots in integrated cognitive architectures for analysis of pilot-aircraft dynamic environment which will facilitate aircraft design and evaluation. One of the common characteristics in the shaping of such various forms of learning mechanisms is that human could learn how to recognize and predict erroneous actions at different level, and the achieved learning mechanisms themselves become human’s appropriate tools and amplified strength [1] which could be intelligently utilized to improve the working environment by reducing the number of erroneous actions before it becomes unfavorable or even unforgiving [1, 2, 3]. The fulfillment of the purpose of this paper means that the expected digital pilots could be able to recognize and predict erroneous actions, and be able to reduce the number of erroneous actions [1]. To be more specific, figuring out more plausible learning mechanisms means that the expected digital pilots could be able to establish various forms of learning mechanisms while recognizing and predicting erroneous actions, and be able to accumulate the learning mechanisms established while reducing the number of erroneous actions. The above establishment and accumulation of various forms of learning mechanisms that could be abundantly incorporate by digital pilots may be characterized as Polymorphic Cumulative Learning (PCL). V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 409–416, 2011. © Springer-Verlag Berlin Heidelberg 2011
410
Y. Tangwen and S. Fu
2 Overview 2.1 Overview Human possess the following essential elements of intelligence [4]: • Have definite overall objective • Be able to set specific goals and problems to be solved under certain circumstances • Be able to obtain related information on the premise of some given context of problem-environment-goal • Be able to refine the information and extract relevant knowledge to accomplish cognition • Be able to activate the knowledge to generate intelligent strategies for the confronted context of problem-environment-goal • Be able to convert the intelligent strategies to intelligent behavior, and ultimately solve the problems to achieve the goals • Be able to set new goals and problems to be solved according to the overall objective and the new environment Among the above seven essential elements, A, B, and G are exclusive to human, while the remaining are desirable for and achievable by human artifacts. Human has the ability to abstract knowledge from information and activate the knowledge to gain intelligence. More importantly, human can convert the intelligence to intelligent tools to make it more intelligent and more capable. 2.2 Digital Pilots’ Intelligence Digital pilots are kind of virtual human artifacts, and their intelligence is confined to C, D, E, and F. Therefore, digital pilots’ intelligence can be defined as the capability to obtain related information purposefully in the context of problem-environmentgoal, to properly refine the knowledge obtained to extract relevant knowledge to accomplish cognition, to generate intelligent strategies in combination with the subjective goals, and to successfully solve the problems in the given environment using the intelligent strategies generated. 2.3 Digital Pilots’ Learning Digital pilots’ intelligence highly relies on the accomplishment of the cognition to the environment and the generation of the strategies to tackle the environment. Both the environment cognition process and the strategy generation process should be sufficient enough for the digital pilots to cope with various kinds of tasks in various kinds of scenarios. Therefore, digital pilots’ learning mainly lies in the environment cognition process and the strategy generation process. The Polymorphic Cumulative Learning (PCL) conception mentioned above is of indispensable for digital pilots to have the capability to incorporate various forms of learning mechanisms as abundant as human do.
PCL in ICA for Analysis of Pilot-Aircraft Dynamic Environment
411
2.4 Integrated Cognitive Architecture Since digital pilots’ intelligence is a subset of that of human’s, in order to build intelligent digital pilots, more specifically here to clarify and implement the Polymorphic Cumulative Learning (PCL) conception, the digital pilots should be integrated into a multifunctional architecture. Such a multifunctional architecture should be a quaternity consisting of a Digital Pilot Cognitive Model (DPCM), a Modeling Environment (ME), a Simulation Framework (SF), and a Tool for Digital Pilot Performance Analysis (TDPPA) [10]. The effort of this paper is an exertion to explore learning mechanisms in Digital Pilot Cognitive Model, with the intermediate goal to analyze the pilot-aircraft dynamic environment and the ultimate objective to facilitate aircraft design and evaluation. For this reason, here the multifunctional architecture mentioned above is regarded as an Integrated Cognitive Architecture (ICA). 2.5 Analysis of Pilot-Aircraft Dynamic Environment Any preferable integrated cognitive architecture should meet the fundamental and ultimate requirement of supporting intelligent behavior in fulfilling real time tasks, especially in time pressing cases. Cognitive processing is kind of equivalent to deliberation that drives conditioned responses---intelligent behavior. So an integrated cognitive architecture’s kernel is cognitive processing. Base on this understanding, an Integrated Cognitive Architecture for Analysis of Pilot-Aircraft Dynamic Environment (ICA-APADE) is proposed in the paper. The integrated cognitive architecture consists of two loosely coupled models, namely, ICA and APADE. The ICA model mainly corresponds to virtual digital pilots along with an interface slot for any virtual external environment to be plugged into, and the APADE model corresponds to the external dynamic environment. The purpose of combining the two models is to provide a context for conceivable learning mechanisms in this paper. As stated above, the ICA is multifunctional and can be treated as a Modeling Environment (ME), so that it can create and instantiate an APADE model and plug the APADE model into the corresponding interface slot [5].
3 The ICA-APADE and Its Theory Background The ICA-APADE was motivated by the need for aircraft design and evaluation in earlier stages, and was conceived by inspiration from scientific fields such as System Science, Cognitive Science, Synergetics, Control Theory, Information Theory, Intelligence Science, Modeling and Simulation, and Computer Science. 3.1 System View of the ICA-APADE The apparent complexity of an ant’s behavior over time is largely a reflection of the complexity of the environment in which it finds itself [1]. The complexity of a system’s behavior is largely a function of the complexity of the system’s local environment and a function of its internal complexity as well.
412
Y. Tangwen and S. Fu
Objective
Goals and Problems
DPCM
Environment (APADE) Configuration Tool
ICA-APADE
Analysis Tool Fig. 1. System View of the ICA-APADE
A system view of the ICA-APADE is draw out with its key components sketched only based on such a reflection rationale (see Fig. 1.). For the ICA-APADE itself, the Objective module is its external environment. The Goals and Problems module is just a skeleton to be stuffed according to the Objective. The Environment module is a combination of models such as aircraft and its surrounding atmosphere [5]. The DPCM module is kind of a user to solve the problems specified and to achieve the goals thereafter. The Configuration Tool module and the Analysis Tool module are self-explanatory 3.2 Control and Information View of the ICA-APADE An innovative control and information view of the ICA-APADE was conceived above (see Fig. 2.). What’s unique here is that the DPCM (the virtual pilot) has to handle two tasks concurrently instead of one, and the two tasks are treated equivalently. The first ask is to cope with the environment, that’s what the convention control and information systems actually do. The second task is to arrive at the goals and solve
Goals and Problems Acquisition (Goal)
Control (Solve) DPCM
Acquisition (APADE) ICA-APADE
Control (Operate)
Environment (APADE)
Fig. 2. Control and Information View of the ICA-APADE
PCL in ICA for Analysis of Pilot-Aircraft Dynamic Environment
413
the problems. The two tasks are different faces of the same objective. There are two information flows and two control flows as can be seen. All the flows interlock with one another rigidly in the DPCM as they are subject to the same objective. Such a conception facilitates the ICA-APADE to utilize theory from Synergetics [6, 7, 8, 9], on which the Polymorphic Cumulative Learning (PCL) proposed early was based. The two information flows represent the acquisition of information from both the goals to achieve and the environment confronted. In terms of Synergetics, the information flows are Macro-to-Micro processes of Information Compression (MMIC) which can be realized by a top-down approach introduced by H. Haken [6, 7]. The two control flows are Micro-Macro processes of Strategy Generation (MMSG) which can be achieved by a bottom-up approach established by H. Haken [8, 9]. What’s worth being noticed here is that the acquisition of goal-oriented information is for the control of the environment (to operate the aircraft), and the acquisition of environmentoriented information is for the control of the problems (to solve). Each acquisitioncontrol situation is the same as the corresponding task mentioned above. Since the two tasks are just different faces of the same objective, the two situations are bound to be mutually validated to each other. The mutual validation fact is the foundation of Polymorphic Cumulative Learning (PCL). 3.3 Learning View of the ICA-APADE Based on the insight above, it is natural to reach a learning view of the ICA-APADE as below (see Fig. 3.). Prior to the explanation to the learning view, the basics of Synergetics are necessary.
G1 Goals and Problems
qG = {qGi (Ti 0 , Ti , I i , Ai )} qG = N G ( qG ) + FG (t )
ξ Gu
ξ Eu , ξ Du
D1 DPCM
qD = {q Dj }
q D = N D ( qD ) + FD (t )
ξ Eu
ξGu , ξ Du
E1 Environment (APADE)
qD = {q Dj } ICA-APADE
q D = N D ( qD ) + FD (t ) Fig. 3. Learning View of the ICA-APADE
414
Y. Tangwen and S. Fu
The status of a system could be described by a group of variables {qi } ; its status vector q(t ) = (q1 (t )," , qN (t )) could be expressed by q = N (q ,α ) + F (t )
(1)
where N representing the determined part, F representing the fluctuation effect, and α representing control parameters [6, 7, 8, 9]. The top-down approach to realize a Macro-to-Micro process of Information Compression (MMIC) can be summarized as follow [6, 7]: Suppose a system is in one of its attractor statuses (neutral status) before a message arrives, and the reception of the message means that the initial values of α and q are to be determined. The system may remain in its neutral status or transfer to a new attractor status. If the system enters an attractor k upon a message i, then define a matrix element M ik = 1 (otherwize M ik = 0) . The system may enter several different attractors if the fluctuation effect were taken in to account. Definition1: Relative importance of attractor k is Pk' Definition2: Relative importance of message i is with fluctuation considered Pj = ∑ L jk Pk' = ∑ k
k
M jk
∑M j'
、
Pk'
(2)
'
jk
Definition3: S (0) = − ∑ Pj ln Pj S (1) = − ∑ Pk' ln Pk' j (3) Theory of maximum information entropy: The probability distribution of a system’s statuses could be speculated by maximizing S (0) with the constraint of
∑P f j
(k ) j
(4)
= fk
j
where f k is the mean value of the possible fluctuation. The message with the maximum value of Pj is the message to be sent. So the theory of maximum information entropy could be utilized to pick out the message to be sent by the G1 module and the E1 module (see Fig. 3.). The bottom-up approach to realize a Micro-Macro process of Strategy Generation (MMSG) can be summarized as follow [8, 9]: Suppose the solution of the determined equation of formula (2) with a given control parameter α 0 is q0 : q = N (q , α ) , and suppose α : q = qo + w , then formula (3) can be rewritten as q0 + w = N (q0 + w , α ) . It can to be expanded as an exponential series as q0 + w = N (q0 , α ) + L (q0 ) w , with the previous two items reserved only. Formula (4) can then eventually be converted to (5) w = L (q0 ) w λt The general solution of formulation (5) is w(t) = e v with eigen values {λi } and eigen vectors {vi ( x )} . Then we can get the expression:
,
q = q0 + ∑ ξ j v j ( x) = q0 + ∑ ξu vu ( x ) + ∑ ξ s vs ( x), ξu > 0, ξ s < 0 j
u
s
(6)
PCL in ICA for Analysis of Pilot-Aircraft Dynamic Environment
415
Formulation (6) is an order parameter equation, and ξ u is called order parameter, while ξ s is called servo parameter which can be expressed by ξu . According to Synergetics, ξ u represents the macro space structure, temporal structure or functional structure of the System. So ξu can be used to generate the DPCM’s intelligent behavior (see Fig. 3.). To be more explicit, ξGu is indicative to identify which statuses of the Goals and Problems module are to be updated, both including the directions and amplitudes; ξ Eu is indicative to identify which kinds of information are to be obtained from the environment; ξ Du is indicative to identify which kinds of intelligent behavior are to be generated to solve the problems and hence to achieve the goals via the manipulation of the environment. The previous explanation reveals that the top-down approach could be used to determine what the subsystem (Goals and Problems, Environment) needs to be changed (to achieve the goals or to cope with the environment) via information compression, and the bottom-up approach could be used to determine what the subsystem (DPCM) needs to do (to solve the problems or to operate the aircraft) via strategy generation. Selective, limited attention is an obvious property of the human perceptual system, which has been implicated as a source of ‘bounded rationality’ in decision-making, has huge effect and notable limitation to intelligent behavior. The meaning for such a property is that attention shift is driven by particular dimension of task information needed. So, the evaluation of each option or preference is confidently based on the currently gathered information. In spite of other use of the currently gathered information for option or preference activation, which forms the course of cognitive processing or deliberation itself, the available information can also be used to validate the internal dynamics of the integrated cognitive architecture, and hence be used to build Polymorphic Cumulative Learning (PCL) mechanisms. According to the Integrated Cognitive Architecture for Analysis of Pilot-Aircraft Dynamic Environment (ICA-APADE) above, it is clear that Cognitive Processing (Perception, Prediction, Planning, and Learning) or deliberation is at the core of the framework, and two major information loops are established via Cognitive Processing [10]. The essence and effect of ICA-APADE are then made curtain. While implementing the ICA-APADE, the Cognitive Processing module can be built as the kernel, and other ancillary modules can be built and plugged into based on the Cognitive Processing module. Both information loops in the ICA-APADE are SelfValidation Loops (SVL). That’s why the available information can be used to build polymorphic cumulative learning mechanisms. During the course of deliberation for decision-making, the Cognitive Processing performs perception, prediction, planning, and learning concurrently. Since the ICA model and the APADE model are loosely coupled, the ICA model and the APADE model can be independently developed, and other types of compatible APADE can be plugged into the integrated cognitive architecture. Once an initial ICA model and a ready APADE model were at position, the ICA will evolve till sufficiently mature to cope with the scenarios defined by the reference to the integrated cognitive architecture.
416
4
Y. Tangwen and S. Fu
Results and Conclusion
The ICA-APADE takes consideration of pilots’ characteristics and human factors, while traditional nonlinear flight dynamical systems fail to do so. The committed step is the Polymorphic Cumulative Learning mechanisms that draw inferences from cross-attested knowledge structures concealed in cognitive processing. Acknowledgments. This research work was supported by National Basic Research Program of China (973 Program No. 2010CB734103).
References 1. Hollnagel, E.: The Phenotype of Erroneous Actions: Implications for HCI Design. Academic Press Limited, London (1991) 2. Byrne, M.D., Kirlik, A.: Using Computational Cognitive Modeling to Diagnose Possible Sources of Aviation Error. The International Journal Of Aviation Psychology 15(2), 135– 155 3. Leiden, K., Keller, J., French, J.: Context of Human Error in Commercial Aviation. Micro Analysis & Design, Inc. (2001) 4. Zhong, Y.X.: An introduction to cognetics: theory of transforms from information to knowledge and further to intelligence. Chinese Engineering Science 6, 1–8 (2004) 5. Zipfel, P.H.: Modeling and Simulation of Aerospace Vehicle Dynamics. The American Institute of Aeronautics and Astronautics, Inc. (2007) 6. Haken, H.: Synergetic Computers and Cognition: A Top- Down Approach to Neural Nets. Springer, Berlin (1991) 7. Haken, H.: Information and Self-Organization: A Macroscopic Approach to Complex Systems. Springer, Berlin (1988, 2000) 8. Haken, H.: Advanced Synergetics. Springer, Berlin (1983) 9. Haken, H.: Principles of Brain Functioning: A Synergetic Approach to Brain Activety, Behavior and Cognition. Springer, Berlin (1996) 10. Foyle, D.C., Hooey, B.L.: Human Performance Modeling in Aviation. Taylor & Francis Group, LLC (2008)
A Context-Aware Adaptation System for Spatial Augmented Reality Anne Wegerich and Matthias Rötting Technische Universität Berlin, Chair of Human-Machine Systems, Franklinstr. 28/29, 10587 Berlin, Germany {anne.wegerich,roetting}@mms.tu-berlin.de
Abstract. From an HCI point of view Augmented Reality (AR) displays have a very specific characteristic. The shown information consists of a virtual and a nearly uncontrollable and cluttering real part. Thus, AR visualizations can be ambiguous, imprecise, or difficult to understand. To avoid this, the presented project shows a systematic method for adapting AR visualizations to the user's needs and perceptual properties. Therefore, we firstly reused and extended established 2D visualization models with AR specific visualization parameters and secondly incorporated known AR specific perception facts to set up an AR smart home control system in a kitchen. The control system involves spatial AR displays (sAR) and makes use of context information like current user tasks or the user's position. The goal is to provide a generic approach of displaying unambiguous AR information at the right time and location without overwhelming the user. Keywords: Spatial Augmented Reality, Information Visualization, Perception, Evaluation, Context-Awareness, Smart Kitchen Displays.
1 Introduction By definition Augmented Reality combines virtual information with the real world in real time and with a content-related connection to the point where it is presented [1]. This task still imply difficulties for hardware and software because of their (physical) constraints. This is one of the reasons why AR technologies have not been usable for a long time. But current system developments show the applicability of AR via mobile phones or light-weight Head-Up-Displays (HUDs). However, beside the still remaining problems like tracking, registration, occlusions, etc., virtual superimpositions with AR technologies also involve information visualization (InfoVis) as a main task. Currently the shown AR information typically is what the appropriate (rendering) system is able to provide sometimes even without following all parts of AR definition (see Fig.1). But for most applications these presentations are not the result of an evaluated visualization synthesis. Thus, users oftenhave problems to understand the AR information which is a problem not only for safety-critical contents. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 417–425, 2011. © Springer-Verlag Berlin Heidelberg 2011
418
A. Wegerich and M. Rötting
Fig. 1. Example for bad superimpositions. Most notably it is unclear which restaurant in the user's field of view is meant. Also the navigation list on the left side does not have a specific relation to the real scene thus it does not fullfill the AR definition in a strict way. In detail this means there could not be made a distinction between this "AR display" and any other mobile device with the shown applications.
The AR InfoVis is the focus of our project UARKit (Ubiquitous Augmented Reality Kitchen), the user-centered and context-aware adaptation system for sAR displays. The UARKit project follows the UAR paradigm [9] where a lot of AR (interaction) concepts (Tangible, multimodal AR) are brought together. As an additional specification we only use spatial AR devices. Spatial AR (sAR) only uses fixed projection systems or HUDs instead of mobile devices to keep the user from wearing heavy display and camera systems. Thus, the sAR devices present information at the location where they belong to in a user friendly way. But to avoid overwhelming the user with information projections a control system will accomplish the visualization synthesis based on constraints of human perception, the AR specific visualization guidelines, the user's position, and the context. The following two sections will explain how sAR visualization can be adapted to user's perception. They relate to 2D visualization parameters and empirical experiences with commercially available AR display technologies. To implement a visualization synthesis for the UARKit project we carried out a systematization of these parameters and tried to find out open questions concerning specific sAR visualization characteristics for the kitchen application. The fifth section therefore also refers to additional AR perception experiments. Section six will summarize and explain the UARKit adaptation system.
2 Theory of 2D Vis Parameters and Their Transferability In most cases AR uses 2D display surfaces or position dependent 2,5D presentations to simulate 3D. Thus, AR InfoVis deals with similar problems like 2D InfoVis. To find out the transferable rules of 2D InfoVis it has to be distinguished between different aims of (s)AR applications. Many of them do not need an abstraction of the virtual visualization because the goal is to present a virtual object very realistic. This
A Context-Aware Adaptation System for Spatial Augmented Reality
419
is not the aim of visualization adaptation. Thus, for our project it is necessary to reduce the set of proposed AR visualization types with a categorization based on their function. Following Dubois and Nigay [5] two basic concepts lead to four categories (see Table 1). Table 1. The four AR types with the tagged aim of this work (grey)
The visualization adaptation only makes sense for the virtual information about the real object (No. 4). This incorporates abstraction of the information and thus has to make use of parameters for a reduced form of information presentation. The other applications of AR information presentation do not relate to a real object or want to manipulate the real object in a virtual way, which means that the augmentation again leads to a virtual object. This is not the aim of this work because virtual or abstract objects could have any representation a designer creates (e.g. file icon on a desktop). The research field of (2D) InfoVis solves similar visualization problems for twodimensional virtual (desktop) displays. It is mainly based on findings concerning perception (e.g. Gestalt Laws [13], Semiology of Graphics [2], etc.) and combines parameters like a special color or shape of the presented information to enforce the right interpretation of data. For 2D visualization basic models determine visualization parameters and dependencies as a given set which later is used for visualization adaptation for a specific application. Typically these models contain: visualization possibilities ranging from textual to graphic forms and from static to animated forms. They also involve image variables like plane, size, value, and relation of the presentation as well as differential variables of the image like texture, color, orientation, and shape. All parameters are basic forms except value and relation. They are formed with a selected combination of the others. To specify the parameters (which color, orientation, or shape, etc. to present) these models also determine (domain related) dependencies. These are the anchors for choosing the best possible visualization. The dependencies involve user skills (from novice to expert), data types (object data, attribute data and meta-information) and data relationship structures (linear, circular, ordered-tree relations, etc.). Other dependencies describe task types (based on task types of Shneiderman [10]), interactivity types, and contextual information like user's life experience, history, intend, need, and devices. For this model and others a large set of rules exists to optimize the information presentation. Each gives guidelines to specify single parameters in general or for a
420
A. Wegerich and M. Rötting
special purpose. To set up the visualization for a given system, the developer has to determine the current dependencies to reduce the set of possible rules which have to be applied for his task. Different approaches exist to automate this determination like visualization exploration models [7] or expert systems. 2D InfoVis parameters are not naturally transferable to AR technology. One general rule says for example that graphic visualizations are always better than annotations only (textual form). But for a virtual augmentation it is relevant how cluttering the background is at the proposed location. Sometimes a graphical visualization of the information about a real object is also very complex and consists of a lot of (heterogeneous) parts and thus is not perceivable in front of a cluttering background. At this point an annotation might help because of its homogeneity. So it has to be tested which 2D parameters are transferable and whether they have to be organized into a hierarchy for AR specific conditions. Thus, for our project we collect a list of rules and set up a knowledge base which follows the principles of good visualizations (expressivity, effectiveness and appropriateness) and those of known AR perception characteristics.
3 Established AR Specific Vis Parameters The existing InfoVis models only involve parameter to optimize 2D (desktop) displays. With them it is possible to lead attention from one virtual information to another (in a relatively small virtual space) or to facilitate the interpretation of virtual data relationships like e.g. presenting them in a 2.5D-tree structure which is virtually rotatable. But if the information consists of virtual and real parts it should be distinguishable from other real information of the scene. So firstly the user has to recognize which part of the real world (and which not) belongs to the virtual augmentation. Then he has to understand the meaning of the superimposed information. An additional difficulty is that in most cases the additional virtual information is not complete or self-explaining without its real part. Many AR parameters of visualization already facilitate the handling of these tasks. Typically these parameters belong to a meta-level of the basic InfoVis parameters which means they combine e.g. color and shape in a specific way to achieve the right interpretation. These (still incomplete) parameters and appropriate exemplary rules are: • Transparency. When using transparent superimpositions maximize contrast between (transparent) virtual and real parts of information, background must be as homogeneous as possible to improve perception [6]. • Motion Parallax. Subjects underestimate sizes and distances of objects, while they are moving (while motion parallax is involved), this is similar to a known Virtual Reality effect [8]. • Cognitive Tunneling. Users tend to fixate the virtual displayed information and miss events in the real scene. This effect is multiplied by heterogeneous and distributed superimpositions [3]. • Conformal Symbology. Perception could be improved by superimposing conformal symbology at the location of the real object it represents [3].
A Context-Aware Adaptation System for Spatial Augmented Reality
421
• Nonconformal Symbology. Should be presented with distance to the relevant object in reality to keep it visible [3]. • Clutter. Clutter causes cognitive tunneling. Highlighting important things and lowlighting (reducing luminance and contrast) less important things helps to avoid this [3]. • Depth Cues. The usage of depth cues improves estimation of proposed projection depth. Pictoral cues are e.g. interposition, linear perspective, and density of texture [4]. • Gestalt Laws. Very common rules for supporting the connection of (displayed) information parts (in this case real and virtual parts) [13].
4 Experiments for Missing Parameters and Visualization Synthesis With collecting and formalizing general rules as a knowledge base the foundation is made for a rule-based expert system which is able to automate the decision for the best possible sAR InfoVis. However, searching visualization guidelines for different types of information in a complex setting will generate a huge amount of rules which is a serious problem. We therefore use the context of the project UARKit and the incorporated sAR display technologies to reduce them. The analysis of missing parameters is the result of a reduction process of possible types of information, user positions, context data, and display limitations. Furthermore, with the use of a fuzzy expert system (fXPS) we do not need a complete decision tree with only consistent rules. Fuzzy inference engines operate with approximate or plausible reasoning which need less (and fuzzy defined) rules to produce the most feasible result (visualization) for a given set of accessible and perhaps inconsistent rules. Another advantage is to allow rule definitions which are not only true (=1) or false (=0). Terms could be defined nominally with also using any value between true and false. For example the information can be very important, average important, or unimportant but nice. Thus, the visualization has to be flashy, noticeable or only visible. In this way the XPS is more suitable for the real application UARKit and solves the problem of inconsistency of all the rules for visualization. The visualization syntheses we propose is the result of the fuzzy inference engine of the XPS. With a draft of its architecture and its decision principles we also found out open questions for optimized perception of sAR InfoVis in the context kitchen: For further classification of usable visualization parameters at one point of the context analysis it has to be decided whether the real object the virtual information belongs to is real and exists physically or is an abstract idea in the real setting like a direction where a user has to go. The next decision step for a physical object has to judge about the kind of evaluation or execution. As already mentioned the meant object first has to be recognized in an efficient way. For this we have to find out appropriate rules for visualizing the needed saliency for a real object and while doing this define specific parameters. For the abstract type of information in the use case of the UARKit project we used the indoor navigation exemplarily.
422
A. Wegerich and M. Rötting
Fig. 2. Operating mode and example of the Fuzzy Inference Engine: Getting an AR visualization from knowledge bases (light grey) and current associated rules (darker grey) for one state of visualization
The open questions concerning missing parameters for the project UARKit are on the one hand related to real object recognition for physical manipulating tasks. On the other hand they relate to the comprehension of an abstract idea (in this case a direction) and execution of the demanded indoor navigation task. In both settings we testet a derived parameter from (non)conformal symbology we call redundancy and the parameter similarity as a combination of realism and Gestalt Laws. Redundancy connects noncorformal and conformal symbology with merging the proposed rules of both and therefore reduces clutter. The meant object is augmented where it is located but the whole augmentation is presented with a distance to the visible real object. Similarity summarizes the Gestalt Laws related to the characteristics of AR visualizations (continued textures, similar motion, same or similar colours etc.). Furthermore, it incorporates the degree of used realism in the virtual superimposition because of the unsolved question how real the augmentation must be to facilitate perception. 4.1 Building Block Experiment In this experiment we tested the influence of varied redundancy and similarity while presenting augmented markers on building blocks in a fixed building block setting. The subjects had to recognize which block was meant on the shown AR visualization and predict where to put it in an imaginary rebuilding task. We found out that redundancy brings no advantage when a simple contour could be presented but there is no decline in performance while using redundancy. Furthermore, distances between representations and real object decrease performance. So conformal superimpositions are preferred but only if they are distinguishable from the real object. An augmentation thus should be of average similarity to improve detection of it but keep the relation between it and the real object. Such an augmentation is even as good as the usage of the common contour.
A Context-Aware Adaptation System for Spatial Augmented Reality
423
4.2 Indoor Navigation Experiment In this experiment the subjects had to find different objects distributed in one room and therefore got varied projected navigational hints [12]. After determining AR hints based on a questionnaire we compared less redundant with redundant visualizations (arrow vs. maps) to examine the effect of a varied redundancy. The analysis shows that there is no difference between them for performance variables. On the other side the feedback of the user shows that visualizations which are more redundant (maps) are preferred. The arrow was found quite inexact and users are very familiar with using navigation systems and appropriate maps. But within the scope of the provided Smart Home scenario both visualizations are equally applicable. The second evaluated parameter was the realism of the visualization. We therefore presented two types of maps (schematic and textured version). The results show again no effect on the performance of the user. Some participants reported not even having recognized different types of maps. Accordingly we could not find a preference for one of the maps after analyzing the questionnaire. Thus, we conclude that there is no necessity to make a virtual information as real as possible for the AR type 4 (see section 2.
Fig. 3. Settings in the two experiments a) Building Block setting with augmented marker (blue block on the right side) b) projections and different hints for indoor navigation
5 UARKit System The proposed context-aware presentation system is the core of the development of a sAR system in a Smart Home Environment. It will be integrated in a kitchen that involves different spatial AR displays and interaction paradigms (see Fig. 4) as well as diverse but fixed tasks. The context information therefore consists of user's position, state of the current user task, relevant light conditions for information presentation with projections and displays with few contrast, and information about objects in the kitchen (amount of needed ingredients for a meal, positions of needed tools, etc.). Context data is registered by sensor data and a context model for steps of user tasks. For example we incorporated an intelligent floor projection which changes the location and content of a floor projection. It visualizes an information based on the position of the user and whether the perception would be efficient when light conditions, physical obstacles, and/or his distance to the projection are taken into account [11].
424
A. Wegerich and M. Rötting
Fig. 4. Kitchen of the UARKit project
The software for user adaptation of the sAR visualization is Java-based with the architecture of a Blackboard system. It incorporates different knowledge sources (KS) for information management which itself solve parts of the problem independent from each other. Thus the system is platform independent and allows highly flexible configurations and problem solving strategies. Even though is is not tied to the type of problem. Thus, the setting kitchen could be replaced by the setting of an industrial factory. Furthermore, the Blackboard system is able to deal with all the different parts of information which lead to one visualization. The knowledge sources solve the tracking, handle the context model, the database for display restrictions, etc., and provide their current results for every other knowledge source. This agent-like behavior makes it easy to replace one component, when parts of the context or the task are changing. The fXPS is one of the knowledge sources and gets the needed information for the inference engine from the other KS. As a result it provides the current visualization which is displayed by the KS which handles display management. However, with the decision for the Blackboard system the foundation is made to develop a multi agent system which is the next step towards a functionality based on learn algorithms. But moreover we keep the UARKit system highly flexible and scalable for the generic approach of a user-centered adaptation of sAR visualizations.
6 Summary and Outlook The presented approach of the UARKit outlines a generic idea of adapting spatial Augmented Reality visualizations. Therefore, we analyzed 2D InfoVis guidelines and tested their applicability for sAR concerns. Additionally we extended the resulting set of rules with AR specific knowledge from own and external experiments. Afterwards, we reduce this immense number of rules while using established methods for rule reduction based on the context of our application. The usage of fuzzy formalization of the remaining rules also facilitates the handling of the proposed expert system concerning consistency and completeness. The chosen architecture for UARKit system design supports the component character of the involved problems like tracking, context model, sensor monitoring,
A Context-Aware Adaptation System for Spatial Augmented Reality
425
etc. and the fXPS. It solves the problem of scalability and provides the possibility for exchanging components to adapt the system to other applications, user tasks, new findings concerning AR visualization adaptation. Furthermore it leads to multi agent systems and appropriate learning strategies which is state of the art for Smart Home Environments. Acknowledgements. We want to thank the Deutsche Forschungsgemeinschaft, the Chair of Human-Machine Systems Berlin, and prometei for their support.
References 1. Azuma, R., Baillot, Y., Behringer, R., Feiner, S., Julier, S., MacIntyre, B.: Recent Advances in Augmented Reality. IEEE Comput. Graph. Appl. 21, 34–47 (2001) 2. Bertin, J.: Semiology of Graphics. University of Wisconsin Press (1983) 3. Crawford, J., Neal, A.: A Review of the Perceptual and Cognitive Issues Associated with the Use of Head-Up Displays in Commercial Aviation. The International Journal of Aviation Psychology 16(1), 1–19 (2006) 4. Drascic, D., Milgram, P.: Perceptual Issues in Augmented Reality. In: Stereoscopic Displays and Virtual Reality Systems III, vol. 2653, pp. 123–134. SPIE, San Jose (1996) 5. Dubois, E., Nigay, L.: Augmented Reality: Which Augmentattion for Which Reality? In: DARE 2000, Designing Augmented Reality Environments, Elsinore, Denmark, pp. 165– 166 (2000) 6. Gabbard, J.L., Swan, J.E.: Usability Engineering for Augmented Reality: Employing userbased studies to inform design. IEEE Transactions on Visualization and Computer Graphics 14(3), 513–525 (2008) 7. Jankun-Kelly, T.J., Ma, K.-L., Gertz, M.: A Model and Framework for Visualization Exploration. IEEE Transactions on Visualization and Computer Graphics 13, 357–369 (2007) 8. Jones, J.A., Swan II, J.E., Singh, G., Kolstad, E., Ellis, S.R.: The Effects of Virtual Reality, Augmented Reality, and Motion Parallax on Egocentric Depth Perception. In: Proceedings of the 5th Symposium on Applied Perception in Graphics and Visualization, APGV 2008, New York, NY, USA, pp. 9–14 (2008) 9. Sandor, C., Klinker, G.: A Rapid Prototyping Software Infrastructure for User Interfaces in Ubiquitous Augmented Reality. Personal Ubiquitous Comput. 9, 169–185 (2005) 10. Shneiderman, B.: Designing the User Interface: Strategies for Effective Human-Computer Interaction. Addison-Wesley Longman Publishing Co., Inc., Boston (1986) 11. Wegerich, A., Adenauer, J., Dzaack, J., Rötting, M.: A Context-aware Adaptation System for Spatial Augmented Reality Projections. In: ICINCO 2009: Proceedings of the 7th International Conference on Informatics in Control, Automation, and Robotics (2009) 12. Wegerich, A., Dzaack, J., Rötting, M.: Optimizing Virtual Superimpositions: Usercentered Design for a UAR Supported Smart Home System. In: IFAC 2010: Proceedings of IFAC Human Machine System 2010 Symposium, Valenciennes, France (2010) 13. Wertheimer, M.: Über Gestalttheorie. Talk Transcription, Verlag der Philosophischen Akademie, Erlangen (1924)
Using Physiological Parameters to Evaluate Operator's Workload in Manual Controlled Rendezvous and Docking (RVD) Bin Wu1, Fang Hou2, Zhi Yao1, Jianwei Niu2, and Weifen Huang1 1
State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, Beijing, 100094, P.R. China 2 Department of Logistics Engineering, University of Science and Technology Beijing, Beijing, P.R. China [email protected]
Abstract. How to train the astronauts for complicated operation in spaceflight such as manually controlled rendezvous and docking (RVD) is a common and important question. In this paper, two training methods named Routine method and Confusing method were proposed from astronaut training experience. Two-factor experiment was conducted. The factors are training method and operation difficulty level. Sixteen male subjects participated in this experiment. Eight participants were trained with Routine method and the other eight trained with Confusing method. Physiological parameters, such as respiratory rate, body temperature, heart rate (HR) and heart rate variability (HRV), were adopted as the dependent variables to evaluate operator's workload. Results show that there are significant differences on body temperature, respiratory rate, cubic root of high frequency component (HF) of HRV, normalized low frequency component (LFNu) of HRV, and mean heart rate (HR) (p values are 0.010, 0.000, 0.042, 0.009 and 0.000, respectively) between training methods, while other physiological parameters show no significant difference. Furthermore, only significant difference on body temperature (p value is 0.010) was found between different operation difficulty levels. Therefore, suitable training methods deserve thorough investigation since they have significant effects on the physiological workload of the operators. In conclusion, this work provides insight into the effect of some influencing factors to workload during the RVD procedure, and will benefit the astronaut training in future. Keywords: Rendezvous and Docking (RVD), Training method, Difficulty level, Physiological parameter, Work load
1 Introduction In long-term spaceflight and deep space exploration, the spaceflight operations of the astronauts become more difficult and complicated [1]. A challenge is how to coordinate the conflict of the requirement of complex operations and limited resource [2]. This dilemma is especially prominent for final RVD phase in spaceflight. This paper aims to V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 426–435, 2011. © Springer-Verlag Berlin Heidelberg 2011
Using Physiological Parameters to Evaluate Operator's Workload
427
explore suitable and efficient training methods for complicated operations in final man controlled RVD phase. Docking technology has been studied since 1960's. ETS-VII docking system developed by National Development Agency (NASDA) for unmanned spacecraft is an excellent autonomous docking system, but the system needs a high accurate onboard control system with many complicated sensors [3]. For the sake of safety and reliability, the manual RVD is more suitable in current practice for China. The Apollo spacecraft docking system is a very reliable example for manual space flight. The system did not need an onboard control system because the astronauts complete the docking procedure [3]. Other docking systems for manual space flight, such as International Space Station, Soyuz, and Mir, also adopt the similar mechanism to the Apollo docking system [3]. The object vehicle’s state displayed on the screen in front of the astronauts in typical manual RVD tasks was as shown in Fig. 1. Taking Soyuz spacecraft as an example, under the aid of a television camera shooting device and an optics sighting telescope, the astronauts are able to manipulate the motion of the chasing spacecraft in X, Y and Z axis by using the translation controller on the left and to control the attitude of the spacecraft by using the orientation controller on the right [4]. Since it is difficult for researchers to realize the microgravity environment on the ground, then a manual control RVD simulation system is constructed [5, 6]. This system is based on the international engineering experience. It can simulate the scenario and provide the information that the astronauts need to accomplish the tasks.
(a) USA
(b) Russia
Fig. 1. Drone of the spacecraft in the manual control mode [6]
The docking mechanism is to mate the spacecraft stably through gradual adjusting the relative velocity and attitude between the chasing vehicle and the object vehicle. Ultra complicated attitude control and mating systems will bear high workload on the astronauts. The principles what kind of factors affect the workload and how does this happen need investigation. Workload is a concept influenced by various variables, e.g., work requirement, time pressure, and the capability and engagement of the operator, etc [7]. Subsequently, work load affects the physiological balance of autonomic nerve system. In normal situations, both sympathetic and pneumogastric nerve system maintain a balance between each other dynamically. Under stress, sympathetic
428
B. Wu et al.
components dominate and result in unbalanced autonomic nerve system [8]. Electrocardiography (ECG), together with HRVs which are derived from ECG, is an easy, noninvasive, and instantaneous quantitative measure of autonomic nerve system activity [9, 10]. A task Force of the European Society of Cardiology and the North American Society of Pacing Electrophysiology has once announced that HRV is the best quantification index to evaluate the activity of autonomic nervous system [11]. Researches have shown HRV values increase with the increasing of workload [12]. HRV has been applied in the assessment of workload in many areas, such as driving, aviation, automobile assembly etc. Several HRV parameters have been presented [13], such as mean heart rate (HR), standard deviation of the RR intervals between normal beats (SDNN), root mean square of successive differences (RMSSD) as time domain features; and very low frequency (VLF), low frequency (LF), high frequency (HF), the ratio of LF to HF (LF/HF), normalized LF (LFnu), and normalized HF (HFnu) components as frequency domain features [11]. The principles and devices of recording HRVs have been well established for decades, but compared to more conventional methods such as self reporting work load questionnaire, rarely work using HRVs in spaceflight studies has been reported. This paper is organized as follows. In Section 2, we introduce the experiment. Section 3 reports the results. Some discussions are given in Section 4. Finally, Section 5 summarizes this study.
2 Experiment 2.1 Participants Twenty two male technicians, aged between 22 and 35 years, from the China Astronaut Research and Training Center were recruited for this experiment. They all had at least a bachelor’s degree and work experience related to manual spaceflight. All of them signed the consent forms. The preliminary phase was screening the participants. Each participant was tested by using DXC-6 Psychological Assessment Instrument (developed by the Fourth Military Medical University, Xi’an, China). Tests included velocity estimate test, rotation test and both-hand coordination ability test. Finally, sixteen persons who passed the tests with a threshold score were admitted to participate in the following experiment. Before the experiment began, the participants took part in the RVD theory course. They were taught the basic knowledge and scientific disciplines of RVD techniques. 2.2 Design A within-participants design for operation difficulty level and between-participants design for training method is proposed in this study. Sixteen participants were divided into two sub-groups via different training method. Each participant performed an operation series twice, each series constitute of nine operation units, classified into three difficulty levels. Before the formal experiment, each participant conducted three turns of trials to get familiar with the operations.
Using Physiological Parameters to Evaluate Operator's Workload
429
Two training methods, routine method and confusing method, were proposed. They are based on the astronaut training experience. Before the experiment, the subjects were trained to finish three stages of operations, i.e., elementary training, advanced training and pre-mission training. For the sub-group of routine training method, units with the same difficulty level are performed in each stage. In other words, units with difficulty level 1, the easiest level, were asked to be performed in stage one. Units with difficulty level 2, the median difficulty level, were performed in stage two, while difficulty level 3 in stage three. In contrast, for the sub-group of confusing training method, units with different difficulty levels are performed alternatively in each stage. That is to say, each stage has the three difficulty levels of operation levels. For the independent variable of difficulty level, it’s determined by changing the original distance and attitudes of the chasing spacecraft to the object vehicle, as well as the velocity along the each axis. Subjective evaluation incorporated with the training expertise was adopted to judge the difficulty level of each operation. For example, if the original distance and attitude was set small, and the velocity was set to not so speedy, the operation would be deemed as low difficulty level. Three difficulty levels, the easiest, the moderate and the most difficult, were determined in the end. 2.3 Apparatus and Measures The final RVD phase is a complicated problem. The high accuracy of relative position and attitude determination is realized by manly control. The image of the cross drone under the docking interface of the object vehicle was recorded by a television camera shooting system. Then the image was displayed onto the screen in front of the astronaut [14]. The cross drone images guide the astronauts to the object vehicle while adjusting the chasing vehicle’s velocity and attitude carefully and gradually. The ideal mode of the chasing vehicle is in the zero state mode which means there is neither displacement nor attitude deviation for the chasing vehicle to the object vehicle. The cross drone may tilt, roll, yawn or displace in each axis during the RVD. The astronaut has to manually adjust the controllers to manipulate the chasing vehicle to change it into the zero state mode. The electrocardiography of the participants was recorded by the Physiological indices recording detector at multiple time points. The detectors were worn on the participants next to the skin while performing the RVD tasks. KF2 physiological parameter analyzing software developed by a research laboratory in China was used to extract the physiological parameters from the electrocardiography data. In order to eliminate the bias resulted from the specific physiological characteristics of each participant, the original absolute physiological parameters during the RVD were substituted by new measures which were obtained by dividing their corresponding absolute values when the participants were in quiet status just before the experiment began.
3 Results SPSS 16.0 was used for the data analysis in this study. All physiological parameters, i.e., the body temperature, the respiratory rate, the heart rate, the natural logarithmic values of VLF, the square root of LF, the cubic root of HF, LFNu, the natural logarithmic values of HFNu, and the natural logarithmic values of LF/HF, passed the
430
B. Wu et al.
normality test by Kolmogorov-Smirnov method, but the respiratory rate together with the natural logarithmic values of LF/HF failed the homogeneity of variances test by Levene’s method. Non-parameter tests were conducted on the respiratory rate and the natural logarithmic values of LF/HF, while other physiological parameters were analyzed using two-way ANOVA. The results of the analysis are shown in Tables 1-4 and Figs. 2-3. Results show that no significant effect of the interaction was found (p values are 0.974, 0.842, 0.871, 0.802, 0.462, 0.824 and 0.856 for the body temperature, the heart rate, the natural logarithmic values of VLF, the square root of LF, the cubic root of HF, LFNu and the natural logarithmic values of HFNu, respectively). Table 1. ANOVA of physiological parameters between training methods dependent variable
F value
Sig.
body temperature
8.00
0.010
respiratory ratea
-
0.000
heart rate
15.827
0.000
ln (VLF)
0.419
0.518
square root LF
2.957
0.087
cubic root HF
4.188
0.042
LFNu
6.913
0.009
b
2.141
0.145
b
ln (HFNu)
lnb(LF/HF)a 0.100 . No parameter Mann-Whitney test. b. ln means the natural logarithmic value.
a
ANOVA for the cubic root of HF, and LFNu (p values are 0.042 and 0.009, respectively) showed statistically significant effects of training method. Post hoc pair wise comparisons of the cubic root of HF showed that the means for routine training method were significantly higher than that for confusing training method, while comparisons of LFNu showed opposite trend, as shown in Fig.2 and Table 2. This can be explained as follows. In routine training method, the participants were trained by the operations on the same difficulty level in each stage, and each stage consists of nine operation units. The monotony in each stage made the participants feel humdrum and sterile; consequently this procedure would distract their attention from the operations and made them relaxed. However, in confusing training method, the participant was trained with all kinds of difficulty levels alternatively in each stage. This frequent change of operation difficulty led to tension of their brain pallium, made it not easy for the participants to relax themselves. They had nothing but to focus on the operations. Consequently, by using routine training method can result in lower mental workload on the operators than by using the confusing training method. These results conform with the view of Kim [8]. Their research team once announced that workload decreases high frequency components in HRV while increases low frequency components.
Using Physiological Parameters to Evaluate Operator's Workload
(a)
431
(b)
Fig. 2. Cubic root of High Frequency (HF) (a) and normalized Low Frequency (LFNu) (b) for different training methods and difficulty levels during RVD. Blue denotes routine training method, and green denotes confusing training method. Both cubic root of HF and LFNu are the ratios compared with the corresponding values when the participants are in quiet status just before the experiment began. Table 2. Statistics of High Frequency (HF) and normalized Low Frequency (LFNu)/ training method physiological
difficultly
parameter
level
cubic root of HF ratio
LFUN ratio
routine method
confusing method
M.
Std.
M.
Std.
1
1.02
0.45
0.84
0.28
2
0.91
0.32
0.84
0.28
3
0.92
0.32
0.87
0.30
1
0.92
0.17
0.99
0.18
2
0.94
0.16
1.02
0.21
3
0.97
0.14
1.01
0.20
Results demonstrate significant training method effect on the body temperature (p value is 0.01). Furthermore, significant difference on the body temperature (p value is 0.01) was also found between different operation difficulty levels. While for the natural logarithmic values of VLF, the differences between ether the training methods or the difficulty levels were not significant. This seems incompatible with the medical common sense that VLF, component smaller than 0.04 Hz, primarily associates with the regulation of body temperature [15]. But if we depict the comparisons of body temperature and VLF values between training methods and difficulty levels as shown in Fig. 3, it can be seen that with the increase of the difficulty level, the body temperature increases gradually. This trend also happens on VLF. Maybe the variation of workload is not big enough to
432
B. Wu et al.
influence VLF significantly, but both VLF and the body temperature share the same trend indeed. Statistics of body temperature and VLF were shown in Table 3.
(a)
(b)
Fig. 3. Body temperature (a) and very Low Frequency (VLF) (b) for different training methods and difficulty levels during RVD Table 3. Statistics of body temperature and very Low Frequency (VLF) training method physiological
difficultly
parameter
level
body temperature ratio
natural logarithmic value of VLF ratio
routine method
confusing method
M.
Std.
M.
Std.
1
1.10
0.06
1.12
0.07
2
1.12
0.06
1.15
0.06
3
1.13
0.06
1.16
0.06
1
-0.28
0.91
-0.48
1.26
2
-0.34
1.02
-0.43
1.18
3
-0.21
0.82
-0.22
1.15
ANOVA for the respiratory rate and the cubic root of HF showed statistically significant effects of training method (p=0.000 and 0.042, respectively). Post hoc pair wise comparisons show the respiratory rate and HF share the similar trend between training methods. The average values of the respiratory rate and HF for the routine training method are both higher than those for the confusing training method. This result is in accordance with the explanation as follows: HF, the relatively high frequency component between 0.15-0.40 Hz, corresponds to the frequency of respiration [15]. In the result analysis related with the task difficulty levels, no clear correlation between the difficulty of the task and HRV was found, as shown in Table 4.
Using Physiological Parameters to Evaluate Operator's Workload
433
Table 4. ANOVA of physiological parameters between difficulty levels dependent variable
F value
Sig.
body temperature
4.84
0.010
respiratory ratea
-
0.525
heart rate
0.265
0.768
ln (VLF)
0.523
0.594
square root LF
0.200
0.819
cubic root HF
0.442
0.643
LFNu
0.691
0.502
0.227
0.797
b
lnb(HFNu) b
a
ln (LF/HF) 0.623 . No parameter Kruskal Wallis Test. b. ln means the natural logarithmic value.
a
4 Discussions More thorough understanding about physiological characteristics under real working situations need to be addressed from both theoretically and in practice. In fact, arguments about the relationship between physiological parameters and work load never disappear all the times. No optimal parameter has been presented so far. It's once addressed that workload decreases high frequency components in HRV while increases low frequency components [8]. But Ryu and Myung thought that an increase in mental workload was typically related to a reduction in the power associated with the mid-frequency band in the HRV, implying a temporary suppression of normal arterial pressure regulation [16]. Others argued that HRV features representing high frequency components could be more accurately estimated [17], while HRV features representing low frequency components have poor repeatability and could not be estimated accurately from ECG segments shorter than 10 min [18]. Researchers also pointed out that a single physiological measure may not provide adequate information to assess the workload [16]. In this experiment, no difference between operation difficulty levels was found on the majority of physiological parameters. This can be explained as follows. Before the formal experiment, each participant conducted three turns of trials. After these trials, the participants had gotten used to the operational patterns. This training effect decreases the influence of the operation difficulty on the physiological workload. Another reason may be that the difficulty level was determined empirically by just changing the distance, velocity and attitudes of the chasing vehicle to the object vehicle. It’s assumed that variation of the controllers’ parameters will absolutely affect the difficulty level. However, this hypothesis should be validated in advance. Furthermore, operation complexity theory could be referred to develop spaceflight operation difficulty measures to quantify the difficulty level of RVD operations [19]. An interesting thing is that Jorna concluded spectral measures are not sensitive to increased difficulty levels within the same type of task [20].
434
B. Wu et al.
Some researchers argue that subjective workload evaluation approaches, such as NASA Task Load Index (NASA-TLX) questionnaire, should be also considered in the assessment of the participants' workload. Actually, typical subjective workload measures such as Cooper-Harper measure has been widely used in assessment of work load. A multivariate workload evaluation index, which integrates physiological parameters and one subjective parameter through Principal Component Analysis was proposed by Miyake [21]. However, no matter how honestly the subjects try to recall his or her feelings during the task, subjective data is not reliable or comprehensive enough [22]. The oblivion effect will get worse if the task lasts too long time. It’s too difficult or even impossible to prevent some subjective bias as well. To develop a real time subjective workload assessment method is a meaningful direction (as done in the latest Observer product of Noldus Co. Ltd, www.noldus.com), whereas this is not easy since the operators have no time to conduct such kind of a questionnaire during the task, especially in the emergency environment such as firefighting, aviation and spaceflight.
5 Conclusions This study is intended to serve as a starting point in developing effective and efficient training methods in the future training of China astronauts. With 16 technicians recruited, the influencing factors for operator's workload in manual controlled RVD by using physiological parameters were addressed. Results show significant effects imposed by training methods on physiological parameters. However, no significant effect of task difficulty level has been found. More subjects and experiments would be demanded for further investigation of the effectiveness of physiological parameters for the evaluation of operator’s mental workload in RVD. What is more, the relationship between physiological parameters and operation performance is worth further investigation in future. Besides RVDs, the proposed method could be applied to other fields such as aviation, driving, and nuclear power plant control etc. Acknowledgments. This research was supported by China Manned Space Project Fund 2009SY5411005 and the National Natural Science Foundation of China under grant 71001092. The authors express gratefulness to the subjects for their participation in the experiment.
References 1. Zhu, R.Z.: Rendezvous and docking techniques of spacecraft. National Defense Industry Press, Beijing (2007) 2. Zhang, Y.J., Li, Z.Z., Wu, B., Wu, S.: A spaceflight operation complexity measure and its experimental validation. International Journal of Industrial Ergonomics 39, 756–765 (2009) 3. Ui, K., Matunaga, S., Satori, S., Ishikawa, T.: Microgravity Experiments of Nano-Satellite Docking Mechanism for Final Rendezvous Approach and Docking Phase, pp. 56–63 (2005) 4. Hall, R.D., Shayler, D.J.: Soyuz: A universal spacecraft. Springer, Berlin (2006) 5. Jiang, Z.C., Zhou, J.P., Wang, Y.F.: Manual control rendezvous and docking simulation based on cross drone. National Univ. of Defense Technology 29(7), 100–103 (2007) (in Chinese)
Using Physiological Parameters to Evaluate Operator's Workload
435
6. Yang, J., Jiang, G.H., Chao, J.G.: A Cross Drone Image-Based Manual Control rendezvous and docking Method. Journal of Astronautics 31(5), 1398–1404 (2010) (in Chinese) 7. Moray, N.: Mental Workload: Its Theory and Measurement, pp. 5–6. Plenum Press, New York (1979) 8. Kim, D., Seo, Y., Kim, S.H., Jung, S.: Short Term Analysis of Long Term Patterns of Heart Rate Variability in Subjects under Mental Stress. In: International Conference on BioMedical Engineering and Informatics, pp. 487–491 (2008) 9. Zhong, X., Hilton, H.J., Gates, G.J., Jelic, S., Stern, Y., Bartels, M.N., DeMeersman, R.E., Basner, R.C.: Increased sympathetic and decreased parasympathetic cardiovascular modulation in normal humans with acute sleep deprivation. J. Appl. Physiol. 98(6), 2024–2032 (2005) 10. Henelius, A., Hirvonen, K., Holm, A., Korpela, J., Muller, K.: Mental Workload Classification using Heart Rate Metrics. In: Proceedings of 31st Annual International Conference of the IEEE EMBS, Minneapolis, Minnesota, USA, September 2-6 (2009) 11. Malik, M., Bigger, J.T., Camm, A.J., et al.: Heart rate variability: standards of measurement, physiological interpretation and clinical use. European Heart Journal 17(5), 354–381 (1996) 12. Di Rienzo, M., Mancia, G., Parati, G., et al.: Blood Pressure and Heart Rate Variability-Computer Analysis, Modeling and Clinical Applications, pp. 123–127. IOS Press, Amstedam (1993) 13. Allen, J.J.B., Chambers, A.S., Towers, D.N.: The many metrics of cardiac chronotropy: a pragmatic primer and a brief comparison of metrics. Biol. Psychol. 74(2), 243–262 (2007) 14. Jiang, Z.C., Zhou, J.P., Wang, Y.F., Li, J.R.: Manual Control Rendezvous and Docking Simulation Based on Cross Drone. Journal of National University of Defense Technology 29(5), 100–103 (2007) 15. Mulder, G.: Sinus arrhythmia and mental workload. In: Moray, N. (ed.) Mental Workload: Its Theory and Measurement, pp. 327–343. Plenum, New York (1979) 16. Ryu, K., Myung, R.: Evaluation of mental workload with a combined measure based on physiological indices during a dual task of tracking and mental arithmetic. International Journal of Industrial Ergonomics 35, 991–1009 (2005) 17. Sandercock, G.R., Bromley, P.D., Brodie, D.A.: The reliability of short-term measurements of heart rate variability. International Journal of Cardiology 103(3), 238–247 (2005) 18. McNames, J., Aboy, M.: Reliability and accuracy of heart rate variability metrics versus ECG segment duration. Med. Biol. Eng. Comput. 44(9), 747–756 (2000) 19. Zhang, Y.J., Xu, Y.Z., Li, Z.Z., Li, J., Wu, S.: Influence of Monitoring Method and Control Complexity on Operator Performance in Manually Controlled Spacecraft Rendezvous and Docking. Tsinghua Science and Technology 13(5), 619–624 (2008) 20. Jorna, P.G.A.M.: Spectral analysis of heart rate and psychological state: a review of its validity as a workload index. Biological Psychology 34, 237–257 (1992) 21. Miyake, S.: Multivariate workload evaluation combining physiological and subjective measures. International Journal of Psychophysiology 40, 233–238 (2001) 22. Parker, R., Moore, D., Baillie, B.: Measurement of Rural Fire Fighter Physiological Workload and Fire Suppression Productivity, New Zealand Fire Service Commission Research Report Number 92 (2008)
Task Complexity Related Training Effects on Operation Error of Spaceflight Emergency Task Yijing Zhang1, 2, Bin Wu1, Xiang Zhang1, Wang Quanpeng1, and Min Liu1 ,
1
China Astronaut Research and Training Center, State Key Laboratory of Space Medicine Fundamental and Application, Beijing, 100094, P.R. China 2 University of Illinois, Beckman Institute, Urbana & Champaign, 61801, Illinois, U.S. [email protected]
Abstract. This paper aims to investigate the training effects on the operation errors of Chinese operators when they conduct spaceflight emergent tasks with different complexity. Twenty-eight operators participated in one training experiment and took nine tasks which were divided into three complexity levels based on complexity measures. Six operation errors were recognized based on one engineering classification method. The results showed that training can reduce the errors significantly in practice trails. Different effects were also found on operation errors of tasks with varying complexity. Training did not exert significant influence on some error types, such as time estimation error, which might rely more on the inherent cognitive ability. In test trails, training differentially reduced errors on the high-complexity tasks, but showed no effect on the other tasks. The results implied that future training programs should be designed based on the complexity of tasks and the cognitive characteristics of astronauts. Keywords: Training effect; Operation error; Task complexity; Emergency tasks; Error classification.
1 Introduction The industrial psychology and training literature have showed that training could reduce performance errors in many domains, such as performance ratings [1], medication administration [2-3] and everyday problems test (EPT) [4]. In aviation, Reason acknowledged that error was inevitable and made crew resource management (CRM) evolving into “error management”. The concept of error management was aimed at error avoidance and trapping. Realizing the impossibility of completely eliminating errors, training also targeted to mitigate of the consequences of error [57]. This error management training was also transfer to many other domains [8-9], however, seldom were training effects on errors reported in spaceflight [10]. On the other side, those training research and practice mainly focused on operation errors as one dependent variable, rather than relating the errors with task complexity or error type or cause. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 436–445, 2011. © Springer-Verlag Berlin Heidelberg 2011
Task Complexity Related Training Effects
437
The goal of this study is to examine the effects of the training on operation errors of spaceflight emergency tasks with different complexity, with a particular emphasis on elucidating how the training influences the occurrence of different errors. The remainder of this paper is organized as follows. Section 1.1 and 1.2 describes the measure for operation complexity in spaceflight and the error classification method used in this study. The research method is described in Section 2. The experiment results and data analysis are presented in Section 3. The last two sections discuss and conclude this study. 1.1 Complexity Measures To investigate the complexity related training effects, the complexity of spaceflight operations should be determined firstly. Therefore, two spaceflight operation complexity measures were used to evaluate the operation complexity in this study. One was the operation complexity measure based on entropy [11]; the other were the Weighted Halstead Measure [12]. The two complexity measures have been validated by correlation/regression analysis between complexity values and performance values with astronaut training data and the experiment data [11, 12]. Operation Complexity Measure Based on Entropy. This operation complexity measure was proposed using a weighted Euclidean norm based on four factors: complexity of operation step size, complexity of operation logic structure, complexity of operation instrument information, and complexity of space mission information. The development of the operation complexity measure followed four steps. First, four factors were identified to be reflected in the operation complexity measure for spaceflight. Second, the entropy theory for a graph was adopted to measure the four factors [13]. Then, the weights of the four factors were determined based on a questionnaire survey of Chinese astronauts. Finally, the operation complexity values of spaceflight operations were determined by the weighted Euclidean norm of the four factors [11]. Weighted Halstead Measure. Halstead method indicated that one software program had a more complicated procedure structure if it contained more operators and operands [14]. For the spaceflight operation, operator was defined as the action of astronauts; operand was defined as the instrument which was operated by astronauts. To reflect the complexity of spaceflight operations, the number of weighted operators and weighted operands were used in Halstead Volume formula. The operators were weighted based on the McCracken-Aldrich theory [15]. And the operands were weighted according to the complex level of operation interface. This measure could evaluate operation complexity from two aspects: the operation activity and the cognitive load [12].
438
Y. Zhang et al.
1.2 Error Classification Each organization or industry has its own understanding on the definition of human error [16]. In complex system, such as aerospace and nuclear plant, human error is usually defined as action or inaction that leads to deviation from crew intentions or situational requirements such as policies, regulations, and standard operating procedures [17]. Correspondingly, there are many different human error classifications, which are dependent on the objectives of the researcher. Psychologists may be interested in understanding the psychological causes of a human error compared to theoretical models. Engineers will want to classify human errors in a way that enables practical error reduction steps to be taken quickly [18]. In this study, one error classification based on the characteristic of spacecraft interface and spaceflight tasks was used to explore how the training affects operation errors. After analyzing the historical operation errors and the subjective survey of astronauts and astronaut trainers, six error types were defined. They were disorder error, instrument regulation error, operation error, observation error, report error and time estimation error. Sequence disorder usually happens when operators do not get a integrate task theory. One emergency task contains a series of operations which must be performed in a firm sequence to clear the alarm. If operators mix the sequences when performing an emergency task, a serious error happens and even will lead to a mission failure. Instrument regulation errors are the actions which are against the instrument regulations. For example, an instruction sent from instrument panel LB of spacecraft contains three sequent steps. First select the row key, then select the column key and pushing “send” key at last. If operators do not obey these regulations, one instruction cannot be sent successfully. Operation error are those operations when they are performed using a wrong instrument. Observation errors are some slips to ignore useful or important information on screen. Sometimes, observation errors also happen when operators cannot find the specified information. Report errors are those errors when operators do not report back to ground in time as required. Time estimation errors are those when operators cannot resent some instructions within one specified time period or when operators fail to give correct time estimation.
2 Method A within-participants experiment was designed to explore the training effects on operation errors. Twenty-eight participants performed nine spaceflight operation tasks which were divided into three complexity levels based on the task measures. The independent variable was training stages and operation complexity. Operation errors were measured and classified as the dependent variables.
Task Complexity Related Training Effects
439
2.1 Participants Twenty-eight male university students were recruited as participants for this experiment. They all had college education or above in aeronautics and astronautics. Their ages ranged from 21 to 26 years old (Mean=23.1, SD=1.57). They all singed consent forms before they participated in the experiment. 2.2 Experiment Task The experiment tasks included nine spaceflight operation tasks, one of which is an activity group to handle one malfunction of spacecraft system. These tasks were evaluated to get the complexity values ranging from 0.9 to 1.7 (based on the measure of entropy) and 23.4 to 293(based on Weighted Halstead measure). According to the complexity values, the nine tasks were divided into three complexity levels and each of them complexity included three operation tasks. The nine tasks appeared in random sequence for each participant of one group. A participant finished a trial when he completed all the nine tasks. The experiment tasks were performed in two forms, one is operating tasks with the help of a manual, called a practice trial, and the other is operating tasks without a manual, called a test trials. Practice trails simulate the operations under normal situations of spaceflight. Test trails present the operations of extreme situations, when astronauts have no time to check the manual. 2.3 Facilities The experiment platform was a simulated spacecraft instrument system, which included the spacecraft instrument panel and the software modules (training support system). The training support could record important operation actions and the corresponding times. Another software program was developed for this experiment to record the time required to perform every operation unit. The whole experiment procedures were recorded by a video-recording system. 2.4 Dependent Variables Operation errors are the dependent variables, which included six error types based on the engineering classification and the total error which is the total number of errors happened in performing the operations of one complexity level. 2.5 Experiment Design The experiment was conducted in China Astronaut Training Centre and had four subexperiments. The first three sub-experiments were conducted at one week intervals, which represent initial training, repeat training and determined training. The fourth one indicates regain training, which was conducted 140 days after that the third
440
Y. Zhang et al.
sub-experiment was finished. The practice trails were carried out at the beginning of each of the four sub-experiments. The test trails were only carried out at the end of the third and the fourth sub-experiments.
3 Results SPSS 15.0 was used for the data analysis in this study. The effects of training on the nine error types and the total error were analyzed. One-way ANOVA and nonparametric tests (Kruskal Wallis Test) were applied on each of the three complexity levels. Both the data in practice trials and that in test trials were analyzed in this Section. 3.1 High Complexity For high-complexity operations, the means, standard deviations for all the error variables examined in practice trails of the four training stages are presented in Table 1. All the error variables did not pass the normality and homogeneity of variances tests, and were explored by Kruskal Wallis Test. For the six error types based on engineering, training showed a significant effect on operation rule error, operation error and observation error (Ps<0.05), but showed no significant effect on the other three error types. The results also showed that training has a significant effect on the total error. In test trails, all the error variables did not pass the normality and homogeneity of variances tests, and was explored by Kruskal Wallis Test. For the six error types, training showed a significant effect on disorder error (X2 =6.59, p=0.01), instrument regulation error (X2 =3.8, p=0.051), operation error (X2 =3.85, p=0.050) and report error (X2 =4.23, p=0.040), but showed no significant effect on observation error, time error and the total error. 3.2 Medium Complexity For medium-complexity operations, the means, standard deviations for all the error variables examined in practice trails of the four training stages are presented in Table 2. And all the error variables collected in test trails of the last two training stages were also analyzed in this section. All the error data did not pass the normality and homogeneity of variances tests, and was explored by Kruskal Wallis Test. In practice trials, training showed a significant effect on operation rule error and operation error (Ps<0.05), but showed no significant effect on the other five error types. In test trails, for all the error types, the training did not showed any significant effect (Ps>0.05).
Task Complexity Related Training Effects
441
Table 1. The analysis result of nine error types of high-complexity tasks in practice trials 1st
2nd
4rd
3rd
Error types M Disorder error
SD
0.036
M
SD
M
SD
M
SD
ChiSquare
p
0.036 0.189 0.036
0.189 0.00
0.00
1.02
0.797
0.793 1.32
Instrument regulation 4.36 error
6.86
0.679 0.863
2.42
25.1
<.001
Operation error
1.42
0.714 0.937 0.179 0.390 0.464 0.693
17.3
0.001
1.18
1.36
Observation 0.321 0.772 error Report error
1.70
0.143 0.448 0.179 0.476
0.5
0.429 1.20
1.00
1.91
9.29
0.026
0.00
0.00
0.00
7.69
0.053
0.00
Time error
1.04
1.77
0.286 0.535 0.286 0.763 0.643
1.28
6.34
0.096
Total error
1.80
1.79
0.759 0.661 0.348 0.387 0.857
1.04
28.8
<.001
Table 2. The analysis result of nine error types of medium-complexity tasks in practice trials 1st
2nd
4rd
3rd
Error types M
SD
M
SD
M
SD
ChiSquare
p
0.00
0.00
0.00
0.00
6.17
0.104
0.286 0.600 0.250 0.518 0.571
M
SD
Sequence ddisorder
0.107
0.315 0.036 0.189
Instrument regulation error
3.43
4.44
1.32
44.7
<.001
Operation error
1.04
0.838 0.250 0.518 0.107 0.315 0.357 0.731
29.5
<.001
Observation 0.536 error
0.881 0.321 0.612 0.464 0.922 0.214 0.499
3.12
0.374
Report error
0.143
0.356 0.250 0.585 0.036 0.189
Time error
1.39
2.39
0.429
Total error
1.63
1.34
0.00
0.00
7.52
0.057
0.250 0.518 0.714
1.41
4.61
0.203
0.768 0.799 0.554 0.698 0.619 0.741
23.1
<.001
1.17
442
Y. Zhang et al.
3.3 Low Complexity For low-complexity operations, the means, standard deviations for all the error variables examined in practice trails of the four training stages were presented in Table 3. And all the error variables collected in test trails of the last two training stages were also analyzed in this section. All the error data did not pass the normality and homogeneity of variances tests, and was explored by Kruskal Wallis Test. In practice trails, training showed a significant effect on the most of the error types (Ps<0.05), but showed no significant effect on time errors. In test trails, For all the error types, the training did not showed any significant effect (Ps>0.05). Table 3. The analysis result of nine error types of low-complexity tasks in practice trials 1st
2nd
4rd
3rd
Error types M
SD
M
SD
M
SD
M
SD
ChiSquare
p
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
1.00
Instrument regulation 0.821 error
1.06
0.071 0.262 0.036 0.189 0.107 0.315
30.9
<.001
0.071 0.262
19.1
<.001
0.00
0.00
8.92
0.03
0.500 0.793 0.250 0.585 0.071 0.262 0.036 0.189
13.2
0.004
0.00
0.036 0.189
3.00
0.392
Total error 0.420 0.403 0.268 0.396 0.071 0.178 0.083 0.147
23.4
<.001
Disorder error
Operation error
0.357 0.488 0.071 0.262
Observation 0.00 error Report error Time error
0.00
0.00
0.00
0.00
0.143 0.356 0.036 0.189
0.00
0.00
0.00
0.00
4 Discussion The positive effect of training on performance has been proved by a lot of research [1-4]. In this study, training decreased most of the error types in practical trials for each of the three complexity level tasks. However, there are some error types, such as sequence disorder, time errors and report errors, which are not been significantly affected by training. It seems there is a mismatch with the previous findings [1-4]. But it is clear if they were looked in the cognitive side and the operation environment. Sequence disorder may involve high-order decision making processes and tend to happen under complex or urgent situation. In practice trials, operators were asked to perform the tasks with the help of one procedural manual. So operators seldom made those errors. That’s why sequence disorder errors did not
Task Complexity Related Training Effects
443
show any changes with the increase of training stages. The time error and report error can be understood from the cognitive basis of such error types. Time error depends on how accurately a person can estimate the time interval between current and prospective operations. It mainly depends on one’s inherent cognitive ability and is hard to change by short-time training. Report error may reflect one kind of attentional distribution problems. When focusing on the emergence operation, one operator could easily ignore to report back due to his limited attentional resources. Such errors cannot be readily eliminated just by short-term training, either. However, there was a decrease of report errors in low-complexity tasks, suggesting that such errors are not only related to training but also related to the poor task design, which did not take the ability limit of operators into consideration. If the task demand is beyond human cognitive capacities, training goals towards the increase of performance would be compromised. In test trials, training did not show any significant effect on all the error types for medium-complexity and low complexity tasks. However, operators showed better performance on sequence disorder, instrument regulation error, operation error and report errors in high-complexity tasks. One clear result is that high-complexity tasks need more training sessions in order to reach the performance plateau. Instrument regulation error and operation errors usually happened when operator were not familiar with the instrument interface and instrument regulations. It is obvious that training show a positive effect on the two kinds of errors. For the report errors, the significant effect of training indicates that without a manual’s distraction, operators can somehow free their limited attentional resources and shift them towards reporting important and urgent messages. And the high arousal level and learning motivation aroused by the high-complexity tasks maybe one explanation for the positive effect of training on high-complexity task.
5 Conclusions The training effects on the operation errors of spaceflight emergent tasks with different complexities were investigated in this paper. Different effects were found on different operation errors of different complex tasks. In practice trails, training does not produce significant influence on some error types, such as time estimation error and report error, which rely more on the inherent cognitive ability. In test trials, larger training effects are observed on the errors of the high-complexity tasks. Those results imply that training programs in future should be designed based on the complexity of tasks as well as the cognitive characteristics of the astronauts. As described by Morphew [19], human errors could result from factors like poor cockpit design, psychological-related factors, and the nature of the training. Other research also implied that it is an effective way to diagnose the underlying cause of observed performance limitation [20]. Future research should look deep what are the psychology causes and the engineering causes of errors, and should also address what types of training (e.g., massed vs. spaced) might decrease what types of errors in performing spaceflight emergent tasks.
444
Y. Zhang et al.
Acknowledgments. This research was part supported by advanced space medioengineering research project of China 2009SY5411005 and the National Natural Science Foundation of China under grant 70771057. The authors thank Lab of Science and Technology on Human Factors Engineering for support.
References 1. Borman, W.C.: Format and Training Effects on Rating Accuracy and Rater Errors. Journal of Applied Psychology 64(4), 410–421 (1979) 2. Daniel, G.F., Seybert, A.L., Smithburger, P.L., et al.: Impact of simulation-based learning on medication error rates in critically ill patients. Intensive Care Medicine, Vol 36(9), 1526–1531 (2010) 3. Brannick, M.T., Fabri, P.J., Zayas-Castro, J., Bryant, R.H.: Evaluation of an ErrorReduction Training Program for Surgical Residents. Academic Medicine, Vol 84(12), 1809–1814 (2009) 4. Boron, J.B., Willis, S.L., Schaie, K.W.: Cognitive Training Effects on Everyday Problem Solving in the Seattle Longitudinal Study. In: Biannual Cognitive Aging Conference, Atlanta, GA (April 2004) 5. Merritt, A., Klinect, J.: Defensive Flying for Pilots: An Introduction to Threat and Error Management (2006) 6. Thomas, M.J.W.: Instructional use of error: the challenges facing effective error management in aviation training. In: Proceedings of the Sixth International Aviation Psychology Symposium. Australian Aviation Psychology Association, Sydney (2003) 7. Reason, J.: Human error. Cambridge University Press, Cambridge (1990) 8. Keith, N., Frese, M.: Effectiveness of Error Management Training: A Meta-Analysis. Journal of Applied Psychology 93(1), 59–69 (2008) 9. LaPoint, J.L.: The effects of aviation-based error management training on perioperative safety attitudes. University of Phoenix (2008) 10. Zhang, Y.J., Zhang, X., Liu, M., Wang, Q.P., Wu, B.: Research on Astronaut Training Methods for Complicated Operation. In: 3rd AHFE International Conference, Miami Florida, USA, July 17-20 (2010) 11. Zhang, Y.J., Li, Z.Z., Wu, B., Wu, S.: A Spaceflight Operation Complexity Measure and its Experimental Validation. International Journal of Industrial Ergonomics 39(5), 756–765 (2009) 12. Zhang, Y.J., Shi, L.J., Li, Z.Z., Wu, B., Xu, S.: Validation and Application of Spaceflight Operation Complexity Measure in Chinese Astronauts Training of Extra Vehicular Activity Mission. In: The 60th International Astronautical Congrees (IAC), Daejeon, South Korea, October 12-16 (2009) 13. Mowshowitz, A.: Entropy and the complexity of graphs. University of Michigan, Ph.D. Dissertation, pp. 101–113 (1967) 14. Halstead, M.H.: Elements of software Science. Elsevier Science Inc., New York (1977) 15. McCracken, J.H., Aldrich, T.B.: Analyses of selected LHX mission functions implications for operator workload and system automation goals. Human Factors Research in Aircrew Performance and Training, MDA903-81-C-0504 ASI479-024-84 (1984) 16. Woods, D.D., Dekker, S., Cook, R., Johannesen, L., Sarter, N.: Behind Human Error, p. 235 (2009)
Task Complexity Related Training Effects
445
17. Klinect, J.R., Wilhelm, J.A., Helmreich, R.L.: Threat and Error Management: Data from line observations safety audits. In: Proceedings of the Tenth International Symposium on Aviation Psychology, pp. 683–688. The Ohio State University, Columbus (1999); (UTHFRP Pub 241) University of Texas 18. Human Error Classification and Data Collection. Report of a technical committee meeting organized by the international atomic energy agency and held in Vienna, February 20-24 (1989) 19. Morphew, M.E.: Psychological and Human Factors in Long Duration Spaceflight, McGill. Journal of Medicine 6(1), 74–80 (2001) 20. Kirlik, A. (ed.): Adaptive Perspectives on Human-Technology Interaction Methods and Models for Cognitive Engineering and Human-Computer Interaction, pp. 29–41 (2006)
The Research of Crew Workload Evaluation Based on Digital Human Model Yiyuan Zheng and Shan Fu School of Aeronautics and Astronautics, Shanghai Jiao Tong University 800 Dongchuan Rd., Shanghai 200240, P.R. China [email protected]
Abstract. The present paper presented the requirements of a digital human model used in the research of human factors in flight deck design and the methods to assess the workload, and established one model to evaluate the crew workload based on certain task. Keywords: Digital Human Model, Human factors, Fuzzy Model Identification.
1 Introduction Aircraft accidents have always been the major safety crisis to the aviation industry. According to NASA’s statistics, over 70% of these accidents could be attributed to the performance of man. Therefore the research of human factor is important to the safety of the aircraft. To correct the errors h happened in preliminary design phrase would cost much more time and efforts in detail design phrase, thus the human factor should be considered seriously at the beginning of the design, especially in the cockpit design, for over 90% flight operations are accomplished there. Currently, the majority of the researches of human factor in aviation were carried out with the concept of aviation psychology. Although these measures are fairly reliable, it results from the accidents and catastrophes. Therefore, it is reasonable to study the human factor from the perspective of engineering design with digital human as the role of pilots.
2 Construction of Digital Human Currently, digital human models have been wildly used in different fields like training, risk assessment and UI design. There are lots of benefits of using them in the engineering design and human factors engineering, such as shorter design time, lower established cost and increasing quality. It narrows the gap between engineering and ergonomics, and realizes human-centered concept in the design process. 2.1 Requirements of Digital Human Digital human model in the aviation industry, especially researching human factors in the flight deck, is an emerging technique. Based on the specifications and activities of the pilots, the development of such digital human model should consider the following features related to the physical and behavioral attributes or requirements of real pilot: V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 446–450, 2011. © Springer-Verlag Berlin Heidelberg 2011
The Research of Crew Workload Evaluation Based on Digital Human Model
• • • • • • •
447
Coverage: The range of pilot selection is 5%-95% of normal people. [1] Visibility: The pilot’s angle of view of outside sitting in normal position is at least 15°. Comfort: Whether the seat is comfortable enough directly influence the behaviors of the pilot. Accessibility: Make sure that the digital human models could reach and operate each switch easily. Ingress and egress: Make sure the target could get in and out the cockpit easily. Cooperation: Whether the performances of the pilot or the co-pilot could affect the other one that would influence the performances of aircraft? Decision making: Such digital human model should have basic mental characteristic of human.
2.2 Construction of Digital Human Beside the above requirements, the established digital human model should subject to the basic physical parameters of pilots. As a special career, pilot has its own specifications, as following table giving out the main dimensions of male pilots in China: Table 1. Main dimensions of male pilots in China Monitoring Items
Monitoring Items Height/mm Weight/kg Upper Arm/mm Forearm/mm Thigh/mm
1720 71 320 238 500
Eye height/mm Shoulder height/mm Sitting height/mm Sitting eye height/mm Sitting shoulder height/mm
Monitoring Items 1606 1402 932 819 614
Shoulder breath/mm Sitting hip breath/mm Total head height/mm Hand length/mm Foot length/mm
388 349 233 186 254
According to the parameters in table and concerned requirements, we built up a digital human with certain parameters in figure 1:
Fig. 1. The Standard Chinese Male Pilot Digital Human Model
448
Y. Zheng and S. Fu
The visibility and accessibility analyses were carried out in a model of prototype of MA60 , and the results were satisfied.
3 Crew Workload Research The definition of workload can be described as the portion of human resources the operator required to perform a task under certain environmental and operational conditions. The resources reflect as physical and mental requirement of the operator. Appropriate workload might improve the efficiency of the operator, however, excessive workload definitely results in severe performance decrements. Workload evaluation techniques are typically classified into three categories: subjective assessment, physiological methods and task performance measures. [2] Task performance measures basically separate into two types: primary task measures and secondary task measures. Primary task is frequently used as the research of generalization of the study, while the secondary task itself has no practical means and only serves to measure the work load of the operator. In order to make the primary task measures reliable, it is necessary to well-plan the parameters that should be measured. Frequently, primary task measures are worked out with combination of experimental and operational assessment, like speed, response time, accuracy and error rate are often served to evaluate the primary task performance. We used primary task measures to evaluate the workload. The primary task was continuous portion of the operations of taking off which included 2 operations to mode control panel, 2 operations to overhead panel and 1 operation to CDU, and the parameters we selected to indicate the workload were average time to fulfill such task and the angle of rotation of shoulder joint of digital human, where Average Time is calculated based on the MOD method, [3] and Angle of Rotation relies on Human Posture Analysis in CATIA. After such simulation, we compared the results with data from experienced pilot who achieved the same task under the condition of semiphysical simulation to assess that whether the established digital human model was suitable for workload research. 3.1 Measurement of Features of Digital Human and Determination of Evaluation Types Simulated the digital human models to obtain the average time and angle of rotation of shoulder joint by forcing the models to accomplish the above task, the data as following table: Table 2. Measurement Results of Digital Human Model Digital Human Model Standard Human Pilot Model
Average Time 7.5s
Angle of Rotation of Shoulder Joint 366°
To giving out a comparative reference, we gathered five experienced pilots to discuss the type of the average time and the angle of rotation of shoulder joint according to their flight experiences. The consensus was following:
The Research of Crew Workload Evaluation Based on Digital Human Model
449
Table 3. The type of behavior Type Bad Common Good Excellent
Average Time >7.8s 7.3s — 8.2s 6.8s — 7.6s <7.2s
Angle of Rotation of Shoulder Joint >390° 355°— 395° 340°— 370° <360°
3.2 Fuzzy Model Identification Then, to analyze the workload, we used the Principle of Maximum Membership Degree. Assuming that the parameters of Average Time and Angle of Rotation of Shoulder Joint are following the normal fuzzy set, therefore, the relevant function of membership degrees is: [4]
⎧ 0, x j − x j > 2s j ⎪⎪ Aj = ⎨ xj − x j 2 ) , x j − x j ≤ 2s j ⎪1 − ( 2s j ⎪⎩ Where x j is the parameter of Average Time and Angle of Rotation of Shoulder Joint,
x j is the average value of consensus, and 2s j is the standard deviation, then compare each Ai ( x ) by the following function to determine the membership:
Ai ( x) ≈
1 2 ∑ Aj ( x j ) 2 j =1
3.3 Results According to the calculation, the membership degree of DH to ‘Common’ is 0.7444, which is higher than the degree to ‘good’ that is 0.4499. Therefore, we classified the performance of Digital Human to ‘Common’. The results of the model are in accordance with the expectations, and we thought the model could be used to evaluate the crew workload. In order to make the model more human, the mental part would be added to it.
4 Conclusion In this paper, the author introduced the digital human models used to evaluation the crew workload of pilot, and established one to accomplish a given task. The results are in line with the expectations.
Acknowledgement This research work was supported by National Basic Research Program of China-(973 Program No. 2010CB734103).
450
Y. Zheng and S. Fu
References 1. Van der Meulen, P.A.: Perry Diclemente Ergonomic Evaluation of an Aircraft Cockpit with RAMSISI 3D Human Modeling Software. Tecmath of North America (2001) 2. Farmer, E., Brownson, A.: Revier of workload measurement, analysis and interpretation methods. European Organisation for the Safety of Air Navigation (2003) 3. Sheng, H.: Modular Arrangement of Predetermined Time Standard. Repute (2009) 4. The Methods and Applications of Fuzzy Math. HUST (2003)
A Simulation Environment for Analysis and Optimization of Driver Models Ola Benderius1, Gustav Markkula1,2, Krister Wolff1, and Mattias Wahde1 1
Department of Applied Mechanics, Chalmers University of Technology, 412 96 Göteborg, Sweden {ola.benderius,krister.wolff,mattias.wahde}@chalmers.se 2 Volvo Technology Corp., 405 08 Göteborg, Sweden [email protected]
Abstract. A simulation environment for evaluation and optimization of driver models is introduced and described. The simulation environment features models of vehicles and drivers, as well as a representation of the traffic environment (roads, buildings etc.). In addition, an optimization framework based on stochastic optimization algorithms has been implemented as an integral part of the simulation environment. Given observed (time series) data of driver behavior and, possibly, vehicle dynamics, the optimization framework can be used for inferring driver model parameters. The simulation environment has been evaluated in two scenarios, one involving emergency braking and one involving a double lane change. Keywords: driver models, optimization, vehicle simulation, emergency braking, double lane change.
1 Introduction and Motivation In this paper, a simulation environment (running on a standard desktop PC) for evaluation and optimization of driver models is introduced and described by means of two specific examples. Computer simulation is increasingly used, in both academia and industry, as a tool for system development and validation. In automotive research, computer simulation can, in some cases, be used as a complement to costly traditional vehicle tests, such as test track driving, driving simulator studies, or crash tests. For some situations, traditional tests might not even be sufficiently safe to carry out in a real vehicle, for example when testing active safety systems. Such systems are activated in critical situations and are thus difficult to test without exposing the driver to danger. In order to obtain meaningful results from vehicle safety simulations, one must not only have a simulation environment capable of representing the essential aspects of the vehicle dynamics, but also a sufficiently accurate human driver model for the scenario at hand [1]. Driver models are often devised using assumptions regarding underlying driver psychology, or by analyzing emerging patterns in driving data [1]. Either way, some parameters are always associated with each driver model, and the mapping from a V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 453–462, 2011. © Springer-Verlag Berlin Heidelberg 2011
454
O. Benderius et al.
given set of parameters to the detailed behavior of the driver model is typically far from trivial. In fact, a common problem with many existing driver models is that systematic tuning of their parameters to data from real driving has not been reported in the literature, even though some authors (see, for example, [2], [3]) have considered optimization of driver models based on data obtained in vehicle simulators or computer simulations. Thus, the development of methods for setting the model parameters appropriately is an important issue in the study of driver models. Therefore, unlike most other vehicle simulation environments, the one presented here includes, as an integral part, an optimization framework based on stochastic optimization algorithms, such as genetic algorithms (GAs) and particle swarm optimization (PSO). The paper is structured as follows: Sect. 2 introduces the simulation environment, with special emphasis on the optimization framework. Sect. 3 introduces the driver models used here and also describes the data. In Sect. 4, the results of two experiments are given as illustrations of the simulation environment and its optimization framework. The results are then discussed in Sect. 5.
2 Simulation Environment 2.1 General Description The simulation environment has been implemented in C# and features vehicle models, driver models, and models of the traffic environment. So far, three different vehicle dynamics models have been implemented; (1) a simple kinematic bicycle model, (2) a bicycle model with lateral dynamics, and (3) a full dynamics model with both lateral and longitudinal dynamics involving force contributions from all wheels [4]. Normally, the full dynamics model is used for evaluation of the vehicle under study, whereas the bicycle models can be used for representing other vehicles present in the simulation (for which the detailed dynamics is less important). The full dynamics model features a steering system, a powertrain containing engine, gearbox and differential, wheels with suspension, and a braking system. The various parts are implemented in a modular fashion, making it easy to modify any part of the vehicle dynamics. It is also possible to implement and add active safety systems to the simulated vehicle. Generally, these additions are activated under specific conditions and might then affect vehicle dynamics or driver behavior. A generic active safety system is included, which can easily be modified to represent any specific active safety system, such as, for example, an anti-lock braking system. Since the main purpose of the simulation environment is to study driver models, an effort has also been made to simplify the process of implementing and using various driver models. The amount of computer code associated with a driver model is very small: In simple cases, only a few lines of code are needed. The driver model output generally includes the steering wheel angle, the brake pedal position, the gas pedal position, and the selected gear. Inputs are acquired from the environment and from other vehicles. Typical inputs are, for example, road boundary positions and the position of other vehicles, as estimated by the driver model.
A Simulation Environment for Analysis and Optimization of Driver Models
455
Fig. 1. Two views from the simulation environment. Left panel: An exterior view of the braking scenario described in Sect. 3.1. Right panel: An interior view (i.e. from the viewpoint of the driver) of the same scenario.
In addition to the vehicle and driver models, a traffic scenario also requires defining the traffic environment, the initial conditions, and the termination criteria. The traffic environment mainly consists of roads defined as splines with five parameters (three spatial dimensions, road banking, and a friction coefficient). It also contains ground, sky, and stationary objects, such as buildings. Initial conditions specify, for example, the position, heading, and speed of each vehicle in the scenario. Termination criteria can easily be defined and combined depending on the details of the scenario. For instance, a typical termination criterion can involve a simple time limit, but it could also be defined based on the actions of the driver or the dynamics of the vehicle. When any termination criterion is met, the simulation is automatically stopped. If optimization is enabled (see below), the next simulation (typically using different driver model parameters) is then started. A data manager, responsible for handling vehicle signals logged during simulation as well as managing external data sets, is integrated in the simulation environment. From the data manager it is possible to load external data, such as a steering wheel angle signal, and then visualize the resulting vehicle trajectory. The Ogre graphics framework1 is used for three-dimensional visualization. This framework is capable of accurately rendering all objects present in the scenario, such as roads, vehicles, buildings etc. It is also possible to use different camera views, as shown in Fig. 1. When developing the simulation environment, the running speed has been an important criterion, especially in view of the optimization framework (see below) which depends on fast evaluation of driver models. On a standard desktop computer, the simulation environment runs around 50 times faster than real time. For visualization purposes, it is possible to temporarily slow down the simulations to real time. 2.2 Optimization Framework The simulation environment features an optimization framework in which several stochastic optimization algorithms, such as genetic algorithms (GAs) [5] and particle 1
http://www.ogre3d.org/
456
O. Benderius et al.
swarm optimization (PSO) [6] are available for inferring driver models based on tiime ds have been used successfully in many optimizattion series data. Such method problems (see, for example,, [7]). Both GAs and PSO are population-based algorithhms, in which a set of candid date solutions (to the problem at hand), referred too as individuals, is maintained. Each candidate solution encodes a set of values for the odel. parameters of the driver mo In the case of a GA (thee algorithm used in this paper), all candidate solutions are first evaluated, and a new population p is then formed by the three operators selectiion, crossover, and mutation. The selection operator selects candidate solutions ((for inclusion in the new population) probabilistically, in proportion to their performaance (taken, for examples, as the t inverse of the error measure, described below). T The crossover operator combinees the parameters from two individuals, and the mutattion operator makes small, rand dom changes of the parameters. Two types of mutatiions exist: Full-range mutationss, in which the new value is generated randomly in the allowed range, and creep mutations, in which the new value is generated usinng a narrow distribution (whosee width is referred to as the creep rate) centered on the previous value. Selection, crossover, c and mutation take place until a new populattion of the same size as the preevious population has been generated. The new populattion then replaces the previous one, o at which point a generation has been completed. T The algorithm typically runs forr a large number (hundreds) of generations.
Fig. 2. The braking scenario considered c in the first experiment. The vehicle controlled byy the driver model (small white reectangle) is a passenger car arriving at an intersection whhen, suddenly, a truck (large white rectangle) emerges from behind a building (large gray rectanggle). See also Fig. 1.
During evaluation, each h individual is evaluated and is then assigned an errror measure. Any combination n of time series (for which data are available) can be uused when forming the error meeasure. Furthermore, a data set typically contains seveeral repetitions of each experim ment. As a specific example, consider a case in which oonly the speed V of the vehicle is used when inferring the driver model (as in the secoond braking setup described in n Sect. 4.1 below). The error measure e is then givenn by
A Simulation Enviro onment for Analysis and Optimization of Driver Models
457
where Vi(j) denotes the sam mples in the speed time series obtained for repetition i off the experiment under considerration (see Sects. 3.1 and 3.2 for specific examples) and Ui(j) denotes the observed speed for the corresponding repetition in the simulattion environment. N denotes th he number of repetitions and Ki denotes the numberr of samples (time steps) in repetition r i. The optimization framework also contaains methods for avoiding overfi fitting, as discussed in Sect. 5 below.
3 Driver Models and d Data In order to test the simullation environment and, in particular, the ability of the optimization framework to find a driver model consistent with the observed data, ttwo were experiments have been caarried out. In the first experiment, artificial data w generated for an emergency y braking task, using a driver model with four parametters. The optimization framewo ork was then used for recovering the parameters of the driver model based solely on the generated data series. The second experiment w was based on data collected in a test track experiment involving a real truck driven onn an icy road in Northern Swedeen. 3.1 Driver Model for Em mergency Braking (Artificial Data) As a first simple test of the t simulation environment, a simple emergency brakking scenario was considered, in n which the driver had to brake to avoid a collision witth a vehicle suddenly appearing g from behind a building. The scenario is illustrated in F Fig. 2. This experiment was aim med at illustrating the ability of the optimization framew work to find (the parameters of) a driver model, given observed driving data, but withhout ding the values of the model parameters. In this particuular any prior knowledge regard experiment, the task of recovering r the driver model parameters was of couurse simplified (compared to a case involving real driving), since the same driver moodel (rather than an actual driveer) was used for generating the data, making it possiblee to obtain, in principle, an exacct fit to the data. The driver model used was inspired by the braking model described by Jurecki and Stanczyk [8], in which the vehiccle's ne term proportional to its speed, and one term inverssely deceleration consists of on proportional to the distancee between the vehicle and the obstacle. However, the moodel described in [8] is not, stricctly speaking, a model of the driver's actions, but rather the result (the deceleration) of those actions. Thus, here, a modified model involving the brake pedal position has beeen used. The model takes the following form:
where p denotes the pedal position (in the range [0,1]) exerted by the driver, and V and S denote the speed of the vehicle and the distance to the obstacle, respectively.
458
O. Benderius et al.
The model has four parameeters, namely the reaction time tR, the time constant tB, and the constants Vc and Sc. The T squashing function s(z) = tanh(z) serves to keep the brake pedal position in the allowed a range.
Fig. 3. A schematic representtation of the obstacles (traffic cones) and the intended trajectory for the double lane change scenario
In this case, artificial data d series were generated by carrying out three brakking maneuvers in the simulaation environment, starting from three different iniitial longitudinal speeds, namely y 16, 18, and 20 m/s. For each initial speed, two data series were collected (with a sam mpling frequency of 100 Hz), representing the brake peedal position and the vehicle's speed. s The parameters used when generating the artifiicial data were taken as ߬ோ = 0.2 s, ߬ = 0.1 s, ܸୡ = 50 m/s, and ܵୡ = 20 m. The durationn of the maneuvers was a few seeconds. 3.2 Driver Model for Dou uble Lane Change (Real Data) Here, a double lane change maneuver, illustrated in Fig. 3, was considered. Data w were or experiment carried out using a rigid truck on a test trrack collected in a brief precurso in Northern Sweden. Unfortunately, the data obtained were incomplete: The positioon Table 1. The second column shows the actual parameter values used for generating the ttime y braking experiment. The third and fourth columns show the series data for the emergency inferred parameter values (aveerage and 95% confidence interval, assuming normally distribuuted values) from the first setup (error ( based on the difference in brake pedal position) and the second setup (error based on difference d in speed), respectively. Parameter tR tB Vc Sc
Acttual value 0.20 00 0.10 00 50.0 0 20.0 0
First setup 0.204±0.002 0.100±0.001 49.6±0.7 19.9±0.1
Second setup 0.206±0.027 0.100±0.036 52.6±8.1 20.0±0.9
A Simulation Enviro onment for Analysis and Optimization of Driver Models
459
of the truck (as a function of time) was not logged due to a problem with the G GPS t driver model analysis presented here simply conceerns logging equipment. Thus, the the variation of the steerin ng wheel angle (SWA) over time. Based on the obserrved SWA variation, as simple phenomenological, p open-loop driver model was definedd in the simulation environmen nt. In this model, the SWA was equal to zero until the distance between the vehiclle and the first cone pair dropped below S0 (at which pooint the time t was set to 0). Afteer that, the SWA (denoted d) varied as
for t § td, where a, t1, t2, and β, along with S0 and td, are model parameters. Tiime petitions of the experiment were available. In all thhree series data from three rep repetitions, the initial longitudinal speed was 50 km/h. Note that positive values o f S0 b the first pair of traffic cones is reached. imply that steering begins before
4 Results The stochastic optimization n procedure (described in Sect. 2.2 above) was appliedd in order to infer the model parrameters for the two cases described in Sect. 3. 4.1 Emergency Braking Here, the task was to deetermine the four parameters tR, tB, Vc, and Sc, giiven the generated time series (see Sect. 3.1). Two different setups were used for the he parameters were inferred from the three time seeries optimization runs: First, th measuring the brake pedal position. p In the second, more challenging case, the parameeters were instead inferred using only o the three time series measuring the vehicle speed. In both cases, the errorr (see Eq. (1)) measured the difference between the tiime series data and the corresponding variable generated in the simulation beeing evaluated. Thus, in the first setup, the difference in brake pedal position was usedd to
Fig. 4. The brake pedal posittion (left panel) and the longitudinal speed (right panel) for one particular repetition (starting from a speed of 20 m/s). As can be seen, the results from m the driver model (using the averag ge parameters obtained for the second setup; see Table 1) oveerlap the (artificial) data more or lesss exactly.
460
O. Benderius et al.
form the error measure whereas, in the second case, the difference in vehicle speed was used. For each setup, 10 optimization runs were carried out, with a population size for the GA (see Sect. 2.2) of 30, a crossover rate of 0.75, and a mutation rate of 0.25. Mutations were either full-range mutations or creep mutations. The creep probability was set to 0.50 and the creep rate to 0.02. Each run lasted 1000 generations. The parameters inferred for the two setups are shown in Table 1. The variations in brake pedal position and vehicle speed are exemplified in Fig. 4. 4.2 Double Lane Change The values of the parameters S0, td, a, t1, t2, and β (see Sect. 3.2) were inferred by running the optimization algorithm 10 times. Essentially, two different sets of (equally viable, in terms of the SWA error e) solutions were found; one in which the driver model begins the double lane change rather late (after passing the first pair of traffic cones) and then steers quite violently, and one in which the driver model begins steering before reaching the first traffic cone pair, and can therefore steer rather softly throughout the maneuver. For the early steering case, the average parameter values were S0 = 9.7 (m), td = 8.5 (s), a = 2.2 (radians), t1 = -19 (s), t2 = 0.38 (s), and β = 0.73, whereas, for the late steering case, the average values were S0 = -5.9 (m), td = 8.4 (s), a = 3.6 (radians), t1 = -5.2 (s), t2 = 0.052 (s), and β = 0.43. The SWA is plotted in Fig. 5, both for the three data series and for the inferred driver models for the early and late steering cases.
Fig. 5. Left panel: A view from the simulation environment, depicting the double lane change scenario. Right panel: The variation in steering wheel angle from the real truck data (solid lines) and from the best-fitting driver models found (dashed lines). The dashed curve with large amplitude corresponds to the case of late steering discussed in Sect. 4.2.
5 Discussion and Conclusion As can be seen in Sect. 4, the optimization framework is able to find the parameters of a driver model, using time series of the observed behavior of the driver, the vehicle, or both.
A Simulation Environment for Analysis and Optimization of Driver Models
461
In the braking scenario, the driver model parameters were accurately inferred in both of the cases considered, even though the spread in the obtained values was larger in the second case (in which the evaluation was based on the speed of the vehicle rather than the brake pedal position). This is not surprising, since, in the second case, the mapping from the driver model to the observed variable (speed) is not as direct as in the first case. Regarding the double lane change, it is interesting to note that two different solutions (with equal performance) were found, indicating that, for real data, one cannot, perhaps, expect to find a single set of parameters for the driver model. In fact, it is well known [1], [9] that there is a clear variability in driving characteristics between different drivers. Thus, the simulation environment can perhaps be used for inferring driver model parameters corresponding to several different driver types. In most driver behavior studies, the amount of available data is typically limited to, at most, a few tens of repetitions of each experiment. Thus, when inferring a driver model from data, one must be careful to avoid overfitting, i.e. fitting the driver model to irrelevant (and irreproducible) aspects of the data resulting from the limited size of the data set, sensor noise etc. For this reason, holdout validation has been implemented as an integral part of the optimization framework in the simulation environment. Thus, when inferring a driver model, only a part of the data (the training set) is used for giving feedback to the optimization algorithm. Another part of the data (the validation set) is used for determining when to stop the optimization, whereas the remaining data (the test set) are used only after optimization, to assess the true performance of the inferred driver model. However, in the optimization examples considered in this paper, holdout validation has not been used: In the first case (emergency braking), the data were generated artificially using a single set of driver model parameters, and were thus (unrealistically) free from noise and other variations between repetitions, and in the second case (double lane change), only three time series were available, too few to carry out holdout validation. Optimization of the kind considered in this paper is generally rather timeconsuming, since a large number of driver model parameters must be evaluated. However, due to the high speed of the simulation environment (see Sect. 2), optimization of a driver model requires a running time of a few hours or less on a standard desktop computer. To conclude, a simulation environment for rapid evaluation and optimization of driver models has been introduced and then validated in two different scenarios. It has been demonstrated that the optimization framework, which is integrated in the simulation environment, is able to infer driver model parameters rapidly and reliably.
References 1. Plöchl, M., Edelmann, J.: Driver Models in Automobile Dynamics Application. Vehicle System Dynamics 45, 699–741 (2007) 2. Dogan, U., Edelbrunner, H., Iossifidis, I.: Towards a Driver Model: Preliminary Study of Lane Change Behavior. In: Proceedings of the 11th International IEEE Conference on Intelligent Transportation Systems, pp. 931–937 (2008) 3. Pellecchia, A., Igel, C., Edelbrunner, J., Schöner, G.: Making Driver Modeling Attractive. IEEE Intelligent Systems 20, 8–12 (2005)
462
O. Benderius et al.
4. Lucet, E., Grand, C., Sallé, D., Bidaud, P.: Stabilization Algorithm for a High Speed Carlike Robot Achieving Steering Maneuver. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 2540–2545 (2008) 5. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975) 6. Kennedy, J., Eberhart, R.C.: Particle Swarm Optimization. In: Proceedings of the IEEE International Conference on Neural Networks, pp. 1942–1948 (1995) 7. Wahde, M.: Biologically Inspired Optimization Methods: An Introduction. WIT Press, Southampton (2008) 8. Jurecki, R., Stanczyk, T.L.: Driver Model for the Analysis of Pre-accident Situations. Vehicle System Dynamics 47, 589–612 (2009) 9. Erséus, A.: Driver-vehicle Interaction: Identification, Characterization and Modelling of Path Tracking Skill. PhD Thesis, Royal Institute of Technology (KTH), Stockholm (2010)
Learning the Relevant Percepts of Modular Hierarchical Bayesian Driver Models Using a Bayesian Information Criterion Mark Eilers*,** and Claus Möbus*,** Transportation Systems, Learning and Cognitive Systems OFFIS Institute for Information Technology, C.v.O. University, Oldenburg, Germany [email protected], [email protected] http://www.offis.de/, http://www.lks.uni-oldenburg.de/
Abstract. Modeling drivers’ behavior is essential for the rapid prototyping of error-compensating assistance systems. Various authors proposed controltheoretic and production-system models. Based on psychological studies various perceptual measures (angles, distances, time-to-x-measures) have been proposed for such models. These proposals are partly contradictory and depend on special experimental settings. A general computational vision theory of driving behavior is still pending. We propose the selection of drivers’ percepts according to their statistical relevance. In this paper we present a new machinelearning method based on a variant of the Bayesian Information Criterion (BIC) using a parent-child-monitor to obtain minimal sets of percepts which are relevant for drivers’ actions in arbitrary scenarios or maneuvers. Keywords: Probabilistic Driver model, Bayesian Autonomous Driver model, Mixture-of-Behavior model, Bayesian Real-Time-Control, Machine-Learning, Bayesian Information Criterion, Hierarchical Bayesian Models.
1 Introduction The Human or Cognitive Centered Design of intelligent transport systems requires computational models of human behavior and cognition [1, 2]. Particularly the modeling of drivers’ behavior is essential for the rapid prototyping of errorcompensating assistance systems [1]. Based on psychological studies [3, 9-11, 13, 20, 21] various perceptual measures (angles, distances, time-to-x-measures) have been proposed for such models. These proposals are partly contradictory and depend on special experimental settings. A general computational vision theory of driving behaviour is still pending. Due to the variability of human cognition and behavior, the irreducible lack of knowledge about underlying cognitive mechanisms, and irreducible incompleteness of *
Project Integrated Modeling for Safe Transportation (IMOST) sponsored by the Government of Lower Saxony, Germany under contracts ZN2245, ZN2253, ZN2366. ** Project ISi-PADAS funded by the European Commission in the 7th Framework Program, Theme 7 Transport FP7-218552. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 463–472, 2011. © Springer-Verlag Berlin Heidelberg 2011
464
M. Eilers and C. Möbus
knowledge about the environment [1] we conceptualize, estimate and implement models of human drivers as probabilistic models: Bayesian Autonomous Driver (BAD) models. In contrast to [21], BAD models need not be programmed like traditional simulation software but are condensed and abstracted in an objective manner using machine-learning techniques from human behavior traces.
2 Bayesian Autonomous Driver Mixture-of-Behaviors Models In earlier research [14] we developed a BAD model with Dynamic Bayesian Networks based on the Bayesian Programming approach [1] and on the assumption that a single skill is sufficient for lateral and longitudinal control. Later, we realized that modeling the complex competence of human drivers, a skill hierarchy is necessary. We modified the simple BAD model architecture to a hierarchical modular probabilistic architecture. Now we are able to construct driver models by decomposing complex maneuvers into basic behaviors and vice versa: Bayesian Autonomous Driver Mixture-of-Behaviors (BAD MoB) models [5, 6, 15, 16]. BAD MoB models consist of Gating-, Behavior-Classification-, and Actionmodels. Their functional interaction allow the generation of context dependent driver behavior by sequencing and mixing pure basic behaviors [5, 6] (Fig. 1).
Fig. 1. Exemplary mixing of behaviors in a BAD MoB model assembled from two Actionmodels, one Behavior-Classification, and one Gating model. The Gating-model calculates a weighted sum over the answers of the two Action-models, according to the appropriateness of their corresponding behaviors, respectively their probabilities or mixing coefficients, inferred by the Behavior-Classification-model.
Based on a skill hierarchy, partitioning complex scenarios into maneuvers and maneuvers into simpler behaviors (cf. Fig. 2), each behavior is modeled by an Action-model. This is implemented by a dynamic Bayesian network that realizes the sensor-motor schema of the desired behavior. It can be utilized to infer the conditional probability distribution (CPD) of the actions given the former actions and the current percepts: P(Actionst|Actionst-1,Perceptst). For each complex scenario
Learning the Relevant Percepts of Modular Hierarchical Bayesian Driver Models
465
(or maneuver) in the skill hierarchy a Behavior-Classification-model is used to infer the appropriateness of the corresponding simpler maneuvers (or behaviors). A Gating-model computes a weighted sum over the inferred CPDs of the Actionmodels by using the appropriateness of their corresponding behaviors, inferred by the Behavior-Classification-models in the form of mixing coefficients (Fig. 1). By calculating weighted sums over the mixture distributions BAD MoB models are able to combine mixture distributions in a hierarchically manner. Thus these models allow the combination of pure behaviors into more complex maneuvers and maneuvers into scenarios. BAD MoB models sample random values Actionst = actionst from the inferred CPD P(Actionst|Actionst-1,Perceptst) every 50ms. These are used as motor commands to autonomously control (simulated) vehicles. 2.1 Skill Hierarchy For an experimental BAD MoB in the racing simulation TORCS1, we defined a skill hierarchy of three hierarchical layers. The Racing Scenario was partitioned into the three maneuvers LaneFollowing, CarFollowing and Overtaking. LaneFollowing was partitioned into the behaviors for driving on a straight segment (Straight), through a wide curve (Wide) and through a sharp curve (Sharp), etc. pp. (Fig. 2).
Fig. 2. Skill hierarchy for a racing scenario with three hierarchical layers
2.2 Training Phase The learning of a BAD MoB model requires multivariate time series of human behavior traces. These were obtained from a single driver, who drove several laps on two different racing courses in the TORCS simulation. We recorded approximately 15000 data samples. Each time-stamped data record contained values for 211 discrete random variables (Table 1): two action-variables Acct and Steert, denoting the position of a combined acceleration and braking pedal and the steering wheel angle, four behavior-variables representing the partitioning of the task hierarchy (Fig. 2) and a set of 205 time-independent (estimates of distances and angles) and time-dependent percept-variables (TDPs), similar but not identical to Lee’s time-to-x (tau) measures [12, 13, 17].
1
http://torcs.sourceforge.net/ (last retrieved 2011-01-31)
466
M. Eilers and C. Möbus
Table 1. Overview of the two action-variables, four behavior-variables and 205 perceptvariables defined for the foveal and ambient visual channel of the driver [7] Variable
Acc
t
Steert
Range Description {0,…,14} Position of a combined acceleration and braking pedal. Ranges from full braking (0) to full acceleration (14). {0,…,29} Steering wheel angle. Ranges from full turning to the right (0) to full turning to the left (29).
BtSc
{0,…,2}
BtLF BtCF
{0,…,2}
BtOT LSt FCAt5m, FCAt10m,
{0,…,2}
{0,…,2}
Represents the maneuvers LaneFollowing, CarFollowing and Overtaking that compose the Racing Scenario. Represents the LaneFollowing behaviors Straight, Wide and Sharp. Represents the CarFollowing behaviors FollowStraight, FollowWide and FollowSharp. Represents the Overtaking behaviors PassOut, PassCar and PassIn.
{0,…,20} Longitudinal speed of the ego-car. {0,…,20} 50 Fixed-Distance Course Angles.
…,
FCAt250m
SCAt0.2s, SCAt0.4s,
{0,…,20} 50 Speed-Dependent Course Angles.
…,
SCAt10.0s
FLAt5m, FLAt10m,
{0,…,20} 50 Fixed-Distance Lane Angles.
…,
FLAt250m
SLAt0.2s, SLAt0.4s,
{0,…,20} 50 Speed-Dependent Lane Angles.
…,
SLAt10.0s
NCAt
{0,…,20} Nearest Alter-Car Angle.
Learning the Relevant Percepts of Modular Hierarchical Bayesian Driver Models
467
Table 1. (continued)
NCSt
{0,…,20} Nearest Alter-Car Space.
NCDt
{0,…,20} Nearest Alter-Car Distance.
NCVAt
{0,…,20} Nearest Alter-Car Viewing Angle.
2.3 Learning of Relevant Peephole Percepts Until now the structures of skill hierarchies have to be created manually. But both the graph-structure of Action- and Behavior-Classification-models and the parameters of their probability distributions can be obtained by machine-learning methods from multivariate time series of human behavior traces. To completely cover the skill hierarchy (Fig. 2) nine Action- and four Behavior-Classification-models have to be learnt [5, 6]. The structure of four complimentary Gating-models can then be derived automatically from the structure of the Action- and Behavior-Classification-models. To ensure efficiency for the real-time control of a BAD MoB model, we constrain the structure of Action- and Behavior-Classification-models to dynamic (first order markov) naïve Bayesian Classifiers. For Action-models we further assume the actionvariables Acct and Steert to be independent given both of the former actions Acct-1 and Steert-1, and that a percept must not be conditioned on both Acct and Steert. These assumptions allow the boosting of the inference performance by splitting the intended CPD P(Acct,Steert|Acct-1,Steert-1,Perceptst) into two independent distributions P(Acct|Acct-1,Steert-1,Perceptst) for longitudinal and P(Steert|Acct-1,Steert-1,Perceptst) for lateral control. Our BAD MoB models rest on the assumption that there is considerable uncertainty about the relevant percepts for realization and classification of natural driving behaviors. So the relevant percepts should be identified during the modeling process. We rely on a step-wise structure-learning technique that exploits the probabilistic foundations of Bayesian driver models and determines the ‘peephole’ percepts from a universe of hypothetical possible or available percepts based on the Bayesian Information Criterion (BIC) [4, 8, 18]. The Parent-Child Bayesian Information Criterion. The BIC rewards how well a model fits the data while penalizing the model complexity. Let δ denote a set of n data
468
M. Eilers and C. Möbus
rows associated with the behavior to be generated by an Action-model A or the mixture of behaviors to be classified by a Behavior-Classification-model B, L(δ| ) denote the likelihood of δ given a model , and size( ) denote the size or complexity of a model (we define as the number of edges in the DBN of the model) then the BIC for an Action-model A is defined as
log L(δ | π A ) − n
[
(
size(π A ) ⋅ log n 2
= ∑ log P actions , actions , percepts | π A i =1
i
i −1
i
and the BIC for a Behavior-Classification-model log L(δ | π B ) − n
[
B
)]
size(π B ) ⋅ log n 2
(
(1)
is defined as
= ∑ log P behavior i , behavior i−1 , perceptsi | π B i =1
size(π A ) − ⋅ log n 2
)]
size(π B ) − ⋅ log n . 2
(2)
To focus on the intended purpose of Action- and Behavior-Classification-models, we evolved a version of the BIC, which we refer as the Parent-Child BIC (PCh-BIC), where the likelihood is replaced by a parent-child-monitor [4]. Following the foregoing definition, the PCh-BIC for an Action-model A is defined as
∑ [log P(actionsi | actionsi −1 , perceptsi , π A )] − n
i =1
size(π A ) ⋅ log n 2
while the PCh-BIC for a Behavior-Classification-model
∑ [log P(behavior i | perceptsi , π B )] − n
i =1
B
(3)
is defined as
size(π B ) ⋅ log n . 2
(4)
Learning Procedure. As the learning procedure of pertinent percepts doesn’t differ between Action- and Behavior-Classification-models, it will be described for the learning of Action-models only: Starting with an initial Action-model A without any percepts (Fig. 3), new percepts are included in a step-wise manner. By now, a simple greedy-heuristic is used. For each one of the available percepts, the PCh-BIC is calculated for the initial model extended by an edge from the actionvariable Acct to the respective percept. Using the intended inference for longitudinal control P(Acct|Acct-1,Steert-1,Perceptst) as the parent-child-monitor, the PCh-BIC is calculated by:
∑ [log P(acci | acci−1, steer i−1, fca5i m ,..., ncvai , π A )] − n
i =1
size(π A ) ⋅ log n . 2
(5)
Learning the Relevant Percepts of Modular Hierarchical Bayesian Driver Models
469
Fig. 3. DBN of an initial or blind Action-model and Behavior-Classification-model without any used percepts
The percept leading to the best PCh-BIC (Fig. 4) can be seen as the most pertinent percept of the given possibilities for longitudinal control and is permanently included in the model.
Fig. 4. Plot of PCh-BICs computed for an Action-model selecting one of 205 possible percepts for longitudinal control at a time. The PCh-BIC is maximized for a time-dependent percept SLAt4.0s, revealing it as the most pertinent percept of the given possibilities for longitudinal control.
Next, for each of the remaining percepts, the PCh-BIC is calculated for the improved model extended by a new edge from the action-variable Steert to the respective percept. Using the intended inference for lateral control P(Steert|Acct-1,Steert-1,Perceptst) as the parent-child-monitor, the PCh-BIC is calculated by:
∑ [log P(steer i | acci−1 , steer i−1 , fca5i m ,..., ncvai , π A )]− n
i =1
size(π A ) ⋅ log n . 2
(6)
Fig. 5. Plot of PCh-BICs computed for an Action-model using one of 204 remaining possible percepts for lateral control at a time. The PCh-BIC is maximized for a time-dependent percept SLAt0.8s, revealing it as the most pertinent percept of the given possibilities for lateral control.
470
M. Eilers and C. Möbus
The percept leading to the best PCh-BIC (Fig. 5) can be regarded as the most pertinent percept of the given possibilities for lateral control and is likewise included permanently in the model. The procedure will then be repeated with the new model. In this step-wise manner percepts are added until the PCh-BIC can’t be improved any longer for any percept conditioned by Acct or Steert. As a result the learning procedure selects a minimal set of peephole percepts.
3 Results and Discussion Using the learning procedure we revealed the most relevant peephole-percepts for all the nine Action-models and four Behavior-Classification-models of the skill hierarchy (Fig. 2). Learning the Action-models, 15 peephole percepts could be revealed as pertinent for longitudinal control, with the speed LSt and the time-independent percept FLAt5m being the two most frequent ones (Table 2). Table 2. Summary of the most relevant 15 peephole percepts used for longitudinal control Nr. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Percept FCAt40m FCAt80m FCAt110m FCAt135m FCAt225m FLAt5m SCAt0.4s SCAt3.2s SLAt4.0s SLAt8.4s SLAt9.0s LSt NCAt NCDt NCVAt
Times used 1 1 1 1 2 3 1 1 1 1 1 4 1 1 1
Relevant for longitudinal control in the PassCar Action-model FollowSharp Action-model FollowWide Action-model Sharp Action-model FollowStraight and PassIn Action-model Wide, Sharp and FollowWide Action-model PassIn Action-model Wide Action-model Straight Action-model FollowStraight Action-model PassOut Action-model Straight, Sharp, FollowStraight and PassOut Action-model FollowSharp Action-model FollowWide Action-model FollowSharp Action-model
For lateral control 22 pertinent percepts could be revealed, with the timeindependent percept FCAt5m and the time-dependent percept SLAt0.8s being the most frequent ones (Table 3). Table 4 shows a summary of all used peephole percepts pertinent for classification of appropriate maneuvers or behaviors in the four Behavior-Classification-models. Using only the 41 different peephole percepts that could be revealed during the learning process, the resulting BAD MoB model is able to drive on different racing courses in the TORCS simulation environment while overtaking slower vehicles (videos are available at http://www.lks.uni-oldenburg.de/46350.html).
Learning the Relevant Percepts of Modular Hierarchical Bayesian Driver Models
471
Table 3. Summary of the most relevant 22 peephole percepts used for lateral control Nr. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Percept FCAt5m FCAt65m FCAt75m FCAt175m FCAt190m FLAt5m FLAt10m FLAt15m FLAt75m FLAt135m FLAt185m FLAt225m SLAt0.2s SLAt0.8s SLAt1.2s SLAt1.6s SLAt2.0s SLAt8.8s SLAt9.6s SCAt0.4s SCAt0.6s NCDt
Times used 4 1 1 1 2 1 1 1 1 1 1 1 2 3 1 1 1 1 1 1 1 1
Relevant for lateral control in the Straight, Sharp, FollowStraight and PassOut Action-model PassIn Action-model Sharp Action-model Sharp Action-model Straight and FollowWide Action-model PassOut Action-model FollowSharp Action-model FollowWide Action-model PassIn Action-model FollowWide Action-model Wide Action-model FollowStraight Action-model FollowWide and FollowSharp Action-model Straight, Wide and Sharp Action-model FollowSharp Action-model Sharp Action-model PassCar Action-model PassOut Action-model FollowSharp Action-model Wide Action-model FollowStraight Action-model PassOut Action-model
Table 4. Summary of the most relevant 11 peephole percepts used for behavior-classification Nr. 1 2 3 4 5 6 7 8 9 10 11
Percept FCAt30m FCAt60m FCAt205m FLAt5m FLAt200m FLAt240m SCAt0.2s SCAt0.6s SCAt8.2s NCAt NCDt
Times used 1 1 1 2 1 1 1 2 1 1 1
Relevant for classification of driving behavior in the LaneFollowing Behavior-Classification-model LaneFollowing Behavior-Classification-model Racing Scenario Behavior-Classification-model CarFollowing and Scenario Behavior-Classification-model CarFollowing Behavior-Classification-model CarFollowing Behavior-Classification-model Overtaking Behavior-Classification-model LaneFollowing and CarFollowing Behavior-Classification-model LaneFollowing Behavior-Classification-model Overtaking Behavior-Classification-model Racing Scenario Behavior-Classification-model
For first validation purposes we computed the correlations between the actions acct and steert in the experimental data and the actions that would be chosen by the corresponding Action-models by drawing values from the inferred distributions P(Acct|acct-1,steert-1,perceptst) and P(Steert|acct-1,steert-1,perceptst). The mean correlation for longitudinal control is 0.902, with the best result provided by the PassCar Action-model (0.964) and the worst result provided by the Sharp Actionmodel (0.818). The mean correlation for lateral control is 0.909, with the best result provided by the Sharp Action-model (0.992) and the worst result provided by the PassIn Action-model (0.783). As a next step, the percepts obtained should be validated by experiments with human drivers [16].
472
M. Eilers and C. Möbus
References 1. Bessière, P., Laugier, C., Siegwart, R. (eds.): Probabilistic Reasoning and Decision Making in Sensory-Motor Systems. Springer, Berlin (2008) 2. Cacciabue, P.C.: Modelling Driver Behaviour in Automotive Environments. Springer, London (2007) 3. Chattington, M., Wilson, M., Ashford, D., Marple-Horvat, D.E.: Eye-Steering Coordination in Natural Driving. Experimental Brain Research 180, 1–14 (2007) 4. Cowell, R.G., Dawid, A.P., Lauritzen, S.L., Spiegelhalter, D.J.: Probabilistic Networks and Expert Systems. Springer, Berlin (1999) 5. Eilers, M., Möbus, C.: Learning of a Bayesian Autonomous Driver Mixture-of-Behaviors (BAD-MoB) Model. In: Duffy, V.G. (ed.) Advances in Applied Digital Human Modeling, pp. 436–445. CRC Press, Taylor & Francis Group, Boca Raton (2010/2011) 6. Eilers, M., Möbus, C.: Lernen eines modularen Bayesian Autonomous Driver Mixture-ofBehaviors (BAD MoB) Modells. In: Jürgensohn, T., Kolrep, H. (eds.) Fahrermodellierung in Wissenschaft und Wirtschaft, 3. Berliner Fachtagung für Fahrermodellierung, Fortschrittsbericht des VDI in der Reihe 22 (Mensch-Maschine-Systeme), pp. 61–74. VDIVerlag (2010) 7. Horrey, W.J., et al.: Modeling Driver’s Visual Attention Allocation While Interacting With In-Vehicle Technologies. J. Exp. Psych. 12, 67–78 (2006) 8. Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs, 2nd edn. Springer, Heidelberg (2007) 9. Land, M.F.: The Visual Control of Steering. In: Harris, L.R., Jenkin, M. (eds.) Vision and Action, pp. 163–180. Cambridge University Press, Cambridge (1998) 10. Land, M., Horwood, J.: Which Parts of the Road Guide Steering? Nature 377, 339–340 (1995) 11. Land, M., Lee, D.N.: Where we look when we steer. Nature 369, 742–744 (1994) 12. Lee, D.N.: A theory of visual control of braking based on information about time-tocollision. Perception 5, 437–459 (1976) 13. Lee, D.N.: How Movement is Guided (2005), http://www.perception-inaction.ed.ac.uk/PDF_s/Howmovementisguided.pdf 14. Möbus, C., Eilers, M.: Further Steps Towards Driver Modeling according to the Bayesian Programming Approach. In: Salvendy, G., Smith, M.J. (eds.) HCI International 2009. LNCS (LNAI), vol. 5618, pp. 413–422. Springer, Heidelberg (2009) 15. Möbus, C., Eilers, M.: Mixture of Behaviors and Levels-of-Expertise in a Bayesian Autonomous Driver Model. In: Duffy, V.G. (ed.) Advances in Applied Digital Human Modeling, pp. 425–435. CRC Press, Taylor & Francis Group, Boca Raton (2010/2011) 16. Möbus, C., Eilers, M., Garbe, H.: Prediction of Deficits in Situation Awareness with a Modular Hierarchical Bayesian Driver Model, HCII 2011 (submitted; this volume) (2011) 17. Pepping, G.J., Grealy, M.A.: Closing the Gap: The Scientific Writings of David N. Lee. Lawrence Erlbaum Associates, Publishers, Mahwah (2007) 18. Schwarz, G.: Estimating the Dimension of a Model. The Annals of Statistics 6(2), 461–464 (1978) 19. Wilkie, R.M., Wann, J.P.: Driving as night falls: The contribution of retinal flow and visual direction to the control of steering. Current Biology 12, 2014–2017 (2002) 20. Wilkie, R.M., Wann, J.P.: Eye-Movements Aid the Control of Locomotion. Journal of Vision 3, 677–684 (2003) 21. Xu, Y., Lee, K.K.C.: Human Behavior Learning and Transfer. CRC Press, Boca Raton (2006)
Impact and Modeling of Driver Behavior Due to Cooperative Assistance Systems Florian Laquai, Markus Duschl, and Gerhard Rigoll Institute for Human-Machine-Communication Technische Universität München Theresienstraße 90, 80333 Munich, Germany {Laquai,Rigoll}@tum.de [email protected]
Abstract. Current developments in Car2X communication technology provide the basis for novel driver assistance systems. To assess the impact of such a system on a group of cars approaching a non moving traffic jam, the driver behavior resulting from a system which supports anticipatory driving is modeled and used in a sub-microscopic traffic simulation. By equipping various percentages of a simulated group of cars with the model, the effect of the system on maximum deceleration and fuel consumption can be assessed. Finally the problems resulting from switching between different driver models are discussed. Keywords: Driver modeling, Deceleration, Traffic simulation, Anticipatory driving.
1 Introduction Current development in driver assistance systems makes increasing use of environment information. The area scanned by state of the art radar, vision and ultrasonic sensors typically covers up to 200 meters around the car. Upcoming technologies like Car2X communication can extend this range to many kilometers by using other cars as remote sensors or just by getting their positions. As a result a period of time can be covered where no assistance was possible until now. In the presented work we will have a look at an assistance system for anticipatory driving and its impact on a group of cars with varying percentages of equipped ones. It fosters a safer and more economic way of driving by informing the driver about future traffic and regulatory conditions in which he is forced to decelerate. This was accomplished by displaying a birdeye perspective on a simplified road visualisation. Additionally the ego-car is displayed, whereas its color and size denotes the required deceleration type (no action required / engine drag torque / braking). Also foreign vehicles and a rudimentary view of the upcoming situation are included in the perspective (see fig. 1 left). The process flow to obtain a driver model of the deceleration phase during assisted driving and its impact on a group of cars is as follows: V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 473–482, 2011. © Springer-Verlag Berlin Heidelberg 2011
474
• • • •
F. Laquai, M. Duschl, and G. Rigoll
Analysis of a driving simulator experiment with the proposed assistance system Iterative modeling and testing with a single equipped car in a traffic simulator Equipping various percentages of cars in a test group with the system Analysis of overall fuel consumption and maximum deceleration in the group
After the final experiment analysis the occurring problems are discussed and future work regarding experiment improvements and redesign of the models will be presented.
2 Driving Simulator Experiment A driving simulator experiment for evaluation of the system was conducted in a fixed base simulator with 3 visual channels which are projected on large screens. The test person corpus consisted of 30 drivers, 18 male and 12 female with an average age of 36 years. For simplicity the representative situation of a non moving traffic jam on a highway behind a non visible curve was chosen. The jam stands still until the group of test vehicles reaches it, then slowly starts moving again. The implemented assistance function takes the speeds of the ego vehicle and preceding vehicle on the same lane, the distance in between and has a model of the speed dependent engine drag force.
Fig. 1. Left: Birdeye Visualisation for upcoming traffic jam, Right: Test course
The assistance system logic output produced 4 possible messages: • • • •
No action required: idle state (output “0”) Step off gas pedal: engine drag force is sufficient (output “3”) Comfortable braking (-0,2g): engine drag force not sufficient (output “2”) Strong braking (-0,4g): comfortable braking not sufficient (output “1”)
The logics output is then fed into the birdeye visualization, where the size and color of the ego vehicle changes accordingly. The encoding is as follows: • Idle state: standard size, white • Step off gas pedal: increased size, yellow • Brake (two brake states combined): maximum size, orange
Impact and Modeling of Driver Behavior Due to Cooperative Assistance Systems
475
During the experiment all driver actions (pedal positions, distance keeping), dynamic properties (ego and foreign car speed and accelerations) and warning outputs are recorded for later analysis.
3 Data Analysis In a first step the speed recordings were windowed to 35 seconds and start at the first output from the logic which was different from "0". As mentioned in [Bennet94], deceleration behavior is a function of starting speed and time. Other parameters (such as deceleration strategy advice through the ego vehicle encoding) had no significant effect on curve shapes. Consequently the data sets were allocated to speed intervals with a width of ca. 20 km/h, depending on the availability of data sets.
Fig. 2. Selected speed curves from driving experiment, starting at first assistance output
Fig. 3. Left: Averaged speed curves, Right: Integrals of speed curves (travelled distance)
Then an average curve for every speed interval is computed to eliminate individual speed oscillations. The steps during the deceleration phase which can be seen in the raw data will later be resynthesised in the model, but are eliminated in the averaging process. It can be seen that with higher starting speed, deceleration curves are steeper, the time gap between a warning output and a reaction becomes shorter and drivers
476
F. Laquai, M. Duschl, and G. Rigoll
reach their target speed earlier. These effects have been already described in the literature (see [Bennet94]) and will be modeled in the behavior model described in the following sections. It was also found that the travelled distances all approach 1000m, which was the actual distance to the jam when the system was activated. This shows that despite drivers approached the situation with different speeds, they all managed to decelerate appropriately while the steps in the curves suggest, that they reassessed the remaining distance resulting from their deceleration curve several times when it differed more than a certain threshold from the actual distance.
4 Behavior Modeling The findings from the data analysis of the driving simulator experiment let us develop a simplified model of the deceleration phase consisting of a section with constant speed, followed by a Gaussian shaped deceleration beginning at starting speed dependent times and is delimited by the driven distance.
Fig. 4. Section-wise defined deceleration model
Thus the speed is defined by the following equation: , ,
0
: The model is calibrated with a target distance of 1000m, so the proportion introduced for being able to apply the model for different distances , too:
is
1000 At first the time shall be modeled, which is fitted to the corner points of the four curves in figure 3 left:
Impact and Modeling of Driver Behavior Due to Cooperative Assistance Systems
473.13 ·
.
·
477
.
is modeled with a Gaussian curve to The actual deceleration part defined by create a smooth transition at the beginning and ending of the deceleration phase: 2 12649
0.016
6.467
523.024 ·
.
As the last parameter, must be determined. This is modeled with a 4th order polynom to reach the targeted distance as closely as possible: 0.0001192
0.01448
0.566
6.2198
65.92 ·
.
The final fitting result can be evaluated in fig. 5 for various starting speeds.
Fig. 5. Real versus modeled deceleration curves
As mentioned before, the model does not resemble the original speed curves completely so far, regarding the steps in the real deceleration curves. It is concluded, that the steps have their origin in a re-estimation of the target distance by the driver due to a poor distance estimation for large distances (up to 500m). To build an initial model of the tolerated difference between expected and real distance to the preceding car, experimental results from [Daum09] are used to create a function describing distance estimation accuracy depending on the absolute distance. In this experiment probands had to estimate absolute distances in vista space. The standard deviation between real and perceived distance in the [Daum09] experiment was initially used to create the desired function with linear interpolation between known points.
478
F. Laquai, M. Duschl, and G. Rigoll
Fig. 6. Left: Tolerance curve derived from [Daum09], Right: Adapted Curve in the model
The driver model compares the expected (at the beginning of the deceleration phase) and the real distance to the preceding vehicle and rebuilds the speed curve if the tolerance is exceeded. First experiments with the left curve in fig. 6 showed that the tolerance is too large to produce the desired steps in the speed curves. An adapted version of it was then implemented to obtain the targeted results. Two adaptations were applied to the initial curve: the maximum tolerance limit is set to 30m and the minimum tolerance is 8m to prevent too frequent curve updates.
Fig. 7. Step during deceleration phase resulting from distance re-estimation
The envisaged effect of step-by-step deceleration as produced by the real drivers is shown in fig. 7.
5 Traffic Simulation For evaluation of the model the scenario which was used for the driving simulator experiments was remodeled in the sub-microscopic traffic simulation PELOPS. As
Impact and Modeling of Driver Behavior Due to Cooperative Assistance Systems
479
depicted in fig. 1 on the right, the 1.8 km long test course is populated with foreign vehicles to create a stationary jam. The movement of the cars in the jam was programmed to be the same as in the driving simulation. The tests were conducted using a Software-In-the-Loop (SIL) topology (fig. 8), meaning that the Simulink model was introduced in the processing loop of PELOPS. During design time test runs can be made quickly using the SIL topology with Socket communication, while for later testing with multiple modified driver models the Simulink model is used in a compiled binary module which is loaded by PELOPS at runtime. The SIL interface provides detailed information about the ego-vehicle and some basic information about the surrounding foreign vehicles. Currently the only way to influence longitudinal vehicle dynamics with this interface is to forcibly set an acceleration value that can be enabled or disabled. When disabled, the normal PELOPS driver model controls the vehicle.
Fig. 8. Traffic simulation with Simulink model as Software-In-The-Loop model
For assessing the influences of such model on a group of 20 cars instead of just one test vehicle, the group is evenly distributed on all three lanes of the highway starting at the position where the single car was before with a speed of 140 km/h. The experiment is repeated with varying percentages of equipped cars, with the cars to be equipped selected from the head of the test group. All other cars are controlled by the standard PELOPS driver model.
6 Experiment Analysis The goal of the assistance system is to increase safety and fuel efficiency by providing early information about future situation in which a driver is forced to decelerate. Additionally an early and smooth deceleration phase decreases the likelihood of new traffic jams and is favorable for recuperation in hybrid and electric vehicles. As there are numerous parameters to describe traffic flow, some selected ones shall be discussed here, in particular fuel consumption and maximum deceleration.
480
F. Laquai, M. Duschl, and G. Rigoll Table 1. Average maximum deceleration and average fuel consumption of test group Maximum deceleration (m/s²)
Fuel consumption (liters)
Mean
SD
Mean
SD
0%
5.6427
1.1557
0.2324
0.02162
5%
5.9134
1.1326
0.2319
0.02265
10%
6.6970
0.5212
0.2269
0.02024
20%
6.5900
1.1665
0.2292
0.02116
50%
6.5200
1.3540
0.2298
0.02445
70%
6.5200
1.3540
0.2298
0.02445
100%
6.5200
1.3540
0.2298
0.02445
Percentage
For easier interpretation the data from table 1 is rendered as a bar diagram with standard deviation indicators in fig. 9.
Fig. 9. Average max. deceleration [m/s²], varying percentages of equipped cars
From percentages of 10% and higher, a significantly (α = 5%) higher maximum deceleration can be detected. This is somewhat unexpected as the assistance system was designed to create smoother deceleration phases, while the results with our model show the opposite. Still the results from the driving simulator experiments showed the expected smoothing effect regarding maximum deceleration and also standard deviation of the acceleration. From the view of fuel consumption, no significant (α = 5%) improvement could be detected. Again, the driving simulator results showed a fuel saving potential of up to 47% (for this single traffic situation and course length, see [Popiv102]). The reasons for these poor experiment results shall be discussed in the following paragraph.
Impact and Modeling of Driver Behavior Due to Cooperative Assistance Systems
481
7 Issues and Future Work Both parameters discussed above (maximum deceleration, fuel consumption) do not show the expected effects as the driving simulator experiments with real drivers suggest. The main reasons for this are sudden speed changes resulting in excessive acceleration and fuel consumption. These occur when the model described above is switched off and the normal PELOPS driver model gains control again over the vehicle again. The PELOPS model keeps running parallel to the deceleration model, but its output for acceleration is overridden. As soon as the deceleration model is switched off, a large positive acceleration step is applied as the PELOPS model has a higher desired speed (see fig. 10). With the current implementation it is not possible to overcome this issue without substantial redesign of the Simulink controller or enhancements in the SIL interface. Regarding the SIL interface, a better parameter to control deceleration behavior might be the desired speed of the driver model instead of forcibly setting a acceleration. The other – and less elegant way – would be to implement a full car following model in Simulink also which can be parameterized at model switching events to prevent the current problems.
Fig. 10. Peaks in speed curve resulting from driver model switching
A last solution would be a changeover to a different traffic simulation which exposes a more flexible and transparent interface to the user than PELOPS. This could possibly achieved with the Open Source framework SUMO (described in detail in [Kraj06]) developed by the German Aerospace Center. Investigating the proposed solutions will be part of future work on this topic as the current results are still not satisfying. Another point for improvement would be a switch from a distance-based regeneration of the deceleration curve to a TTC-based approach which might be a more realistic indicator (see [Kiefer06]). Also the driving simulator experiment should be conducted a second time in a different Hardware Mockup but with the same software to validate or falsify the presence of the steps in the speed curves during deceleration.
482
F. Laquai, M. Duschl, and G. Rigoll
Acknowledgements. This work was partially supported by BMW Forschung und Technik GmbH. Markus Duschl contributed the birdeye visualization, Darya Popiv and Kerstin Sommer provided substantial support for the driving simulator experiments.
References [Duschl10]
Markus, D., et al.: Birdeye visualisation for assisting prospective driving. In: Fisita World Automotive Congress, Budapest (2010) [Popiv10] Darya, P., et al.: Effects of assistance of anticipatory driving on driver’s behaviour during deceleration phases. In: Humanist Conference, Berlin (2010) [Popiv102] Darya, P., et al.: Reduction of fuel consumption by early anticipation and assistance of deceleration phases. In: Fisita World Automotive Congress, Budapest (2010) [Bennet94] Bennett, C.: Modelling driver acceleration and deceleration behaviour in New Zealand. N.D. Lea International, Vancouver (1994) [Daum09] Daum, S.O., Hecht, H.: Distance estimation in vista space. Attention, Perception, and Psychophys. 71, 1127–1137 (2009) [PELOPS10] Forschungsgesellschaft Kraftfahrwesen mbH: PELOPS White Paper, Aachen (2010) [Kiefer06] Kiefer, et al.: Time-to-collision judgements under realistic driving conditions. Human Factors 48, 334–345 (2006) [Kraj06] Krajzewicz, et al.: The Open Source Traffic Simulation Package SUMO. In: RoboCup, Infrastructure Simulation Competition, RoboCup, Germany, Bremen (2006)
Predicting the Focus of Attention and Deficits in Situation Awareness with a Modular Hierarchical Bayesian Driver Model Claus Möbus*,**, Mark Eilers*,**, and Hilke Garbe Learning and Cognitive Systems / Transportation Systems C.v.O University / OFFIS, Oldenburg, Germany http://www.lks.uni-oldenburg.de/ {claus.moebus,hilke.garbe}@uni-oldenburg.de, [email protected]
Abstract. Situation Awareness (SA) is defined as the perception of elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future [1]. Lacking SA or having inadequate SA has been identified as one of the primary factors in accidents attributed to human error [2]. In this paper we present a probabilistic machine-learning-based approach for the real-time prediction of the focus of attention and deficits of SA using a Bayesian driver model as a driving monitor. This Bayesian driving monitor generates expectations conditional on the actions of the driver which are treated as evidence in the Bayesian driver model. Keywords: Focus of attention, deficits in situation awareness, Bayesian autonomous driver model, Bayesian driving monitor, modular hierarchical Bayesian driver model, learning of action-relevant percepts.
1 Introduction Driver’s behavior should be regarded as an instance of Cognition in the Wild [3]. Its study is not possible in highly controlled laboratory experiments but only in natural (real or simulated) traffic scenarios. A natural scenario includes all hazards even where they are rare events. Because of their clinical cleanness most simulated traffic scenarios are lacking this kind of naturalness. They are not ecological valid with respect to experiments and cues: the external validity of simulation experiments is doubtful. This is especially important when studying SA. Scenarios without hazards or with a biased set of hazards cause biased theoretical hypotheses and empirical results. In this paper we assume, that all SA-studies which will be planned according to our proposals are ecological valid. Unfortunately, this was not the case for the simulated driving studies discussed here. *
Project Integrated Modeling for Safe Transportation (IMOST) sponsored by the Government of Lower Saxony, Germany under contracts ZN2245, ZN2253, ZN2366. ** Project ISi-PADAS funded by the European Commission in the 7th Framework Program, Theme 7 Transport FP7-218552. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 483–492, 2011. © Springer-Verlag Berlin Heidelberg 2011
484
C. Möbus, M. Eilers, and H. Garbe
2 Early Models of Lo ooking and Driving There is a wealth of reporteed eye movement strategies and relevant perceptional ccues which are sometimes contraadictory or ambiguous [4, ch.7]. When driving on empty straight roads two cues are required for proper llane maintenance: the location relative r to the lane edges at some preview distance and the splay angles (Fig. 1) that th he lane edges make with the driver’s heading [5, 6].
Fig. 1. Steering corrections in n the UK (right-hand-drive car in the left lane) as a functionn of heading (bold arrow), splay an ngles, and location (displacement) at preview distance (horizoontal dotted line; adapted from [4, p.118]). Splay angles in reality differ from those in Fig. 1.
Only one corrective rottation is needed to reduce the risky situation 1.b to the normal position 1.a or from m 1.c to 1.a. Two corrections are necessary to reduce 1.dd to 1.b and then to 1.a or to red duce 1.e to 1.c and then to 1.a. Changing lanes involves a three-stage maneuver: a turn out-of-lane (1.aÆ1.cc), a period of oblique travel, an nd then a counterturn (1.cÆ1.a) in the new lane [4]. T The quality of the maneuver waas reported to improve when a slower vehicle ahead was the reason for the lane change and a could be used as an inertial cue for perception [7]. Steering around (gentle)) bends is more complicated, because splay angles chaange with distance and during th he drive. The gaze sampling strategy assumes that drivvers choose a point on the fu uture path at a constant preview distance and uses the horizontal eccentricity of th his point to adjust steering [4, 8, and 9]. The steering whheel angle is a function of the paath curvature, which can be estimated by the angle betw ween gaze direction (to a preview w point P at a distance D) and heading. According to floowfield theory the driver samp ples cues in the form of flow lines from various parts off the view field [4, 8 - 10]. Winding country roads require r anticipatory estimates of the future curvature frrom road features or flow liness. Land & Lee [11] found that drivers spend much tiime
Predicting the Focus of Attention and Deficits in Situation Awareness
485
looking at the tangent point of the upcoming bend. This is the moving point on the inside where the driver’s gaze line is tangential to the road edge. It is nearly stationary in the flow field when the curvature is constant. Similar findings in favor of this strategy were reported [12 - 15]. Other studies favor the hypothesis that it is not the tangent point but the future path that is viewed more frequently [16 - 18]. Either way, the steering angle seems to be a function of the gaze angle with a time lag of ca. 0.8 sec [4, Fig. 7.5b]. This seems to be a planned delay to enhance the anticipatory horizon of the driver. When the course is not too demanding there are off-road glances between 0.5 and 1 sec [19] which provide evidence for lateral control with peripheral [20] or ambient [21] vision. Computational models of steering behavior have been engineered since 1978 [22]. Donges’ model uses feedback and feed-forward or anticipatory information from near and distant parts of the road. There is empirical evidence for this particular combination of information [23, 24]. The two-point control model of steering [25] rests on the same combination. The change of the steering wheel angle is a function of the change of the angles between heading and two preview points (near and far point) and of the angle between heading and the near point. To summarize, there is universal agreement that fluent driving on winding roads requires visual information from both near and distant road regions. The near region involves feedback from the angular position and velocity of the lane boundaries. There is, however, less agreement about which features in the far road region are necessary or sufficient to provide prospective feed-forward information (Mars, 2008) [4, p.129]. During natural driving, the gaze may be intermittently directed to the (tangent point) TP and the road ahead as a way to determine the future path and keep steering smooth at the same time [14]. Other scenarios like braking, overtaking, entering highways, driving in cities, multitasking, etc. provide and require different cues and other more interleaved visual strategies. These are at the moment only rudimentary understood or formalized. So we developed a nearly assumption-free modeling strategy which rests on machinelearning, data mining, stochastic models. One of the few assumptions is that relevant percepts or cues could be learnt on-the-fly by stochastic structure-learning techniques.
3 Hierarchical Modular Bayesian Autonomous Driver Models In our earlier research [26-28] we developed a Bayesian Autonomous Driver (BAD) model based on Dynamic Bayesian Networks proposed in [29] (Fig. 2). The model assumed that one single model is sufficient to generate actions for longitudinal and lateral control in an arbitrary virtual course. Now, we have a more sophisticated view of human competence. We assume and model skill hierarchies (Fig. 3) of human drivers with a new hierarchical modular probabilistic architecture [30, 31] (Fig. 4). According to our skill hierarchy (Fig. 3) the maneuver Overtaking is partitioned into the three behaviors PassOut, PassCar, and PassIn. If we had reused Land & Tatler’s proposal of partitioning the ChangeLane maneuver we would get a partition of seven behaviors for the Overtaking maneuver.
486
C. Möbus, M. Eilers, and H. Garbe
Fig. 2. Reactive Bayesian Autonomous Driver (BAD) model based on a 2-time-sliced dynamic Bayesian network
Fig. 3. Skill hierarchy of a driver with partitioning the overtaking maneuver into the three behaviors PassOut, PassCar, and PassIn
Predicting the Focus of Attention and Deficits in Situation Awareness
487
Fig. 4. Implementation of the skill Maneuverj by the assemblage of a Bayesian Gating Model Gj and submodels Bj and Ajk with BIC-relevant peephole percepts P
4 Learning Relevant Percepts in Hierarchical Modular Bayesian Autonomous Driver Models Our BAD models are able to drive autonomously in the TORCS simulation environment on a virtual country road - including passing or overtaking maneuvers without opposing traffic (Fig. 5). They rest on the assumption that there is considerable uncertainty about the relevant percepts in natural driving maneuvers. So the relevant percepts have to be identified during the modeling process. To model the driver’s ambient and foveal vision channels [21] we used 205 different time-independent (estimates of speed, distance and angle) and timedependent percepts (TDP). TDPs are percepts similar but not identical to Lee’s timeto-x (tau) measures [32-34]. TDP-percepts (e.g. SLA-angles in Fig. 6 and SCA-angles in Fig. 7) are defined on points of reference or preview in the scenario. One point of reference is the current position; the points of preview are located in the future course and will be reached by the driver in a certain amount of time depending on the speed of his vehicle. A more detailed description of the variables is given in [31]. This set of variables proved to be informative for lateral and longitudinal control by machinelearning standards. Single time-dependent angles tended to over-anticipate the upcoming course of the road when approaching sharp curves (e.g. hairpins) with high speed [26]. Fanning out distance sensors from the driver to the street boundaries [28] solved the problem of over-anticipation but were too restrictive and not able to anticipate the course of the road in curves. Now we use sets of time-dependent and –independent angles to encode the vision field of the driver and to avoid the problem of over-anticipating.
488
C. Möbus, M. Eilers, and H. Garbe
In a training phase we selected percepts relevant for the actions steer and de/acceleration with a step-wise structure-learning technique using the Bayesian Information Criterion (BIC). We assume – as said before - that the training phase is ecological valid with respect to all hazards. For each skill in the hierarchy the procedure selected a minimal set of relevant (‘peephole’) percepts PAjk and PBj (Fig. 4, 6, 7). In total, 41 of the 205 percepts were identified as relevant for Action and Behavior-Classification models [31].
Fig. 5. Experimental setup with TORCS course, variables of interest, and data classification
5 Predicting the Focus of Attention and Deficits in Situation Awareness with the Bayesian Driver Model as a Driving Monitor After the training phase the BAD model could be used as a multi-purpose driving monitor. First, it could be used to check whether the drive can be considered ‘normal’ or ‘abnormal’ according to the standards of a correct selected action model (Fig. 4, 6). This could be checked by computing the likelihood of the drive. The selection of action models is done by the Behavior Classification model (Fig. 4, 7) and its peephole percepts (e.g. NCA and SLA0.2s in Fig. 7).
Predicting the Focus of Attention and Deficits in Situation Awareness
489
Fig. 6. Action-model PassOut (one speed-, two distance-, and time-based peephole percepts)
Fig. 7. Behavior-Classification-model Overtaking with one distance-based and one time-based peephole percept
490
C. Möbus, M. Eilers, and H. Garbe
Second, predictions of the focus of attention can be obtained. Because the Behavior and the Action-model (Fig. 4) are conceptualized and validated as efficient Bayesian classifier models we are able to predict the relevance or adequacy of percepts on the basis of the driver actions by asking the questions P(Percepts | Actions). In the case of driving the questions are of course reversed in direction to P(Actions | Percepts). Third, predictions of deficits in situation awareness can be obtained during the same monitoring process. There are two alternatives in SA deficit prediction. In the first case we predict that in certain behaviors or maneuvers some percepts are not relevant in the view of the action model (question marks in Figures 6-9). We expect that we can introduce hazards (e.g. vehicles or pedestrians appearing on the ?-marked sides of the car) which will not be noticed by the driver.
Fig. 8. Inadequate percept-action mapping behavior according PassOut-Action model
Fig. 9. Adequate percept-action mapping behavior according PassOut-Action model
In the second case the prediction of deficits in SA rests on an abnormality or likelihood check of the driver behavior according to the standards of the selected action model. It is assumed that drivers sometimes choose actions not adequate for the scenario. This could be due to the fact that they substitute real-world percepts by mental assumptions or anticipations. E.g. if we assume that the driver chooses the same speed in Figure 6 and 8 we infer with the help of the driver model, that the relevant percepts are identical.
Predicting the Focus of Attention and Deficits in Situation Awareness
491
But in Figure 8 the relevant percepts SLA are inadequate for the chosen behavior because they are shaded by the bus. The driver is driving too fast according to the standards of the PassOut-Action model. An adequate percept-action mapping according to the PassOut-Action model is depicted in Figure 9. A preference for slower driving guarantees that the relevant time-dependent percepts are not “out of sight”. Further empirical studies could reveal the necessity of a new PassOutSchoolbus-Action model with a new percept-action mapping (Figure 9).
5 Summary The BAD MoB model could be used as a monitor of the driver’s behavior. First, it could be used to compute the likelihood of the actual driving behavior under the assumption of a correct selected action model. Second, it could be used to predict the focus of attention on the basis of driver actions by answering the questions P(Percepts | Actions). Third, it could be used to predict deficits in SA. Behavior and actions which seem to be unlikely in the view of the scenario-relevant valid action model are indicators of reduced SA. At the same time the abducted BIC-relevant percepts should be checked whether they could be really observed in the driving situation. If not then this gives a hint that the driving behavior is inadequate for the situation. Furthermore the abducted nonrelevant percepts give hints where hazards could intrude the local vicinity of the vehicle unnoticed from the situational attention system of the driver.
References 1. Endsley, M.R.: Toward a theory of situation awareness in dynamic systems. Human Factors 37(1), 32–64 (1995) 2. Horswill, M.S., McKenna, F.P.: Driver’s Hazard Perception Ability: Situation Awareness on the Road. In: Banbury, Tremblay (eds.) A Cognitive Approach to Situation Awareness: Theory and Application, pp. 155–172. Ashgate Publishing, Ltd. (2004) 3. Hutchins, E.: Cognition in the Wild. MIT Press, Cambridge (1996) 4. Land, M.F., Tatler, B.W.: Looking and Acting: Vision and eye movements in natural behavior. Oxford University Press, Oxford (2009), ISBN 978-0-19-857094-3 5. Beall, A.C., Loomis, J.M.: Visual control of steering without course information. Perception 25, 481–494 (1996) 6. Riemersma, J.B.: Visual control during straight road driving. Acta Psych. 48, 215–225 (1981) 7. Macuga, K.L., Beall, A.C., Kelly, J.W., Smith, R.S., Loomis, J.M.: Changing Lanes: inertial information facilitates performance when visual feedback is removed. Experimental Brain Research 178, 141–150 (2006) 8. Wann, J.P., Land, M.F.: Steering with or without the flow: Is the retrieval of heading necessary? Trends in Cognitive Science 4, 319–324 (2000) 9. Wann, J.P., Wilkie, R.M.: How do we control high-speed steering? In: Vaina, L.M., Beardsley, S.A., Rushton, S.K. (eds.) Optical Flow and Beyond, pp. 371–389. Kluwer Academic Publishers, Dordrecht (2004) 10. Lee, D.N., Lishman, R.: Visual control of locomotion. Scand. J. Psych. 18(1), 224–230 (1977) 11. Land, M., Lee, D.N.: Where we look when we steer. Nature 369, 742–744 (1994)
492
C. Möbus, M. Eilers, and H. Garbe
12. Underwood, G., Chapman, P., Crundall, D., Cooper, S., Wallen, R.: The visual control of steering and driving: where do we look when negotiating curves? In: Gale, A.G. (ed.) Vision in Vehicles VII, pp. 245–252. Elsevier, Amsterdam (1999) 13. Chattington, M., Wilson, M., Ashford, D., Maple-Horvat, D.E.: Eye-steering coordination in natural driving. Experimental Brain Research 180, 1–14 (2007) 14. Mars, F.: Driving around bends with manipulated eye-steering coord. J. Vision 8(11) (2008) 15. Kandil, F.I., Rotter, A., Lappe, M.: Driving is smoother and more stable when using the tangent point. Journal of Vision 9(1), 1–11 (2009) 16. Robertshaw, K.D., Wilkie, R.M.: Does gaze influence steering around a bend? Journal of Vision 8(4), 18, 1–13 (2008) 17. Wilkie, R.M., Wann, J.P.: Driving as night falls: The contribution of retinal flow and visual direction to the control of steering. Current Biology 12, 2014–2017 (2002) 18. Wilkie, R.M., Wann, J.P.: Eye-Movements Aid the Control of Locomotion. Journal of Vision 3, 677–684 (2003) 19. Yilmaz, E.R., Nakayama, K.: Fluctuation of attention levels during driving. Investigative Ophthalmology and Visual Science 36, 940 (1995) 20. Mourant, R.R., Rockwell, T.H.: Mapping Eye-Movement Patterns to the Visual Scene in Driving: An Exploratory Study. Human Factors 12(1), 81–87 (1970) 21. Horrey, W.J., Wickens, C.D., Consalus, K.P.: Modeling Driver’s Visual Attention Allocation While Interacting With In-Vehicle Technologies. J. Exp. Psych. 12, 67–78 (2006) 22. Donges, E.: A 2-level model of driver steering behavior. Hum. Factors 20, 691–707 (1978) 23. Land, M.F.: The Visual Control of Steering. In: Harris, L.R., Jenkin, M. (eds.) Vision and Action, pp. 163–180. Cambridge University Press, Cambridge (1998) 24. Land, M., et al.: Which Parts of the Road Guide Steering? Nature 377, 339–340 (1995) 25. Salvucci, D.D., Gray, R.: A Two-Point Visual Control Model of Steering. Perception 33, 1233–1248 (2004) 26. Möbus, C., Eilers, M.: Further Steps towards Driver Modeling According to the Bayesian Programming Approach. In: Duffy, V.G. (ed.) ICDHM 2009. LNCS, vol. 5620, pp. 413– 422. Springer, Heidelberg (2009) 27. Möbus, C., Eilers, M.: Integrating Anticipatory Competence into a Bayesian Driver Model. In: Cacciabue, C., et al. (eds.) Human Modeling in Assisted Transportation. Springer, HD (in press), ISBN-13: 978-8847018204 28. Möbus, C., Eilers, M.: Prototyping Smart Assistance with Bayesian Autonomous Driver Models. In: Chong, N.Y., Mastrogiovanni, F. (eds.) Handbook of Research on Ambient Intelligence and Smart Environments, IGI Global (in press), ISBN13: 9781616928575 29. Forbes, T., Kanazawa, H.K., Russell, S.: The BATmobile: Towards a Bayesian Automated Taxi. In: IJCAI 1995, vol. 2, pp. 1878–1885 (1995) 30. Eilers, M., Möbus, C.: Learning of a Bayesian Autonomous Driver Mixture-of-Behaviors (BAD-MoB)Model. In: Duffy, V.G. (ed.) Advances in Applied Digital Human Modeling, pp. 436–445. CRC Press, Taylor & Francis Group, Boca Raton (2010), ISBN 978-1-4398-3511-1 31. Eilers, M., Möbus, C.: Learning the Relevant Percepts for Modular Hierarchical Bayesian Driver Models using the Bayesian Information Criterion, HCII (in press, 2011) 32. Lee, D.N.: A theory of visual control of braking based on information about time-tocollision. Perception 5, 437–459 (1976) 33. Lee, D.N.: How Movement is Guided (2005), http://www.perception-in-action.ed.ac.uk/PDF_s/ Howmovementisguided.pdf 34. Pepping, G.J., Grealy, M.A.: Closing the Gap: The Scientific Writings of David N. Lee. Lawrence Erlbaum Associates, Mahwah (2007), ISBN-13: 978-0805856194
The Two-Point Visual Control Model of Steering New Empirical Evidence Hendrik Neumann and Barbara Deml Otto-von-Guericke-Universität Magdeburg, Institute of Ergonomics, Manufacturing Systems and Automation (IAF), Universitätsplatz 2, D-39106 Magdeburg, Germany {Hendrik.Neumann,Barbara.Deml}@ovgu.de
Abstract. Formal models of human steering behavior can enhance our understanding of perceptual and cognitive processes involved in lateral control. One such model is the two-point visual control model of steering proposed by Salvucci and Gray [8]. An experiment was conducted to test one of its central assumptions, namely that people use information coming from only two locations, a near region about 8m in front of the car and a far region 0,9s down the road, for lane keeping. 42 subjects completed a simulated driving task; road visibility was either unrestricted or reduced according to the assumptions of the two-point model. Additionally, the subjects could either freely choose where to look or had to fixate a target located in the far region. Analysis of steering precision data showed that reduced visibility did not reduce steering precision, thus lending support to the near/far region assumption of the two-point model. Keywords: Driver Modeling, Lane Keeping, Near/Far Point, Two-Point Visual Control Model of Steering.
1 Introduction Keeping a car on the lane is one of the fundamental tasks faced by human drivers. To gain a better understanding of the mechanisms of lane keeping, several models of steering behavior have been proposed, using control theoretic (for an overview, see [1]) or computational (e.g., [2], [3]) approaches. Most of these models cannot claim to realistically model the perceptual and cognitive processes that enable human drivers to steer around curves and generally stay on the road, though. For example, traditional control theoretic models, like the one proposed by Donges [4], tend to ignore the peculiar limitations of human perception and information processing, using variables like curvature which are difficult for human drivers to judge correctly (cf. [5], [6]). In recent years, new approaches have been submitted that explicitly use findings from perceptual psychology (e.g. [7]), thus trying to overcome the limitations of the more traditional approaches. One such model is the two-point visual control model of steering introduced by Salvucci and Gray [8] (also see [9]). In the following section, an overview of this model and its underlying assumptions is given, and it is shown that an empirical test for the model’s basic premises is still lacking. The rest of this paper describes an experiment that tried to explicitly test one of its central V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 493–502, 2011. © Springer-Verlag Berlin Heidelberg 2011
494
H. Neumann and B. Deml D
assumptions. A description n of the method used and the results are given in sectionn 3. A discussion of these resultts can be found in section 4.
2 The Two-Point Vissual Control Model of Steering The two-point visual contrrol model of steering (henceforth two-point model) likkens human steering behavior to o a tracking task and is based on two premises [8]. The ffirst premise states that informaation relevant for lane keeping comes from two regionss of the visual scene, one locatted near to the car (near point) and one further down the road (far point). The secon nd basic assumption is derived from a study reportedd by Land and Lee [10], who fo ound that human drivers tend to look at the tangent pooint while steering around bendss. Based on these premisess, the two-point model assumes that by fixating a far pooint, which in curves coincides with w the tangent point, the visual angle to that point cann be assessed by human driverss. The visual angle towards the near point, located in the middle of the lane, can be judged likewise; see Figure 1 for an illustration of thhese n the road. The amount of steering is linked to these anggles points and their location on in a way that can be formu ulated as a PI-controller (cf. Formula 1): the drivers tryy to minimize changes in the angles a to the near and far point, and additionally steerr to keep the angle to the near point p at zero (indicating a centered position in the lane).
Fig. 1. Far (cross) and near (ciircle) point used by the two-point visual control model of steeering on straights (left) and in curvess (right) [8]
Formula 1 gives the equaation for the described controller in its continuous form:: .
(1)
Here, is the change in steeering angle, and are changes in the angles to thee far and near point, respectively y, and is the angle to the near point. The coefficients , and are used to scalle the contributions of each angle information. A discrrete version of this controller (aand thus the two-point model) has been implemented in the cognitive architecture ACT-R [11] to form a computational driver model [12]. d Gray did not specify if the near point informationn is Originally, Salvucci and acquired through fixation of o the near point or peripherally, only stating that fixattion
The Two-point Visual Control Model of Steering - New Empirical Evidence
495
might not be necessary [8]. The implemented model1, though, explicitly fixates only the far point, so all further discussion of the model is based on this more specific version. 2.1 Empirical Evidence Concerning the Near/Far Point Assumption Empirical evidence for this assumption comes from a study conducted by Land and Horwood [13]. The authors used a scenario where only limited visual information was present for steering: Instead of showing the whole road, only segments subtending a visual angle of 1° were shown to the participants [13], [14]. Either one or two such segments were visible, which were varied with regard to their position. With only one segment visible, steering precision (measured as the reciprocal of the standard deviation of lane position, measured from lane center) was worse than when the full road was displayed. When an additional segment was shown, performance increased; optimal driving precision was reached when one segment was presented about 4° down from the horizon and the other near to the (virtual) car, about 7° down from the horizon. In this condition, driving performance was as good as with full road visibility. When varying speed, the optimal near segment stayed in the region 7° down from the horizon (or approx. 8.0 to 9.0m ahead); the optimal far region varied in distance, though. When measuring the preview time, this was constant at about 0.8 to 0.9s. Thus, it seems that steering requires visual information about the road trajectory from one point near to the car and another point approximately 0.8s to 0.9s down the road [13], [14], [15]. While the described finding supports the basic assumption of the two-point model, Land and Horwood report only data from three participants each conducting five drives [13]. Additionally, Chatziastros, Wallis and Bülthoff could not reproduce the results found by Land and Horwood [16]. Using a similar method, they registered no improvement of driving performance (lateral deviation) for adding a second visible road segment beyond the one near to the car. Furthermore, in the experiments reported by Land and Horwood, participants were free to direct their gaze wherever they wanted [13]. Though about 50% of all fixations were found to fall into the far region, the authors did not prevent glances to the near area, which could possibly have aided their subjects in keeping the lane. As the two-point model in its implemented form explicitly states that normal steering performance is possible even if only the far point is fixated, a rigorous test of the near/far point assumption is necessary to validate the model. 2.2 Implications of the Tangent Point Assumption Which point in the far region is actually fixated by drivers? In this regard, the two point model builds on the results of Land and Lee's analysis of drivers' gaze during curve negotiation [10]. The authors found that human drivers tend to fixate an area of the road close to the tangent point. Land and Lee proposed that the visual angle to the tangent point can be used to estimate curvature, which then can be used to control steering behavior [10]. This proposal suffers from the same problems as some other 1
The LISP-code for the model has been made publicly available by the author at http://www.cs.drexel.edu/~salvucci/threads.html
496
H. Neumann and B. Deml
control theoretic approaches, namely that humans have difficulty judging curvature correctly [5], [6]. But according to the two-point model, there is no need to estimate curvature; the driver simply needs to fixate the tangent point and keep the visual angle to it stable in order to smoothly follow the curve (cf. the PI-controller stated above). On the other hand, Robertshaw and Wilkie [17] (also see [7], [18], [19]), propose that for optimal steering drivers should not fixate the tangent point, but a point on the future trajectory of their car. As was shown in [20], drivers tend to steer towards the fixated point, even if instructed to stay in the middle of the lane. According to this model, fixating a point located on the lane center (center point) would be the best strategy when trying to keep the car in the middle of the lane. The PI-controller of the two-point model could actually accommodate both tangent and center point for steering control [8]. When testing the near/far point assumption while holding drivers’ gaze fixated on the far region, it is more convenient to use the center point, though. The vertical position of the fixation target should be fixed to that of the far region; as the tangent point’s vertical position depends on curvature and the car’s position in the lane, this cannot be guaranteed by experimental manipulation alone. Additionally, the implemented version of the two-point model uses the center point on straights. As there are no differences in fixation point in this instance, testing the assumptions could only be done on curves, lessening the external validity of the study. To conclude, fixating a point located centrally on the lane (center point) might actually be the best strategy for drivers trying to keep their car located centrally in the lane, and allows for locking the fixation target to the same far point region in curves and on straights. A test of the near/far point assumption using restricted gaze could therefore use this point and not the tangent point. This would still allow for testing of the near/far point hypothesis, though it does not allow for testing the tangent point assumption of the model.
3 Testing the Near/Far Point Assumption of the Two-Point Model The experiment reported in the following section had the goal of testing the assumption that two visible road segments are indeed sufficient for normal steering performance (defined as driving precision and measured by the reciprocal of SD of lateral deviation and SD of heading error) when the gaze is exclusively directed at the "far" segment, compared to a free gaze baseline condition. To independently assess the impact of reduced visibility and restricted gaze, a 2 (visibility restricted vs. unrestricted) x 2 (center point fixation vs. free gaze) experimental design was chosen. The hypothesis derived from the two-point model assumptions combined with the evidence concerning gaze direction [19] was as follows: Reducing road visibility does not lead to decreased driving precision (i.e. significantly higher SD of lateral deviation and heading error). If drivers show impaired steering precision in the limited visibility condition, this would directly invalidate the two-point model.
The Two-point Visual Control Model of Steering - New Empirical Evidence
497
3.1 Method A simulator study analogous to the Land and Horwood experiment [13], [14] was devised to test the stated hypothesis. A Logitech G27 steering wheel mounted on a table was used as car mockup, with participants sitting in a real car seat in front of the table. The simulation was displayed on a projection screen (2m width by 2m height) located approximately 2.5m in front of the steering wheel. For projection, an Acer P1165 projector set to 1280 x 1024 pixel resolution was used. This was located 2m behind and above the driver's seat, with the experimenter and computer for running the simulation software located to one side. The simulated single-lane road had a length of 1360m, a width of 3m and was created using the driving simulation software SILAB. Only white road edge markings of 0.1m width on a black background were visible. The road was either displayed in its entirety (full visibility condition) or with only two horizontal segments visible (reduced visibility condition). Those segments were adjusted to be 1° high and were located at heights corresponding to the near point (8.3m in front of the car) and far point (0.85s ahead of the car). Thus the positions of the visible bars were set to the values (as derived from [15]) allowing optimal performance. The track consisted of a series of four left and four right curves of 135m length, with 50m of straight road in between the bends, and straights 150m long at the start and the end of the track. A radius of 47m was chosen for the curves, resulting in a challenging driving task. Gaze fixation was regulated by the use of pink, circular fixation targets displayed at the lane center in the far region. The road configuration for all conditions is displayed in Figure 2. In order to test compliance with fixation instructions, point of gaze was recorded using a Dikablis eye tracking system, synchronized with the simulation so each logging step of the simulation recorded the corresponding information about gaze direction. To allow for automated mapping of eye tracker gaze coordinates to simulation window coordinates, two geometric markers positioned to the left and right of the lane were displayed along with the road (see Figure 2). Participants were 42 students of Otto-von-Guericke-Universität Magdeburg. After a welcome by the experimenter, each subject signed an informed consent form and completed a questionnaire to assess demographic variables (age, gender), driving experience (time since the driving license was acquired, yearly mileage), and familiarity with driving games. This was done to exclude novice drivers from the study. In the next step, the participants were briefed about their task in the experiment followed by four practice trials of 5 minutes duration. Those included all visibility and fixation conditions, but the track differed from that used in the experimental trials. The practice trials were followed by calibration of the eye tracking device. After the calibration, each participant drove all four experimental tracks, which were preceded by a repetition of the relevant instructions. The trials were presented in random order to avoid systematic bias due to additional practice. Questions about the study could be asked at the end of the experiment. The whole process took less than one hour per participant. All subjects were explicitly told to keep as close to the lane center as possible. As dependent variables three lane keeping performance measures were used (cf. [20]):
498
H. Neumann and B. Deml D
mean lateral position (relaative to the lane centerline), standard deviation of lateeral position, and standard deviaation of the heading error. For computation of mean lateeral position, the absolute deviaation from the centerline was used. This was done to avvoid lateral deviation in left and a right curves cancelling each other out due to ssign differences, resulting in a mean m lateral deviation near zero even in subjects show wing marked cutting of curves. The T relevant data (lane position in m, heading error in rrad) was logged automatically by the simulation software at a rate of 60 Hz.
Fig. 2. Road configuration including two markers for the eye tracker and fixation circles forr the two fixation conditions center point (left) and free gaze (right). The top row shows the reducced, the bottom row the full road visibility v conditions. Note that the fixation targets originally w were displayed in pink color to increease their saliency.
3.2 Results The data from 8 participaants had to be excluded from the analysis due to llane exceedances (resulting in extreme values for the lane deviation and/or loss of the fixation point) or failure to o comply with fixation instructions. The remaining sam mple consisted of 19 male and 15 1 female subjects (total N = 34, mean age = 25,62 yeears (SD = 4,29) with a range off 19 to 39 years), all of whom reported at least two years of driving experience and only y passing familiarity with computer racing games. The ffirst and last 100m of the track k were excluded from the analysis because the car neeeded some time to accelerate to o 15m/s after each trial start and the imminent end of the road might have resulted in n odd steering behavior from the participants. To test the stated hypothesis, two repeated measures ANOVAs were conducted on the remainning
The Two-point Visual Control Model of Steering - New Empirical Evidence
499
data. Table 1 gives the mean steering precision measures for the four experimental conditions and their standard deviations (SD). Precision descriptively increased both with reduced visibility and in the center point fixation condition. Table 1. Mean and SD of steering precision (reciprocal of the SD of lateral deviation in m and heading error in rad), listed separately for each of the four experimental conditions Fixation free gaze center point
Visibility
1/SD lateral deviation
1/SD heading error
Mean
Mean
SD
SD
unrestricted
3.577
1.162
57.214
9.045
restricted
3.888
1.335
74.991
19.076
unrestricted
3.844
1.381
68.156
13.186
restricted
3.955
1.366
77.085
16.992
For the reciprocal of SD of lateral deviation, the ANOVA revealed neither significant main effects for fixation, F(1, 33) = 1.44, p = .24 (η2 = .04), nor for visibility condition, F(1, 33) = 1.39, p = .25 (η2 = .04). The interaction yielded no significant effect either, with F(1, 33) < 1, p = .56 (η2 = .01). For the reciprocal of SD of heading error, the repeated measured ANOVA showed significant main effects for fixation, F(1, 33) = 17.35, p < .001 (η2 = .35), and visibility, F(1, 33) = 36.17, p < .001 (η2 = .52). The interaction effect was significant, too, with F(1, 33) = 9.49, p < .01 (η2 = .22). As can be seen in Figure 3, the significant effects resulted because of the better steering precision in restricted visibility and center point fixation conditions, with the worst performance shown in the baseline condition (full visibility, free gaze).
Fig. 3. Steering precision results for the four experimental conditions. In a), the marginal means for the reciprocal of the SD of heading error are reported (in 1/rad) with standard error bars shown; in b), the marginal means of the reciprocal of the SD of lateral deviation (incl. standard error bars; in 1/m) are reported.
500
H. Neumann and B. Deml
4 Discussion The reported experiment was conducted to test the central assumption of the twopoint visual control model of steering, namely, that lane information coming from one region near to the car and one approximately 0.9s down the road is sufficient for precise steering, even with eye fixations only falling into the far region. The results indeed support this hypothesis, as the steering precision (as measured by the reciprocal of the SD of lateral deviation and heading error) was descriptively (in the case of lateral deviation) or even significantly (in the case of heading error) better with reduced visibility than with the whole road visible. This replicates the data reported by Land and Horwood [13], and additionally shows that even when fixating the far region, sufficient information about road position can be gleaned peripherally from the near area. The contradictory findings reported by Chatziastros et al. [16] could be the result of different curvature: in the Chatziastros et al. experiment, curve radius varied between 114.6m and 916.7m, as opposed to 47m in this study. Drivers probably benefit more from the far road region information when curvature is high, necessitating a faster steering response. Additionally, Chatziastros et al. used steering accuracy (lateral deviation) as their dependent measure, and not steering precision as Land and Horwood [13] and this study did. Having only near region information might be sufficient to keep approximately in the middle of the lane, but steering becomes much more stable when a second, far road segment is added. This hypothesis is supported by the steering wheel instability index reported by Land and Horwood, showing that the normalized number of peak steering responses is reduced to baseline levels when two visible segments are displayed instead of one [13]. One limiting factor to the generalizability of this study is that the reported results are only based on those participants who did not exceed the lane boundaries by more than half the car’s width. Eight subjects failed to do so; two thirds of all lane exceedances were observed in the reduced visibility, center point fixation condition, with almost every other exceedance occurring in the reduced visibility, free gaze condition. This shows that about one fifth of the subjects had difficulties with the basic lane keeping task when road visibility was limited. Though the data indicate that reduced information is sufficient for most drivers to steer precisely and keep in the middle of the lane (a post hoc analysis of mean lateral deviation showed no detrimental effect of reduced visibility, either), there might be some people who use different information for steering than that available in the reported experiment, e.g. by relying more on optical flow than visual angle information (cf. [21]). As the current study does not hint at the underlying mechanisms, further research is definitely needed in this regard. Lastly, the second important assumption of the two-point model, namely that drivers are using the tangent point as fixation target, was not tested in this experiment at all. Fixation was directed at a target located at the lane center, in accordance with theories positing that drivers should look where they steer (e.g. [7]). Though the PIcontroller of the two-point model works with any fixation point located in the far region [8], a thorough test of the implemented two-point model should include a tangent point fixation condition. Additionally, eye tracking data was only used to
The Two-point Visual Control Model of Steering - New Empirical Evidence
501
control for compliance with the fixation instruction. The eye tracking data recorded during the free gaze trials could be checked for evidence concerning the tangent point usage. Further analysis and experiments examining the role of the tangent point as fixation target in human drivers’ steering maneuvers will be the next step in our examination of the viability of the two-point visual control model of steering.
References 1. Jürgensohn, T.: Control Theory Models of the Driver. In: Cacciabue, P.C. (ed.) Modelling Driver Behaviour in Automotive Environments, pp. 277–292. Springer, London (2007) 2. Tsimhoni, O., Liu, Y.: Modeling Steering Using the Queueing Network – Model Human Processor (QN-MHP). In: Proceedings of the Human Factors and Ergonomics Society 47th Annual Meeting, pp. 1875–1879. Human Factors and Ergonomics Society, Santa Monica (2003) 3. Möbus, C., Hübner, S., Garbe, H.: Driver Modelling: Two-Point- or Inverted Gaze-BeamSteering. In: Rötting, M., Wozny, G., Klostermann, A., Huss, J. (eds.) Prospektive Gestaltung von Mensch-Technik-Interaktion. Fortschritt-Berichte VDI-Reihe 22, vol. 25, pp. 483–488. VDI Verlag, Düsseldorf (2007) 4. Donges, E.: A Two-level Model of Driver Steering Behaviour. Human Factors 20, 691– 707 (1978) 5. Fildes, B.N., Triggs, T.J.: The Effect of Changes in Curve Geometry on Magnitude Estimates of Road-like Perceptive Curvature. Perception & Psychophysics 37, 218–224 (1985) 6. Shinar, D.: Curve Perception and Accidents on Curves: An Illusive Curve Phenomenon? Zeitschrift für Verkehrssicherheit 23, 16–21 (1977) 7. Wilkie, R.M., Wann, J.P., Allison, R.S.: Active Gaze, Visual Look-Ahead, and Locomotor Control. J. of Exp. Psych.: Human Perception and Performance 34, 1150–1164 (2008) 8. Salvucci, D.D., Gray, R.: A Two-point Visual Control Model of Steering. Perception 33, 1233–1248 (2004) 9. Rushton, S.K., Salvucci, D.D.: An Egocentric Account of the Visual Guidance of Locomotion. Trends in Cognitive Sciences 5, 6–7 (2001) 10. Land, M.F., Lee, D.N.: Where We Look when We Steer. Nature 369, 742–744 (1994) 11. Anderson, J.R.: How Can the Human Mind Occur in the Physical Universe? Oxford University Press, Oxford (2007) 12. Salvucci, D.D.: Modeling Driver Behavior in a Cognitive Architecture. Human Factors 48, 362–380 (2006) 13. Land, M.F., Horwood, J.: Which Parts of the Road Guide Steering? Nature 377, 339–340 (1995) 14. Land, M.F.: Horwood, J.: How Speed Affects the Way Visual Information is Used in Steering. In: Gale, A.G. (ed.) Vision in Vehicles VI, pp. 43–50. North Holland, Amsterdam (1998) 15. Land, M.F.: The Visual Control of Steering. In: Harris, L.R., Jenkins, H. (eds.) Vision and Action, pp. 172–190. Cambridge University Press, Cambridge (1998) 16. Chatziastros, A., Wallis, G.M., Bülthoff, H.H.: The Effect of Field of View and Surface Texture on Driver Steering Performance. In: Gale, A.G. (ed.) Vision in Vehicles VII, pp. 253–259. North Holland, Amsterdam (1997)
502
H. Neumann and B. Deml
17. Robertshaw, K.D., Wilkie, R.M.: Does Gaze Influence Steering Around a Bend? J. of Vision 8(4) 18, 1–13 (2008) 18. Wilkie, R.M., Wann, J.P.: Eye-movements Aid the Control of Locomotion. J. of Vision 3, 677–684 (2003) 19. Wann, J.P., Swapp, D.K.: Why You Should Look Where You Are Going. Nature Neuroscience 3, 647–648 (2000) 20. Knappe, G., Keinath, A., Meinecke, C.: Empfehlungen für die Bestimmung der Spurhaltegüte im Kontext der Fahrsimulation. MMI-Interaktiv 11, 3–13 (2006) 21. Wilkie, R., Wann, J.: Controlling Steering and Judging Heading: Retinal Flow, Visual Direction, and Extraretinal Information. J. of Exp. Psych.: Human Perception and Performance 29, 363–378 (2003)
Automation Effects on Driver’s Behaviour When Integrating a PADAS and a Distraction Classifier Fabio Tango, Luca Minin, Raghav Aras, and Olivier Pietquin Centro Ricerche Fiat – Strada Torino, 50 10043 Orbassano, Italy University of Modena & Reggio Emilia – via Amendola, 02 42122 Reggio Emilia, Italy IMS Supélec, Metz, France [email protected] [email protected] [email protected] [email protected]
Abstract. The FP7 EU project ISi-PADAS aims at conceiving an intelligent system, called PADAS, to support drivers, which intervenes continuously from warning up to automatic braking in the whole longitudinal control of the vehicle. However, such supporting systems can have some unwanted sideeffect: due to the presence of automation in the driving task, less attention and reaction are needed by the drivers to intervene in the longitudinal control of the vehicle. Such a paper aims at investigating the effects of the level of automation on drivers, in particular on their Situation Awareness, when the user is supported by a specific PADAS application, integrated with a driver’s distraction classifier.
1 Introduction The FP7 EU project ISi-PADAS (Integrated Human Modelling and Simulation to support Human Error Risk Analysis of Partially Autonomous Driver Assistance Systems) aims at conceiving an intelligent system called PADAS (Partially Autonomous Driver Assistance System) which aids human users to drive safely, by providing them with pertinent and accurate information in real time about the external conditions and by acting as a co-pilot in emergency situations. The system interacts with the driver through a Human-Machine Interface (HMI) installed on the vehicle using an adequate Warning and Intervention Strategy (WIS). Such a system intervenes continuously from warning up to automatic braking in the whole longitudinal control of the vehicle [1]. Recent data have identified inattention – and in particular distraction – as the primary cause of accidents [3] and [4]. The ISI-PADAS project has developed a model, able to detect and classify driver’s distraction, which has then been integrated into the PADAS, in order to avoid or mitigate the negative effects and make the system “smarter” and more adaptive [15]. This paper addresses the problem of evaluating the effects of the automation on driver’s Situation Awareness when using a PADAS integrating a Distraction (DIS) classifier. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 503–512, 2011. © Springer-Verlag Berlin Heidelberg 2011
504
F. Tango et al.
1.1 The Context and the Problem Addressed Systems like the Adaptive Cruise Control (ACC) can (partially) automate the longitudinal driving task and thus support users in this perspective, but – on the other way – they can cause also less attention and reduced reaction time from drivers, especially in case of unexpected events. Ward found that ACC appeared to improve driving safety by reducing instances of unsafe headway distance in following tasks [5]. However, he also highlighted a relevant impairment of situation awareness (SA) because of a reduced attention dedicated to positioning and slower response times to unexpected events. In addition, other studies have proved the influence of ACC automation on performance, workload and attention allocation, e.g. [6]. Despite of the indubitable advantages, drivers may direct their attention away from the driving task when using ACC, creating an unsafe situation due to a loss of situation awareness. The study of Rudin-Brown confirms such results, with an unexpected increase in accidents from driver distraction due to the interaction with the secondary task [7]. 1.2 The PADAS Concept The choice to focus the attention on longitudinal driving control task derived by a preliminary in-depth accident analysis, conducted to derive hypotheses about causes of driver’s errors responsible for crashes [8]. In this context, many studies have shown the benefits of supporting systems in reducing accidents [2]. Therefore, the project has developed a PADAS called “Advanced Adaptive Cruise Control” (ACC+) with the following functionalities. In normal condition, the system can use the “traditional” ACC. Then when the driver acts on the brake pedal, thus indicating the will to brake, the Assisted Braking function is able to modulate the braking action automatically. Finally, if the driver ignored warning and did not intervene on the brakes, Emergency Braking function acts autonomously in order – at least – to minimise the effects.
2 The MDP Approach Consider a system that assists the driver by applying corrective controls to the vehicle (giving the driver warning signals and intervening in the braking) so that collisions may be avoided. When a collision is imminent, there is no doubt about what controls to apply: a clear collision warning must be given to the driver, and brakes must be applied as hard as possible to bring the host vehicle to a halt as quickly as possible. When there is a safe distance between the two vehicles, it is not clear what the optimal controls are. Should no warnings, no braking interventions occur? It could be, for example, that not giving a warning when the vehicle is at a certain “safe distance” leads (eventually) to a situation where a collision is unavoidable whereas a warning leads to an avoidance of the eventual collision. Deciding what corrective controls to exercise in order to avoid an eventual collision is essentially a problem of “credit assignment”: suppose an outcome is a consequence of a sequence of decisions. How do we decide what part each of decisions plays in the outcome? In other words, the credit assignment problem calls for a system for associating decisions to their longterm outcomes. One of the most important theories for formulating and solving credit assignment in sequential decision-making problems is the Markov decision process
Automation Effects on Driver’s Behaviour When Integrating a PADAS
505
(MDP) theory ([2] for details). In modelling a problem as an MDP, we contemplate a decision-maker who is required to take decisions over a sequence of discrete time periods. In each period, the problem occupies one of N possible states and the decision maker is called upon to choose from K possible alternatives. Such a decision and the state of the problem together determine, according to fixed probabilities P, the state occupied by the problem in the next period. Each choice is evaluated by an immediate cost c which is a function only of the state in which the choice is made (Markovian Property). The immediate cost tells the decision maker how good or bad a choice is in the short turn (i.e., its immediate consequence). It can use the immediate costs to determine the long-term desirability of a sequence of decisions. The objective of the decision maker to make such decisions that minimize his long-term expected cost. The solution to a sequential decision making problem is of course a sequential decision making policy. A policy is a function which assigns to each state an alternative. If the decision maker is following the optimal policy, it is the case that in any period, whatever be the state occupied by the problem, the decision the policy dictates will lead to long-term expected cost that is as small as or smaller than any other decision he could have made. Thus the MDP approach applied to conceiving an optimal PADAS consists of two elements: modelling the problem as an MDP; solving the (possibly unknown) MDP. 2.1 Modelling a PADAS Target Application as an MDP First, the state of the MDP in the collision avoidance problem has to be defined. In this problem, the decision maker is the warning and intervention system (of the PADAS). The relevant factors that determine if a collision with the leading vehicle can occur are: time to collision; headway; velocities of the host and leading vehicles; accelerations of the host and leading vehicles; brake and gas pedals position; driver’s distraction level. So, the state in the proposed MDP can be considered as a 10-vector. However, the number of states of such an MDP would be too large for solving the MDP. If each element of the state is allowed to take only 3 possible values (a very conservative estimate, indeed most elements would take continuous values), we would end up with 3 states! For the sake of computational tractability, we must determine which of these elements contains most information relevant to collision avoidance. With this in mind, the time to collision1 and driver’s distraction variables are two good candidates. 2.2 Distraction Model as an MDP State to Improve the PADAS A distraction classifier has been developed in the ISI-PADAS project and integrated into the optimal WIS for the PADAS, with the aforementioned MDP approach. The distraction model has been implemented following also in this case a Machine Learning approach, which constitutes a quite common method to detect human visual distraction in a non-intrusive way, based on the vehicle dynamic data ([9], [10]). Concerning the distraction classifier developed by the project, a broad explanation is outside the scope of this paper, but some details are provided as following (the 1
defined as the time it would take for a collision to occur if the leading and follower vehicles maintain their current velocities, that is the distance rated by their relative speed.
506
F. Tango et al.
interested reader can see [11]). In particular, a Support Vector Machines based model (SVM, well-known data-mining methods) has been selected, since achieving the best performances [11]. So, to sum up, the PADAS is conceived as an MDP, including an SVM-distraction classifier in the states defining the MDP.
3 The Experimental Set-up Description The scope of this experiment was twofold. On one hand, to assess the effect of automation on driver’s situation awareness; on the other hand, to compare the effects of the ACC+ solutions in reducing the risk of collision and on system acceptability from a driver perspective. Therefore the following phases have been considered: •
ACC+ with a starting (initial) MDP policy p0 including a model for drivers distraction; ACC+ with a MDP running p* policy (that is, the optimal policy developed by solving the MDP problem), including a model for driver distraction detection ACC+ with a policy based on deceleration (pDec policy, described in [14]); Without supporting systems (reference line).
• • •
Concerning the reference policy pDec, it provides the driver with the same kind of information of the p* policy, but the approach is different. The main idea of pDec is to give the needed acceleration for the host vehicle, in order to reach the same speed of the lead vehicle within a certain distance. Then, if the value is over certain thresholds, a corresponding warning or action is provided [14]. This policy represents a traditional algorithm for the implementation of a longitudinal distance control/warning system. For each of the above configurations, ten participants in the age between 21 and 45 years (M = 31.0, SD = 9.0) were asked to complete 12 driving sessions: each condition consisted in a car-following task on a highway road where unexpected braking events of the leading car have been reproduced. Hence, each of the 10 subjects performed 36 driving sessions, divided as follows: • • •
12 with the ACC+ working with the p* policy; 12 with the ACC+ working with the pDec policy; 12 without any supporting system
Two different levels of road visibility (low, LV; high, HV) have been reproduced, combined with the activation of the secondary task and the level of automation of the vehicle longitudinal control. LV (200m) has been introduced with fog effect to force low situation awareness situations. LV condition has been compared with the HV condition (7000m). The secondary task adopted is the SuRT (Surrogate visual Research Task [13]). Drivers were asked to interact with SuRT every time the task was presented on the lateral (right hand side) touch screen installed in the simulator cabin. A high level of SuRT difficulty has been set: high number of distractors with a reduced time to task completion (i.e. the SuRT screen was updated each 3-6 seconds). The secondary task has been activated in 6 conditions out of 12 for each driver.
Automation Effects on Driver’s Behaviour When Integrating a PADAS
507
SA has been measured at the end of each session by means of a self reporting questionnaire called SART (Situation Awareness Rating Technique [12]) submitted to participants: we expected a decrease of situation awareness when scenarios are critical (i.e. low visibility and secondary task on) and the level of automation is low. The number of occurred collisions has been also calculated, to evaluate the effects of highly automated longitudinal control in particular while dealing with low situation awareness conditions. Finally, in order to verify the influence of the different ACC+ system automation on performance, workload and attention allocation ([8] and [9]), additional investigations have been conducted on specific indicators: NASA-Tlx and Steering Entropy (SE). The experiment has been conducted on the static driving simulator of the University of Modena and Reggio Emilia (see www.scaner2.com).
4 Data Processing and Results During the experiment, several data on drivers’ behaviour were recorded from the simulator network at a sample rate of 20Hz (1 data-point each 0.05s). 4.1 Indicators and Indexes of Performance SA has been measured by means of the SART questionnaire. It is a quick and easy self–rating measurement technique [12]. As a result of the interviews provided by the questionnaire, 10 dimensions were derived and used to measure SA. For each basic driving condition (4 in total) and for each driving session (3 in total, 2 with ACC+ and 1 without) the following measures have been analysed: • •
•
Number of accidents occurred, allowing the evaluation of drivers’ longitudinal performances with reference to the level of longitudinal automation and the scenario complexity Steering Entropy [16] has been computed to evaluate drivers’ impairment in lateral control for each of the above mentioned situations. SE is a measure of randomness in a driver’s steering control: it is higher when drivers make larger erratic steering movements, indicating potentially unsafe behaviour. Nasa-Tlx [17] has been used to analyse, in order to have a general understanding of the level of cognitive and physical effort required by drivers while coping with concurrent tasks (car-following and secondary task interaction).
Finally, participants were asked to fill up the subjective questionnaire defined by the ISI-PADAS consortium to assess both ACC+ running p* and ACC+ running pDec policies. 4.2 Results for Automation Effects on Driver’s Behaviour SART questionnaire has been submitted to each subject four times for each of the three automation conditions. The aim of the evaluation conducted on collected data was to compare the average values of SART indicator of the aggregated SART scales
508
F. Tango et al.
between subjects for each level of automation and basic driving conditions. According to [12] the 10-scales of the SART have been aggregated: Demands on attentional resources (D), Supply of attentional resources (S) and Understanding of the situation (U).
SART
Average SART indicator for each automation level and driving codition 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ACC+ PStar ACC+ PDec NoPadas
SuRT ON + Visibility Low SuRT OFF + Visibility High SuRT ON + Visibility High
SuRT OFF + Visibility Low
Fig. 1. Average values of SART computed on the whole sample
In the figure, it is shown the quantitative variation of situation awareness among the driving (x-axis) and the automation conditions. Variations among automation conditions are relevant between automated (ACC+) and not-automated situations and there is a general positive increment of SA for the former conditions. Moreover, it is possible to see the effect of the driving conditions (SuRT and visibility) on the perceived SA: the absence of the secondary task and fog clearly increases the level of perceived situation awareness. A t-Test analysis of SART values was conducted for each driving condition and (in pairs) between the three automation configurations, in order to evaluate whether there are significant differences between the correspondent SART levels. Significant differences (p < 0.05) between average values of SART with and without automation have been found for all driving conditions but the “SuRT + Visibility low” one. All in all, no significant differences were found comparing the average values of SART computed for the ACC+ p* and ACC+ pDec. Concerning the lateral driving behaviour, the average values of steering entropy (SE) were computed for all subjects for the two automation types: ACC+ with p* and ACC+ with pDec. As anticipated, SE revealed steering impairments due to factors influencing the vehicle lateral control. In this case, these factors are the interaction with SuRT and the low visibility. The t-Test analysis conducted on SE values showed that there is a significant statistical difference (p < 0.05) in SE performances between driving conditions (x-axis of the next figure) for each separated automated condition. SE significantly increased with the presence of the secondary task, if compared to the conditions without the SuRT (see the following figure): For each driving condition, again a t-Test has been also conducted to compare the average values of SE in the two automated conditions to highlight potential statistical differences. No statistical differences between these values, however, were found in each of the driving conditions.
Automation Effects on Driver’s Behaviour When Integrating a PADAS
509
Steering Entropy in the four basic driving condition w .r.t. autom ation 3,00
Steering Entropy
2,50 2,00 ACC+ Pdec
1,50
ACC+ Pstar
1,00 0,50 0,00 SURT OFF + Visibility High
SURT ON + Visibility Low
SURT ON + Visibility High
SURT OFF + Visibility Low
Fig. 2. SE variations in the 4 driving conditions, w.r.t. the two kinds of automation
Finally, the automation effects on driver’s workload are analyzed. The NASA-Tlx questionnaire has been submitted to each participants after each of the last four driving conditions for each driving session (ACC+ p*, ACC+ pDec and without supporting systems). All in all, the level of workload perceived by drivers was higher in the conditions where SuRT was active, lower when no secondary task was present and visibility was high. Significant differences among workload levels measured for each automation type in the four driving conditions were not found, except for the “SuRT OFF + Visibility high” where participants stated that driving without longitudinal assistance induced high workload. In this case, the NASA scales showing this pattern are physical effort (because drivers have to module the speed autonomously) and effort (physical and mental demand). Moreover, the level of NASA-Tlx was higher in the ACC+ pStar conditions if compared to ACC+ pDec: the NASA scale contributing to the increase of the level of workload perceived by drivers is “Frustration”. Drivers stated they were annoyed by a continuous warning signal provided by the ACC+ pStar also in conditions where they did not perceive a risk of collision and where they were able to modulate the speed by themselves in case that more deceleration was needed. 4.3 Results for PADAS Performances Some preliminary results on the performances of PADAS applications, integrating the Distraction classifier are pointed out in this section. Figure 4 (top) shows the effects of the type of longitudinal automation (ACC+ with p*, ACC+ with pDec) compared to the control condition (no automation, No-PADAS) in the four basic driving conditions: The benefits of the ACC+ running the p* policy are evident: besides some userinteraction issues discussed in the previous sections, the WIS, developed following an MDP approach, allows drivers to intervene in time on the brakes and avoid collision with the lead vehicle when it suddenly stopped. All in all, drivers found ACC+ with p* policy more cautious, in the sense that the system provided warning signals also in conditions where the lead vehicle braking was smoother than the sudden one and where they did not perceive an imminent risk of collisions. This “learning effect” of the system induced drivers to be prepared to brake before the sudden stop event.
510
F. Tango et al.
Figure 4 (bottom) also reveals drivers impairment, for each driving condition, in coping not only with longitudinal control but also with secondary tasks activation and road visibility (i.e. conditions where SA was critical, in particular without the support of longitudinal automation).
Fig. 3. Total collision using different PADAS configurations: cumulative collisions for all subjects in the top figure; collisions occurred for each subject in every driving session
5 Discussion and Conclusions This paper has presented the evaluation of the impact on driver’s behaviour, caused by supporting systems with high level of automation. In addition, also the system performances have been illustrated. We have focused on the following aspect: situation awareness, lateral behaviour and workload. Concerning the automatic system, we have developed the PADAS application following a MDP approach. Experiments have been carried out in a static driving simulator. Concerning SA, the SART questionnaires showed significant benefits between automated and not automated driving conditions. In the former, higher values of situation awareness reveal a relevant impact of the ACC+ (indifferently p* or pDec) in providing the driver with a better understanding of the situation, particularly while dealing with concurrent tasks like monitoring the driving environment and executing secondary task. Effects of the ACC+ p* and pDec on lateral vehicle control have been also assessed using the SE indicator, revealing that the two automation solutions did not generate different lateral control impairments while significant different levels of lateral impairments have been measured among driving conditions because of the presence of the secondary task and low visibility conditions. Drivers’ feedbacks on the different NASA scales revealed an overall workload increase in conditions where the secondary task was active and the visibility was obstructed, as expected. Within these conditions, we did not found relevant workload differences among the two ACC+ solutions, but for most of these conditions driver workload was higher in the conditions without automation then with ACC+. This can be linked to the difficulty of performing secondary task like SURT in a harsh environment (low visibility conditions). So, we can conclude that high-automated systems, such as ACC+, did not reduce neither the impairment of drivers in controlling the vehicle, nor their situation awareness; moreover, driver’s workload did not enhance, due to system use.
Automation Effects on Driver’s Behaviour When Integrating a PADAS
511
Another very interesting aspect was the evaluation of PADAS performances, in particular considering its efficacy in avoiding extremely dangerous situations. From preliminary results presented here, considering the number of collisions occurred when using ACC+ with p* or pDec, compared with the situation when no supporting system is used, results pointed in favour of the ACC+ p* system. In fact, this solution proved to be the most effective in avoiding accidents, if compared to the numbers measured for the other automation solution and for the driving condition without automation. The major benefit introduced by the ACC+ p* was the cautious warning information and assistance strategy that allowed the driver to anticipate a potential sudden brake of the lead car. Although these good results, from internal acceptance questionnaires, drivers were annoyed by this cautious strategy providing AB signals (hap-tic + acoustic in particular) frequently along the test session. This research represents a feasibility study for an innovative approach to PADAS applications development, based on MDP and integrating a SVM driver’s distraction classifier. Of course, all these methods, taken separately, are not new in literature, but at the best of authors’ knowledge, their use in this kind of domain and with this type of integrated approach is quite innovative. With respect other traditional algorithms used for implementing the WIS of ADAS applications (i.e. pDec), our approach is based on learning the “behaviour” directly from the interaction between driver and system (vehicle + automation) in the specific environment; therefore we regard it as more appropriate to provide the right information in the right time (this has been proved by the drastic reduction in the number of collisions). On the other hand, such a system needs to be refined, since users assessed it as too conservative; however, due to that direct interaction between human and system, we are strongly confident to tune appropriately this kind of PADAS. So, next steps will involve the refinement of cost function, a more precise definition of the parameters constituting the state of the MDP, new possible technique for the MDP problem solution and – last but not least – a deeper investigation and integration of the distraction model into the WIS of the system. The authors would like to specially thank the ISi-PADAS consortium that has supported the development of this research.
References [1] Alsen, J., et al.: Integrated Human Modelling and Simulation to support Human Error Risk Analysis of Partially Autonomous Driver Assistance Systems: the ISI-PADAS Project. Transport Research Arena Europe, Brussels (2010) [2] Tango, F., Aras, R., Pietquin, O.: Learning optimal control strategies from interactions for a Partially Autonomous Driver Assistance System. In: The Proceeding of I Workshop on Human Modelling in Assisted Transportation (HMAT), Belgirate, Italy, June-July, 30-02 (2010) (to be published by Springer) [3] Beirness, D.J., et al.: The Road Safety Monitor: Driver Distraction. Traffic Injury Research Foundation, Ontario (2002) [4] Wang, J., Knipling, R.R., Goodman, M.J.: The role of driver inattention in crashes; New statistics from the 1995 crashworthiness data system (CDS). In: Proc. 40th Annu.: Assoc. Advancement Automotive Med., pp. 377–392 (1996) [5] Ward, N.J.: Automation of task processed: an example of intelligent transportation systems. Human Factors and Ergonomics in Manufacturing 10(4), 395–408 (2000)
512
F. Tango et al.
[6] Parker, H.A., Malisia, A.R., Rudin-Brown, C.M.: Adaptive cruise control and driver workload. In: Proceedings of XVth Triennial Congress of the International Ergonomics Association Conference (CD-ROM), Seoul, Korea, August 24–29 (2003) [7] Rudin-Brown, C.M., Parker, H.A., Malisia, A.R.: Behavioral adaptation to adaptive cruise control. In: Proceedings of the 47th Annual Meeting of the Human Factors and Ergonomics Society, Human Factors and Ergonomics Society, Santa Monica (2003) [8] Briest, S., Vollrath, M.: In welchen Situationen machen Fahrer welche Fehler? Ableitung von Anforderungen an Fahrerassistenzsysteme durch”; In-Depth- Unfallanalysen. In: VDI (ed.) Integrierte Sicherheit und Fahrerassistenzsysteme, pp. 449–463. VDI, Wolfsburg (2006) (in German language) [9] Liang, Y., Reyes, M.L., Lee, J.D.: Real-time Detection of Driver Cognitive Distraction Using Support Vector Machines. Proc. IEEE Transactions on Intelligent Transportation Systems 8(2) (June 2007) [10] Salvucci, D.D.: Modeling tools for predicting driver distraction. In: Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting. Human Factors and Ergonomics Society, Santa Monica (2005) [11] Tango, F., Minin, L., Montanari, R., Botta, M.: Non-intrusive Detection of Driver Distraction using Machine Learning Algorithms. In: The Proceeding of the XIX European Conference on Artificial Intelligence (ECAI), Lisbon, Portugal (August 16-19, 2010) [12] Taylor, R.M.: Situational awareness rating technique (SART): The development of a tool for aircrew systems design. In: Proceedings of the AGARD AMP Symposium on Situational Awareness in Aerospace Operations, CP478, NATO AGARD, Seuilly-sur Seine (1989) [13] Mattes, S.: The lane change task as a tool for driver distraction evaluation. In: Strasser, H., Rausch, H., Bubb, H. (eds.) Quality of Work and Products in Enterprises of the Future, pp. 57–60. Ergonomia Verlag, Stuttgart (2003) [14] Saroldi, L., Bertolino, D., Sidoti, C.: Driving in the Fog with a Collision Warning System: a Driving Simulator Experiment ITS 1997. In: 4th World Congress on Intelligent Transport Systems, Berlin (October 21-24, 1997) [15] Pietquin, O., Tango, F., Aras, R.: Batch Reinforcement Learning for Optimizing Longitudinal Driving Assistance Strategies. In: IEEE SSCI 2011 Symposium Series on Computational Intelligence, Paris, France (April 11-15, 2011) [16] Nakayama, O., Futami, T., Nakamura, T., Boer, E.R.: Development of a steering entropy method for evaluating driver workload. Paper presented at SAE International Congress and Exposition, Detroit, Michigan, USA (1999) [17] Hart, S.G., Staveland, L.E.: Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. In: Hancock, P.A., Meshkati, N. (eds.) Human Mental Workload, pp. 239–250. North Holland Press, Amsterdam (1988)
What Is Human? How the Analysis of Brain Dynamics Can Help to Improve and Validate Driver Models Sebastian Welke1, Janna Protzak1, Matthias Rötting1, and Thomas Jürgensohn2 1
TU Berlin, Department for Human-Machine Systems, Germany 2 HFC Human-Factors-Consult GmbH, Berlin {sebastian.welke,janna.protzak}@zmms.tu-berlin.de, [email protected], [email protected]
Abstract. To model realistic driver behavior, research interest is growing to develop new approaches that deal with the integration of cognitive processing steps in order to tune the modeled behavior to be more humanlike. But what means humanlike in this context? This paper deals with the question: How brain dynamics can help to get insights into the internal processes preceding intended driver behavior? Features extracted from electroencephalography (EEG) data can serve as an indicator of such neural activities. This approach will be illustrated by two experiments, dealing with simple button presses and more complex realistic steering maneuvers in an obstacle avoidance task. We found the lateralized readiness potential (LRP) in both experiments preceding button presses (~200ms) and steering maneuvers (~190ms) as a valid neuronal characteristic for motor preparation processes. This data indicates, that a time range about 200ms should be considered between decide and action components to describe driver models more humanlike. Keywords: EEG, LRP, realistic driver modeling
1 Introduction In order to develop and test driver support and assistance systems, requirements are growing to simulate the extensive vehicle behavior. The accuracy of the simulation depends on the goal of the intended development. In most cases, the behavior of the driver takes a crucial role for the behavior of the whole vehicle. Therefore human factors including non-mechanical behavior should always be kept in mind. Hence, the need of realistic simulation of the drivers’ behavior becomes evident in this context. Only this way the functions and support levels of such systems can be developed and tested on the drivers’ realistic human behavior. But what means human in this context? The temporal behavior of mathematical driver models is often too perfect. Aspects like human action or reaction require the calculation of specific temporal parameter. And unlike in machines, human reaction times depend on complex neural processing steps. Thus these several factors are in need of specific calculations. The aim of the presented paper is to demonstrate how brain dynamics of the driver can help to get insights into internal processes preceding intended behavior. Methods of V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 513–522, 2011. © Springer-Verlag Berlin Heidelberg 2011
514
S. Welke et al.
cognitive neuroscience, especially electrophysiological techniques like Electroencephalography (EEG) provide insights into the temporal order and duration of neural processes involved in motor sequences and action planning. It is possible to measure neural processes through the EEG signal with a fine grained temporal resolution in a range of milliseconds. Therefore a proper mapping of the time course of neural activity is realizable. We use this method in our studies to estimate exemplarily the shortest time range between decision-making and converting this abstract internal representation into a final motor action. Two experiments were carried out to determine the neural characteristic of these cognitive processing steps. The identification of potential neural indicators were first validated in highly controlled conditions and afterwards transferred into a more realistic driving context. Ideally, the identified parameter could be used to determine these specific temporal time ranges for driver models in an adequate way. 1.1 Neural Structures Involved in Movement Planning and Preparation To understand the particularity of human information processing and action selection, the underlying basic principles of psychophysiology and associated neuronal correlates will be described briefly in this paragraph. Cognitive action selection is linked to information processing through complex and balanced neural networks [1, 2] (Fig. 1). Different cortical regions, including the cortical motor regions, the brainstem, the spinal cord, the basal ganglia and the cerebellum, modulate the implementation of a cognitive selected action into a motor response.
Fig. 1. The figure shows a decision circuit and the involved neural structures (Figure taken from a paper by Platt [2])
Deliberate movements are produced by distributed cognitive information processes within these brain structures. Finally, all information merge into the motor areas of the cortex. In order to convert decision-making into a final motor action primary motor cortex neurons send the processed information via electrical impulses to the lower spinal cord. Subsequently this structure provokes the corresponding output signals leading to the final muscle contraction [3].
What Is Human? How the Analysis of Brain Dynamics Can Help to Improve
515
1.2 EEG Indicators for Movement Planning and Motor Action One of the most established indicators that maps the neural preparation of motor actions is the lateralized Readiness potential (LRP). The LRP is a spatial negative cortical event related potential measured by the EEG over motor cortex areas. It reflects the relative activation of the motor cortex due to preparation and activation of motor responses of a specific body side [4]. According to the moved body side the potential occurs prior to the onset of a self-paced movement with a peak activity in the respective corresponding contra-lateral hemisphere [5, 6]. The LRP is well-known in the context of simple motor tasks, like button press experiments [7, 8, 9]. In general, the LRP can be considered to reflect the output of cortical processes that are connected to movements or rather muscle contractions of a specific body side [6, 10]. Since the LRP indicates the point in time of converting a motor decision into a final motor action, we analyzed this brain potential to identify the onset of the anticipated activation within the motor areas of the brain due to button presses and steering maneuvers.
2 Experiment 1 – Neural Indicators Preceding Finger Movements The first experiment was carried out to determine motor-related brain activity patterns in EEG signals before simple index finger movements. These patterns are necessary to assign the neural activity to the underlying brain structures. Afterwards this information serves as prototypic activation templates to validate the brain dynamics in the more comprehensive setting of the second experiment. 2.1 Method A highly controlled stimulus-response task was used to replicate the known EEG correlates of movement preparation and to validate the selected methods. 44 participants (24 female; age Mean: 27, SD: 6.16) were requested to press a left and a right button with the corresponding index finger as fast as possible directly after a visual stimulus. The letters L and R serve as stimuli and were presented in random order. They were shown in the middle of a screen with durations of 700 ms and interstimulus intervals of 300 ms with a random jitter to overcome the problem of automated reactions. The experiment was divided into two blocks, eleven minutes each. Together, both blocks contained about 650 trials with a nearly counterbalanced number of requests to press a marked left and right button on a PC keyboard. EEG data were recorded with 32 electrodes, arranged in the positions of the standardized extended 10-20 system [11], referenced to a signal measured at the intersection of the frontal and two nasal bones of the skull. The EEG signals at electrodes C3 and C4 (left and right motor cortices) were response-locked and averaged over all button presses for every participant. To validate the neural activity measured by the EEG we used the independent component analysis (ICA) based on the infomax principle [12] implemented in EEGLAB 6.03, a freely available open source toolbox running under Matlab [13].
516
S. Welke et al.
This method identifies tem mporally independent signal sources in multi-channel E EEG data as linear combination ns of the original channels - the so called independdent components (ICs). Since th he aim of the presented paper is to identify neural mootor processes the challenge is to identify the motor ICs. By exploiting the well knoown characteristic of the LRP, we used the correlation of this brain potential and the averaged activity of each independent linear combination in order to identify the corresponding LRP-ICs. In conjunction with modeling the activity of these ICs through equivalent dipoles (using the EEGLAB plug-in DIPFIT), the activity of the potential sources can be allocated a within the human brain and assigned to neuural structures using the Talaiirach brain atlas [14, 15]. The allocation of the hhigh correlated ICs serves for th he validation process. It should demonstrate that the siggnal characteristic of the LRP and a the corresponding ICs is generated by the activityy of neural structures within the motor areas. 2.2 Results As expected, the LRP appeaared approximately 200 ms (Mean: -201.25 ms; SD: 36..84) prior to the button press in the averaged EEG signal of every participant as a negattive shift over the correspondin ng contra-lateral hemisphere. This activation had a left ft or right distribution over the lateral l motor areas. Fig. 2 shows exemplarily the averaaged hemispheric difference of the EEG signals over all right and left hand trialss of participant 22 revealing thee temporal characteristic of the known LRP.
Fig. 2. The figure shows exem mplarily the averaged LRP of participant 22 in the case of left and right finger movement (button press at 0)
Fig. 3 shows the ICs wiith the highest correlation between the mean activationn of right finger movements (rr = 0.96, p < 0.001), as well as between the activattion associated with left fingerr movements and the corresponding LRP (r = 0.93, p < 0.001) for this participant. The single trail activation due to right finger movemeents shows a distribution over the left motor cortex areas, while left finger movemeents generate the known activatiion over the right motor cortex. Since the asymmetry off the hemispheric motor corticess for a body site is well known, the characteristic of the identified ICs indicates thee validity of the signal decomposition via the ICA and the identification of the motor ICs. I
What Is Human? How the Analysis of Brain Dynamics Can Help to Improve
517
Fig. 3. The figure shows the activity a highest correlated to the LRP ( IC 17/ 28, participant 222)
By reconstruction of equivallent dipoles, both components could be allocated within to motor cortex arreas (Fig. 4). For the coordinates of thhese dipoles, the Talairach brain aatlas confirms the left and right motor cortiices as involved neural structures. This way, the motor related brrain activity could be identified for all participants in this experiment. We use the activity from this validated neuural source to determine the shortest tiime range for converting a motor decision iinto a final motor action. In fact, we inferrred from this time course of neural mootor Fig. 4. The figure shows the equivalent preparation leading to a simple musscle dipole reconstruction of IC 17 (green) and contraction that the considered time raange 28 (red) from participant 22 should be at least approximately 200mss.
3 Experiment 2 - Neu ural Indicators Preceding Steering Actions 3.1 Method To test the relevance of the findings of the first experiment in realistic drivving conditions an obstacle avoidance task on a test track was carried out. Fourtteen participants (3 female; age Mean: 26, SD: 2.6) took part in this experiment. All helld a current, full car drivers liceense and drove regularly. All of them reported being freee of neurological and psychiatriic disorders. A VW TOURAN TDI manufactured in 20003, was used in this study. While W performing the experiment, electroencephalograpphic data of the driver was recorded with 64 impedance-optimized electrodes, arrangedd in the positions of the standardized extended 10-20 system analog to experimentt 1. Furthermore, for artifact co ontrol activity from the lateral neck muscles as well as the horizontal and vertical ey ye-movement were recorded. To measure the drivers’
518
S. Welke et al.
behavior data from the steerring wheel angle were recorded simultaneously. In ordeer to trigger the steering maneu uvers, red-shaded shoe mounted flash lights were ussed, installed on the dashboard d (Fig. 5). The flash lights were activated by an infraared sensor attached in the veh hicle. The activation of this sensor was established byy an external infrared light mou unted on a rack, placed on the test track. The participaants were requested to initiate a steering maneuver as fast as possible when the flash ligghts lit up. Each maneuver was carried out in the same test environment with one enterring lane and three parallel adjaacent corridors. Furthermore, they were instructed that the flash lights indicate obstacles and thus they have to take the opposite corridor (Fig. 5).
Fig. 5. The figure shows thee obstacle avoidance task and the view when the flash ligghts simulate obstacles and target th he avoidance corridors
Data from the steering wheel w position sensor was segmented according to the giiven flash-stimuli. In comparison n with the protocol that was taken for every trial, the pooint in time where the participaant started to steer was inferred by visual inspection and cross-checked by two subjeect matter experts. Failed trials were rejected from furtther analysis. The observed onsset of steering served as marker for the EEG analysis. T The recorded EEG data were diivided into casual epochs (segments) related to the driver’s steering reaction. The observation of steering direction from the steering wheel anngle behavior serves as the key information to distinguish between the first left and riight steering reaction. To obtain n information about the existence of common valid E EEG
What Is Human? How the Analysis of Brain Dynamics Can Help to Improve
519
characteristics for all subjeects, all trials were averaged regarding the correspondding steering direction. EEG siignal processing, source reconstruction and identificattion were realized analog to exp periment 1. 3.2 Results 588 trials for first left and 484 4 trials for first right steering reaction could be extraccted from all 14 subjects. Fig.. 6 shows the grand average analog to the analysiss in experiment 1 for all right and left steering trials of all participants. We observeed a consistent negative LRP characteristic over all participants starting approximately 190 ms before the steering onseet (Mean: 186.79 ms; SD: 28.8). There was no significcant difference between left an nd right steering maneuvers in the onset of the negattive signal shift. This signal chaaracteristic reproduce the signal behavior preceding buttton presses.
Fig. 6. The figure shows the grand g averaged LRP of all participants in the case of left (red) and right (green) steering maneuveers (steering onset at 0)
One of the problems of o the LRP in steering tasks is the disturbance by eye movements. Especially horrizontal eye movements generate a lateralized shift in the EEG signals that is similarr to the time course of LRPs. Hence, the validation of the source causing the LRP characteristic is an important step towards identifying neuural motor processes. Fig. 7 ex xemplarily shows the ICA solutions that correlate to the individual LRP of participaant RLCT-1 at least for r > 0.8 and p<0.001.
Fig. 7. The figure shows the motor ICs highest correlated to the LRP (participants RLCT-1)
These results indicate th hat analog to the button press experiment activity from the motor areas can be analyzzed in a realistic driving condition. Since the participaants were instructed to steer witth both hands we observed activity over both hemispheeres.
520
S. Welke et al.
Fig. 8 exemplarily shows the t equivalent dipole reconstruction of IC 26 and 25 frrom participant RLCT-1. Analo og to experiment 1, the neural structures causing the L LRP characteristic can be identiified as the motor structures within the human brain. T The allocation of the equivalentt dipoles from all participants revealed potential sourcess for these activations within th he motor cortex also preceding steering maneuvers.. In addition to the results of ex xperiment 1 we used the activity from this validated neuural source to determine the sh hortest time range for converting a motor decision intto a steering maneuver. From the t time course of neural motor preparation leading tto a steering reaction we inferreed a time range of approximately 190ms as a minimum.
Fig. 8. The figure shows the equivalent dipole of IC 25 (green) and 26 (red) of particippant RLCT-1
4 Discussion and Conclusion In the first part of this pap per, we used the LRP observed in a simple button prress experiment in order to identify the neural activity preceding an intended musscle nger under highly controlled conditions. Like Prime [9] we contraction of the index fin found the well-known LRP P characteristic over motor cortex areas. Furthermore, we observed the LRP precedin ng steering maneuvers as a valid neuronal characteristic for motor preparation processses. Hence, we were able to transfer the results frrom controlled tasked condition ns into a realistic driving context with fitting ecological validity. In conclusion, thee data indicate that a time range about 200 ms shouldd be considered as the shorteest time range between decision-making and acttion components or rather steeriing maneuvers to describe driver models more humanlike. Otherwise the temporal ord der of the modeled behavior would be more perfect thaan it is in fact and would not match m with real human conditions. This minimal lateency period applied to simple bu utton presses and more complex steering became evidennt in the presented EEG analysis reflecting the activity of the underlying neural structurees. mple The benefit of these ressults is the exploration of realistic time ranges for sim motor preparation processees. It was possible to validate our methods in controllled
What Is Human? How the Analysis of Brain Dynamics Can Help to Improve
521
conditions and to replicate our findings in a practical driving situation. It could be assumed, that these results are more adequate than theoretical calculated data. Related to the presented results, currently used modeling approaches should be checked for their temporal behavior. An integration of human factors into driver models seems to be useful for appropriate implementations. The experiment also incidentally investigates the role of the LRP as a predictor of the steering direction before this information becomes evident in the steer angle. The offline analysis of EEG data indicates that this neural activity pattern preceding the steering onset allow for predictions of the steering direction with an accuracy of approximately 85% over all participants. This result is not in the scope of the presented approach but reveals interesting further research in this context. Acknowledgements. The authors would like to thank Brain Products for providing the EEG equipment and Team PhyPA from the Berlin Institute of Technology, chair of human-machine-systems for the algorithmic support. We would also like to thank the Anschutz Entertainment Group GmbH for the provision of the testing ground and the corresponding resources. This work was supported in part by grants of the Deutsche Forschungsgemeinschaft, DFG (GRK 1013/2 prometei).
References 1. Bar-Gad, I., Bergman, H.: Stepping out of the box: information processing in the neural networks of the basal ganglia. Curr. Opin. Neurob 11(6), 689–695 (2001) 2. Platt, M.L.: Neural correlates of decisions. Curr. Opin. Neurobiol. 12(2), 141–148 (2002) 3. Gazzaniga, M.S., Ivry, R.B., Mangung, G.R.: Cognitive Neuroscience: The Biology of the mind. Norton & Company, New York (2002) 4. Wascher, E., Wauschkuhn, B.: The interaction of stimulus- and response-related processes measured by event-related lateralizations of the EEG. Electroencephal. Clin. Neurophysiol. 99, 149–162 (1996) 5. Jahanshahi, M., Hallett, M.: The Bereitschaftspotential: what does it measure and where does it come from? In: Jahanshahi, M., Hallett, M. (eds.) The Bereitschaftspotential. Movement-Related Cortical Potentials, vol. (1-18). Kluwer-Plenum, New York (2003) 6. Shibasaki, H., Hallett, M.: What is the Bereitschaftspotential? Clin. Neurophysiol. 117, 2341–2356 (2006) 7. Coles, M.G.H.: Modern mind-brain reading: Psychophysiology, physiology, and cognition. Psychophysiology 26, 251–269 (1989) 8. Hackley, S.A., Miller, J.: Response complexity and precue interval effects on the lateralized readiness potential. Psychophysiology 32, 230–241 (1995) 9. Prime, D.J., Ward, L.M.: Inhibition of return from stimulus to response. Psychol. Sci. 15(4), 272–276 (2004) 10. Haggard, P., Eimer, M.: On the Relation Between Brain Potentials and the Awareness of Voluntary Movements. Exp. Brain. Res. 126, 128–133 (1999) 11. Jasper, H.: Ten-twenty electrode system of the International Federation. Electroenceph. Clin. Neurophysiol. 10, 371–375 (1958) 12. Bell, A.J., Sejnowski, T.J.: An information-maximization approach to blind separation and blind deconvolution. Neur. Comput. 7(6), 1129–1159 (1995)
522
S. Welke et al.
13. Delorme, A., Makeig, S.: EEGLAB: an open source tool box for analysis of single-trial EEG dynamics. J. Neurosci. Meth. 134, 9–21 (2004) 14. Lancaster, J.L., Rainey, L.H., Summerlin, J.L., Freitas, C.S., Fox, P.T., Evans, A.C., Toga, A.W., Mazziotta, J.C.: Automated labeling of the human brain: a preliminary report on the development and evaluation of a forward-transform method. Hum. Brain Mapp. 5(4), 238– 242 (1997) 15. Lancaster, J.L., Woldorff, M.G., Parsons, L.M., Liotti, M., Freitas, C.S., Rainey, L., Kochunov, P.V., Nickerson, D., Mikiten, S.A., Fox, P.T.: Automated Talairach atlas labels for functional brain mapping. Hum. Brain Mapp. 10(3), 120–131 (2000)
Less Driving While Driving? An Approach for the Estimation of Effects of Future Vehicle Automation Systems on Driver Behavior Bertram Wortelen and Andreas Lüdtke OFFIS – Institute for Information Technology, R&D Division Transportation, Escherweg 2, 26122 Oldenburg, Germany {Bertram.Wortelen,Andreas.Luedtke}@offis.de
Abstract. We present a concept for the automatic estimation of the effect of newly introduced in-vehicle systems on driver’s gaze behavior by means of executable cognitive driver models. Structure and properties of the complete system are shown. Particular attention is paid to the underlying model of visual attention which is integrated in the cognitive architecture CASCaS. Different alternatives for the operationalization of human expectancy as one influencing factor of visual attention are presented and discussed. Principal applicability of the model is demonstrated on a laboratory four-instrument visual scanning task. First results are briefly discussed. Future steps are outlined. Keywords: Cognitive Driver Model, CASCaS, Visual Scanning, In-Vehicle System Evaluation.
1 Introduction Since the time of the first automobiles, there has been a constant process of technical improvement. Especially in recent years a lot of new systems have been integrated in new cars, which aim to automate parts of the driver’s task. The level of automation varies between different systems. For the longitudinal control task there exist informatory systems like collision warning systems as well as supporting, partially autonomous driver assistance systems (PADAS) like ACC (Adaptive Cruise Control). Similar assistance can be found for other subtasks. There are Lane Departure Warning Systems and Lane Keeping Assistants for the lateral control task, Navigation Systems for Route planning and a lot of informatory and warning systems like Blind Spot Detection or Traffic Jam Warning. Current technical progress, e.g. in advanced sensor networks and C2C/C2I-communication, bear high potential for further automation in the automobile domain. The DARPA urban challenge is aiming at an even more ambitious goal. It strives to pave the future of driving by motivating international teams to develop vehicles able to drive fully autonomous in urban areas. Some of these teams, like the German team of Jörn Wille (TU Braunschweig) or the Google team of Sebastian Thrun, even tested their vehicles in real traffic. V.G. Duffy (Ed.): Digital Human Modeling, HCII 2011, LNCS 6777, pp. 523–532, 2011. © Springer-Verlag Berlin Heidelberg 2011
524
B. Wortelen and A. Lüdtke
Besides automation another trend can be foreseen in the automotive area. The number and quality of infotainment systems is likely to rise in near future [1]. Introducing internet access, freely available development kits and operating systems will introduce a wide range of new applications. An example of a near to market system is the AutoLinQ system under development at Continental. A similar trend in the smartphones market resulted in a rapidly growing App-market. These Apps might easily allocate the free attentional resources drivers gain from using automation. The introduction of more and more automation in cars will change driver’s role step by step from manual control to supervisory control. A lot can be learned from the aviation domain, where the automation process was introduced earlier. Potential problems this change entails are loss of situation awareness, new attentional demands or complacency induced by too much confidence in automation [2]. Important to consider are behavioral changes induced by the automation which do not lead directly to accidents, but may be one contributing factor. Dekker [3] pointed out, that the difficult question is not, which function to automate, but how to ensure a good working cooperation between human and machine. On the one hand the assistance system should support the driver; on the other hand it should always keep the driver in the loop, because the driver is held responsible for the consequences induced by the driver-car system. One very important aspect in the context of supervisory control systems is the human visual scanning behavior, as it directly influences how well the system operator is aware of the current system state. When introducing a new assistance system in a car, the system designer has to evaluate possible side-effects on drivers’ behavior, including unwanted changes in their visual scanning behavior. Estimating the effect of a novel system on drivers’ behavior can be very valuable information for the designer. Different complementary approaches exist for this purpose [4]. Testing the completed system or prototypes with human test participants is an established way of evaluating the system. It provides the advantage of directly observing reactions of the participants to the system. Participants are able to articulate their impressions and feelings of the system, while data recording allows also for objective measures of influences on driving behavior. Unfortunately this has to be done in a late design stage, where a prototype of the system is available. In addition the evaluation of longterm effects is expensive and normally only applicable after market introduction. Another alternative is to consult a human factor expert to give estimations about the system’s effect on driver behavior, which is a good opportunity in early design stages, as the expert can already work on informal system descriptions. Our approach aims at giving software tool support for the evaluation process and partially automating it. The basic concept is to do closed loop simulations with all involved components of the driver-vehicle-environment (DVE) system, including an executable model of the driver. In an optimum case the integration of a system prototype and an interaction description on how to use the assistance system into the DVE simulation system leads to an automatic prediction of drivers’ behavior in a way it would have been observed by using test participants in a driving simulator. Predicting human interaction with novel technical systems in a simulative way is a known approach in scientific literature. A famous general purpose example is CogTool [5] which automatically generates a predictive human performance model
Less Driving While Driving?
525
from a user interface prototype. Salvucci adopted this approach for his promising Distract-R [6] system, which predicts changes in driving behavior when interacting with an in-vehicle interface. In the spirit of Distract-R we are investigating whether the effect of newly introduced systems on visual scanning behavior can be evaluated in an automated way. Our approach goes beyond Distract-R as it does not only take the direct interaction with the user interface into account, but also the demands that are placed on the driver’s visual scanning strategy. This allows evaluating a wider range of systems. We will show how our approach might even be used to evaluate partially autonomous driver assistance systems, which require only few direct interactions with the user interface. Instead these systems are critical for their possibility to provoke out of the loop effects. We aim to predict these effects by simulating the adaption of drivers’ visual scanning behavior to the newly introduced system.
2 System Structure Our long-term objective is to make a software-supported methodology available for systems designers, which allows predicting effects of a system design on driver’s behavior. One important requirement for the practical application of this methodology is that it should add as little additional costs to the design process as possible, in terms of time and effort. In this section we present the software system structure and describe how the individual components are configured to enable an efficient interaction with the system designer. 2.2 Joint Driver-Vehicle-Environment Platform Our research focuses on modeling and simulating driver behavior within a joint driver-vehicle-environment (JDVE) platform. The specific platform we are using has been developed by Schindler et al. [7] to especially fit the needs of a model driven system evaluation process. The platform consists of all components typical for a realistic commercial driving simulator (see Fig. 1). The environment component contains a model of the physical environment, the surrounding traffic and the scenario with its related events, while the vehicle component simulates the dynamics of a specific vehicle model. The platform has been designed in a flexible way to allow an easy exchange and integration of new components. This allows integrating new driver assistance systems into the simulation. The driver component is also exchangeable. It can be connected to a physical driving simulator cabin to do test with human drivers, or it can be instantiated by an executable driver model. This allows direct comparison of model behavior and human driver behavior. In use with driver models the JDVE platform reveals further advantages. As no interaction with human driver is needed, there is no need to do real-time simulations. The JDVE platform is able to run in an as-fast-as-possible mode to significantly speed up processing time. By disconnecting all visualization components from the simulation the processing time can be further improved. A built in batch execution support allows to repeat the same scenario multiple times to produce driver behavior traces in a Monte-Carlo like way, showing a reasonable realistic variance in behavior.
526
B. Wortelen and A. Lüdtke
Fig. 1. JDVE setup for automatic system evaluation. The human driver is replaced by a driver model combined with an interaction description for the system under investigation.
2.2 Driver Model The driver model we are using basically consists of two parts: a task independent and a task dependent model of human behavior. For the task independent model we use the cognitive architecture CASCaS (Cognitive Architecture for Safety Critical Task Simulation) developed at the German institute OFFIS. CASCaS was originally developed within the Aeronautics domain to simulate pilot-cockpit interaction [8]. In recent years it has been extended with several competencies to account for the special needs of driver models [9, 10]. It again contains a set of sub models (see Fig. 2) able to simulate different cognitive processes which are involved in everyday tasks. This allows CASCaS to produce reasonable realistic human-like behavior. But CASCaS alone cannot generate behavior. It additionally needs a description of the knowledge humans possess for handling a specific task like flying a plane or driving a car. The representation of task dependent knowledge can widely differ. This depends on the layer chosen for processing of knowledge and the modeling technique chosen for the process implementation on that layer. Based on Fitts’ concept of the three learning stages [11] CASCaS has three processing layers. The core layer of CASCaS which is used in every CASCaS model is the associative layer. It interprets GOMS like [12], rule-based knowledge to generate behavior. It represents conscious but familiar processing steps. In contrast the cognitive layer represents information processing in unfamiliar situations and the autonomous layer represents fast and automated behavior, which requires no conscious processing. While knowledge processing on the associative layer is fixed within the architecture, the processing on the cognitive and autonomous layer can be adapted for the special needs of a specific model. For these layers an interface is provided to dynamically connect software libraries, which do the actual knowledge processing, regardless of the utilized modeling technique. For the autonomous layer several modeling techniques have been evaluated for use within a driver model. The control theoretic model of Salvucci and Gray [13] for
Less Driving While Driving?
527
lateral vehicle control and Boer’s[14] model for longitudinal control, as well as Möbus’ and Eilers’ [15] BAD-MoB models for both aspects have been adapted for the sub tasks of lane-keeping, car following and driving with no leading vehicle. They have been used in different scenarios, including highway, rural and urban areas.
Fig. 2. High-Level view on main components of CASCaS
SVMs, stepwise linear regression models and BAD-MoB models have been evaluated for modeling parts of the behavior at traffic light approaches. 2.3 Scanning Model in CASCaS The process that mainly drives the gaze behavior of models executed in CASCaS is the selection of goals. CASCaS maintains a set of active goals ( Gact ) which the model tries to achieve at the same time. This set can change dynamically during the simulation. For a driver model such a set can be: • • • •
Keep appropriate distance to leading vehicle Keep car within driving lane Find a route to target location Listen to radio / talk to passengers / answer mobile phone …
Typically goals are hierarchically decomposed into sub goals up to single actions. The goals listed above are all active at the same time. On the associative layer only one of the active goals is processed at any time. We refer to this as the selected goal. To achieve all goals the model has to do multitasking by switching the selected goal from time to time. Which one is selected is determined by the goal module. The autonomous layer in contrast can really process multiple goals in parallel as the behavior on this layer is very skilled and needs little conscious processing. But these goals might fight for the limited resource of visual attention. In such a case the selected goal will get the resource. Therefore goal selection is the process that mainly drives the gaze behavior of models executed in CASCaS. In order to enable a realistic gaze behavior, we integrated a visual scanning model into the goal selection process of CASCaS. We choose the SEEV-model of Wickens
528
B. Wortelen and A. Lüdtke
[16], which predicts the probability of visually attending an information source based on four general influencing factors:
P ( IS ) = S − Ef + Ex ⋅ V .
Bottom − up
(1)
Top − down
The influencing factors are: (S) (Ef) (Ex) (V)
Saliency of the information source Effort required for obtaining information from an information source Expectancy, that an information source offers new information Value of the information from an information source for solving a particular task
We focus on the top-down factors Expectancy and Value as these are highly relevant for very familiar and skilled task executions. Like Wickens [16] we also distinguish between different sources of expectancy:
Ex = Cx + Er .
(2)
(Cx) Contextual Cues, like an alarm signal, which probably results in moving attention to the source of the alarm (Er) Event rate (information bandwidth). Denotes how often an information source provides new information. As the goal selection mainly drives visual attention in CASCaS, we enable the annotation of goals with the top-down factors of the SEEV model. The probability of selecting a goal ( P(g ) ) in CASCaS is given by the relative value of these factors: P(g) =
∑
ga
Ex ( g ) ⋅ V ( g ) . ( Ex ( g ) ⋅ V ( g )) a a ∈G
(3)
act
Care has to be taken with this implementation as goals and information sources cannot be used synonymously. A goal in CASCaS can use more than one information source. As a consequence the described way of goal selection would result in different percent dwell times (PDTs) for the information sources, than the original SEEV model would predict. There are easy ways to work around this in CASCaS which we will not discuss in this paper. 2.4 Operationalization of Expectancy Assuming that a system designer has to do the work of modeling the system interaction, then she/he has to define the parameters for the SEEV model. System designers are normally engineers, who are not very knowledgeable in human factor issues. From their point of view defining these parameters is much effort for little gain. To reduce the effort and increase the acceptability of the method for the designer, we provide the possibility to automatically estimate the expectancy factor Ex . In this section we first present how the event rate Er as part of Ex can be operationalized. Then we show how it can be generalized to also include expectancy stemming from contextual cues ( Cx ).
Less Driving While Driving?
529
We define an event in this context as a reaction of the model to some visual information. The number of reactions per time defines the event rate. The idea behind this definition is that a reaction to information is a good indicator for the relevance of that specific piece of information for the current task. For sure there are exceptional cases where this assumption is not true. People might perceive and notice relevant information without showing a visible response. Nevertheless we think that in most cases this gives a useful approximation. Within CASCaS a reaction to visual information occurs when a procedure rule is fired, whose left hand side matches the visually perceived information and whose right hand side defines the actions to be taken. Thus an event is derived dynamically from the simulation of the model’s interaction description. This has some consequences. This definition of event rate deviates slightly from the one of Wickens [16]. Wickens does not explicitly operationalize an event, but in his model the event rate, respectively information bandwidth, is an environmental property of the information source, which in principal can be measured in an objective way. By using our definition the interaction description explicitly defines all possible events. But the model might miss some events and only the ones it perceives determine the event rate. Thus it is a subjective measurement of the model, which can deviate from the real environmental event rate. Another consequence is, that the model is now able to adapt to changes in the event rate, because the event rate is no longer a static parameter, but determined dynamically during simulation. Assuming that humans are good at adapting to properties of the environment, they might also adapt to more aspects of the event sequence than just the average event rate. An information source that periodically produces events with a fixed distance between two events might result in a different visual scanning behavior than an information source with the same average event rate, but a very random distribution of event distances. So a better strategy would be to base the decision where to look on the probability that new relevant information is available at an information source, given the time since the last event. Therefore we refine the operationalization of event rate by:
Er ( g ) = P ( e g | Δ el ) .
(4)
Where P(eg | Δel ) is the probability that since the last event el a new event e g has been produced for goal g under the condition that the time since the last event for that goal is Δel . This probability is obtained by tracing the cumulative event distance distribution during simulation. Again assuming, that people are good at adapting to environmental properties, they might not only use the time since the last event Δel as a predictor for new events. Other contextual information can also be used as predictors. Therefore we generalize equation 4 to also include contextual predictors: (5) Ex ( g ) = P ( e g | Δ el , cx 1 , " , cx n ) . Thus event rate Er and contextual cues Cx as expectancy producing factors are treated in a similar way within CASCaS. Which factors are used as predictors has to be defined by the modeler.
530
B. Wortelen and A. Lüdtke
3 Application to Laboratory Task In order to show the general application of the visual scanning model, we created an interaction description for a well known laboratory task developed to investigate scanning behavior. It is the Four Instrument Scanning Task of Senders [17]. In section 2.3 and 2.4 we showed that CASCaS offers different ways to use the scanning model. One way is to define fixed values for the top-down SEEV parameters. In the CASCaS procedure language this is done by annotating goals, when they are instantiated in the model. For observing the gauges we create one goal for each gauge and annotate it with the pointer speeds Senders used for each instrument measured in rad/s: Goal, name=observe_gauge_1, value=1.0, event_rate=0.5 Goal, name=observe_gauge_2, value=1.0, event_rate=1.0 Goal, name=observe_gauge_3, value=1.0, event_rate=2.0 Goal, name=observe_gauge_4, value=1.0, event_rate=4.0 Due to the fixed parameters we are able to calculate the expected fixation probability for each gauge before the simulation. Gaze fixation and selected goal should correlate perfectly for this model. Therefore we determine the fixation probability for a gauge by its respective goal: Pfix (g ) . It is given by the probability that a transition is done from any other goal to that specific one. The probability of doing a transition from goal a to goal b is given by the probability of having selected a times the probability of selecting b as next goal.
Pa → b = Pfix ( a ) ⋅
P (b ) ∑ g∈G \ a P ( g )
(6)
The sum term does not include P (a ) because reselecting a would not result in a transition neither of selected goal nor of fixated information source. With equation 6 we get:
Pfix ( g ) =
∑P
g~∈G \ g
g~ → g
.
(7)
Solving the resulting set of linear equations gives the fixation probabilities in column 3 of Table 1. We connected the executable CASCaS model to an implementation of Senders’ task, so that it receives the current state of all gauges. Running the model for one hour resulted in the simulated fixation probabilities listed in the fourth column. Simulated and calculated probabilities are almost identical, which just indicates, that the implementation is working correct. Correlation between model and experimental data is high with a Pearson coefficient of 0.881, but a very small sample size (n=4).
4 Application to In-vehicle System Infotainment systems tend to draw drivers attention to their display, while PADAS might reduce the attentional demand of the primary driving task. So both kinds of
Less Driving While Driving?
531
systems bear the potential danger of inappropriately reducing the amount of time drivers spend looking at the road scenery, especially when used in combination. The proposed modeling approach aims at predicting the changes in visual attention. The example of Senders task served to investigate the principal applicability of the visual scanning model. For realistic systems a lot of additional problems are expected. The definition of an event might not be clear in every situation. People might especially weight events differently. The information value might be hard to define in relation to other task. Gaze behavior in real world tasks is normally not a pure scanning behavior. In some situations people show fixed gaze patterns. When changing lanes on a highway people develop gaze patterns like looking-to-side-mirror Æ looking-over-shoulder Æ looking-to-leading-vehicle. In other situations people show a visual search behavior, e.g. when searching for specific information on a display. Visual search behaviors and fixed gaze patterns are mixed with visual scanning behavior in real world tasks. Our next step is to apply the visual scanning model to a cognitive driver model. We will investigate whether the visual scanning model is able to make useful predictions despite all the anticipated problems. We will investigate a scenario where two kinds of systems are involved. One system is a variant of an ACC as an example of a partially autonomous driver assistance system, which is expected to reduce attentional demands of the primary driving task. The other system is an artificial visual search task which is used as placeholder for in-vehicle information systems. Table 1. Calculated, simulated and observed fixation probabilities for each instrument
Gauge 1 Gauge 2 Gauge 3 Gauge 4
Signal bandwidth
Calculated P fix
Simulated P fix
0.5 rad/s 1 rad/s 2 rad/s 4 rad/s
0.1 0.186 0.314 0.4
0.095 0.185 0.319 0.402
Observed P fix (Senders [13]) 0.077 0.172 0.214 0.537
5 Discussion We successfully integrated parts of the SEEV model into the cognitive architecture CASCaS and demonstrated it on a laboratory visual scanning task. With equal value parameters and fixed event rate parameters our implementation is basically identical with Senders’ Random Constraint Sampler [17]. Even though the results seem to reasonably fit Senders’ data, in later experiments with a six-instrument setup [18] Senders argued that the Random Constrained Sampler fails in explaining experiment results. The model especially underestimated fixation probabilities for instruments with low bandwidths and overestimates high-bandwidth instruments. Acknowledgments. The research leading to these results has received funding from the European Commission Seventh Framework Program (FP7/2007-2013) under grant agreement n° 218552, Project ISi-PADAS.
532
B. Wortelen and A. Lüdtke
References 1. Löwer, C., Bolduan, G.: Smartphones auf Rädern, vol. (1), pp. 48–51. Heise, Hamburg (2011) 2. Sarter, N.B., Woods, D.D., Billings, C.E.: Automation Surprises. In: Salvendy, G. (ed.) Handbook of Human Factors & Ergonomics. John Wiley & Sons, Chichester (1997) 3. Dekker, S.W.A.: Ten Questions about Human Error: A New View of Human Factors and System Safety. Lawrence Erlbaum Associates, Mahwah (2005) 4. Servel, A., Knapp, A., Jung, C., Donner, E., Dilger, E., Tango, F., Mihm, J., Schwarz, J., Wood, K., Ojeda, L., Neumann, M., Brockmann, M., Meyer, M., Kiss, M., Flament, M., Kompfner, P., Walz, R., Cotter, S., Winkle, T., Janssen, W.: Code of Practice for the Design and Evaluation of ADAS. Tech. Report, V3.0 (2010), http://www.preventip.org/download/deliverables/RESPONSE3/D11.2/Response3_CoP_v 3.0.pdf 5. John, B.E., Blackmon, M.H., Polson, P.G., Fennel, K., Teo, L.: Rapid Theory Prototyping: An Example of an Aviation Task. In: Human Factors and Ergonomics Society Annual Meeting Proceedings, vol. 53(12), pp. 794–798. San Antonio, Texas (2009) 6. Salvucci, D.D.: Rapid Prototyping and Evaluation of In-Vehicle Interfaces. In: ACM Trans. of Computer-Human Interaction, vol. 16(2), pp. 9:1–9:33. ACM, New York (2009) 7. Schindler, J., Harms, C., Noyer, U., Richter, A., Flemisch, F., Köster, F., Bellet, T., Mayenobe, P., Gruyer, D.: JDVE: A Joint Driver-Vehicle-Environment Simulation Platform for the Development and Accelerated Testing of Autonomous Assistance and Automation Systems. In: Cacciabue, P.C., Hjälmdahl, M., Lüdtke, A., Riccioli, C. (eds.) Human Modeling in Assisted Transportation. Springer, Italia (2011) 8. Lüdtke, A., Pfeifer, L.: Human Error Analysis Based on a Semantically Defined Cognitive Pilot Model. In: Saglietti, F., Oster, N. (eds.) SAFECOMP 2007. LNCS, vol. 4680, pp. 134–147. Springer, Heidelberg (2007) 9. Wortelen, B., Zilinski, M., Baumann, M., Muhrer, E., Vollrath, M., Eilers, M., Lüdtke, L., Möbus, C.: Modelling Aspects of Longitudinal Control in an Integrated Driver Model: Detection and Prediction of Forced Decisions and Visual Attention Allocation at Varying Event Frequencies. In: Cacciabue, P.C., Hjälmdahl, M., Lüdtke, A., Riccioli, C. (eds.) Human Modeling in Assisted Transportation. Springer, Italia (2011) 10. Lüdtke, A., Osterloh, J.-P., Weber, L., Wortelen, B.: Modeling Pilot and Driver Behavior for Human Error Simulation. In: Duffy, V.G. (ed.) ICDHM 2009. LNCS, vol. 5620, pp. 403–412. Springer, Heidelberg (2009) 11. Fitts, P.M., Posner, M.I.: Human Performance. Brooks/Cole, Belmont CA (1967) 12. Card, S.K., Moran, T.P., Newell, A.: The Psychology of Human-Computer Interaction. Erlbaum, Hillsdale (1983) 13. Salvucci, D., Gray, R.: A Two-Point Visual Control Model of Steering. Perception 33(10), 1233–1248 (2004) 14. Boer, E.R., Ward, N.J., Manser, M.P., Yamamura, T., Kuge, N.: Driver Performance Assessment with a Car Following Model. In: Proceeding of the Third International Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design, pp. 433–440 (2005) 15. Möbus, C., Eilers, M.: Mixture of Behaviors and Levels-of-Expertise in a Bayesian Autonomous Driver Model. In: Duffy, V.G. (ed.) Advances in Applied Digital Human Modeling, pp. 425–435. CRC Press, Boca Raton (2010) 16. Wickens, C.D., McCarley, J.S.: Applied Attention Theory. CRC Press, Boca Raton (2008) 17. Senders, J.W.: The Human Operator as a Monitor and Controller of Multidegree of Freedom Systems. IEEE Trans. Hum. Factors Electron HFE-5(1), 2–5 (1964) 18. Senders, J.W.: Visual Scanning Processes. University of Tilburg, Netherlands (1983)
Author Index
Abdel-Malek, Karim Aras, Raghav 503 Arora, Jasbir 109
109
Bachman, Mark 171 Bauer, Sebastian 272 Beckmann, Michael 379 Benderius, Ola 453 Bengler, Klaus 79 Bhatt, Rajan 109 Bloodworth, Toni 292 Blum, Rainer 186 Boothby, Robyn 59, 69, 231 Bubb, Heiner 79 Burns, Katherine 312 Burra, Nagananda Krishna 210 Chang, Chien-Chi 228, 394 Chao, Chuzhi 30 Chee-kooi, Chan 22 Chen, Bi-Hui 3 Chiou, Wen-Ko 3 Chou, Wei-Ying 3 Cloutier, Aimee 59, 231 D’Aquaro, Nicola 245 Deml, Barbara 493 Dennerlein, Jack T. 178, 228 Djalilian, Hamid 171 Dong, Dayong 255 Duschl, Markus 473 Dzaack, Jeronimo 379 Eilers, Mark 463, 483 Elepfandt, Monika 263 Ellegast, Rolf 195 Engstler, Florian 79 Ericsson, Mikael 99 Faber, Gert S. 228 Fang, Jing-Jing 12 Fang, Sheng-Yi 12 Feng, Ailan 37, 46 Fritzsche, Lars 272 Fu, Shan 255, 409, 446
Gao, Feng 204 Garbe, Hilke 483 Goonetilleke, Ravindra S. Graf, Holger 282 Gragg, Jared 69, 231 Guenzkofer, Fabian 79 Guo, Xiaoling 367 Hariri, Mahdiar 109 Hayes, Austen 292 Hazke, Leon 282 He, Ketai 37, 46 Hermanns, Ingo 195 Hesse, Sebastian 186 Hodges, Larry F. 292 Hou, Fang 426 Howard, Brad 89 Huang, Weifen 426 Ip, W.H.
22
J¨ ackel, Thomas 272 Jendrusch, Ricardo 272 Jiang, Ting 387 Johansson, Henrik 99 Johnson, Ross 151 J¨ urgensohn, Thomas 513 Kabuß, Wolfgang 399 Kahn, Svenja 282 Kaklanis, Nikolaos 302 Kausch, Bernhard 399 Kerr, Brandon 292 Keyvani, Ali 99 Kingma, Idsart 228 Knake, Lindsey 151 Kraus, Thomas 195 Kwon, HyunJung 109 L¨ amkull, Dan 99 Laquai, Florian 473 Lei, Zhipeng 119 Leidholdt, Wolfgang 272 Li, Dongxu 129 Li, Kang 139 Lin, Jia-Hua 394
220
534
Author Index
Liu, Min 436 Liu, Taijie 30, 37, 46 Long, James 312 L¨ udtke, Andreas 523 Luximon, Ameersing 22, 321, 367 Luximon, Yan 321 Ma, Xiao 367 Maggiorini, Dario 245 Maier, Enrico 143 Malerczyk, Cornelius 282 Mancuso, Giacomo 245 Markkula, Gustav 453 Marler, Tim 151 Mayer, Marcel Ph. 399 McGorry, Raymond W. 394 Minin, Luca 503 M¨ obus, Claus 463, 483 M¨ oller, Sebastian 337 Moschonas, Panagiotis 302 Moustakas, Konstantinos 302 Neumann, Hendrik 493 Ng, Roger 328 Niu, Jianwei 37, 46, 426 Numa, Julien 210 Ochsmann, Elke 195 Odenthal, Barbara 399 ¨ Ortengren, Roland 99 Ozsoy, Burak 161 Pannetier, Romain 210 Paulick, Peyton 171 Pietquin, Olivier 503 Pirger, Attila 272 Protzak, Janna 513 Qin, Jin 178 Quanpeng, Wang
436
Ran, Linghua 30, 37, 46 Rigoll, Gerhard 473 Ripamonti, Laura A. 245 R¨ otting, Matthias 347, 417, 513 Rupprecht, Dominik 186 Schaffer, Stefan 337 Schiefer, Christoph 195
Schleicher, Robert 337 Schlick, Christopher M. 399 Schmuntzsch, Ulrike 347 Shen, Haifeng 357 Sixiang, Peng 22 Smotherman, John Mark 292 Sun, Xiaohui 204 S¨ underhauf, Marcelina 263 Tan, Virak 139 Tango, Fabio 503 Tangwen, Yin 409 Tian, Zhiqiang 387 Trudeau, Matthieu 178 Tzovaras, Dimitrios 302 Ulinski, Amy
292
Wahde, Mattias 453 Wang, Chunhui 387 Wang, Lijing 255 Wang, Xuguang 210 Wang, Yanyu 37, 46 Wang, Zheng 387 Weerasinghe, Thilina W. Wegerich, Anne 417 Welke, Sebastian 513 Wolff, Krister 453 Wortelen, Bertram 523 Wu, Bin 426, 436 Xu, Xu 228 Xu, Yongzhong
220
387
Yang, Jingzhou (James) 59, 69, 89, 119, 161, 231, 312 Yao, Zhi 426 Yuan, Xiugan 204, 255, 357 Zhang, Ming 367 Zhang, Qinghong 231 Zhang, Xiang 436 Zhang, Xin 30, 37, 46 Zhang, Yifan 367 Zhang, Yijing 436 Zhao, Jingquan 204 Zhao, Yan 129 Zheng, Yiyuan 446 Zou, Qiuling 231